Archive for June, 2012

VirtualBox + kgdb: How to analyze ext4 filesystem processing in Linux kernel?

After the “gdb ./vmlinux” at the root of linux kernel source code directory, do a “rbreak ext4_*”:

(gdb) rbreak ext4_*
Breakpoint 1 at 0xffffffff8120ceb9: file fs/ext4/balloc.c, line 640.
int ext4_bg_has_super(struct super_block *, ext4_group_t);
Breakpoint 2 at 0xffffffff8120cf79: file fs/ext4/balloc.c, line 688.
long unsigned int ext4_bg_num_gdb(struct super_block *, ext4_group_t);
Breakpoint 3 at 0xffffffff8120cc65: file fs/ext4/balloc.c, line 478.
int ext4_claim_free_clusters(struct ext4_sb_info *, s64, unsigned int);
Breakpoint 4 at 0xffffffff8120ce50: file fs/ext4/balloc.c, line 565.
ext4_fsblk_t ext4_count_free_clusters(struct super_block *);
Breakpoint 5 at 0xffffffff8120d90e: file fs/ext4/balloc.c, line 222.
unsigned int ext4_free_clusters_after_init(struct super_block *, ext4_group_t,
struct ext4_group_desc *);
Breakpoint 6 at 0xffffffff8120c9fd: file fs/ext4/balloc.c, line 250.
struct ext4_group_desc *ext4_get_group_desc(struct super_block *,
ext4_group_t, struct buffer_head **);
Breakpoint 7 at 0xffffffff8120c9a9: file fs/ext4/balloc.c, line 765.
void ext4_get_group_no_and_offset(struct super_block *, ext4_fsblk_t,
ext4_group_t *, ext4_grpblk_t *);
Breakpoint 8 at 0xffffffff8120d0c6: file fs/ext4/balloc.c, line 167.
void ext4_init_block_bitmap(struct super_block *, struct buffer_head *,
ext4_group_t, struct ext4_group_desc *);
Breakpoint 9 at 0xffffffff8120d969: file fs/ext4/balloc.c, line 765. (2 locations)
ext4_fsblk_t ext4_inode_to_goal_block(struct inode *);
---Type <return> to continue, or q <return> to quit---

And so keep entering “enter” to set the breakpoints automatically. Not all the breakpoints can be set successfully. At about page 7 of the above some error occurred.

/build/buildd/gdb-7.1/gdb/breakpoint.c:6481: internal-error: expand_line_sal_maybe: Assertion `found' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)
Please answer y or n.
/build/buildd/gdb-7.1/gdb/breakpoint.c:6481: internal-error: expand_line_sal_maybe: Assertion `found' failed.

So I stopped at about page 6 of setting breakpoints, good enough numbers of breakpoints.

(gdb) cont
Continuing.
[New Thread 180]
[Switching to Thread 180]

Breakpoint 6, ext4_get_group_desc (sb=0xffff880030c4f800, block_group=0,
bh=0x0) at fs/ext4/balloc.c:250
250 if (EXT4_HAS_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_FLEX_BG)) {
(gdb) bt
#0 ext4_get_group_desc (sb=0xffff880030c4f800, block_group=0, bh=0x0)
at fs/ext4/balloc.c:250
#1 0xffffffff812351e5 in ext4_check_descriptors (sb=0xffff880030c4f800,
data=<value optimized out>, silent=6) at fs/ext4/super.c:1965
#2 ext4_fill_super (sb=0xffff880030c4f800, data=<value optimized out>,
silent=6) at fs/ext4/super.c:3401
#3 0xffffffff8117cdd6 in mount_bdev (fs_type=<value optimized out>,
flags=<value optimized out>, dev_name=<value optimized out>, data=0x0,
fill_super=0xffffffff81234100 <ext4_fill_super>) at fs/super.c:1024
#4 0xffffffff812236e5 in ext4_mount (fs_type=<value optimized out>,
flags=<value optimized out>, dev_name=<value optimized out>,
data=<value optimized out>) at fs/ext4/super.c:4779
#5 0xffffffff8117dc13 in mount_fs (type=0xffffffff81c5c8c0, flags=32769,
name=<value optimized out>, data=<value optimized out>) at fs/super.c:1130
#6 0xffffffff81197972 in vfs_kern_mount (type=0xffffffff81c5c8c0,
flags=32769,
name=0xffff88001c6fee00 "/dev/disk/by-uuid/30532f8e-6a1e-4cc1-96d9-e0b01a27e68e", data=0x0) at fs/namespace.c:707
#7 0xffffffff811981b4 in do_kern_mount (fstype=0xffff88001c69ebf0 "ext4",
flags=32769,
name=0xffff88001c6fee00 "/dev/disk/by-uuid/30532f8e-6a1e-4cc1-96d9-e0b01a27e68e", data=<value optimized out>) at fs/namespace.c:1804
#8 0xffffffff811999a4 in do_new_mount (
---Type <return> to continue, or q <return> to quit---
dev_name=0xffff88001c6fee00 "/dev/disk/by-uuid/30532f8e-6a1e-4cc1-96d9-e0b01a27e68e", dir_name=<value optimized out>, type_page=0xffff88001c69ebf0 "ext4",
flags=<value optimized out>, data_page=0x0) at fs/namespace.c:1864
#9 do_mount (
dev_name=0xffff88001c6fee00 "/dev/disk/by-uuid/30532f8e-6a1e-4cc1-96d9-e0b01a27e68e", dir_name=<value optimized out>, type_page=0xffff88001c69ebf0 "ext4",
flags=<value optimized out>, data_page=0x0) at fs/namespace.c:2188
#10 0xffffffff8119a190 in sys_mount (
dev_name=0x7fff22af4e30 "/dev/disk/by-uuid/30532f8e-6a1e-4cc1-96d9-e0b01a27e68e", dir_name=<value optimized out>, type=<value optimized out>, flags=32769,
data=0x0) at fs/namespace.c:2387
#11 <signal handler called>
#12 0x00007f5a220d9cea in __brk_reservation_fn_dmi_alloc__ ()
#13 0xffff880030ceadc0 in __brk_reservation_fn_dmi_alloc__ ()
#14 0xffffffff81c1ece0 in ?? ()
#15 0x0000000000000000 in ?? ()

Analysing the stacktrace, we should know something:

  • These stacktrace are kernel stacks, and so since transitioning from userspace to kernel is always starting with syscall, the lowest part of the stacktrace is always the syscall function.
  • Reading upwards, it shows the calling sequence from syscall functions to other functions.
  • So in the stack trace above, we know the calling sequence start with syscall sys_mount(), and subsequently other functions – reading upwards.

    Some error occurred, so in the starting up of second attempt, the following (different) stacktrace were recorded:

    (gdb) bt
    #0 ext4_get_group_desc (sb=0xffff880030cdc800, block_group=0, bh=0x0)
    at fs/ext4/balloc.c:250
    #1 0xffffffff812351e5 in ext4_check_descriptors (sb=0xffff880030cdc800,
    data=<value optimized out>, silent=6) at fs/ext4/super.c:1965
    #2 ext4_fill_super (sb=0xffff880030cdc800, data=<value optimized out>,
    silent=6) at fs/ext4/super.c:3401
    #3 0xffffffff8117cdd6 in mount_bdev (fs_type=<value optimized out>,
    flags=<value optimized out>, dev_name=<value optimized out>, data=0x0,
    fill_super=0xffffffff81234100 <ext4_fill_super>) at fs/super.c:1024
    #4 0xffffffff812236e5 in ext4_mount (fs_type=<value optimized out>,
    flags=<value optimized out>, dev_name=<value optimized out>,
    data=<value optimized out>) at fs/ext4/super.c:4779
    #5 0xffffffff8117dc13 in mount_fs (type=0xffffffff81c5c8c0, flags=32769,
    name=<value optimized out>, data=<value optimized out>) at fs/super.c:1130
    #6 0xffffffff81197972 in vfs_kern_mount (type=0xffffffff81c5c8c0,
    flags=32769,
    name=0xffff88001c754140 "/dev/disk/by-uuid/30532f8e-6a1e-4cc1-96d9-e0b01a27e68e", data=0x0) at fs/namespace.c:707
    #7 0xffffffff811981b4 in do_kern_mount (fstype=0xffff88001c683570 "ext4",
    flags=32769,
    name=0xffff88001c754140 "/dev/disk/by-uuid/30532f8e-6a1e-4cc1-96d9-e0b01a27e68e", data=<value optimized out>) at fs/namespace.c:1804
    #8 0xffffffff811999a4 in do_new_mount (
    ---Type <return> to continue, or q <return> to quit---
    dev_name=0xffff88001c754140 "/dev/disk/by-uuid/30532f8e-6a1e-4cc1-96d9-e0b01a27e68e", dir_name=<value optimized out>, type_page=0xffff88001c683570 "ext4",
    flags=<value optimized out>, data_page=0x0) at fs/namespace.c:1864
    #9 do_mount (
    dev_name=0xffff88001c754140 "/dev/disk/by-uuid/30532f8e-6a1e-4cc1-96d9-e0b01a27e68e", dir_name=<value optimized out>, type_page=0xffff88001c683570 "ext4",
    flags=<value optimized out>, data_page=0x0) at fs/namespace.c:2188
    #10 0xffffffff8119a190 in sys_mount (
    dev_name=0x7fffa93e9e30 "/dev/disk/by-uuid/30532f8e-6a1e-4cc1-96d9-e0b01a27e68e", dir_name=<value optimized out>, type=<value optimized out>, flags=32769,
    data=0x0) at fs/namespace.c:2387
    #11 <signal handler called>
    #12 0x00007fa7b17decea in __brk_reservation_fn_dmi_alloc__ ()
    #13 0x0000000000000000 in ?? ()
    (gdb) cont
    Continuing.

    And it continue, never seemingly terminate or return back, neither any displayed in VirtualBox.

    And so I restart the whole thing again, now setting only one breakpoint – ext4_get_group_desc:

    Breakpoint 1, ext4_get_group_desc (sb=0xffff880030e8f000, block_group=0,
    bh=0x0) at fs/ext4/balloc.c:250
    250 if (EXT4_HAS_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_FLEX_BG)) {
    (gdb) bt
    #0 ext4_get_group_desc (sb=0xffff880030e8f000, block_group=0, bh=0x0)
    at fs/ext4/balloc.c:250
    #1 0xffffffff812351e5 in ext4_check_descriptors (sb=0xffff880030e8f000,
    data=<value optimized out>, silent=6) at fs/ext4/super.c:1965
    #2 ext4_fill_super (sb=0xffff880030e8f000, data=<value optimized out>,
    silent=6) at fs/ext4/super.c:3401
    #3 0xffffffff8117cdd6 in mount_bdev (fs_type=<value optimized out>,
    flags=<value optimized out>, dev_name=<value optimized out>, data=0x0,
    fill_super=0xffffffff81234100 <ext4_fill_super>) at fs/super.c:1024
    #4 0xffffffff812236e5 in ext4_mount (fs_type=<value optimized out>,
    flags=<value optimized out>, dev_name=<value optimized out>,
    data=<value optimized out>) at fs/ext4/super.c:4779
    #5 0xffffffff8117dc13 in mount_fs (type=0xffffffff81c5c8c0, flags=32769,
    name=<value optimized out>, data=<value optimized out>) at fs/super.c:1130
    #6 0xffffffff81197972 in vfs_kern_mount (type=0xffffffff81c5c8c0,
    flags=32769,
    name=0xffff88001c679f40 "/dev/disk/by-uuid/30532f8e-6a1e-4cc1-96d9-e0b01a27e68e", data=0x0) at fs/namespace.c:707
    #7 0xffffffff811981b4 in do_kern_mount (fstype=0xffff88001c6816a8 "ext4",
    flags=32769,
    name=0xffff88001c679f40 "/dev/disk/by-uuid/30532f8e-6a1e-4cc1-96d9-e0b01a27e68e", data=<value optimized out>) at fs/namespace.c:1804
    #8 0xffffffff811999a4 in do_new_mount (
    ---Type <return> to continue, or q <return> to quit---
    dev_name=0xffff88001c679f40 "/dev/disk/by-uuid/30532f8e-6a1e-4cc1-96d9-e0b01a27e68e", dir_name=<value optimized out>, type_page=0xffff88001c6816a8 "ext4",
    flags=<value optimized out>, data_page=0x0) at fs/namespace.c:1864
    #9 do_mount (
    dev_name=0xffff88001c679f40 "/dev/disk/by-uuid/30532f8e-6a1e-4cc1-96d9-e0b01a27e68e", dir_name=<value optimized out>, type_page=0xffff88001c6816a8 "ext4",
    flags=<value optimized out>, data_page=0x0) at fs/namespace.c:2188
    #10 0xffffffff8119a190 in sys_mount (
    dev_name=0x7fff299bfe30 "/dev/disk/by-uuid/30532f8e-6a1e-4cc1-96d9-e0b01a27e68e", dir_name=<value optimized out>, type=<value optimized out>, flags=32769,
    data=0x0) at fs/namespace.c:2387
    #11 <signal handler called>
    #12 0x00007fa6fa9b4cea in __brk_reservation_fn_dmi_alloc__ ()
    #13 0x000000000000ff4f in __brk_reservation_fn_dmi_alloc__ ()
    #14 0x000000000001eaaf in __brk_reservation_fn_dmi_alloc__ ()
    xxx
    xxx (near 500 lines of __brk_reservation_fn_dmi_alloc__())
    xxx
    #521 0x000000000001fc19 in __brk_reservation_fn_dmi_alloc__ ()
    #522 0x0000001b0000000a in __brk_reservation_fn_dmi_alloc__ ()
    #523 0x000000000000e4a6 in __brk_reservation_fn_dmi_alloc__ ()
    #524 0x000000000001fc1f in __brk_reservation_fn_dmi_alloc__ ()

    #525 0x0000000000000000 in ?? ()
    (gdb) cont
    Continuing.

    What is happening, and why is the __brk_reservation_fn_dmi_alloc__() called recursively so many times?

    VirtualBox + kgdb analysis of Linux kernel (v3.4.0-rc3)

    After compiling the kernel inside VirtualBox, copy out the vmlinux to the host, where “gdb” command is to be issued.

    Add “kgdboc=ttyS0,115200 kgdbwait” at the end of the kernel line inside the /boot/grub/grub.cfg. Eg:

    linux /boot/xxxxx kgdboc=ttyS0,115200 kgdbwait

    Then issue at host level:

    socat -d -d /tmp/serial1204 pty

    after that “/dev/pts/6” is printed:

    2012/06/17 08:04:20 socat[6051.18446744073196852992] N opening connection to AF=1 "/tmp/serial1204"
    2012/06/17 08:04:20 socat[6051.18446744073196852992] N successfully connected from local address AF=1 "\x02\xE0"
    2012/06/17 08:04:20 socat[6051.18446744073196852992] N successfully connected via s\xD1]\x7E
    2012/06/17 08:04:20 socat[6051.18446744073196852992] N PTY is /dev/pts/6
    2012/06/17 08:04:20 socat[6051.18446744073196852992] N starting data transfer loop with FDs [3,3] and [4,4]

    Use that information below (at the host level):

    gdb ./vmlinux (where vmlinux is the compiled vmlinux image from inside the VirtualBox).

    (gdb) set remotebaud 115200
    (gdb) target remote /dev/pts/6
    Remote debugging using /dev/pts/6
    kgdb_breakpoint () at kernel/debug/debug_core.c:987
    987 kernel/debug/debug_core.c: No such file or directory.
    in kernel/debug/debug_core.c

    From above, “No such file” found is because we are not located at the root of the kernel source code. Restarting “gdb ./vmlinux” at the root of the kernel source code, and using the “list” command in gdb, we can list the source code.

    (gdb) list
    982 void kgdb_breakpoint(void)
    983 {
    984 atomic_inc(&kgdb_setting_breakpoint);
    985 wmb(); /* Sync point before breakpoint */
    986 arch_kgdb_breakpoint();
    987 wmb(); /* Sync point after breakpoint */
    988 atomic_dec(&kgdb_setting_breakpoint);
    989 }
    990 EXPORT_SYMBOL_GPL(kgdb_breakpoint);
    991

    From above we can see that the very first breakpoint when kgdb stopped is at 986. So all codes that execute before this point cannot be traced by kgdb. As for how this kgdb_breakpoint() is reached, can be seen from the stacktrace:

    (gdb) bt
    #0 kgdb_breakpoint () at kernel/debug/debug_core.c:987
    #1 0xffffffff810cf486 in kgdb_initial_breakpoint (
    new_dbg_io_ops=0xffffffff81c7f9e0) at kernel/debug/debug_core.c:885
    #2 kgdb_register_io_module (new_dbg_io_ops=0xffffffff81c7f9e0)
    at kernel/debug/debug_core.c:927
    #3 0xffffffff813e4d94 in configure_kgdboc ()
    at drivers/tty/serial/kgdboc.c:197
    #4 0xffffffff81d2709e in init_kgdboc () at drivers/tty/serial/kgdboc.c:219
    #5 0xffffffff8100203f in do_one_initcall (fn=0xffffffff81d2708a )
    at init/main.c:678
    #6 0xffffffff81cf2d53 in do_initcall_level (unused=)
    at init/main.c:753
    #7 do_initcalls (unused=) at init/main.c:761
    #8 do_basic_setup (unused=) at init/main.c:780
    #9 kernel_init (unused=) at init/main.c:863
    #10 0xffffffff816608e4 in ?? () at arch/x86/kernel/entry_64.S:1204
    #11 0x0000000000000000 in ?? ()

    Essentially the caller is kernel_init()—> do_basic_setup()—> do_initcalls()—>do_initcall_level()—>do_one_initcall()—>init_kgdboc()—>configure_kgdboc()—>kgdb_register_io_module()—>kgdb_initial_breakpoint()—>kgdb_breakpoint().

    By setting breakpoints at “printk”, we can see all the stacktrace when the kernel is printing its output to the dmesg:

    (gdb) break printk
    Breakpoint 1 at 0xffffffff8164c7c6: file kernel/printk.c, line 753.
    (gdb) cont
    Continuing.
    [New Thread 19]
    [Switching to Thread 19]

    From above, we can see there is a switching of execution from one thread to another.


    Breakpoint 1, printk (
    fmt=0xffffffff819ed780 "<6>Refined TSC clocksource calibration: %lu.%03lu MHz.\n") at kernel/printk.c:753
    753 }
    (gdb) bt
    #0 printk (
    fmt=0xffffffff819ed780 "<6>Refined TSC clocksource calibration: %lu.%03lu MHz.\n") at kernel/printk.c:753
    #1 0xffffffff8101a4a1 in tsc_refine_calibration_work (
    work=<value optimized out>) at arch/x86/kernel/tsc.c:915
    #2 0xffffffff8106d26a in process_one_work (worker=<value optimized out>,
    work=0xffffffff81c18e60) at kernel/workqueue.c:1866
    #3 0xffffffff8106e2d5 in worker_thread (__worker=0xffff88003dbb4b80)
    at kernel/workqueue.c:1977
    #4 0xffffffff81072d63 in kthread (_create=0xffff88003da57d28)
    at kernel/kthread.c:121
    #5 0xffffffff816608e4 in ?? () at arch/x86/kernel/entry_64.S:1204
    #6 0x0000000000000000 in ?? ()
    (gdb) cont
    Continuing.
    [Switching to Thread 1]

    Breakpoint 1, printk (fmt=0xffffffff81a37d88 "<6>brd: module loaded\n")
    at kernel/printk.c:753
    753 }
    (gdb) bt
    #0 printk (fmt=0xffffffff81a37d88 "<6>brd: module loaded\n")
    at kernel/printk.c:753
    #1 0xffffffff81d2818b in brd_init () at drivers/block/brd.c:624
    #2 0xffffffff8100203f in do_one_initcall (fn=0xffffffff81d2803c <brd_init>)
    at init/main.c:678
    #3 0xffffffff81cf2d53 in do_initcall_level (unused=<value optimized out>)
    at init/main.c:753
    #4 do_initcalls (unused=<value optimized out>) at init/main.c:761
    #5 do_basic_setup (unused=<value optimized out>) at init/main.c:780
    #6 kernel_init (unused=<value optimized out>) at init/main.c:863
    #7 0xffffffff816608e4 in ?? () at arch/x86/kernel/entry_64.S:1204
    #8 0x0000000000000000 in ?? ()
    (gdb) cont
    Continuing.
    [Switching to Thread 19]

    Now boot up all the way by deleting all the breakpoints:

    (gdb) delete
    Delete all breakpoints? (y or n) y
    (gdb) cont
    Continuing.

    %d bloggers like this: