How to do kernel debugging via gdb over serial port via QEMU?

The host OS is Ubuntu 12.04 LTS 64-bit, with qemu binaries installed (this include qemu-img, qemu-system-x86_64 etc). The Qemu guest will be running the linux kernel to be debugged. It is also running CentOS 64-bit as the distribution.

First create the Qemu image if needed:

qemu-img create centos64.img 20G

Install CentOS (as the guest OS) inside the Qemu guest (assuming that the CentOS-6.4-x86_64-minimal.iso has been downloaded):

qemu-system-x86_64 -m 2048 -net user -net nic -enable-kvm -hda centos64.img -cdrom CentOS-6.4-x86_64-minimal.iso

For cases of no serial debugging is needed, this is the way to use the quest:

qemu-system-x86_64 -m 2048 -net user -net nic -enable-kvm centos64.img

If additional harddisk image (eg, ext4.img) is needed, just append “-hdb ext4.img” to the above line. Inside the guest, you may need to issue “dhclient” acquire a TCP IP address for the guest before it can connect to outside world.

For cases of serial debugging is needed:

qemu-system-x86_64 -s -S -m 2048 -net user -net nic -enable-kvm centos64.img

The “-s -S” will stop the guest from execution just before executing the first instruction in the guest. Doing a “man qemu”:

-S Do not start CPU at startup (you must type 'c' in the monitor).

-gdb dev
Wait for gdb connection on device dev. Typical connections will
likely be TCP-based, but also UDP, pseudo TTY, or even stdio are
reasonable use case. The latter is allowing to start qemu from
within gdb and establish the connection via a pipe:

(gdb) target remote | exec qemu -gdb stdio ...

-s Shorthand for -gdb tcp::1234, i.e. open a gdbserver on TCP port

So in the Ubuntu host issue the gdb command:

gdb vmlinux

where “vmlinux” is the uncompressed image generated from the compiled linux kernel running INSIDE the guest machine. In other words, after installing the guest OS, download the linux kernel and compile it, as some Linux distribution does not provide the uncompress vmlinux image.

In Ubuntu this is not a problem:

But in CentOS it is not available:

So the best option is to compile the kernel inside the guest, modify the grub.conf to boot up into that kernel version (not the default one!!), and then copy out the vmlinux image to the host machine for the use of gdb command.

After “gdb vmlinux“, enter “target remote localhost:1234” to connect to the gdbserver inside the Qemu guest, and then “cont” to continue the gdb execution.

And doing a “bt” inside gdb:

(gdb) bt
Remote 'g' packet reply is too long: 000000000000000000000000000000000000000000000000000000000000000001000000000000002802de81ffffffffc81ea081ffffffffc81ea081ffffffff0000000000000000000000000000000002000000000000000000000000000000c075c081ffffffff0000000000000000ffffffffffffffff50370900000000002bb90381ffffffff4602000010000000180000001800000018000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007f03000000000000000000000000000000000000000000000000000000000000000000000000ff0000000000000000ff6564645f6964002d2d6578706f727400000000000000000000000000000000000000000000000000ff0000000000ffff20000000000000004100000000000000404040404040404040404040404040405b5b5b5b5b5b5b5b5b5b5b5b5b5b5b5b202020202020202020202020202020200000000000000000000000000000000000000000000000ff0000000000ffffff20202020202000202020202020202000ffffffffffffffffffffffffffffffff000000000000000000000000000000000000000000000000000000 000000000 00000000000000000000000000000000000000000000000000000000000000000801f0000

What is this error? Googling revealed the answer:

And so:

(gdb) set architecture i386:x86-64:intel
The target architecture is assumed to be i386:x86-64:intel

Trying again, and now the backtrace beautifully displayed:

(gdb) bt
#0 native_safe_halt ()
at /root/arch/x86/include/asm/irqflags.h:50
#1 0xffffffff81014a5d in raw_safe_halt ()
at /root/arch/x86/include/asm/paravirt.h:110
#2 default_idle () at arch/x86/kernel/process.c:328
#3 0xffffffff81009fc6 in cpu_idle () at arch/x86/kernel/process_64.c:140
#4 0xffffffff814f160a in rest_init () at init/main.c:483
#5 0xffffffff81c27f7b in start_kernel () at init/main.c:740
#6 0xffffffff81c2733a in x86_64_start_reservations (
real_mode_data=<optimized out>) at arch/x86/kernel/head64.c:123
#7 0xffffffff81c27438 in x86_64_start_kernel (
real_mode_data=0x93750 <Address 0x93750 out of bounds>)
at arch/x86/kernel/head64.c:94
#8 0x0000000000000000 in ?? ()

Next you can try “rbreak vfs_*” to breakpoint on all the vfs_* functions:

Breakpoint 50, vfs_dq_drop (inode=0xffff88003786ca48) at fs/quota/dquot.c:1327
1327 fs/quota/dquot.c: No such file or directory.
(gdb) bt
#0 vfs_dq_drop (inode=0xffff88003786ca48) at fs/quota/dquot.c:1327
#1 0xffffffff8119c8b3 in clear_inode (inode=0xffff88003786ca48)
at fs/inode.c:334
#2 0xffffffff8119d0f3 in generic_forget_inode (inode=0xffff88003786ca48)
at fs/inode.c:1306
#3 generic_drop_inode (inode=0xffff88003786ca48) at fs/inode.c:1321
#4 0xffffffff8119bf72 in iput_final (inode=0xffff88003786ca48)
at fs/inode.c:1343
#5 iput (inode=0xffff88003786ca48) at fs/inode.c:1361
#6 0xffffffff81198bc0 in dentry_iput (dentry=0xffff880037a84cc0)
at fs/dcache.c:118
#7 0xffffffff81198d21 in d_kill (dentry=0xffff880037a84cc0) at fs/dcache.c:177
#8 0xffffffff8119a8cc in dput (dentry=0xffff880037a84cc0) at fs/dcache.c:256
#9 0xffffffff81182189 in __fput (file=0xffff88007caa1140)
at fs/file_table.c:266
#10 0xffffffff81182235 in fput (file=<optimized out>) at fs/file_table.c:199
#11 0xffffffff8117d68d in filp_close (filp=0xffff88007caa1140,
id=0xffff880037c24e80) at fs/open.c:971
#12 0xffffffff810713cf in close_files (files=0xffff880037c24e80)
at kernel/exit.c:495
#13 put_files_struct (files=0xffff880037c24e80) at kernel/exit.c:523
#14 0xffffffff81071493 in exit_files (tsk=0xffff88007c96d500)
at kernel/exit.c:557
---Type <return> to continue, or q <return> to quit---
#15 0xffffffff8107350d in do_exit (code=256) at kernel/exit.c:986
#16 0xffffffff81073c48 in do_group_exit (exit_code=256) at kernel/exit.c:1091
#17 0xffffffff81073cd7 in sys_exit_group (error_code=<optimized out>)
at kernel/exit.c:1102
#18 0xffffffff8100b072 in ?? () at arch/x86/kernel/entry_64.S:488
#19 0x00007fbd9438b028 in ?? ()
#20 0x0000000000000000 in ?? ()

From above you can easily see who are the callee and callers functions.

But there is another problem. If you try “rbreak ext4_*” you will encounter that there is no symbol found. Checking for “ext4” symbol inside the vmlinux file clearly revealed that the function name is also not available inside (“nm vmlinux | grep ext4”). But inside the guest, “grep ext4 /proc/kallsyms” (running as root, as non-root don’t give you the privileges of getting the addresses) clearly revealed that the function addresses and function names are available, not sure how kallsyms get all its names. And by the way, if the addresses in /proc/kallsyms (inside the guest) does not match the addresses in vmlinux file (in the host) then you will not be able to set breakpoint either, or not breaking even after breakpoint is set.

To solve the above problem, copy out the /proc/kallsyms as a file, extract out the addresses needed for breakpoint, and do a “break *0xyyyyyyyy” to explicitly set breakpoint by address. Eg:

Breakpoint 17, 0xffffffffa00a6a70 in ?? ()
#0 0xffffffffa00a6a70 in ?? ()
#1 0xffffffffa00a736d in ?? ()
#2 0x0000000000000099 in ?? ()
#3 0xffff88003790bcc0 in ?? ()
#4 0xffffffffffffff10 in ?? ()
#5 0xffff88003790bde0 in ?? ()
#6 0x0000000000000099 in ?? ()
#7 0xffff88003790bcc0 in ?? ()

#8 0xffff880037ab2438 in ?? ()
#9 0xffff880037ab2588 in ?? ()
#10 0xffff88007c43bc80 in ?? ()
#11 0xffffffff811b3b0f in generic_block_bmap (mapping=<optimized out>,
block=<optimized out>, get_block=<optimized out>) at fs/buffer.c:3038
Backtrace stopped: frame did not save the PC

No names available, quite difficult to read. There is a solution!!!

Inside the guest OS, where CentOS is running, “lsmod | grep ext4” and you can see that ext4.ko is running as a kernel module.

Next, “cat /proc/modules | grep ext4” (running as root) and you can read off the loaded address of ext4.ko (0xffffffffa009e000 for the case below).

cat /proc/modules

i2c_piix4 12608 0 - Live 0xffffffffa0127000
i2c_core 31084 1 i2c_piix4, Live 0xffffffffa0119000
sg 29350 0 - Live 0xffffffffa010c000
ext4 363344 2 - Live 0xffffffffa009e000
jbd2 91554 1 ext4, Live 0xffffffffa007c000
mbcache 8193 1 ext4, Live 0xffffffffa0075000
sd_mod 38976 3 - Live 0xffffffffa0062000

Well, in the host Ubuntu, inside gdb:

(gdb) add-symbol-file /tmp/ext4.ko 0xffffffffa009e000

(where 0xffffffffa009e000 is the address where ext4.ko is shown to be loaded inside /proc/modules of the guest OS). To test out if it works:

(gdb) info breakpoints

Num Type Disp Enb Address What
1 breakpoint keep n 0xffffffffa00a6a70 in ext4_get_blocks
at fs/ext4/inode.c:1277
2 breakpoint keep n 0xffffffff81195cc0 in vfs_readdir
at fs/readdir.c:23
breakpoint already hit 13 times
3 breakpoint keep n 0xffffffffa009f970 in ext4_dir_llseek
at fs/ext4/dir.c:309
4 breakpoint keep n 0xffffffffa00a3270 in ext4_get_branch
at fs/ext4/inode.c:458
5 breakpoint keep n 0xffffffffa00a7180 in ext4_get_block_dio_write
at fs/ext4/inode.c:3694
6 breakpoint keep n 0xffffffffa00ba600 in ext4_get_sb
at fs/ext4/super.c:4519
7 breakpoint keep n 0xffffffffa00d8e00 in ext4_get_acl
at fs/ext4/acl.c:136
breakpoint already hit 9 times
8 breakpoint keep n 0xffffffffa00a5a90 in ext4_get_inode_loc
at fs/ext4/inode.c:5281
breakpoint already hit 18 times
9 breakpoint keep n 0xffffffffa00a2b30 in ext4_get_reserved_space
at fs/ext4/inode.c:1072
breakpoint already hit 2 times
10 breakpoint keep n 0xffffffffa00a3100 in ext4_getattr
at fs/ext4/inode.c:5911
breakpoint already hit 74 times
11 breakpoint keep n 0xffffffffa009e500 in ext4_get_group_desc
at fs/ext4/balloc.c:202
breakpoint already hit 20 times
12 breakpoint keep n 0xffffffffa00a2d60 in ext4_get_inode_flags
at fs/ext4/inode.c:5306
breakpoint already hit 15 times
13 breakpoint keep n 0xffffffffa009e000 in ext4_get_group_no_and_offset at fs/ext4/balloc.c:33
breakpoint already hit 2 times
14 breakpoint keep n 0xffffffffa00a72b0 in ext4_get_block
at fs/ext4/inode.c:1398
breakpoint already hit 15 times
15 breakpoint keep n 0xffffffffa00af1b0 in ext4_get_parent
at fs/ext4/namei.c:1065
16 breakpoint keep n 0xffffffffa00a86a0 in ext4_getblk
at fs/ext4/inode.c:1471
breakpoint already hit 3 times
17 breakpoint keep n 0xffffffffa00a6a70 in ext4_get_blocks
at fs/ext4/inode.c:1277
breakpoint already hit 20 times

Now all the ext4 functions are listed by names and even files lines offset. Doing a backtrace (“bt”):

(gdb) bt
#0 ext4_get_block (inode=0xffff88003790bcc0, iblock=324,
bh_result=0xffff88007c43bbe0, create=0) at fs/ext4/inode.c:1398
#1 0xffffffff811b3b0f in generic_block_bmap (mapping=<optimized out>,
block=<optimized out>, get_block=<optimized out>) at fs/buffer.c:3038
#2 0xffffffffa00a4752 in ext4_bmap (mapping=0xffff88003790bde0, block=324)
at fs/ext4/inode.c:3556
#3 0xffffffff8119b891 in bmap (inode=<optimized out>, block=<optimized out>)
at fs/inode.c:1381
#4 0xffffffffa0085fe3 in ?? ()
#5 0xffff88007c43bd20 in ?? ()
#6 0xffff88007cb00000 in ?? ()
#7 0xffff88007cb00024 in ?? ()
#8 0xffff88007c43bd08 in ?? ()
#9 0xffff88007c43bcf0 in ?? ()
#10 0xffffffffa008628a in ?? ()
#11 0xffff88007c8609c0 in ?? ()
#12 0xffff88007cb00000 in ?? ()
#13 0xffff88007c8609c0 in ?? ()
#14 0xffff88007cb00000 in ?? ()
#15 0xffff88007c43bd20 in ?? ()
#16 0xffffffffa0086e91 in ?? ()
#17 0xffff88007c8609c0 in ?? ()
#18 0xffff88007cb00000 in ?? ()
#19 0x0000000000000000 in ?? ()

Okay, the addresses are partially resolved…..still unresolved for other symbols yet.

Analysing other symbols (eg, vfs_readdir()):

Breakpoint 18, vfs_readdir (file=0xffff88007db05d80,
filler=0xffffffff81195b00 <filldir>, buf=0xffff88007d849f38)
at fs/readdir.c:23
23 {

Since the kernel source files are located in the same directory in the host OS where the kernel images are compiled inside the guest OS, you can clearly list the source via gdb:

(gdb) list
18 #include <linux/unistd.h>
20 #include <asm/uaccess.h>
22 int vfs_readdir(struct file *file, filldir_t filler, void *buf)
23 {
24 struct inode *inode = file->f_path.dentry->d_inode;
25 int res = -ENOTDIR;
26 if (!file->f_op || !file->f_op->readdir)
27 goto out;

Next, “print *file” clearly can decipher the structure of the pointer values into its structures in details.

(gdb) print *file
$4 = {f_u = {fu_list = {next = 0xffff88007cfe3d80, prev = 0xffff88007cb550e8},
fu_rcuhead = {next = 0xffff88007cfe3d80, func = 0xffff88007cb550e8}},
f_path = {mnt = 0xffff88007cfed9c0, dentry = 0xffff880037bd7240},
f_op = 0xffffffffa00d9c40, f_lock = {raw_lock = {slock = 0}}, f_count = {
counter = 2}, f_flags = 624640, f_mode = 29, f_pos = 0, f_owner = {lock = {
raw_lock = {lock = 16777216}}, pid = 0x0, pid_type = PIDTYPE_PID,
uid = 0, euid = 0, signum = 0}, f_cred = 0xffff880037c01480, f_ra = {
start = 0, size = 0, async_size = 0, ra_pages = 32, mmap_miss = 0,
prev_pos = -1}, f_version = 0, f_security = 0xffff88007d4d1f20,
private_data = 0x0, f_ep_links = {next = 0xffff88007db05e28,
prev = 0xffff88007db05e28}, f_mapping = 0xffff880037bdf5e0}

(gdb) print file->f_path
$7 = {mnt = 0xffff88007cfed9c0, dentry = 0xffff880037bd7240}

(gdb) print file->f_path->dentry
$9 = (struct dentry *) 0xffff880037bd7240

Given the structure is “struct dentry *” you can also dereference via gdb the structure:

(gdb) print *(struct dentry *)file->f_path->dentry
$10 = {d_count = {counter = 1}, d_flags = 8, d_lock = {raw_lock = {
slock = 0}}, d_mounted = 0, d_inode = 0xffff880037bdf4c0, d_hash = {
next = 0xffff88003794c998, pprev = 0xffffc9000002ee48},
d_parent = 0xffff880037bc56c0, d_name = {hash = 3818298688, len = 8,
name = 0xffff880037bd72e0 "incoming"}, d_lru = {next = 0xffff880037bd7340,
prev = 0xffff880037bd71c0}, d_u = {d_child = {next = 0xffff880037bd7350,
prev = 0xffff880037bd71d0}, d_rcu = {next = 0xffff880037bd7350,
func = 0xffff880037bd71d0}}, d_subdirs = {next = 0xffff880037bd72a0,
prev = 0xffff880037bd72a0}, d_alias = {next = 0xffff880037bdf4f0,
prev = 0xffff880037bdf4f0}, d_time = 14757395258967641292, d_op = 0x0,
d_sb = 0xffff88007cb55000, d_fsdata = 0x0,
d_iname = "incoming00\314\314\314\314\314\314\314\314\314\314\314\314\314\314\314\314\314\314\314\314\314\314",
<incomplete sequence \314>}

And gdb can walk the structure and print the details of the field (with names) for you.


Have fun!!!


2 responses to this post.

  1. Remember too, that gdb’s “print” is a garbage-in-garbage-out command, if you supply (struct inode *) like “print *(struct inode *)file->f_path->dentry” print will also print out the content following the format of “struct inode *”.


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: