Illumos Startup Analysis/Debugging using VirtualBox

Illumos is the kernel underlying OpenIndiana OS, and its kernel images can be downloaded at:

https://download.joyent.com/pub/iso/

Although OpenIndiana can be downloaded from:

http://openindiana.org/download/

It will not be covered here. The present analysis will just analyze how Illumos kernel start itself up initially. (BTW, Illumos is a Opensource version of Opensolaris, which is no longer “open” now).

First I install VirtualBox on my Ubuntu, but because the kernel has been upgraded, I had to do a “/etc/init.d/vboxdrv setup” reinstallation of the VirtualBox kernel drivers. Several problems were resolved, among which is the fact that VirtualBox setup script does not allow you to have spacing in the path to your kernel source, which initially exists, as I have moved the kernel source to the external NTFS harddisk, and the mount point has a space in it. Resolving that, I successfully compiled the VirtualBox drivers and got it running in my Linux kernel 3.3.0-rc3, Ubuntu 10.04 (LTS 64-bit version).

Using this procedure:

https://www.illumos.org/projects/illumos-gate/wiki/Serial_console_in_VirtualBox

The VirtualBox was setup CD/DVD pointing at the ISO image of Illumos kernel (smartos-20120210T043623Z.iso for me) and with serial port pointing at host pipe with filenamed as “/tmp/serial_host_pipe”, and download and compile socat as according to above instruction. (Alternatively, instead of downloading and compile, in Ubuntu you can easily do a “apt-get install socat” to install your socat application).

Next create the following file: mysocat.sh

#!/bin/bash
[ "$1" ] || { echo "usage: $0 PIPE"; exit 1; }
trap "stty sane" 0 1 2 3 15
socat unix-connect:$1 stdio,raw,echo=0,icanon=0

and then start the Illumos running in VirtualBox, and select “live 64-ttya-kmdb” the VirtualBox will spawn a black screen with dots pausing for input. At this point, the serial host pipe is created, and so run “mysocat.sh /tmp/serial_host_pipe” in another terminal to connect to the VirtualBox. All inputs and analysis will be done via that terminal:

[0]> moddebug/W 80000000
moddebug: 0 = 0x80000000

First we will list all the kernel threads existing:

[0]> ::threadlist -v
ADDR PROC LWP CLS PRI WCHAN
fffffffffbc2fa40 fffffffffbc2eb00 fffffffffbc31540 0 96 0
PC: 0 CMD:
stack pointer for thread fffffffffbc2fa40: fffffffffbc72240
0x38()
mlsetup+0x715()
_locore_start+0x8b()

From the above stacktrace of the only kernel thread, we can identify the symbols involved in the startup process.

[0]> ::dis _locore_start
_locore_start: leaq +0x40f149(%rip),%rbp <_edata>
_locore_start+7: movq $0x0,0x0(%rbp)
_locore_start+0xf: leaq +0x46d37a(%rip),%rsp <t0stack>
_locore_start+0x16: addq $0x4f10,%rsp
_locore_start+0x1d: subq $0x8,%rsp
_locore_start+0x21: movq %rdi,+0x431bf0(%rip) <sysp>
_locore_start+0x28: movq %rdx,+0x4138c9(%rip) <bootops>
_locore_start+0x2f:
movq $0xfffffffffbc13908,+0x45c5ee(%rip) <bootops> <bootopsp>
_locore_start+0x3a: movq %rdi,0x10(%rsp)
_locore_start+0x3f: movq %rsi,0x18(%rsp)
_locore_start+0x44: movq %rdx,0x20(%rsp)
_locore_start+0x49: movq %rcx,0x28(%rsp)
_locore_start+0x4e: movq %r8,0x30(%rsp)
_locore_start+0x53: movq %r9,0x38(%rsp)
_locore_start+0x58: pushfq
_locore_start+0x59: popq %r11
_locore_start+0x5b: movq %r11,0xd8(%rsp)
_locore_start+0x63: movq %cr0,%rax
_locore_start+0x66: orq $0x50000,%rax
_locore_start+0x6c: andq $0xffffffff9fffffff,%rax
_locore_start+0x72: movq %rax,%cr0
_locore_start+0x75: btsl $0x18,+0x431be3(%rip) <x86_featureset>
_locore_start+0x7d: xorl %ebp,%ebp
_locore_start+0x7f: movq %rsp,%rdi
_locore_start+0x82: pushq %rbp
_locore_start+0x83: movq %rsp,%rbp
_locore_start+0x86: call +0x380a5 <mlsetup>
_locore_start+0x8b: call +0x2634e0 <main>
_locore_start+0x90: leaq +0x7(%rip),%rdi <0xfffffffffb8000ae>
_locore_start+0x97: xorl %eax,%eax
_locore_start+0x99: call +0x813d2 <panic>

From above, we can see how _locore_start() called mlsetup(), and subsequently main().

Looking into the file:

http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/i86pc/ml/locore.s

154 * XXX Make this less vile, please.
155 */
156 ENTRY_NP(_locore_start)
157
158 /*
159 * %rdi = boot services (should die someday)
160 * %rdx = bootops
161 * end
162 */
163
164 leaq edata(%rip), %rbp /* reference edata for ksyms */
165 movq $0, (%rbp) /* limit stack back trace */
166
167 /*
168 * Initialize our stack pointer to the thread 0 stack (t0stack)
169 * and leave room for a "struct regs" for lwp0. Note that the
170 * stack doesn't actually align to a 16-byte boundary until just
171 * before we call mlsetup because we want to use %rsp to point at
172 * our regs structure.
173 */
174 leaq t0stack(%rip), %rsp
175 addq $_CONST(DEFAULTSTKSZ - REGSIZE), %rsp
176 #if (REGSIZE & 15) == 0
177 subq $8, %rsp
178 #endif
179 /*
180 * Save call back for special x86 boot services vector
181 */
182 movq %rdi, sysp(%rip)
183
184 movq %rdx, bootops(%rip) /* save bootops */

And finally….

1049
1050 #endif /* !__xpv */
1051
1052 /*
1053 * mlsetup(%esp) gets called.
1054 */
1055 pushl %esp
1056 call mlsetup
1057 addl $4, %esp
1058

http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/i86pc/os/mlsetup.c and its disassembly below:

[0]> mlsetup+0x715::dis
mlsetup+0x6e9: movq %rax,%rdi
mlsetup+0x6ec: call +0x2505f <setcr4>
mlsetup+0x6f1: jmp -0x588 <mlsetup+0x16e>
mlsetup+0x6f6: nop
mlsetup+0x6f8: movq $0xfffffffffb9375e8,%rdi
mlsetup+0x6ff: xorl %eax,%eax
mlsetup+0x701: call +0x1790ca <prom_printf>
mlsetup+0x706: call +0x176335 <prom_enter_mon>
mlsetup+0x70b: jmp -0x1a1 <mlsetup+0x56f>
mlsetup+0x710: call +0x2598b <kmdb_enter>
mlsetup+0x715: nopl (%rax)
mlsetup+0x718: jmp -0x275 <mlsetup+0x4a8>
mlsetup+0x71d: nopl (%rax)
mlsetup+0x720: movq %gs:0x10,%rdi
mlsetup+0x729: call -0x2e53e <cpuid_getfamily>
mlsetup+0x72e: cmpl $0x6,%eax
mlsetup+0x731: ja -0x5f7 <mlsetup+0x140>
mlsetup+0x737: movl $0xe,%esi
mlsetup+0x73c: movq $0xfffffffffbc31c70,%rdi <x86_featureset>
mlsetup+0x743: call -0x312e8 <is_x86_feature>
mlsetup+0x748: testl %eax,%eax
[0]>

Single-stepping further:

[0]> :s
kmdb: target stopped at:
kmdb_enter+0xe: call -0x5c3 <intr_restore>

[0]> $c
kmdb_enter+0xe()
mlsetup+0x715(fffffffffbc722a8)
_locore_start+0x8b()

[0]> :s
kmdb: target stopped at:
intr_restore: testq $0x200,%rdi

Now looking into the disassembly at the present point:

[0]> ::dis
intr_restore: testq $0x200,%rdi
intr_restore+7: je +0x1 <intr_restore+0xa>
intr_restore+9: sti
intr_restore+0xa: ret
0xfffffffffb85dc3b: nopl 0x0(%rax,%rax)
sti: sti
sti+1: ret
0xfffffffffb85dc42: nopw %cs:0x0(%rax,%rax)
cli: cli
cli+1: ret
0xfffffffffb85dc52: nopw %cs:0x0(%rax,%rax)

And then looking into the stacktrace:

[0]> $c
kmdb_enter+0xe()
mlsetup+0x715(fffffffffbc722a8)
_locore_start+0x8b()

We are therefore inside the kmdb_enter() processing function:

121 void
122 mlsetup(struct regs *rp, kfpu_t *fp)
123 {

135 /* drop into kmdb on boot -d */
136 if (boothowto & RB_DEBUGENTER)
137 kmdb_enter();
138

204 /*
205 * Initialize thread/cpu microstate accounting
206 */
207 init_mstate(&t0, LMS_SYSTEM);
208 init_cpu_mstate(CPU, CMS_SYSTEM);
209
210 /*
211 * Initialize lists of available and active CPUs.
212 */
213 cpu_list_init(CPU);
214
215 cpu_vm_data_init(CPU);
216
217 pg_cpu_bootstrap(CPU);
218
219 (void) prom_set_preprom(kern_splr_preprom);
220 (void) prom_set_postprom(kern_splx_postprom);
221 PRM_INFO("mlsetup: now ok to call prom_printf");

241
242 /*
243 * Negotiate hypervisor services, if any
244 */
245 hsvc_setup();
246 mach_soft_state_init();

Using these symbols, and googling, the following was found that gave a low down description of the bootup process:

http://www.cs.dartmouth.edu/~sergey/cs108/2009/l6.txt

Advertisements

3 responses to this post.

  1. An even earlier starting point is when the CPU is still in 16-bit real mode – and the code is here:

    http://src.illumos.org/source/xref/illumos-gate/usr/src/grub/grub-0.97/stage1/stage1.S

    Here the starting point is 0x7c00, as documented here:

    http://en.wikibooks.org/wiki/X86_Assembly/Bootloaders

    BIOS is responsible for jumping to that memory location.

    Reply

  2. Just found out that kvm modules must not be running too, when VirtualBox is running:

    lsmod |grep kvm ==>

    rmmod kvm_amd (or rmmod kvm_intel)
    rmmod kvm

    They are not compatible.

    Reply

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: