How to add new entries to “sysctl” with the same root?

For example, entering “sudo sysctl -a | grep ‘^dev'” gives me the following list:

dev.cdrom.autoclose = 1
dev.cdrom.autoeject = 0
dev.cdrom.check_media = 0
dev.cdrom.debug = 0

dev.cdrom.info = CD-ROM information, Id: cdrom.c 3.20 2003/12/17
dev.cdrom.info =
dev.cdrom.info = drive name:
dev.cdrom.info = drive speed:
dev.cdrom.info = drive # of slots:
dev.cdrom.info = Can close tray:
dev.cdrom.lock = 0
dev.hpet.max-user-freq = 64
dev.mac_hid.mouse_button2_keycode = 97
dev.mac_hid.mouse_button3_keycode = 100
dev.mac_hid.mouse_button_emulation = 0
dev.parport.default.spintime = 500
dev.parport.default.timeslice = 200
dev.raid.speed_limit_max = 200000
dev.raid.speed_limit_min = 1000
dev.scsi.logging_level = 0

As you can see, under the “dev” branch are many different children. How is it done?

For example, under “dev” is “scsi”, which is achieved via register_sysctl_table(), and in this linux kernel file:

./scsi/scsi_sysctl.c:
scsi_table_header = register_sysctl_table(scsi_root_table);

static struct ctl_table scsi_root_table[] = {
{ .procname = “dev”,
.mode = 0555,
.child = scsi_dir_table },
{ }
};

static struct ctl_table_header *scsi_table_header;

int __init scsi_init_sysctl(void)
{
scsi_table_header = register_sysctl_table(scsi_root_table);
if (!scsi_table_header)
return -ENOMEM;
return 0;
}

And under scsi_root_table:

static struct ctl_table scsi_root_table[] = {
{ .procname = “dev”,
.mode = 0555,
.child = scsi_dir_table },
{ }
};

And under scsi_dir_table:

static struct ctl_table scsi_dir_table[] = {
{ .procname = “scsi”,
.mode = 0555,
.child = scsi_table },
{ }
};

And under scsi_table:

static struct ctl_table scsi_table[] = {
{ .procname = “logging_level”,
.data = &scsi_logging_level,
.maxlen = sizeof(scsi_logging_level),
.mode = 0644,
.proc_handler = proc_dointvec },
{ }
};

So the multilevel tables is to implement:

dev.scsi.logging_level = 0

And similarly:

./cdrom/cdrom.c:
cdrom_sysctl_header = register_sysctl_table(cdrom_root_table);

So traversing from cdrom_root_table, all the way to “cdrom_table”:

static struct ctl_table cdrom_root_table[] = {
{
.procname = “dev”,
.maxlen = 0,
.mode = 0555,
.child = cdrom_cdrom_table,
},
{ }
};

static struct ctl_table cdrom_table[] = {
{
.procname = “info”,
.data = &cdrom_sysctl_settings.info,
.maxlen = CDROM_STR_SIZE,
.mode = 0444,
.proc_handler = cdrom_sysctl_info,
},
{
.procname = “autoclose”,
.data = &cdrom_sysctl_settings.autoclose,
.maxlen = sizeof(int),
.mode = 0644,
.proc_handler = cdrom_sysctl_handler,
},

And noticed the function cdrom_sysctl_info() above:

It is where all the information below is printed:

dev.cdrom.info = CD-ROM information, Id: cdrom.c 3.20 2003/12/17
dev.cdrom.info =
dev.cdrom.info = drive name:
dev.cdrom.info = drive speed:
dev.cdrom.info = drive # of slots:
dev.cdrom.info = Can close tray:
<…>

And the function snippets is here:

pos = sprintf(info, “CD-ROM information, ” VERSION “\n”);

if (cdrom_print_info(“\ndrive name:\t”, 0, info, &pos, CTL_NAME))
goto done;
if (cdrom_print_info(“\ndrive speed:\t”, 0, info, &pos, CTL_SPEED))
goto done;
if (cdrom_print_info(“\ndrive # of slots:”, 0, info, &pos, CTL_SLOTS))
goto done;
if (cdrom_print_info(“\nCan close tray:\t”,
CDC_CLOSE_TRAY, info, &pos, CTL_CAPABILITY

And so that’s how multiple entries under the same root “dev” can be achieved.

This is also answering the question posted here:

http://stackoverflow.com/questions/20164041/dynamically-adding-entries-to-sysctl

virt-manager error

While trying to create a VM in virt-manager, I got a “bind socket” permission denied error. This happens whether CentOS or Ubuntu is used as the VM guest.

Error as follows:

Unable to complete install: ‘internal error: process exited while connecting to monitor: 2016-03-19T04:58:53.268413Z qemu-system-x86_64: -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-centos7.0/org.qemu.guest_agent.0,server,nowait: Failed to bind socket to /var/lib/libvirt/qemu/channel/target/domain-centos7.0/org.qemu.guest_agent.0: Permission denied’
Traceback (most recent call last):
File “/usr/share/virt-manager/virtManager/asyncjob.py”, line 90, in cb_wrapper
callback(asyncjob, *args, **kwargs)
File “/usr/share/virt-manager/virtManager/create.py”, line 2277, in _do_async_install
guest.start_install(meter=meter)
File “/usr/share/virt-manager/virtinst/guest.py”, line 501, in start_install
noboot)
File “/usr/share/virt-manager/virtinst/guest.py”, line 416, in _create_guest
dom = self.conn.createLinux(start_xml or final_xml, 0)
File “/usr/lib/python2.7/dist-packages/libvirt.py”, line 3606, in createLinux
if ret is None:raise libvirtError(‘virDomainCreateLinux() failed’, conn=self)
libvirtError: internal error: process exited while connecting to monitor: 2016-03-19T04:58:53.268413Z qemu-system-x86_64: -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-centos7.0/org.qemu.guest_agent.0,server,nowait: Failed to bind socket to /var/lib/libvirt/qemu/channel/target/domain-centos7.0/org.qemu.guest_agent.0: Permission denied

Causes of error:

The error arise from “channel qemu-ga” virtual hardware not emulated correctedly.

Workaround Steps:

a. Create new VM -> select ISO images.

b. Use ISO images -> select ISO file.

c. Set memory / CPU.

d. Set disk image size.

e. Set filename of image, and then now select “custom configuration before install”.

f. Inside the custom configuration screen, you see “Channel qemu-ga” as the hardware. Remove this hardware.

g. After removable, everything now works.

XFS: how to extend the filesystem size when full?

Scenario: My CentOS7 is running inside QEMU.

Looking at my CentOS7 filesystem using "df":

You can see the /home is near 100%. How to extend it?

Luckily, the default filesystem in CentOS7 is XFS:

Just do a "sudo blkid /dev/mapper/centos-home" and you can see that it is "XFS".

To extend it I need to do a few things:

a. Add new disk. SInce the OS is running inside QEMU, just do:

qemu-img create -f qcow2 centos7_hdd2.img 80G

to create a new "harddisk" named as centos7_hdd2.img. If you are not using QEMU, then it is equivalent to shutting down system and putting a new harddisk instead.

b. Reboot CentOS7. If you are using QEMU, then remember to include the new harddisk image when you start your CentOS7 guest, for example part of it shown below:

qemu-system-x86_64 -hda centos7_hdd.img -hdb centos7_hdd2.img …

c. Now the new harddisk is recognized as /dev/sdb. Create a new partition table using "fdisk /dev/sdb" and add a new partition called /dev/sdb1.

Now XFS is using LVM concept: There is PV: which house all the harddisk. Now we will have two PV – /dev/sda and /dev/sdb. From the PV, you create VG: nothing to add as we are reusing an existing VG. From VG, you create LV: nothing to add, but the LV size need to be extended. So here it goes:

d. Add the new partition to PV:

sudo pvcreate /dev/sdb1

And check:

sudo pvdisplay

e. Extend the existing VG with the new PV:

sudo vgextend centos /dev/sdb1

And check:

sudo vgdisplay

f. Now extend the size of the LV:

sudo lvextend -L80G /dev/centos/home

And check:

sudo lvdisplay

g. Finally extend the filesystem (XFS) on the LV:

sudo xfs_growfs /home

And check:

sudo df

And now the diskspace utilization is 35%. Cool.

https://ma.ttias.be/increase-expand-xfs-filesystem-in-red-hat-rhel-7-cento7/

http://serverfault.com/questions/610973/how-to-increase-the-size-of-an-xfs-file-system

http://linoxide.com/file-system/create-mount-extend-xfs-filesystem/

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/xfsgrow.html

https://www.centos.org/docs/5/html/Cluster_Logical_Volume_Manager/VG_grow.html

http://www.microhowto.info/howto/increase_the_size_of_an_lvm_logical_volume.html

https://www.centos.org/docs/5/html/Cluster_Logical_Volume_Manager/lv_extend.html

Setting up virtual machine via virsh/virt-manager/virt-viewer

My requirementsfor the setup: Setup VM client with VMX and VT-d

As compared with an earlier article which focus entirely on virsh, and virt-viewer command, now here the focus is on virt-manager:

https://tthtlc.wordpress.com/2016/02/03/how-to-setup-virsh-or-libvirt-or-virtio-in-a-custom-kernel/

The tool virt-manager is a complex and sophisticated tool to generate the command line for QEMU, using libvirtd:

From https://virt-manager.org/
virt-manager-vm-list.png

Someone had said on the web, normally command line is the preferred mechanism to work in Linux, as it facilitate automation easily. But ease of use provided by virt-manager, vs the complex XML which it can generate to create the VM guest, really justify its use.

The relationship between libvirtd, QEMU, and virt-manager or virt-viewer are as follows:

From http://www.ibm.com/developerworks/cloud/library/cl-managingvms/:

figure2.jpg

Environment:

Ubuntu 16.04 Xenial (64-bit):

http://askubuntu.com/questions/103965/how-to-determine-if-cpu-vt-extensions-enabled-in-bios

http://serverfault.com/questions/633183/how-do-i-enable-kvm-device-passthrough-in-linux

https://www.centos.org/forums/viewtopic.php?f=47&t=48115 (note that the basic problem of achieving VT-d is not solved in that thread, as the hardware is not capable of VT-d).

For a list of hardware that support VT-d (+VMX):

http://www.intel.com/content/www/us/en/support/boards-and-kits/desktop-boards/000005758.html

First edit “sudo vi /etc/default/grub” and change the line:

GRUB_CMDLINE_LINUX_DEFAULT=”quiet splash”

to:

GRUB_CMDLINE_LINUX_DEFAULT=”intel_iommu=on”

Do a sudo update-grub, and reboot.

Next install the following (in addition to existng packages which you have installed in the past):

apt-get update

apt-get install openssh-server

apt-get install virt-manager virt-viewer lsscsi

apt-get install qemu-system-x86

apt-get install libvirt-dev

apt-get install ssh-askpass

apt-get install kvm

Edit the file /etc/ssh/sshd_config and ensure the following:

PermitRootLogin yes
StrictModes yes

The “sudo service ssh restart” to restart the ssh server, and so now you can login as root into localhost.

Next, enter “sudo virt-manager” and then “File->Create new connection”:

Add a localhost connection as root, which will be prompted for password later (which is why we need to permit rootlogin locally).

The following are the commands to directly interact with libvirtd, which can be equally done by virt-manager’s GUI interface:

=======================================================

sudo virsh sysinfo

# other command are
#virsh -c qemu+ssh://bozz@SERVER/system sysinfo
#virsh -c qemu+ssh://tteikhua@localhost/system sysinfo
#virsh -c qemu+ssh://tteikhua@localhost sysinfo
#virsh -c qemu+ssh://user@localhost/system sysinfosudo virsh pool-list

sudo virsh net-list –all

sudo virsh net-start default

sudo virsh list

sudo virsh destroy myguest_domain

### the following command will create the VM guest from the XML file “myguest_domain.xml” which have been saved beforehand.
sudo virsh define myguest_domain.xml

# after creating the domain, you have to start it to get it booted up.

sudo virsh start myguest_domain

sudo virt-host-validate

============================================================

The following is the output of sysinfo:
<sysinfo type='smbios'>
  <bios>
    <entry name='vendor'>American Megatrends Inc.</entry>
    <entry name='version'>G56JK.201</entry>
    <entry name='date'>05/13/2014</entry>
    <entry name='release'>4.6</entry>
  </bios>
  <system>
    <entry name='manufacturer'>ASUSTeK COMPUTER INC.</entry>
    <entry name='product'>G56JK</entry>
    <entry name='version'>1.0       </entry>
    <entry name='serial'>EBN0BC00467945B     </entry>
    <entry name='uuid'>837152F2-5681-444E-92DB-54A05089F05E</entry>
    <entry name='sku'>ASUS-NotebookSKU</entry>
    <entry name='family'>G</entry>
  </system>
  <baseBoard>
    <entry name='manufacturer'>ASUSTeK COMPUTER INC.</entry>
    <entry name='product'>G56JK</entry>
    <entry name='version'>1.0       </entry>
    <entry name='serial'>BSN12345678901234567</entry>
    <entry name='asset'>ATN12345678901234567</entry>
    <entry name='location'>MIDDLE              </entry>
  </baseBoard>
  <processor>
    <entry name='socket_destination'>SOCKET 0</entry>
    <entry name='type'>Central Processor</entry>
    <entry name='family'>Core i7</entry>
    <entry name='manufacturer'>Intel</entry>
    <entry name='signature'>Type 0, Family 6, Model 60, Stepping 3</entry>
    <entry name='version'>Intel(R) Core(TM) i7-4710HQ CPU @ 2.50GHz</entry>
    <entry name='external_clock'>100 MHz</entry>
    <entry name='max_speed'>3800 MHz</entry>
    <entry name='status'>Populated, Enabled</entry>
    <entry name='serial_number'>Not Specified</entry>
    <entry name='part_number'>Fill By OEM</entry>
  </processor>
  <memory_device>
    <entry name='size'>8192 MB</entry>
    <entry name='form_factor'>SODIMM</entry>
    <entry name='locator'>ChannelB-DIMM0</entry>
    <entry name='bank_locator'>BANK 2</entry>
    <entry name='type'>DDR3</entry>
    <entry name='type_detail'>Synchronous</entry>
    <entry name='speed'>1600 MHz</entry>
    <entry name='manufacturer'>Samsung</entry>
    <entry name='serial_number'>E187DCC9</entry>
    <entry name='part_number'>M471B1G73DB0-YK0</entry>
  </memory_device>
</sysinfo>
And the following is the output of validate-host:
Domain myguest_domain started

  QEMU: Checking for hardware virtualization                                 : PASS
  QEMU: Checking if device /dev/kvm exists                                   : PASS
  QEMU: Checking if device /dev/kvm is accessible                            : PASS
  QEMU: Checking if device /dev/vhost-net exists                             : PASS
  QEMU: Checking if device /dev/net/tun exists                               : PASS
  QEMU: Checking for cgroup 'memory' controller support                      : PASS
  QEMU: Checking for cgroup 'memory' controller mount-point                  : PASS
  QEMU: Checking for cgroup 'cpu' controller support                         : PASS
  QEMU: Checking for cgroup 'cpu' controller mount-point                     : PASS
  QEMU: Checking for cgroup 'cpuacct' controller support                     : PASS
  QEMU: Checking for cgroup 'cpuacct' controller mount-point                 : PASS
  QEMU: Checking for cgroup 'devices' controller support                     : PASS
  QEMU: Checking for cgroup 'devices' controller mount-point                 : PASS
  QEMU: Checking for cgroup 'net_cls' controller support                     : PASS
  QEMU: Checking for cgroup 'net_cls' controller mount-point                 : PASS
  QEMU: Checking for cgroup 'blkio' controller support                       : PASS
  QEMU: Checking for cgroup 'blkio' controller mount-point                   : PASS
  QEMU: Checking for device assignment IOMMU support                         : PASS
  QEMU: Checking if IOMMU is enabled by kernel                               : PASS
   LXC: Checking for Linux >= 2.6.26                                         : PASS
   LXC: Checking for namespace ipc                                           : PASS
   LXC: Checking for namespace mnt                                           : PASS
   LXC: Checking for namespace pid                                           : PASS
   LXC: Checking for namespace uts                                           : PASS
   LXC: Checking for namespace net                                           : PASS
   LXC: Checking for namespace user                                          : PASS
   LXC: Checking for cgroup 'memory' controller support                      : PASS
   LXC: Checking for cgroup 'memory' controller mount-point                  : PASS
   LXC: Checking for cgroup 'cpu' controller support                         : PASS
   LXC: Checking for cgroup 'cpu' controller mount-point                     : PASS
   LXC: Checking for cgroup 'cpuacct' controller support                     : PASS
   LXC: Checking for cgroup 'cpuacct' controller mount-point                 : PASS
   LXC: Checking for cgroup 'devices' controller support                     : PASS
   LXC: Checking for cgroup 'devices' controller mount-point                 : PASS
   LXC: Checking for cgroup 'net_cls' controller support                     : PASS
   LXC: Checking for cgroup 'net_cls' controller mount-point                 : PASS
   LXC: Checking for cgroup 'freezer' controller support                     : PASS
   LXC: Checking for cgroup 'freezer' controller mount-point                 : PASS

 

Unboxing DragonBoard 410c and booting up on uart console

As a start a lot of details have been covered here:

http://www.cnx-software.com/2015/11/21/dragonboard-410c-development-board-quick-start-guide-and-android-benchmarks/

and here is the setup for uart console:

https://github.com/96boards/documentation/wiki/Dragonboard-410c-Installation-Guide-for-Linux-and-Android#setting-up-the-uart-console (A)

And yes, the UART console is done using a 1.8V FTDI adapter cable (which I used previously for Raspberry Pi):

http://www.aliexpress.com/item/2pcs-Lot-USB-To-RS232-TTL-Serial-Adapter-Cable-for-Raspberry-Pi-and-Banana-Pi/32262456765.html

And the S6 pins on the DragonBoard itself is 0-1-0-0. (which is after the stage where the SDcard have been flashed to the internal eMMC.

Details see the above URL link (A) (Flashing the SD Card Image to the DB410c).

And here is the bootup captured over the uart console of booting up from internal eMMC:

http://pastebin.com/EBiuuuAu

(which is Ubuntu 15.04 as you can see)

References:

https://www.96boards.org/wp-content/uploads/2015/02/96BoardsCESpecificationv1.0-EA1.pdf

https://developer.qualcomm.com/download/db410c/linux_android_board_support_package_vla.br_.1.2.4-01810-8×16.0-2.zip

https://github.com/96boards/documentation/wiki/Dragonboard-410c-Installation-Guide-for-Linux-and-Android#installing-image-using-an-sd-card-image

builds.96boards.org/releases/dragonboard410c/linaro/ubuntu/latest/dragonboard410c_sdcard_install_ubuntu*.zip

builds.96boards.org/releases/dragonboard410c/qualcomm/android/latest/dragonboard410c_sdcard_install_android*.zip

https://github.com/96boards/documentation/wiki/Dragonboard-410c-Installation-Guide-for-Linux-and-Android#create–install-a-rescue-image

https://github.com/96boards/documentation/wiki/Dragonboard-410c-Installation-Guide-for-Linux-and-Android#installing-image-using-an-sd-card-image

builds.96boards.org/releases/dragonboard410c/linaro/rescue/latest/dragonboard410c_bootloader_emmc_linux*.zip

builds.96boards.org/releases/dragonboard410c/linaro/ubuntu/latest/boot-linaro-vivid-qcom-snapdragon-arm64*.img.gz

builds.96boards.org/releases/dragonboard410c/linaro/ubuntu/latest/linaro-vivid-developer-qcom-snapdragon-arm64*.img.gz

https://developer.qualcomm.com/hardware/dragonboard-410c/tools

https://developer.qualcomm.com/download/db410c/little-kernel-boot-loader-overview.pdf

https://developer.qualcomm.com/download/db410c/android-display-overview.pdf

https://developer.qualcomm.com/download/db410c/sensors-porting-guide-dragonboard-410c.pdf

https://developer.qualcomm.com/download/db410c/linux_android_board_support_package_vla.br_.1.2.4-01810-8×16.0-2.zip

https://developer.qualcomm.com/download/db410c/linux-android-board-support-package-vla.br.1.2.4-00310-8×16-1.zip

https://developer.qualcomm.com/download/db410c/linux-ubuntu-board-support-package-v1.1.zip

https://developer.qualcomm.com/hardware/dragonboard-410c

https://github.com/96boards/documentation/wiki/Dragonboard-410c-Installation-Guide-for-Linux-and-Android

How to setup libvirt or virtio in a custom kernel

While setting up a custom kernel for Ubuntu 14.04 LTS 64-bit to be able to run “virsh” / libvirt tools, many problems have been encountered (see references).

First install all the essential tools for Ubuntu the easy way first:

https://help.ubuntu.com/lts/serverguide/libvirt.html

sudo apt-get install qemu-kvm libvirt-bin
sudo apt-get install virtinst
sudo apt-get install qemu-system
sudo apt-get install virt-viewer

Next a custom kernel is needed, so download a stable kernel source from www.kernel.org. The existing config from current version of Ubuntu 14.04 is copied to “.config”, and the following additional changes made:

CONFIG_NETFILTER_XT_NAT=m
CONFIG_NF_NAT_MASQUERADE_IPV4=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_NF_NAT_MASQUERADE_IPV6=m

and then “make oldconfig” applied, before “make” to start the compilation. Then you will need to “sudo make modules_install” and “sudo make install”.

Reboot into the new kernel, and ensure that “nat” show up in /proc/net/ip_table_names. And ensure that “kvm” is listed in “lsmod”.

So to test the setup:

a. sudo virsh net-start default

b. sudo virsh list

c. sudo virsh sysinfo

d. sudo virsh pool-list

e. sudo virsh net-list –all

f. To create a guest VM, first create a file called “guest.xml”:

<domain type='kvm'>
 <name>guest</name>
 <uuid>f5fe9230-6ef3-4eec-af54-65363a68f3ce</uuid>
 <memory>524288</memory>
<currentMemory>524288</currentMemory>
<vcpu>1</vcpu>
<os>
 <type arch='x86_64' machine='pc-i440fx-1.5-qemu-kvm'>hvm</type>
 <boot dev='cdrom'/>
</os>
<features>
<acpi/>
<apic/>
<pae/>
</features>
<clock offset='localtime'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>restart</on_crash>
<devices>
<emulator>/usr/bin/kvm</emulator>
<disk type='file' device='disk'>
 <source file='/var/lib/libvirt/images/guest.img'/>
 <target dev='hda' bus='ide'/>
</disk>
<disk type='file' device='cdrom'>
 <source file='/home/user/ubuntu1404_x86_64/ubuntu-14.04-desktop-amd64.iso'/>
 <target dev='hdc' bus='ide'/>
<readonly/>
</disk>
<interface type='network'>
<mac address='54:52:00:2a:58:0d'/>
<source network='default'/>
</interface>
<input type='mouse' bus='ps2'/>
<graphics type='vnc' port='-1' autoport='yes' keymap='en-us'/>
</devices>
</domain>

For each of the bold item above:

1. The unique number is generated by “uuidgen”.

2. For the machine type it must come from one of this items:

/usr/bin/qemu-system-x86_64 –machine ?

Different types of machine chosen may end up using “tcg” as the QEMU emulation mode, instead of “kvm” (which is based on hardware virtualization and thus much faster.

So beware – if you find the emulation abnormally slow. Just “ps -ef” and ensure that
the “accel=kvm” is displayed instead of “accel=tcg”.

3.    Create the guest image beforehand:

sudo qemu-img create -f qcow2 /var/lib/libvirt/images/guest.img 8192

4. The cdrom ISO is just the ISO downloaded.

Now issue “sudo virsh define guest.xml”. (what if you hate XML file, and are not sure how to setup the correct XML file for your version of libvirt,   which might be different from here? See below).

g. After “sudo virsh list” to ensure that the “guest” KVM image is listed, now do “sudo virsh start guest” to start the guest booting up from CDROM.

h. After starting the VM running: “ps -ef” to list the process:

libvirt+ 4737 1 5 11:44 ? 00:02:04 qemu-system-x86_64 -enable-kvm -name guest -S -machine pc-i440fx-1.5-qemu-kvm,accel=kvm,usb=off -m 512 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid f5fe9230-6ef3-4eec-af54-65363a68f3ce -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/kvm1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/guest.img,if=none,id=drive-ide0-0-0,format=raw -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=/home/user/ubuntu1404_x86_64/ubuntu-14.04-desktop-amd64.iso,if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 -netdev tap,fd=24,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=54:52:00:2a:58:0d,bus=pci.0,addr=0x3 -vnc 127.0.0.1:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4

Notice how the complexity of the qemu command line is now solved by properly entering the correct parameter in the “guest.xml” file.

i.   Finally, to connect to the running VM, you either can use “sudo virt-viewer <vm_name>” where <vm_name> is the name of the guest itself (which is “guest” in our case), or “sudo virt-viewer -c qemu:///system guest”.

j.    And “virsh shutdown guest” to shutdown the running VM.

k.   And “virsh destroy guest” to destroy the running VM.

How to setup virtio based on the libvirt infrastructure?

An architectural visualization of the virtio and its use in VM guest setup are as follows (http://www.openstack.cn/?p=580):

In summary, the diagram essentially means some of the I/O processing can pass directly from host into the guest without going through the QEMU event loop (which uses the /dev/kvm interface inside the host.   As shown in another diagram below (http://slides.com/braoru/kvm/fullscreen#/):

By default the following should be enable before compiling the custom kernel:

CONFIG_VHOST_NET=m
CONFIG_VHOST_SCSI=m
CONFIG_VHOST_RING=m
CONFIG_VHOST=m

After compiling and rebooting into the new kernel:

a. modprobe -c |grep vhost to search for all the virtio related kernel module (which essentially is just vhost, vhost_scci, vhost_net).

b. modprobe vhost, modprobe vhost_scsi, modprobe vhost_net to load the kernel module.

c. Use “sudo virt-host-validate” to check that the kernel setup:
QEMU: Checking for hardware virtualization                                 : PASS
QEMU: Checking for device /dev/kvm                                         : PASS
QEMU: Checking for device /dev/vhost-net                                   : PASS
QEMU: Checking for device /dev/net/tun                                     : PASS
LXC: Checking for Linux >= 2.6.26                                         : PASS

And to setup the virtio direct I/O between the host and the VM guest, you can follow through here:

https://easyengine.io/tutorials/kvm/enable-virtio-existing-vms/

How to setup the guest if no XML is given:

This is made possble by the package “virtinst” installed earlier.

And the command is:

sudo virt-install –virt-type qemu –arch x86_64 –machine ‘pc-i440fx-2.0’ –debug –name guest –ram 1024 –disk path=/var/lib/libvirt/images/guest.qcow2 –cdrom /home/user/ubuntu1404_x86_64/ubuntu-14.04-desktop-amd64.iso –boot cdrom

Again the machine type must come from one of those listed in:

/usr/bin/qemu-system-x86_64 --machine ?

And now “sudo virsh dumpxml guest” to extract out the XML file.

References:

How libvirt work internally:   https://libvirt.org/internals.html

http://blog.vmsplice.net/2011/09/qemu-internals-vhost-architecture.html

http://www.linux-kvm.org/images/4/41/2011-forum-virtio_net_whatsnew.pdf

http://www.ibm.com/developerworks/library/l-virtio/l-virtio-pdf.pdf
From here:   http://dpdk.org/doc/guides/sample_app_ug/vhost.html

../_images/virtio_linux_vhost.png

More complex setup for virtio:

../_images/vhost_net_arch1.png

Architectural internals of vrtio:

https://jipanyang.wordpress.com/2014/10/27/virtio-guest-side-implementation-pci-virtio-device-virtio-net-and-virtqueue/

Common setup problems:
https://www.digitalocean.com/community/questions/problem-with-iptables

https://forums.gentoo.org/viewtopic-t-1009770.html?sid=7822e8eefcdb28edcedf9db7526b7b1e

http://stackoverflow.com/questions/21983554/iptables-v1-4-14-cant-initialize-iptables-table-nat-table-does-not-exist-d

http://serverfault.com/questions/593263/iptables-nat-does-not-exist/593289

Intel Processor Trace: How to use it

To be noted is that “Processor Trace” is a feature of recent Intel x86 processor (eg skylake):

https://software.intel.com/en-us/blogs/2013/09/18/processor-tracing

First clone this:

https://github.com/01org/processor-trace

And do a “sudo make install” to install the libipt.so libraries.

Next do a clone of Andi Kleen pt tracing tool:

git clone github.com:andikleen/simple-pt

If you do “make install” is to make the kernel module, and then followed by “sudo insmod simple-pt.ko” to load it.

Next is to “make user” to make all the relevant binaries (see Makefile, which is essentially sptdump, fastdecode, sptdecode and ptfeature).

During compilation you make encounter some errors on missing files, and the following are the ones I need to install before successful “make user” (but your mileage may differ):

sudo apt-get install libelf-dev
 sudo apt-get install libdw-dev dwarfdump
 sudo apt-get install libdwarf-dev

There are other binaries like “rdmsr” which the tester program depends as well.

Next is to run tester as root: “sudo ./tester”, and the output is listed here:

http://pastebin.com/d3QeVNsV

And looking further into “stest.out”:

0 [+0] [+   1] native_write_msr_safe+18
 [+  31] trace_event_raw_event_msr+116 -> trace_event_buffer_reserve
 [+  21] trace_event_buffer_reserve+143 -> trace_event_buffer_lock_reserve
 [+  23] trace_event_buffer_lock_reserve+67 -> trace_buffer_lock_reserve
 [+  13] trace_buffer_lock_reserve+33 -> ring_buffer_lock_reserve
 [+  41] ring_buffer_lock_reserve+184 -> rb_reserve_next_event
 [+  36] rb_reserve_next_event+180 -> trace_clock_local
 [+   5] trace_clock_local+20 -> sched_clock
 [+   4] sched_clock+12 -> native_sched_clock
 [+   7] trace_buffer_lock_reserve+65 -> ring_buffer_event_data
 [+   4] ring_buffer_event_data+12 -> rb_event_data
 [+   6] trace_buffer_lock_reserve+90 -> tracing_generic_entry_update
 [+   7] trace_event_buffer_reserve+176 -> ring_buffer_event_data
 [+   4] ring_buffer_event_data+12 -> rb_event_data
 [+  11] trace_event_raw_event_msr+164 -> trace_event_buffer_commit
 [+  33] trace_event_buffer_commit+124 -> filter_check_discard
 [+  11] trace_event_buffer_commit+226 -> trace_buffer_unlock_commit
 [+  15] trace_buffer_unlock_commit+45 -> ring_buffer_unlock_commit
 [+  16] ring_buffer_unlock_commit+54 -> rb_update_write_stamp
 [+   7] trace_buffer_unlock_commit+115 -> ftrace_trace_userstack

It is generated by the sptdecode command:

sptdecode --sideband ${PREFIX}.sideband --pt ${PREFIX}.0 $DARGS > ${PREFIX}.out

The function “trace_event_raw_event_msr” is nowhere to be found inside simple-pt.c, nor part of the kernel symbols. But “trace_event_buffer_reserve” is part of the kernel symbol (sudo cat /proc/kallsyms |grep trace_event_buffer_reserve).

So now we shall disassemble the APIs shown in the stest.out from the live-running kernel (every is still read-only, so is safe, no modification is possible, but you will need root):

To see the APIs live in action, first identify the “vmlinux” where the kernel is build. Mine is build by myself, and so do this (sudo is needed before /proc/kcore is root read-only):

sudo gdb ./vmlinux /proc/kcore ==> gdb prompt.

And next is to identify the source directory of the simple-pt.ko kernel module, and its offset in memory:

sudo cat /proc/modules |grep simple
 simple_pt 61440 0 - Live 0xffffffffa1021000 (OE)
 

And so add the following inside the gdb prompt:

 add-symbol-file /home/tteikhua/simple-pt/simple-pt.ko 0xffffffffa1021000

Now you can list in assembly by name:

(gdb) x /10i trace_event_raw_event_msr
 0xffffffffa1022c70 <trace_event_raw_event_msr>: push %rbp
 0xffffffffa1022c71 <trace_event_raw_event_msr+1>: mov %rsp,%rbp
 0xffffffffa1022c74 <trace_event_raw_event_msr+4>: push %r15
 0xffffffffa1022c76 <trace_event_raw_event_msr+6>: push %r14
 0xffffffffa1022c78 <trace_event_raw_event_msr+8>: push %r13
<snip>
 0xffffffffa1022ccd <trace_event_raw_event_msr+93>: lea -0x58(%rbp),%rdi
 0xffffffffa1022cd1 <trace_event_raw_event_msr+97>: mov $0x20,%edx
 0xffffffffa1022cd6 <trace_event_raw_event_msr+102>: mov %r12,%rsi
 0xffffffffa1022cd9 <trace_event_raw_event_msr+105>: addq $0x1,0x7d27(%rip) # 0xffffffffa102aa08
 0xffffffffa1022ce1 <trace_event_raw_event_msr+113>: mov %ecx,-0x5c(%rbp)
 0xffffffffa1022ce4 <trace_event_raw_event_msr+116>: 
callq 0xffffffff81205060 <trace_event_buffer_reserve>  
0xffffffffa1022ce9 <trace_event_raw_event_msr+121>: addq $0x1,0x7d1f(%rip) # 0xffffffffa102aa10

From the “stest.out” output above, we can also see that the line by line output correspond to each “basic blocks” (https://en.wikipedia.org/wiki/Basic_block) in the assembly listing.

References:

https://lwn.net/Articles/576551/

https://lwn.net/Articles/584539/

https://lwn.net/Articles/654705/

https://lwn.net/Articles/648154/

Follow

Get every new post delivered to your Inbox.

Join 28 other followers

%d bloggers like this: