Xen

xend

You can actually kill the xend process and restart it again without disturbing the running guests. So, if xm list or whatever doesn't respond, that's something to try.


xm console

To exit the console hit Ctrl+]

Linux

In /etc/inittab add a line like this:

s1:12345:respawn:/sbin/agetty xvc0 115200 vt100

..and add 'xvc0' to /etc/securetty.

This could be added to a kickstart file

Changed in Fedora 9

In most cases, the xen serial device is xvc0, but in FC9, it is hvc0. The block device is no longer hda but xvda. The bridge is no longer 'xenbr0' but 'eth0'. Yeeps...


Networking

By default on FC6 (and possibly elsewhere) Xen does all kinds of weird things to the networking. If they work, you won't know that they happened. If the don't work, it will look really weird. It involves the birth of peth0, vif0.0 and xenbr0. See http://wiki.xensource.com/xenwiki/XenNetworking.


Creating a New domU

booting an installer

I tried creating a domU config file with reasonable variables and it wouldn't install an OS because the installer would pop off to some other virtual console. The xen console isn't even a vt100. It's just a serial connection.

Fedora's xen stuff comes with virt-install and it works but I wanted to know how. Basically, it creates a working domU config file with a few nifty variables.

In FC6 in the os/images dir of the install media there is a dir called xen which contains initrd.img and vmlinuz. These are special versions of the kernel and installer initrd that play nice in the console. By creating a domU config file that uses these, I can boot to an installer.

Interestingly, I can't just use the special initrd with a normal kernel. Somehow the special kernel tells anaconda to attach to the xen console rather than zip off to another virtual console like normal.

domain config file

The command

xm create --help_config

will print a list of possible config vars for the domain config files.


The Block Devices

I want to be able to resize partitions on guests without rebooting them or the host.

1 fdisked lv on host

In an attempt to make the guests disk look just like the disks I use, I made a single large LV on dom0 and then used fdisk to divide it into a /boot and an LVM partition. I later tried to resize the LVM partition, but I couldn't get the kernel of domU to recognise that the partition size had changed.

2 lvs on host, no fdisk

I created 1 small LV to use as the boot partition, and the other LV was a whole-disk PV for LVM. I'm hoping that because there's no partition table on the LVM disk, I can resize while the thing is running.

I was able to create the PV, VG and LVs from inside dom0 and have them found by the fedora installer in domU.

To recap:

1 [physical disk]
2 [vg on physical disk]
3 [lv on vg]
4 [domU vg containing lv as only pv]
5 [domU lv on domU vg]

So, to resize an lv on domU, I first expand the lv on dom0, then I run pvresize on the lv on dom0. Then I can lvrezie domU lv from domU.


Updated DomU Install Notes

Restoring from Backup

In this case, you need to have the disk ready, which you can do from dom0


Tar

If you are un-tarring a backup onto a set of dirs mounted on dom0, make sure you use --numeric-owner, otherwise it will change the owner/group numeric ids of the files to match the id's of the users and groups on dom0. E.g. Fred's uid = 500 on hostA and 503 on dom0. If you untar hostA onto volumes mounted on dom0, it will change his files to belong to uid 503.


Kickstart

Don't set the 'logging' option in your kickstart file. It makes python blow wierd errors and you can't see any useful errors because they are on a different vt from the serial console.

In kickstart file, don't set install mode to 'text', set it to 'cmdline'. You get better error messages that way.

Any %post scripts should begin with

exec &> /dev/xvc0

This will make STDOUT and STDERR go to the virtual serial device instead of the non-existant ptty3.

The irritating "Could not get identity of device ... Ignore/Cancel" error means you have to sit at the cosole and hit "Ignore" at least 6 times before the install proper begins. I hacked stage2.img (usr/lib/anaconda/text.py) and now I'm OK.

My 'repo' lines didn't work. I copied --mirrorlist directly from yum repos...D'oh! Need to replace vars like $repoversion with values like '5.2'.


CPUs

To allow a domU to use more than one CPU, you not only have to tell it how many cpus it's allowed to use, but also which specific cpus are OK for it to use. Eg.

cpus=0-3  # you're allowed to run on all 4 cpus
vcpus=4   # you're allowed to use 4 cpus at once

Changed in 3.1.0

The line  cpus=0-3  causes a really obscure error. I think that's because it's not valid python code. If you comment it out, the new default seems to be to use all available cpus. So, if you just have the line  vcpus=4  on a 4 proc dom0, then domU will be able to use all 4.


Cloning and LVM

I tried to clone a system by creating similer LVs and using  dd  to copy the contents of the 1st system's disks onto the new system. The new system did not boot because the initrd from the old system had lvm configuration stuff for the old system. It was looking for the wrong VGs.

To fix this, I had to mount the new system's lvs on a running system (dom0 in my case) and run mkinit. I had to make sure that /proc and /sys were available and most importantly, that /dev was populated. I populated /dev by doing a file copy from the system I was cloning. I'm sure there's a better way to do that.

To get everything mounted in dom0 so I could edit things, I had to use kpartx to get the partition tables read for the lvs that are whole disks on domU. If you use kpartx to map the partition table of an LV and then mount it, you must use kpartx again to unmap them or the changes you've made to any files won't actually be written to disk. It's a bit maddening to change grub.conf and then see that pygrub is ignoring your changes.

The next problem I had was that I had made /home 5G instead of 7G and because I had copied all the data with dd, the filesystem was totally confused. It expected to be on a 7G partition, not a 5G partition. I wound up blowing /home away on the new sytem and using tar to copy from the old system.

More with mkinitrd

In order to get mkinird to recongnize that I have LVM on my system, I had to create /dev/hda,/dev/hda1 and /dev/hda2 using mknod. I copied the major/minor numbers from the devices in dom0. Since this is all to fool mkinitrd, which is running in dom0, that's what we want to do. Sure enough, when I chroot to the new system and run mkinitrd, it thinks hda2 has LVM stuff on it.

It still won't boot, but at least it's doing a vgscan. I think if I fix /dev, all will be right in the world.


Confusing Kernel Panic

I had a confusing kernel panic in a paravirt domU that I was booting using kernel=blah, ramdisk=blah in the xm config file, rather than just using the bootloader=blah line. I discovered that by adding  extra="console=xvc0"  to the xm config file, I got a lot more info during the failed boot. I assume that xen is printing things to the serial console, but the actual kernel doesn't know to do that unless you tell it to via that option.


OpenBSD as HVM Guest

If you have a processor with special virtualization support, you're supposed to be able to run any OS,right? Ha ha.. Well, you probably can but it ain't easy.

Use 3.1.0

After giving myself yet another ulcer, I gave up on xen 3.0.3 and I waited for Fedora 7 which has xen 3.1.0 to come out. Things look much much better!

QEMU

It looks like all the IO emulation is actually handled by QEMU, not xen. It looks like xen takes config variables and passes them onto a patched version of QEMU. That means some configuration things that worked fine with a PV guest don't work with an HVM guest. Did I mention 'ARRGH!'?

If you run the command ps auxw | grep qemu on dom0 and you'll see the qemu command with it's args. You can then check the qemu docs to see what the heck you've actually done.

root@syldavia#
root@syldavia# ps auxw | grep qemu 
root     10032  1.2  0.2  64576  6980 ?  Sl   15:32   0:48 
/usr/lib64/xen/bin/qemu-dm -d 23 -vcpus 4 -boot dc -acpi -domain-name 
celeste -net nic,vlan=1,macaddr=00:16:3e:5f:b7:1a,model=rtl8139 -net 
tap,vlan=1,bridge=eth0 -vncunused -vnclisten 127.0.0.1 
root@syldavia#

Block Device Requires Full Path

In an example of the above phenominon, you must specify the full path to the block device or file back-end on Dom0 in the xm config file, unlike with the PVs in 3.0.3.

disk = [ 'phy:/dev/vg/celeste,ioemu:sda,w',
         'file:/etc/xen/cd41.iso,hdc:cdrom,r'
       ]

Can't boot from SCSI disk

I installed OpenBSD onto an emulated SCSI disk. However, the xen bios didn't see this disk when looking for disks to boot from. I changed it to an IDE drive and changed /etc/fstab to reflect this change and it started booting OK.

To change from scsi to ide, change the name of the destination device in the disk entry in your xm config file. See the example below:

disk = [ 'phy:/dev/vg/celeste,ioemu:sda,w',      # sda = scsi 
         'file:/etc/xen/cd41.iso,hdc:cdrom,r'
       ]

disk = [ 'phy:/dev/vg/celeste,ioemu:hda,w',      # hda = ide
         'file:/etc/xen/cd41.iso,hdc:cdrom,r'
       ]

Networking

xenbr0 vs. eth0

Somehow in the transition between FC6/Xen3.0.3 -> F7/Xen3.1.0 xenbr0 fell out of favor and eth0 became the name of the bridge de jour. I needed to specify the bridge as eth0 for it to work. If I didn't, it assumed xenbr0 as the default bridge... ooops.

vif = [ 'type=ioemu,bridge=eth0', ]

No, I don't know what "type=ioemu" means. I just cribbed it from examples on the web and it seems to work.

Emulated Network Card Problem

The OpenBSD rtl8139 driver and the QEMU rtl8139 emulation do not get along. Every few seconds, the OpenBSD terminal says something like

re0: watchdog timeout

(I'm writing that from memory, so it might not be exactly that)

When it does this, network connectivity stops for a few seconds. This make the networking slower than a 56K modem.

What I've done as a lousy work-around is use the ne2000 emulation in qemu. To tell it to emulate the ne2000,

vif = [ 'type=ioemu,bridge=eth0,model=ne2k_pci', ]

This means you're limited to a 10Mbps, which is still a heck of a lot faster than with the busted realtek.

Investigation Ongoing

Looks like the card is actually hung and the watchdog should be resetting it. Oh dear...

I've been tinkering around in the OpenBSD code to see if I can learn where it's not working. It looks like the general behavior of an interface is set in /usr/src/sys/dev/net/if.c and most of the 're' stuff is in /usr/src/sys/dev/ic/re.c. I've found that there is a timer (if->if_timer) that gets decremented by the watchdog and if it hits 1 (not zero) it freaks out. It's not supposed to get to 1 because the interface is supposed to call _something_ to reset it. In the case of the re driver, it looks like a re_txeof (transfer at end of file?) a re_stop (stop the interface?) set it to zero (basically, not counting any more) and re_send sets the timer to 5. That makes it look like the txeof is probably the thing that should be happening that isn't. Strangely, I don't see that in normal behavior. What I see is lots of re_sends.

Ideas


Logging

You can get a xen console log on dom0 by changing the way xenconsoled runs. I have not been able to find a real man page for xenconsoled, but I've figured out that you can edit /etc/rc.d/init.d/xend to include some logging options like:

XENCONSOLED_LOG_HYPERVISOR=yes
XENCONSOLED_LOG_GUESTS=yes
XENCONSOLED_TIMESTAMP_HYPERVISOR_LOG=yes
XENCONSOLED_TIMESTAMP_GUEST_LOG=yes

I haven't actually restarted xen with those options, so I don't know if that's correct or if it works.

What I have done is kill the running xenconsoled, and then run it like this

xenconsoled --log=all --log-dir=/var/log/xen/console/ -t all

It seems to be absolutely fine to kill and restart xenconsoled while xen vms are running.

xenconsoled

And now, ladies and gentlemen, a summary of xenconsoled's options. Gleaned from reading the source. This is valid for CentOS 5 (which is xen 3.0.3 with lots of patches), and probably Xen 3.1.

Usage: xenconsoled [-h] [-V] [-v] [-i] [--log=none|guest|hv|all] [--log-dir=DIR] [--pid-file=PATH] [-t, --timestamp=none|guest|hv|all]

-h - print usage message above
-V - print xenconsoled version
-v - makes xenconsoled log more verbosely to syslog.  This seems to be xenconsoled internal logging, not logging of xen
-i - interactive mode - don't turn into a daemon, no idea what this means other than that

--log - "guest" gives you a log of console output for each guest.  "hv" gives you a log of stuff from dom0.  "all" and "none" do what you think.
--log-dir - Where to put all this stuff.  /var/log/xen/console/ on my system, for example.
--pid-file - if you don't want to use the default pid file location
--timestamp - include timestamps in log messages from your guests and your dom0.  Options do the same as for --log


CategoryNotes

Xen (last edited 2008-11-07 23:12:23 by dmartin)