README for the Xen packages
===========================
This file contains SUSE-specific instructions and suggestions for using Xen.
For more in-depth documentation of using Xen on SUSE, consult the
virtualization chapter in the SLES or SUSE Linux manual, or read up-to-date
virtualization information, including a list of known issues, at
http://www.novell.com/documentation/vmserver/.
For more complete documentation on Xen itself, please install one of the
xen-doc-* packages and read the documentation installed into
/usr/share/doc/packages/xen/.
About
-----
Xen allows you to run multiple virtual machines on a single physical machine.
See the Xen homepage for more information:
http://www.cl.cam.ac.uk/Research/SRG/netos/xen/
If you want to use Xen, you need to install the Xen hypervisor and a number of
supporting packages. During the initial SUSE installation (or when installing
from YaST) check-mark the "Xen Virtual Machine Host Server" pattern. If,
instead, you wish to install Xen manually later, install the following packages:
bridge-utils
kernel-xen or kernel-xenpae
python
xen
xen-libs
xen-tools
xen-tools-ioemu (Only required for hardware-assisted virtualization)
xen-doc-* (Optional)
tightvnc (Optional, to view VMs)
yast2-vm (Optional, to facilitate creation and management of VMs)
multipath-tools (Required by yast2-vm, for domUloader)
You then need to reboot your machine. Instead of booting a normal Linux
kernel, you will boot the Xen hypervisor and a slightly changed Linux kernel.
This Linux kernel runs in the first virtual machine and will drive most of your
hardware.
This approach is called para-virtualization, since it is a partial
virtualization (the Linux kernel needs to be changed slightly, to make the
virtualization easier). It results in very good performance (consult
http://www.cl.cam.ac.uk/Research/SRG/netos/xen/performance.html) but has the
downside of unchanged operating systems not being supported. However,
upcoming hardware features (e.g., Intel's VT and AMD's Virtualization) will
help overcome this limitation.
Terminology
-----------
The Xen open-source community has a number of terms that you should be
familiar with.
A "domain" is Xen's term for a virtual machine.
"Domain 0" is the first virtual machine. It can control all other virtual
machines. It also (usually) controls the physical hardware. A kernel used in
domain 0 may sometimes be referred to as a dom0 kernel.
"Domain U" is any virtual machine other than domain 0. The "U" indicates it
is unprivileged (that is, it cannot control other domains). A kernel used in
an unprivileged domain may be referred to as a domU kernel.
Novell documentation will use the more industry-standard term "virtual
machine", or "VM", rather than "domain" where possible. And to that end,
domain 0 will be called the "virtual machine server", since it essentially the
server on which the other VMs run. All other domains are simply "virtual
machines".
The acronym "HVM" refers to a hardware-assisted virtual machine. These are
VMs that have not been modified (e.g., Windows) and therefore need hardware
support such as Intel's VT or AMD's Virtualization to run on Xen.
Kernels
-------
Xen supports two kinds of kernels: A privileged kernel (which boots the
machine, controls other VMs, and usually controls all your physical hardware)
and unprivileged kernels (which can't control other VMs, and usually don't need
drivers for physical hardware). The privileged kernel boots first (as the VM
server); an unprivileged kernel is used in all subsequent VMs.
The VM server takes control of the boot process after Xen has initialized the
CPU and the memory. This VM contains a privileged kernel and all the hardware
drivers.
For the other virtual machines, you usually don't need the hardware drivers.
(It is possible to hide a PCI device from the VM server and re-assign it to
another VM for direct access, but that is a more advanced topic.) Instead you
use virtual network and block device drivers in the unprivileged VMs to access
the physical network and block drivers in the VM server.
For simplicity, SUSE ships a single Xen-enabled Linux kernel, rather than
separate privileged and unprivileged kernels. As most of the hardware drivers
are modules anyway, using this kernel as an unprivileged kernel has very
little extra overhead.
The kernel is contained in the kernel-xen package (or kernel-xenpae for 32 bit
hardware with > 4G of RAM), which you need to install to use Xen.
Booting
-------
If you installed Xen during the initial SUSE installation, or installed one
of the kernel-xen* packages later, a "XEN" option should exist in your Grub
bootloader. Select that to boot SUSE on top of Xen.
If you want to add additional entries, or modify the existing ones, you will
have to edit Grub yourself. All Xen entries in the Grub configuration file
(usually /boot/grub/menu.lst) look something like this:
title XEN
root (hd0,5)
kernel /xen.gz
module /vmlinuz-xen <parameters>
module /initrd-xen
Replace (hd0,5) with the partition that holds your /boot directory in
grub-speak, e.g., hda1 -> (hd0,0) and sda5 -> (hd2,4).
Normally, xen.gz requires no parameters. If you want to add parameters,
see below.
Replace "<parameters>" with the kernel parameters that you want to pass to
your kernel. These should be very similar, if not identical, to those passed
to a normal kernel that you boot on bare iron.
Once you have booted this configuration successfully, you are running Xen with
a privileged kernel on top of it.
Xen Boot Parameters
-------------------
Normally, xen.gz requires no parameters. However, in special cases (such as
debugging or a dedicated VM server) you may wish to pass it parameters.
We have added the following parameters (as compared to upstream Xen):
reboot=option1[,option2,...]
Options are:
warm Reboots will be warm (no memory testing, etc.)
cold Reboots will be cold (with memory testing, etc.)
no No reboots allowed
bios Reboot by calling the BIOS
hard Reboot by toggling RESET and/or crashing the CPU
For a more complete discussion of possible parameters, see the user
documentation in the xen-doc-* packages.
Start Scripts
-------------
Before you can create additional VMs (or use any other xm command) xend must
be running. This init script is part of the xen-tools package, and it is
activated at installation time. You can (de)activate it using insserv (or
chkconfig). You can also start it manually with "rcxend start".
One other relevant startup script is xendomains. This script can be used to
start other VMs when the VM server boots. It also cleanly shuts down the
other VMs when the VM server shuts down. To use this feature, place a
symbolic link in /etc/xen/auto that points to the VM's configuration file.
Look in /etc/sysconfig/xendomains for relevant settings.
Creating a VM with YaST
-----------------------
YaST is the recommended method to create VMs. The YaST module (from the
yast2-vm package) handles creating both the VM's configuration file and
disk(s). YaST can help install any operating system, not just SUSE.
From the command line, run "yast2 xen". From the GUI, start YaST, select
"System", then start "Virtual Machine Management (Xen)". For full
functionality, YaST must run in graphical mode, not ncurses.
The first screen shows all created and running VMs. To create a new one,
click "Add". Now adjust the VM's configuration to your liking.
Xen does not yet properly support removable media in VMs in paravirtual mode,
so installing an operating system from CDs can be difficult. We recommend
using a network installation source, a DVD, or a DVD ISO. CDs do, however,
work as expected in fully-virtual mode.
Note that paravirtualized SUSE Linux will default to using a text-based
installation. To perform a graphical installation, add "vnc=1" to the
"Installation Options" line in YaST. See this page for further guidance on
installing via VNC:
http://www.novell.com/coolsolutions/feature/15568.html
Once you have the VM configured, click "Next". YaST will now create a
configuration file for the VM, and create a disk image. The disk image will
exist in /var/lib/xen/images, and a corresponding config file will exist in
/etc/xen/vm. The operating system's installation program will then run within
the VM.
When the VM shuts down (because the installation -- or at least the first stage
of it -- is done), YaST gives you a chance to finalize the VM's configuration.
This is useful, for example, if the installer and the application that will
run in the VM have different memory or network requirements.
The creation of VMs can be automated with AutoYaST. A single AutoYaST profile
can control the VM's settings (regardless of OS type) and/or the actual
installation of the OS within the VM (for SUSE only). Perhaps the easiest way
to create such a profile is to install a SUSE OS within a VM and "clone" the
operating system in the final stage of the OS installation. Then copy the
resulting file (/root/autoinst.xml) into the VM server's filesystem, into the
directory /var/lib/autoinstall/repository/. Start the AutoYaST tool (YaST >
Miscellaneous > Autoinstall) and then open the profile. Select the "Virtual
Machine Management (Xen)" heading, and add the settings for the VM. Save the
profile. Now the single profile can direct both the configuration of a VM,
and the installation of the OS within the VM.
Creating a VM Manually
----------------------
If you create a VM manually (as opposed to using YaST, which is the recommended
way), you will need to create a disk (or reuse an existing one) and a
configuration file.
Each VM needs to have its own root filesystem. The root filesystem can live on
a block device (e.g., a hard disk partition, or an LVM2 or EVMS volume) or in
a file that holds the filesystem image.
VMs can share filesystems, such as /usr or /opt, that are mounted read-only
from _all_ VMs. Never try to share a filesystem that is mounted read-write;
filesystem corruption will result. For sharing writable data between VMs, use
NFS or other networked or cluster filesystems.
If you are using a disk or disk image that is already installed with an
operating system, you'll probably need to replace its kernel with a Xen-enabled
kernel.
The kernel and ramdisk used to bootstrap the VM must match any kernel modules
that might be present in the VM's disk. It is possible to manually copy the
kernel and ramdisk from the VM's disk (for example, after updating the kernel
within that VM) to the VM server's filesystem. However, an easier (and less
error-prone) method is to use something called the "domUloader". Before a new
VM is started, this loader automatically copies the kernel and ramdisk into
the VM server's filesystem, so that it can be used to bootstrap the new VM.
See /etc/xen/examples/xmexample.domUloader for an example.
Next, make a copy of one of the /etc/xen/examples/* files, and modify it to
suit your needs. For para-virtualized VMs, start with
/etc/xen/examples/xmexample1; for fully-virtualized VMs, start with
/etc/xen/examples/xmexample.hvm. You'll need to change (at very least) the
"name" and "disk" parameters.
When defining the virtual network adapter(s), we recommend using a static MAC
for the VM rather than allowing Xen to randomly select one each time the VM
boots. (See "Network Troubleshooting" below.) XenSource has been allocated a
range of MAC addresses with the OUI of 00-16-3E. By using MACs from this
range you can be sure they will not conflict with any physical adapters.
To get started quickly, you can use a modified rescue image from the Novell
SUSE installation CD/DVD. It's on the first CD/DVD in the boot/ directory with
the name "rescue". To make it usable with Xen, run the script
/usr/share/doc/packages/xen/mk-xen-rescue-img.sh (run it with no arguments to
get help). The script replaces the normal Linux kernel in the image with a
Xen-enabled Linux kernel (among other things; read the script for details).
The script also creates a matching configuration file. The disadvantage of
using the rescue way of constructing a root filesystem is that the result does
not have an RPM database, so you can't easily add packages using rpm. On the
positive side, the result is relatively small yet has most of what's needed to
get started with networking.
Managing Virtual Machines
-------------------------
VMs can be managed from the command line or from YaST.
To create a new VM from the command line, use a command like:
xm create my-vm
If your VM's configuration file is not located in /etc/xen/vm, you must
specify the full path.
Have a look at running sessions with "xm list". Note the ID of the newly
created VM. Attach to that VM with "xm console <ID>" (replacing ID with the
VM's ID). Equivalently, you could have passed "-c" during creation to
immediately connect to the console. Attaching to multiple VM consoles is most
conveniently done with the terminal multiplexer "screen".
Have a look at the other xm commands by typing "xm help". Note that most xm
commands must be done as root.
Using the Mouse via VNC in Fully-Virtual Mode
---------------------------------------------
When accessing a fully-virtualized operating system via VNC, the mouse may be
difficult to control. By default, the VM is emulating a PS/2 mouse. PS/2
provides mouse deltas, but VNC only provides absolute coordinates. The
solution is (when using VNC) to emulate a pointing device that offers absolute
coordinates.
Emulation of a SummaSketch graphics tablet is provided for this reason. To
use the Summa emulation, you will need to configure your fully-virtualized OS.
Note that the virtual tablet is connected to the second virtual serial port
(/dev/ttyS1 or COM2).
Most Linux distributions ship with appropriate drivers, and only need to be
configured. To configure gpm, edit /etc/sysconfig/mouse and add these lines:
MOUSETYPE="summa"
XMOUSETYPE="SUMMA"
DEVICE=/dev/ttyS1
The format and location of your configuration file could vary depending upon
your Linux distribution. The goal is to run the gpm daemon as follows:
gpm -t summa -m /dev/ttyS1
X also needs to be configured to use the Summa emulation. Add the following
stanza to /etc/X11/xorg.conf, or use your distribution's tools to add these
settings:
Section "InputDevice"
Identifier "Mouse0"
Driver "summa"
Option "Device" "/dev/ttyS1"
Option "InputFashion" "Tablet"
Option "Mode" "Absolute"
Option "Name" "EasyPen"
Option "Compatible" "True"
Option "Protocol" "Auto"
Option "SendCoreEvents" "on"
Option "Vendor" "GENIUS"
EndSection
After making these changes, restart gpm and X.
Windows does not ship with a driver for the SummaSketch tablet. You can
obtain an appropriate driver from the device manufacturer's website, or one of
the large Windows driver websites.
HVM Console in Fully-Virtual Mode
---------------------------------
When running a VM in fully-virtual mode, a special console is available that
provides some additional ways to control the VM. Press Ctrl-Alt-2 to access
the console; press Ctrl-Alt-1 to return to the VM.
The two most important commands are "send-key" and "change". The "send-key"
command allows you to send any key sequence to the VM, which might otherwise
be intercepted by your local window manager. The "change" command allows the
target of a block device to be changed; for example, use it to change from one
CD ISO to another. Type "help" for more information.
Networking
----------
Your virtual machines become much more useful if your can reach them via the
network. The default Xen setup creates a virtual bridge (xenbr0) in domain 0
when you start xend. Your eth0 device is enslaved to it. The slave VMs get a
virtual network interface eth0, which is visible to domain 0 as vifN.0 and
connected to the bridge. This means that if you set up an IP address in the
slave VMs belonging to the same subnet as eth0 from your domain 0, you'll be
able to communicate not only with the other slave VMs, but also with domain 0
and with the external network. If you have a DHCP server running in your
network, your slave VMs should succeed in getting an IP address.
Be aware that this may have unwanted security implications. You may want to
opt for routing instead of bridging, so you can set up firewalling rules in
domain 0.
Please read about the network configuration in the Xen manual. You can set up
bridging or routing for other interfaces also.
The network setup is done via the scripts in /etc/xen/scripts. They do not
support ipv6 at this moment, but this is just a limitation of the scripts.
When using SuSEfirewall2 and Xen network bridging, ensure that the Xen
bridges being used (xenbr0, xenbr1, etc.) are listed in
FW_FORWARD_ALWAYS_INOUT_DEV in the SuSEfirewall2 file. The format for
FW_FORWARD_ALWAYS_INOUT_DEV is a list of interfaces separated by a space.
For example, if the Xen bridge xenbr0 is being used, the line should be:
FW_FORWARD_ALWAYS_INOUT_DEV="xenbr0".
If xenbr0 and xenbr1 are being used, the line should be:
FW_FORWARD_ALWAYS_INOUT_DEV="xenbr0 xenbr1".
If you use the rescue images created by the above mentioned script, you'll
have a boot script inside that parses the ip=.... boot parameter. You can set
this parameter in the config file, and can have networking work automatically.
When using bridging, the eth0 in domain 0 device will be renamed to peth0 and
its MAC address will be set to fe:ff:ff:ff:ff:ff and ARP will be disabled.
veth0 will take over the old MAC address, be renamed to eth0, and be enabled
(ifup'ed). vif0.0 and peth0 are then enslaved to xenbr0. veth0 is connected
to vif0.0 behind the scenes.
Caveats:
- rcSuSEfirewall is not currently called from the Xen networking scripts, but
implicitly started by the ifup call; it won't get restarted on starting
additional domains.
This issue may be addressed in a future update.
Configuring network interfaces when using Xen bridging:
Due to the renaming of network interfaces by the network-bridge script
(e.g. eth0 to peth0), network interfaces should not be configured or restarted
while they are enslaved to a Xen bridge. Before configuring a network
interface enslaved to a Xen bridge, shutdown all VMs using the interface.
Then use the network-bridge script to remove the Xen bridge and to restore the
network interface back to normal (put peth0 back to eth0). For example, to
remove the Xen bridge and restore eth0 back to normal do the following:
/etc/xen/scripts/network-bridge stop netdev=eth0
With the Xen bridge removed and eth0 put back to normal, eth0 can then be
configured or restarted. Once the configuration is complete, Xen bridging can
be started back up again (creating the Xen bridge and renaming eth0 to peth0)
by doing the following:
/etc/xen/scripts/network-bridge start netdev=eth0
The VMs can then be started again.
For debugging, here's what happens on bootup of a domU:
- xenstored saves the device setup in xenstore
- domU is created
- vifX.1 shows up in domain 0 and a hotplug event is triggered
- hotplug is /sbin/udev; udev looks at /etc/udev/rules.d/40-xen.rules and
calls /etc/xen/scripts/vif-bridge online
- vif-bridge set the vifX.1 device up and enslaves it to the bridge
- eth0 shows up in domU (hotplug event triggered)
Similar things happen for block devices, except that /etc/xen/scripts/block is
called.
It's not recommended to use ifplugd nor NetworkManager for managing the
interfaces if you use bridging mode. Use routing with nat or proxy-arp
in that case. You also need to do that in case you want to send out packets
on wireless; you can't bridge Xen "ethernet" packets into 802.11 packets.
Thread-Local Storage
--------------------
For some time now, the glibc thread library (NPTL) has used a shortcut to
access thread-local variables at a negative segment offset from the segment
selector GS instead of reading the linear address from the TDB (offset 0).
Unfortunately, this optimization has been made the default by the glibc and
gcc maintainers, as it saves one indirection. For Xen this is bad: The access
to these variables will trap, and Xen will need to use some tricks to make the
access work. It does work, but it's very slow.
SUSE Linux 9.1 and SLES 9 were prior to this change, and thus are not
affected. SUSE Linux 9.2 and 9.3 are affected. For SUSE Linux 10.x and SLES
10, we have disabled negative segment references in gcc and glibc, and so
these are not affected. Other non-SUSE Linux distributions may be affected.
For affected distributions, one way to work around the problem is to rename
the /lib/tls directory, so the pre-i686 version gets used, where no such
tricks are done. An example LSB-compliant init script which automates these
steps is installed at /usr/share/doc/packages/xen/boot.xen. This script
renames /lib/tls when running on Xen, and restores it when not running on Xen.
Modify this script to work with your specific distribution.
Mono has a similar problem, but this has been fixed in SUSE Linux 10.1 and
SLES 10. Older or non-SUSE versions of Mono may have a performance impact.
Security
--------
Domain 0 has control over all domains. This means that care should be taken to
keep domain 0 safe; ideally you strip it down to only do as little there as
possible, preferably with no local users except for the system administrator.
Most commands in domain 0 can only be performed as root, but this protection
scheme only has moderate security and might be defeated. In case domain 0 is
compromised, all other domains are compromised as well.
To allow relocation of VMs (migration), the receiving machine listens on TCP
port 8002. You might want to put firewall rules in place in domain 0 to
restrict this to machines which you trust. You have some access control in
xend-config.sxp as well by tweaking the xend-relocation-hosts-allow
setting. Relocating VMs with sensitive data is not a good idea in untrusted
networks, since the data is not sent encrypted.
The memory protections for the domUs are effective; so far no way to break out
of a virtual machine is known. A VM is an effective jail.
Limitations
-----------
When booting, Linux reserves data structures matching the amount of (virtual)
processors and RAM found. This has the side-effect that you can't dynamically
grow the virtual hardware beyond what the kernel has been booted with. But
you can trick domU Linux to prepare for a larger amount of RAM by passing the
mem= boot parameter.
The export of virtual hard disks from files in Xen is handled via the loopback
driver. You can easily run out of those, as by default only 8 loopback devices
are supported. You can change this by inserting:
options loop max_loop=64
into /etc/modprobe.conf.local in domain 0.
Similarly, the netback driver comes up with 8 virtual network device pairs
(vif0.X - vethX). You can change this by inserting:
options netloop nloopbacks=64
into /etc/modprobe.conf.local in domain 0.
Network Troubleshooting
-----------------------
First ensure the VM server is configured correctly and can access the network.
For starting it's easiest to disable any firewall on the VM server, but enable
IP_FORWARD in /etc/sysconfig/sysctl (/proc/sys/net/ipv4/ip_forward). If you
want to enable SuSEfirewall2 with bridging, add xenbr0 to a device class, set
FW_ROUTE and FW_ALLOW_CLASS_ROUTING. Watch the kernel reject messages ...
Switch off ifplugd and NetworkManager. These can interfere with the changes
xend makes to the network setup.
Specify a static virtual MAC in the VM's configuration file. Random MACs can
be problematic, since with each boot of the VM it appears that some hardware
has been removed (the previous random MAC) and new hardware is present (the
new random MAC). This can cause network configuration files (which were
intended for the old MAC) to not be matched up with the new virtual hardware.
In the VM's filesystem, ensure the ifcfg-eth* files are named appropriately.
For example, if you do decide to use a randomly-selected MAC for the VM, the
ifcfg-eth* file must not include the MAC in its name; name it generically
("ifcfg-eth0") instead. If you use a static virtual MAC for the VM, be sure
that is reflected in the file's name.
Troubleshooting
---------------
First try to get Linux running on bare iron before trying with Xen.
Be sure your Xen hypervisor (xen) and VM kernels (kernel-xen) are compatible.
The hypervisor and domain 0 kernel are a matched set, and usually must be
upgraded together.
If you have trouble early in the boot, try passing pnpacpi=off to the Linux
kernel. If you have trouble with interrupts or timers, passing lapic to Xen
may help. Xen and Linux understand similar ACPI boot parameters. Try the
options acpi=off,force,strict,ht,noirq or acpi_skip_timer_override. Other
useful debugging options to Xen may be nosmp, noreboot, mem=1024M,
sync_console, noirqbalance (Dell). For a complete list of Xen boot options,
consult chapter 10.3 of the Xen users' manual.
If domain 0 Linux crashes on X11 startup, please try to boot into runlevel 3.
To debug Xen or domain 0 Linux crashes or hangs, it may be useful to use the
debug-enabled hypervisor, and to prevent automatic rebooting. Change your
Grub configuration from something like this:
kernel (hd0,5)/xen.gz
To something like this:
kernel (hd0,5)/xen-dbg.gz noreboot
After rebooting, the Xen hypervisor will write any error messages to the log
file (viewable with the "xm dmesg" command).
If problems persist, check if a newer version is available. Well-tested
versions will be shipped with SUSE and via YaST Online Update. More frequent
(but less supported) updates are available on Novell's Forge site:
http://forge.novell.com/modules/xfmod/project/?xenpreview
Known Issues
------------
For a list of known issues and work-arounds, see
http://www.novell.com/documentation/vmserver/.
Disclaimer
----------
Xen performed amazingly well in our tests and proved very stable. Still, you
should be careful when using it, just like you'd be careful if you boot an
experimental kernel. Expect that it may not boot and be prepared to have a
fall-back solution for that scenario. Be prepared that it may not support all
of your hardware. And for the worst of all cases, have your most valuable
data backed up. (This is always a good idea, of course.)
Feedback
--------
In case you have remarks about, problems with, ideas for, or praise for Xen,
please report it back to the xen-devel list:
xen-devel@lists.xensource.com
If you find issues with the packaging or setup done by Novell/SUSE, please
report it to:
http://www.suse.de/feedback/
ENJOY!
Your Novell SUSE Team.