Kdump README for SLES 10

Prerequisites
=============

Be sure that you have installed the kexec-tools rpm.  For x86, x86-64
and ppc64, install kernel-kdump.rpm, too.  The version of the
kernel-kdump rpm must match the version of the running system kernel.


Overview
========

Kdump uses kexec to quickly boot to a recovery kernel whenever a dump of
the system kernel's memory needs to be taken (for example, when the
system panics). The system memory image is preserved across the reboot
and is accessible to the debug kernel. You can use common Linux
commands, such as cp and scp, to copy the memory image to a dump file on
the local host, or across the network to a remote system.

Kdump and kexec are currently supported on the x86, x86_64, and PPC64
architectures.

The system kernel reserves a small section of memory for the capture
kernel at boot time of the system kernel. This ensures that ongoing
Direct Memory Access (DMA) from the system kernel does not corrupt the
capture kernel. The "kexec -p" command loads the capture kernel into
this reserved memory area.

On x86 machines, the first 640 KB of physical memory is needed to boot,
irrespective of where the kernel loads. Therefore, kexec preserves this
region immediately before rebooting into the recovery kernel.

All of the necessary information about the system kernel's core image is
encoded in the ELF format, and stored in a reserved area of memory
before a crash. The physical address of the start of the ELF header is
passed to the recovery kernel through the "elfcorehdr=" boot parameter.

In the capture kernel, you can access the memory image from the system
kernel in two ways:

1) Through a /dev/oldmem device interface. A capture utility can read the
device file and write out the memory in raw format. This is a raw dump
of memory. Analysis and capture tools must be intelligent enough to
determine where to look for the right information.

2) Through /proc/vmcore. This exports the memory dump as an ELF format
file that can be written out using any file copy command such as cp or
scp. Further, you can use analysis tools such as the GNU Debugger (GDB)
or Crash to debug the dump file. This method ensures that the dump pages
are ordered correctly.


Setup of Kdump on SLES 10
=========================

Be sure the prerequisite RPMs are installed.

To enable a crash dump, you need to add an option to the boot loader to
specify the size and offset of the recovery kernel memory area.

An example of this boot loader option is "crashkernel=64M@16M". The 64M
shows the reserved space for the Kdump recovery kernel, and the 16M is
the address of the reserved area.  On ia64, the start offset is
calculated by the kernel, so @xxx offset is ignored.
 
You can add this option either with the YaST boot loader module, or by
manually editing the boot loader configuration file. 

The recommended values by architecture for the "crashkernel" option are:

i386:   crashkernel=64M@16M
x86_64: crashkernel=64M@16M
ia64:   crashkernel=512M                (on small machines use 256M)
PPC64:  crashkernel=128M@32M

After setting the boot loader option, activate the Kdump init script,
which is not activated by default. To do this, use the YaST System
Services (Runlevel) module. Alternately, enable the service on the
command line with the following command: "/sbin/chkconfig kdump on".

***Warning*** You must activate kdump service permanently via
YaST or chkconfig like above.  Starting kdump service temporarily
(e.g. "rckdump start") doesn't suffice.  It's because the system
is once rebooted over kexec to another state, and the temporary
activation is abandoned at the kdump boot stage.

After enabling the Kdump init script, reboot the system so that the
Kdump kernel image is loaded properly.

Test your Kdump setup by issuing the following commands as the root
user:
 
***Warning*** This procedure will crash your system. Shut down all
applications and ensure that no users are logged on before performing
this test.

# sync
# echo u > /proc/sysrq-trigger       (remount file systems read-only to
                                      avoid recovery after reboot)
# echo c > /proc/sysrq-trigger

After the system recovers, verify that a vmcore file was generated in
the save dump directory. By default the vmcore file is located in
/var/log/dump/<date-string>.

When a crash occurs, the kernel crash handler starts the second recovery
kernel that the Kdump init script loaded earlier, and reboots the system
using the reserved memory up to the $KDUMP_RUNLEVEL runlevel.

During the boot of the recovery kernel, the Kdump init script loads
again, but this time it dumps the core image for later analysis.

When a crash happens in a graphical environment, you will likely have no
GUI in the second kernel boot. If you used a VGA console, you might
still have visual output from the secondary kernel. The default behavior
of the Kdump script is to save the old vmcore image, and then reboot the
system immediately. You can adjust the behavior of the Kdump script
through sysconfig variables described later in this document.


The Default Dumper
==================

By default, the Kdump script saves the vmcore file to a unique
sub-directory consisting of $KDUMP_SAVEDIR and the date string, such as
/var/log/dump/2006-02-21-13:20/vmcore.

Before copying the vmcore file, the default dumper does some system
checks. First, it checks the number of old dump directories and removes
them if there are more than $KDUMP_KEEP_OLD_DUMPS. Then, the dumper
checks the free disk space in the partition of the dump directory. If
the free space is less than the sum of the memory size and the value
given in $KDUMP_FREE_DISK_SIZE, then the dumper will not create a dump.

$KDUMP_RUNLEVEL specifies the runlevel of the Kdump (recovery) kernel
boot. When $KDUMP_IMMEDIATE_REBOOT is set to yes, then the init script
automatically reboots after saving the vmcore. By default, the dumper
uses KDUMP_RUNLEVEL=1 and KDUMP_IMMEDIATE_REBOOT=yes, in order to reduce
the possible risk of disk corruption in the recovery kernel environment.

If you want Kdump to run more complex jobs than set by the default
dumper configuration, set the name of the appropriate command or script
to be run via $KDUMP_TRANSFER, and change $KDUMP_RUNLEVEL and
$KDUMP_IMMEDIATE_REBOOT.

For example, setting $KDUMP_TRANSFER="scp /proc/vmcore remote:/dump" and
KDUMP_RUNLEVEL=3 will make Kdump act like a netdump. You can set
KDUMP_IMMEDIATE_REBOOT=no to prevent the immediate reboot. This could be
useful to check the system over the network, for example.

Note that the available memory size for the recovery kernel is limited.
Setting KDUMP_RUNLEVEL=5 (graphical login) is not recommended.


Initrd-based Dump Saving
========================

The problem with the procedure mentioned above is that your root file
system (or whatever partition your KDUMP_SAVEDIR is in) may be corrupted.
So the script may not be able to mount the device and is not able to
save your file to disk.

For this, you can configure KDUMP_DUMPDEV to point to an unused partition
that is large enough -- i.e. larger than the system's main memory -- to
hold the dump. Before mounting the root file system, the init script
writes the dump to that device. After rebooting, the normal boot script
saves the dump from that device to KDUMP_SAVEDIR. Because the data was
is saved to disk, you can safely turn off the computer and/or repair
the file system using some tool (for example, you may need to boot from
a CD which is no problem).

After you changed that value, you have to re-run mkinitrd on the kdump
kernel, or on all kernels.


Tuning parameters
=================

You can adjust the basic behavior of the Kdump script by editing the
/etc/sysconfig/kdump file. Edit the script values with the YaST runlevel
System Services editor, or manually edit the /etc/sysconfig/kdump file,
and then restart the kdump service.


Generic options
---------------

- KDUMP_KERNELVER

This is the kernel version string for the Kdump kernel; an example is
"2.6.16-5-kdump". The init script will use a kernel named
/boot/vmlinux-$KDUMP_KERNELVER. The kdump script is located in the
/etc/sysconfig file.

If you do not specify a version, then the init script will try to find a
Kdump kernel with the same version number as the running kernel. Using
the string "kdump" will default to the most recently installed Kdump
kernel (suitable for x86, x86-64 and ppc64).  For ia64, keep this
string empty to point the same running kernel.


- KDUMP_COMMANDLINE

This sets the command string to be passed to the Kdump kernel. This will
usually match the contents of the grub kernel line. An example is
KDUMP_COMMANDLINE="ro root=LABEL=/".

If you do not give a command line, then the default will be taken from
/proc/cmdline.


- KDUMP_COMMANDLINE_APPEND

Set this variable if you only want to _append_ values to the default
command line string. The string gets also appended if KDUMP_COMMANDLINE
is set.


- KEXEC_OPTIONS

You can use this to pass additional arguments to kexec. For i386 and
x86-64, you likely need to pass "--args-linux" here.


- KDUMP-RUNLEVEL

This is the runlevel that the Kdump kernel boots to. The default is "1". 
To enable network support in the Kdump recovery environment, set this to
"3".


- KDUMP_IMMEDIATE_REBOOT

This option specifies whether to reboot immediately after saving the
core in the Kdump kernel. This option is ignored when KDUMP_DUMPDEV is
set to a non-empty string. The default is "yes".


- KDUMP_TRANSFER

This is an option to execute a script or command to process or transfer
the dump image. It can read the dump image either through /proc/vmcore
or /dev/oldmem. An empty string will use the default dumper.


Options for the Default Dumper
------------------------------

- KDUMP_SAVEDIR

This option specifies the path to the directory where the dumps are
saved. The default is "/var/log/dump". See also KDUMP_DUMPDEV if you
don't want to save the dump at first on a raw device which helps if your
root file system is corrupted.


- KDUMP_DUMPDEV

Specifies the dump device that is used for saving the dump in the kdump
kernel.  You don't need to specify a dump device here. Then the dump is
written to KDUMP_SAVEDIR when booting from the kdump kernel.

If KDUMP_DUMPDEV points to a device file, the dump is written to that
device when booting from the kdump kernel. The advantage over  is that
you don't have to mount the root file system (which may be corrupted!)
just to write the dump. On the first normal boot which is able to
successfully mount the root file system, the dump is saved to
KDUMP_SAVEDIR as usual.

Important: The KDUMP_DUMPDEV is overwritten by kdump, so don't use it
for saving any data. Also don't use the currently used swap partition.


- KDUMP_KEEP_OLD_DUMPS

This option specifies how many previous dumps are kept. If the number of
saved dump files exceeds this number, the dumper removes older dumps. 
You can prevent automatic removal by setting this to "0" (zero). The
default value is "5".


- KDUMP_FREE_DISK_SIZE

This specifies the minimum free disk space in megabytes of the dump
partition. If the free disk space is less than the sum of this value and
the memory size, then the default dumper will not save the vmcore file
in order to prevent disk corruption. Setting this option to "0" (zero)
forces the dumper to dump without checking the size. The default value
is "64".


- KDUMP_VERBOSE

Determines if kdump uses verbose output. This value is a bitmask:

  1: kdump command line is written to system log when executing
     /etc/init.d/kdump
  2: progress is written to stdout while dumping 
  4: kdump command line is written so standard output when executing
     /etc/init.d/kdump


Machine-specific Notes
======================

 - IA64
   o On some Hewlett Packard platforms you need 'machvec=dig' in
     KDUMP_COMMANDLINE_APPEND. For example: HP rx3600.

   o On SGI SN2 machines, the kdump doesn't work when the VGA console
     is active. To disable the VGA console execute following commands
     in the EFI shell

         Shell> set NoVGA 1
         Shell> reset


Dump Triggering Methods
=======================

This section talks about the various ways, other than a Kernel Panic,
in which Kdump can be triggered.  These methods will enable the user
to invoke Kdump in cases where the system is experiencing a hard
hang.

1) AltSysRq C

On i386 and x86_64 machines, Kdump can be triggered with the
combination of the 'Alt','SysRq' and 'C' keyboard keys.  This method
will work only on directly attached consoles, and not on remote
consoles.  In cases where the machine is in a hung state with
interrupts disabled, AltSysRq C cannot be used.  If any kind of
terminal access is still possible, the same result may be achieved
from the shell command line like so:

       # echo c > /proc/sysrq-trigger

On PowerPC boxes also AltSysrq C can be used to initiate Kdump if a
directly attached console is available.  In addition, Kdump can also
be triggered via Hardware Management Console(HMC) using 'Ctrl', 'O'
and 'C' keyboard keys.  Inorder to use the Sysrq method for dump
triggering /proc/sys/kernel/sysrq needs to be enabled, which can be
done as follows:

       # echo 1 > /proc/sys/kernel/sysrq

2) Kernel OOPs

If we want to generate a dump everytime the Kernel OOPses, we can
achieve this by setting the 'Panic On OOPs' option as follows:

    # echo 1 > /proc/sys/kernel/panic_on_oops


3) NMI(Non maskable interrupt) button

In cases where the system is in a hung state, and is not accepting
keyboard interrupts, using NMI button for triggering Kdump can be very
useful.  NMI button is present on most of the newer x86 and x86_64
machines.  Please refer to the User guides/manuals to locate the
button, though in most occasions it is not very well documented. In
most cases it is hidden behind a small hole on the front or back panel
of the machine.  You could use a toothpick or some other
non-conducting probe to press the button.

For example, on the IBM X series 366 machine, the NMI button is
located behind a small hole on the bottom center of the rear panel.

To enable this method of dump triggering using NMI button, you will
need to set the 'unknown_nmi_panic' option as follows:

   # echo 1 > /proc/sys/kernel/unknown_nmi_panic

When enabling unknown_nmi_panic please be careful not to enable Nmi
Watchdog feature, else the system will panic.

4) NMI WATCHDOG

Nmi watchdog is a feature available in the x86 and x86_64 kernels
which uses NMI to monitor whether a CPU has locked up.  On i386
machines, nmi watchdog can be enabled by passing nmi_watchdog=1 in the
commandline of the kernel.  On x86_64 machines, this is enabled by
default.  To verify if your system has been configured with nmi
watchdog, look at the NMI entry in /proc/interrupts.  If the count is
greater than zero then nmi watchdog has been confgured, else it is
not.

Please refer to Documentation/nmi_watchdog.txt in the kernel source
for a more detailed description.

Once this feature has been enabled in the kernel, any lockups will
result in an OOPs message to be generated, followed by Kdump being
triggered.  This also requires 'Panic On OOPs' to be enabled as
explained in method 2 above.

Please refrain from simultaneously enabling 'nmi_watchdog' and setting
/proc/sys/kernel/unknown_nmi_panic, as this would result in a Kernel
Panic from legitimate NMIs generated by the nmi_watchdog.


5) PowerPC specific methods:

On IBM PowerPC machines, the following methods to issue a soft reset
can be used to trigger Kdump.  On SLES10 systems, XMON(debugger) is
turned off by default.  If the user wishes to enable XMON, he can do
so by booting the kernel with 'xmon=on' option.  With XMON enabled,
issuing a soft reset will drop the user to the XMON prompt, where
typing a 'X' will trigger Kdump.  If XMON is not enabled then a soft
reset will directly trigger Kdump.

5.1) HMC

Hardware Management Console(HMC) available on Power4 and Power5
machines allow partitions to be reset remotely.  This is specially
useful in hang situations where the system is not accepting any
keyboard inputs.

Once you have HMC configured, the following steps will enable you to
trigger Kdump via a soft reset:

On Power4
  Using GUI

    * In the right pane, right click on the partition you wish to
      dump. 
    * Select "Operating System->Reset".
    * Select "Soft Reset".
    * Select "Yes".

  Using HMC Commandline

    # reset_partition -m <machine> -p <partition> -t soft

On Power5
  Using GUI

    * In the right pane, right click on the partition you wish to
      dump. 
    * Select "Restart Partition".
    * Select "Dump".
    * Select "OK".

  Using HMC Commandline

    # chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r lpar

5.2) Blade Management Console for Blade Center

To initiate a dump operation, go to Power/Restart option under "Blade
Tasks" in the Blade Management Console.  Select the corresponding
blade for which you want to initate the dump and then click "Restart
blade with NMI".  This will issue a soft reset.

5.3) Control Panel function for a standalone Power5 machine

A standalone machine is one which does not have any LPARs configured
and also does not have a HMC available.  In such cases the Control
Panel, usually located on the front panel of the machine (please refer
to the User guide of the specific model for details) can be used for
dump triggering in case the system has a hard hang.

The control panel provides many functions for System Management
purposes; Function 22 is meant for invoking a Partition dump.  This
function is available only in the Manual operating mode.

To check if the system is operating in manual mode,

    * Select function 1 on the panel
    * Press enter
    * Read the Operating mode from the panel display
    * If it is not 'M', then use function 2 to set it (see below)

To set manual mode:

    * Select function 2 on the panel
    * Press enter
    * The current OS IPL type is displayed with a pointer
    * Press enter to move to the Operating mode
    * Use increment, decrement buttons to change the mode to M
    * Press enter

To trigger the dump:

    * Select function 22 on the panel
    * Press enter
    * Select function 22 on the panel
    * Press enter

Invoking function 22 twice will issue a soft reset to the machine.


Dump Analysis
=============

Dump analysis can be performed using GDB or the Crash utility. The Crash
utility is included in the crash RPM package. You must install a
debug-info kernel matching the version of the system kernel (of the
system where the dump was collected) on the system where the analysis is
to be performed. The debug-info kernel provides symbol and type
information that Crash and GDB use. You can find kernel debug
information RPMs on the SUSE support Web site. Alternately, you can
build a debug-info kernel from source by specifying the
CONFIG_DEBUG_INFO kernel parameter.

Even if you install kernel-debuginfo, you need to uncompress the kernel
image first. This depends on the architecture on which your system is
running. If you don't know, just run "uname -i" to get the architecture.

On i586, i686 and x86_64, s390 and s390x, you have to unpack the kernel
image:

   $ gunzip /boot/vmlinux-<version>.gz

On IA64, the default kernel image is already a gzip'ed vmlinux image.
Run

   $ zcat /boot/vmlinuz-<version> > /boot/vmlinux-<version>

On PPC and PPC64, you don't have do to anything as there the bootloader
already loads the vmlinux image.

The symbol information in the debug-info kernel may differ from the
running kernel, therefor; when running crash against a vmcore you
should specify both the System.map file and the debug-info kernel.
For example, to run crash against a vmcore use the following command
line:

    $ crash /boot/System.map-version /boot/vmlinux-version vmcore

Where:
  /boot/System.map-<version> -- The map file matching the kernel
	                            being analyzed.
  /boot/vmlinux-<version>    -- The matching kernel.
  vmcore                     -- The crash dump.

GDB Helper Script
=================

The GDB-kdump script is provided to simplify use of GDB on dump images. 
The usage is "gdb-kdump [vmcore]".

The argument is the vmcore dump image to analyze. If you do not give an
argument, then the latest dump image will be taken. The script starts
GDB with the vmlinux of the currently running kernel. The script assumes
that the vmlinux file is at /boot/vmlinux-$kernel. If the script finds
only a gzip-compressed file, the file is automatically uncompressed.

Note that you will need to supply kernel-versionnumber-debuginfo, with
debug symbols. GDB-kdump also reads some useful macros for the Kdump
image, originally provided in /usr/src/linux/Documentation/kdump, at
startup. The following macros then become available: bttnobp, btt,
btpid, trapinfo, and dmesg. See the help topic of each command in GDB
for details.