forked from pool/kexec-tools
This commit is contained in:
parent
69d44e59e9
commit
c99563c960
176
README.SUSE
176
README.SUSE
@ -316,6 +316,180 @@ Machine-specific Notes
|
|||||||
Shell> reset
|
Shell> reset
|
||||||
|
|
||||||
|
|
||||||
|
Dump Triggering Methods
|
||||||
|
=======================
|
||||||
|
|
||||||
|
This section talks about the various ways, other than a Kernel Panic,
|
||||||
|
in which Kdump can be triggered. These methods will enable the user
|
||||||
|
to invoke Kdump in cases where the system is experiencing a hard
|
||||||
|
hang.
|
||||||
|
|
||||||
|
1) AltSysRq C
|
||||||
|
|
||||||
|
On i386 and x86_64 machines, Kdump can be triggered with the
|
||||||
|
combination of the 'Alt','SysRq' and 'C' keyboard keys. This method
|
||||||
|
will work only on directly attached consoles, and not on remote
|
||||||
|
consoles. In cases where the machine is in a hung state with
|
||||||
|
interrupts disabled, AltSysRq C cannot be used. If any kind of
|
||||||
|
terminal access is still possible, the same result may be achieved
|
||||||
|
from the shell command line like so:
|
||||||
|
|
||||||
|
# echo c > /proc/sysrq-trigger
|
||||||
|
|
||||||
|
On PowerPC boxes also AltSysrq C can be used to initiate Kdump if a
|
||||||
|
directly attached console is available. In addition, Kdump can also
|
||||||
|
be triggered via Hardware Management Console(HMC) using 'Ctrl', 'O'
|
||||||
|
and 'C' keyboard keys. Inorder to use the Sysrq method for dump
|
||||||
|
triggering /proc/sys/kernel/sysrq needs to be enabled, which can be
|
||||||
|
done as follows:
|
||||||
|
|
||||||
|
# echo 1 > /proc/sys/kernel/sysrq
|
||||||
|
|
||||||
|
2) Kernel OOPs
|
||||||
|
|
||||||
|
If we want to generate a dump everytime the Kernel OOPses, we can
|
||||||
|
achieve this by setting the 'Panic On OOPs' option as follows:
|
||||||
|
|
||||||
|
# echo 1 > /proc/sys/kernel/panic_on_oops
|
||||||
|
|
||||||
|
|
||||||
|
3) NMI(Non maskable interrupt) button
|
||||||
|
|
||||||
|
In cases where the system is in a hung state, and is not accepting
|
||||||
|
keyboard interrupts, using NMI button for triggering Kdump can be very
|
||||||
|
useful. NMI button is present on most of the newer x86 and x86_64
|
||||||
|
machines. Please refer to the User guides/manuals to locate the
|
||||||
|
button, though in most occasions it is not very well documented. In
|
||||||
|
most cases it is hidden behind a small hole on the front or back panel
|
||||||
|
of the machine. You could use a toothpick or some other
|
||||||
|
non-conducting probe to press the button.
|
||||||
|
|
||||||
|
For example, on the IBM X series 366 machine, the NMI button is
|
||||||
|
located behind a small hole on the bottom center of the rear panel.
|
||||||
|
|
||||||
|
To enable this method of dump triggering using NMI button, you will
|
||||||
|
need to set the 'unknown_nmi_panic' option as follows:
|
||||||
|
|
||||||
|
# echo 1 > /proc/sys/kernel/unknown_nmi_panic
|
||||||
|
|
||||||
|
When enabling unknown_nmi_panic please be careful not to enable Nmi
|
||||||
|
Watchdog feature, else the system will panic.
|
||||||
|
|
||||||
|
4) NMI WATCHDOG
|
||||||
|
|
||||||
|
Nmi watchdog is a feature available in the x86 and x86_64 kernels
|
||||||
|
which uses NMI to monitor whether a CPU has locked up. On i386
|
||||||
|
machines, nmi watchdog can be enabled by passing nmi_watchdog=1 in the
|
||||||
|
commandline of the kernel. On x86_64 machines, this is enabled by
|
||||||
|
default. To verify if your system has been configured with nmi
|
||||||
|
watchdog, look at the NMI entry in /proc/interrupts. If the count is
|
||||||
|
greater than zero then nmi watchdog has been confgured, else it is
|
||||||
|
not.
|
||||||
|
|
||||||
|
Please refer to Documentation/nmi_watchdog.txt in the kernel source
|
||||||
|
for a more detailed description.
|
||||||
|
|
||||||
|
Once this feature has been enabled in the kernel, any lockups will
|
||||||
|
result in an OOPs message to be generated, followed by Kdump being
|
||||||
|
triggered. This also requires 'Panic On OOPs' to be enabled as
|
||||||
|
explained in method 2 above.
|
||||||
|
|
||||||
|
Please refrain from simultaneously enabling 'nmi_watchdog' and setting
|
||||||
|
/proc/sys/kernel/unknown_nmi_panic, as this would result in a Kernel
|
||||||
|
Panic from legitimate NMIs generated by the nmi_watchdog.
|
||||||
|
|
||||||
|
|
||||||
|
5) PowerPC specific methods:
|
||||||
|
|
||||||
|
On IBM PowerPC machines, the following methods to issue a soft reset
|
||||||
|
can be used to trigger Kdump. On SLES10 systems, XMON(debugger) is
|
||||||
|
turned off by default. If the user wishes to enable XMON, he can do
|
||||||
|
so by booting the kernel with 'xmon=on' option. With XMON enabled,
|
||||||
|
issuing a soft reset will drop the user to the XMON prompt, where
|
||||||
|
typing a 'X' will trigger Kdump. If XMON is not enabled then a soft
|
||||||
|
reset will directly trigger Kdump.
|
||||||
|
|
||||||
|
5.1) HMC
|
||||||
|
|
||||||
|
Hardware Management Console(HMC) available on Power4 and Power5
|
||||||
|
machines allow partitions to be reset remotely. This is specially
|
||||||
|
useful in hang situations where the system is not accepting any
|
||||||
|
keyboard inputs.
|
||||||
|
|
||||||
|
Once you have HMC configured, the following steps will enable you to
|
||||||
|
trigger Kdump via a soft reset:
|
||||||
|
|
||||||
|
On Power4
|
||||||
|
Using GUI
|
||||||
|
|
||||||
|
* In the right pane, right click on the partition you wish to
|
||||||
|
dump.
|
||||||
|
* Select "Operating System->Reset".
|
||||||
|
* Select "Soft Reset".
|
||||||
|
* Select "Yes".
|
||||||
|
|
||||||
|
Using HMC Commandline
|
||||||
|
|
||||||
|
# reset_partition -m <machine> -p <partition> -t soft
|
||||||
|
|
||||||
|
On Power5
|
||||||
|
Using GUI
|
||||||
|
|
||||||
|
* In the right pane, right click on the partition you wish to
|
||||||
|
dump.
|
||||||
|
* Select "Restart Partition".
|
||||||
|
* Select "Dump".
|
||||||
|
* Select "OK".
|
||||||
|
|
||||||
|
Using HMC Commandline
|
||||||
|
|
||||||
|
# chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r lpar
|
||||||
|
|
||||||
|
5.2) Blade Management Console for Blade Center
|
||||||
|
|
||||||
|
To initiate a dump operation, go to Power/Restart option under "Blade
|
||||||
|
Tasks" in the Blade Management Console. Select the corresponding
|
||||||
|
blade for which you want to initate the dump and then click "Restart
|
||||||
|
blade with NMI". This will issue a soft reset.
|
||||||
|
|
||||||
|
5.3) Control Panel function for a standalone Power5 machine
|
||||||
|
|
||||||
|
A standalone machine is one which does not have any LPARs configured
|
||||||
|
and also does not have a HMC available. In such cases the Control
|
||||||
|
Panel, usually located on the front panel of the machine (please refer
|
||||||
|
to the User guide of the specific model for details) can be used for
|
||||||
|
dump triggering in case the system has a hard hang.
|
||||||
|
|
||||||
|
The control panel provides many functions for System Management
|
||||||
|
purposes; Function 22 is meant for invoking a Partition dump. This
|
||||||
|
function is available only in the Manual operating mode.
|
||||||
|
|
||||||
|
To check if the system is operating in manual mode,
|
||||||
|
|
||||||
|
* Select function 1 on the panel
|
||||||
|
* Press enter
|
||||||
|
* Read the Operating mode from the panel display
|
||||||
|
* If it is not 'M', then use function 2 to set it (see below)
|
||||||
|
|
||||||
|
To set manual mode:
|
||||||
|
|
||||||
|
* Select function 2 on the panel
|
||||||
|
* Press enter
|
||||||
|
* The current OS IPL type is displayed with a pointer
|
||||||
|
* Press enter to move to the Operating mode
|
||||||
|
* Use increment, decrement buttons to change the mode to M
|
||||||
|
* Press enter
|
||||||
|
|
||||||
|
To trigger the dump:
|
||||||
|
|
||||||
|
* Select function 22 on the panel
|
||||||
|
* Press enter
|
||||||
|
* Select function 22 on the panel
|
||||||
|
* Press enter
|
||||||
|
|
||||||
|
Invoking function 22 twice will issue a soft reset to the machine.
|
||||||
|
|
||||||
|
|
||||||
Dump Analysis
|
Dump Analysis
|
||||||
=============
|
=============
|
||||||
|
|
||||||
@ -361,7 +535,7 @@ Where:
|
|||||||
vmcore -- The crash dump.
|
vmcore -- The crash dump.
|
||||||
|
|
||||||
GDB Helper Script
|
GDB Helper Script
|
||||||
-----------------
|
=================
|
||||||
|
|
||||||
The GDB-kdump script is provided to simplify use of GDB on dump images.
|
The GDB-kdump script is provided to simplify use of GDB on dump images.
|
||||||
The usage is "gdb-kdump [vmcore]".
|
The usage is "gdb-kdump [vmcore]".
|
||||||
|
18
kdump
18
kdump
@ -77,12 +77,18 @@ get_size_mb()
|
|||||||
# and save the vmcore
|
# and save the vmcore
|
||||||
save_core()
|
save_core()
|
||||||
{
|
{
|
||||||
|
dumpsize=`get_mem_size`
|
||||||
|
if [ -z "$dumpsize" -o "$dumpsize" = 0 ]; then
|
||||||
|
echo -n "Null size vmcore"
|
||||||
|
rc_status -s
|
||||||
|
rc_failed 6
|
||||||
|
return
|
||||||
|
fi
|
||||||
|
|
||||||
if [ $KDUMP_KEEP_OLD_DUMPS -gt 0 ]; then
|
if [ $KDUMP_KEEP_OLD_DUMPS -gt 0 ]; then
|
||||||
purge_old_dumps
|
purge_old_dumps
|
||||||
fi
|
fi
|
||||||
|
|
||||||
dumpsize=`get_mem_size`
|
|
||||||
|
|
||||||
if [ $KDUMP_FREE_DISK_SIZE -gt 0 ]; then
|
if [ $KDUMP_FREE_DISK_SIZE -gt 0 ]; then
|
||||||
restsize=`parse_rest_size "$KDUMP_SAVEDIR"`
|
restsize=`parse_rest_size "$KDUMP_SAVEDIR"`
|
||||||
needsize=`expr $dumpsize + $KDUMP_FREE_DISK_SIZE`
|
needsize=`expr $dumpsize + $KDUMP_FREE_DISK_SIZE`
|
||||||
@ -241,12 +247,8 @@ is_crash_kernel ()
|
|||||||
test -f /proc/vmcore || return 1
|
test -f /proc/vmcore || return 1
|
||||||
# FIXME: any better way to detect crash environment?
|
# FIXME: any better way to detect crash environment?
|
||||||
test -n "$CRASH" && return 0
|
test -n "$CRASH" && return 0
|
||||||
case `uname -i` in
|
grep -q elfcorehdr= /proc/cmdline && return 0
|
||||||
ia64)
|
return 1
|
||||||
# ia64 has no kdump kernel
|
|
||||||
return 1;;
|
|
||||||
esac
|
|
||||||
return 0
|
|
||||||
}
|
}
|
||||||
|
|
||||||
# return success if we have a valid dump on the dump device
|
# return success if we have a valid dump on the dump device
|
||||||
|
@ -1,3 +1,14 @@
|
|||||||
|
-------------------------------------------------------------------
|
||||||
|
Wed Mar 14 18:45:27 CET 2007 - tiwai@suse.de
|
||||||
|
|
||||||
|
- add detailed description about dump triggering methods to
|
||||||
|
README.SUSE (#250134)
|
||||||
|
|
||||||
|
-------------------------------------------------------------------
|
||||||
|
Wed Mar 14 15:18:06 CET 2007 - tiwai@suse.de
|
||||||
|
|
||||||
|
- improve the check of crash kernel in kdump init script (#252632)
|
||||||
|
|
||||||
-------------------------------------------------------------------
|
-------------------------------------------------------------------
|
||||||
Fri Mar 9 21:34:36 CET 2007 - bwalle@suse.de
|
Fri Mar 9 21:34:36 CET 2007 - bwalle@suse.de
|
||||||
|
|
||||||
|
@ -19,7 +19,7 @@ Requires: %insserv_prereq %fillup_prereq
|
|||||||
Autoreqprov: on
|
Autoreqprov: on
|
||||||
Summary: Tools for fast kernel loading
|
Summary: Tools for fast kernel loading
|
||||||
Version: 1.101
|
Version: 1.101
|
||||||
Release: 82
|
Release: 84
|
||||||
Source: %{name}-%{package_version}.tar.bz2
|
Source: %{name}-%{package_version}.tar.bz2
|
||||||
Source1: kdump
|
Source1: kdump
|
||||||
Source2: sysconfig.kdump
|
Source2: sysconfig.kdump
|
||||||
@ -128,6 +128,11 @@ true # ignore errors
|
|||||||
%{_sbindir}/kdump-helper
|
%{_sbindir}/kdump-helper
|
||||||
|
|
||||||
%changelog
|
%changelog
|
||||||
|
* Wed Mar 14 2007 - tiwai@suse.de
|
||||||
|
- add detailed description about dump triggering methods to
|
||||||
|
README.SUSE (#250134)
|
||||||
|
* Wed Mar 14 2007 - tiwai@suse.de
|
||||||
|
- improve the check of crash kernel in kdump init script (#252632)
|
||||||
* Fri Mar 09 2007 - bwalle@suse.de
|
* Fri Mar 09 2007 - bwalle@suse.de
|
||||||
- added hint that VGA console doesn't work (#253173)
|
- added hint that VGA console doesn't work (#253173)
|
||||||
* Thu Feb 15 2007 - bwalle@suse.de
|
* Thu Feb 15 2007 - bwalle@suse.de
|
||||||
|
Loading…
Reference in New Issue
Block a user