This commit is contained in:
parent
69d44e59e9
commit
c99563c960
176
README.SUSE
176
README.SUSE
@ -316,6 +316,180 @@ Machine-specific Notes
|
||||
Shell> reset
|
||||
|
||||
|
||||
Dump Triggering Methods
|
||||
=======================
|
||||
|
||||
This section talks about the various ways, other than a Kernel Panic,
|
||||
in which Kdump can be triggered. These methods will enable the user
|
||||
to invoke Kdump in cases where the system is experiencing a hard
|
||||
hang.
|
||||
|
||||
1) AltSysRq C
|
||||
|
||||
On i386 and x86_64 machines, Kdump can be triggered with the
|
||||
combination of the 'Alt','SysRq' and 'C' keyboard keys. This method
|
||||
will work only on directly attached consoles, and not on remote
|
||||
consoles. In cases where the machine is in a hung state with
|
||||
interrupts disabled, AltSysRq C cannot be used. If any kind of
|
||||
terminal access is still possible, the same result may be achieved
|
||||
from the shell command line like so:
|
||||
|
||||
# echo c > /proc/sysrq-trigger
|
||||
|
||||
On PowerPC boxes also AltSysrq C can be used to initiate Kdump if a
|
||||
directly attached console is available. In addition, Kdump can also
|
||||
be triggered via Hardware Management Console(HMC) using 'Ctrl', 'O'
|
||||
and 'C' keyboard keys. Inorder to use the Sysrq method for dump
|
||||
triggering /proc/sys/kernel/sysrq needs to be enabled, which can be
|
||||
done as follows:
|
||||
|
||||
# echo 1 > /proc/sys/kernel/sysrq
|
||||
|
||||
2) Kernel OOPs
|
||||
|
||||
If we want to generate a dump everytime the Kernel OOPses, we can
|
||||
achieve this by setting the 'Panic On OOPs' option as follows:
|
||||
|
||||
# echo 1 > /proc/sys/kernel/panic_on_oops
|
||||
|
||||
|
||||
3) NMI(Non maskable interrupt) button
|
||||
|
||||
In cases where the system is in a hung state, and is not accepting
|
||||
keyboard interrupts, using NMI button for triggering Kdump can be very
|
||||
useful. NMI button is present on most of the newer x86 and x86_64
|
||||
machines. Please refer to the User guides/manuals to locate the
|
||||
button, though in most occasions it is not very well documented. In
|
||||
most cases it is hidden behind a small hole on the front or back panel
|
||||
of the machine. You could use a toothpick or some other
|
||||
non-conducting probe to press the button.
|
||||
|
||||
For example, on the IBM X series 366 machine, the NMI button is
|
||||
located behind a small hole on the bottom center of the rear panel.
|
||||
|
||||
To enable this method of dump triggering using NMI button, you will
|
||||
need to set the 'unknown_nmi_panic' option as follows:
|
||||
|
||||
# echo 1 > /proc/sys/kernel/unknown_nmi_panic
|
||||
|
||||
When enabling unknown_nmi_panic please be careful not to enable Nmi
|
||||
Watchdog feature, else the system will panic.
|
||||
|
||||
4) NMI WATCHDOG
|
||||
|
||||
Nmi watchdog is a feature available in the x86 and x86_64 kernels
|
||||
which uses NMI to monitor whether a CPU has locked up. On i386
|
||||
machines, nmi watchdog can be enabled by passing nmi_watchdog=1 in the
|
||||
commandline of the kernel. On x86_64 machines, this is enabled by
|
||||
default. To verify if your system has been configured with nmi
|
||||
watchdog, look at the NMI entry in /proc/interrupts. If the count is
|
||||
greater than zero then nmi watchdog has been confgured, else it is
|
||||
not.
|
||||
|
||||
Please refer to Documentation/nmi_watchdog.txt in the kernel source
|
||||
for a more detailed description.
|
||||
|
||||
Once this feature has been enabled in the kernel, any lockups will
|
||||
result in an OOPs message to be generated, followed by Kdump being
|
||||
triggered. This also requires 'Panic On OOPs' to be enabled as
|
||||
explained in method 2 above.
|
||||
|
||||
Please refrain from simultaneously enabling 'nmi_watchdog' and setting
|
||||
/proc/sys/kernel/unknown_nmi_panic, as this would result in a Kernel
|
||||
Panic from legitimate NMIs generated by the nmi_watchdog.
|
||||
|
||||
|
||||
5) PowerPC specific methods:
|
||||
|
||||
On IBM PowerPC machines, the following methods to issue a soft reset
|
||||
can be used to trigger Kdump. On SLES10 systems, XMON(debugger) is
|
||||
turned off by default. If the user wishes to enable XMON, he can do
|
||||
so by booting the kernel with 'xmon=on' option. With XMON enabled,
|
||||
issuing a soft reset will drop the user to the XMON prompt, where
|
||||
typing a 'X' will trigger Kdump. If XMON is not enabled then a soft
|
||||
reset will directly trigger Kdump.
|
||||
|
||||
5.1) HMC
|
||||
|
||||
Hardware Management Console(HMC) available on Power4 and Power5
|
||||
machines allow partitions to be reset remotely. This is specially
|
||||
useful in hang situations where the system is not accepting any
|
||||
keyboard inputs.
|
||||
|
||||
Once you have HMC configured, the following steps will enable you to
|
||||
trigger Kdump via a soft reset:
|
||||
|
||||
On Power4
|
||||
Using GUI
|
||||
|
||||
* In the right pane, right click on the partition you wish to
|
||||
dump.
|
||||
* Select "Operating System->Reset".
|
||||
* Select "Soft Reset".
|
||||
* Select "Yes".
|
||||
|
||||
Using HMC Commandline
|
||||
|
||||
# reset_partition -m <machine> -p <partition> -t soft
|
||||
|
||||
On Power5
|
||||
Using GUI
|
||||
|
||||
* In the right pane, right click on the partition you wish to
|
||||
dump.
|
||||
* Select "Restart Partition".
|
||||
* Select "Dump".
|
||||
* Select "OK".
|
||||
|
||||
Using HMC Commandline
|
||||
|
||||
# chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r lpar
|
||||
|
||||
5.2) Blade Management Console for Blade Center
|
||||
|
||||
To initiate a dump operation, go to Power/Restart option under "Blade
|
||||
Tasks" in the Blade Management Console. Select the corresponding
|
||||
blade for which you want to initate the dump and then click "Restart
|
||||
blade with NMI". This will issue a soft reset.
|
||||
|
||||
5.3) Control Panel function for a standalone Power5 machine
|
||||
|
||||
A standalone machine is one which does not have any LPARs configured
|
||||
and also does not have a HMC available. In such cases the Control
|
||||
Panel, usually located on the front panel of the machine (please refer
|
||||
to the User guide of the specific model for details) can be used for
|
||||
dump triggering in case the system has a hard hang.
|
||||
|
||||
The control panel provides many functions for System Management
|
||||
purposes; Function 22 is meant for invoking a Partition dump. This
|
||||
function is available only in the Manual operating mode.
|
||||
|
||||
To check if the system is operating in manual mode,
|
||||
|
||||
* Select function 1 on the panel
|
||||
* Press enter
|
||||
* Read the Operating mode from the panel display
|
||||
* If it is not 'M', then use function 2 to set it (see below)
|
||||
|
||||
To set manual mode:
|
||||
|
||||
* Select function 2 on the panel
|
||||
* Press enter
|
||||
* The current OS IPL type is displayed with a pointer
|
||||
* Press enter to move to the Operating mode
|
||||
* Use increment, decrement buttons to change the mode to M
|
||||
* Press enter
|
||||
|
||||
To trigger the dump:
|
||||
|
||||
* Select function 22 on the panel
|
||||
* Press enter
|
||||
* Select function 22 on the panel
|
||||
* Press enter
|
||||
|
||||
Invoking function 22 twice will issue a soft reset to the machine.
|
||||
|
||||
|
||||
Dump Analysis
|
||||
=============
|
||||
|
||||
@ -361,7 +535,7 @@ Where:
|
||||
vmcore -- The crash dump.
|
||||
|
||||
GDB Helper Script
|
||||
-----------------
|
||||
=================
|
||||
|
||||
The GDB-kdump script is provided to simplify use of GDB on dump images.
|
||||
The usage is "gdb-kdump [vmcore]".
|
||||
|
18
kdump
18
kdump
@ -77,12 +77,18 @@ get_size_mb()
|
||||
# and save the vmcore
|
||||
save_core()
|
||||
{
|
||||
dumpsize=`get_mem_size`
|
||||
if [ -z "$dumpsize" -o "$dumpsize" = 0 ]; then
|
||||
echo -n "Null size vmcore"
|
||||
rc_status -s
|
||||
rc_failed 6
|
||||
return
|
||||
fi
|
||||
|
||||
if [ $KDUMP_KEEP_OLD_DUMPS -gt 0 ]; then
|
||||
purge_old_dumps
|
||||
fi
|
||||
|
||||
dumpsize=`get_mem_size`
|
||||
|
||||
if [ $KDUMP_FREE_DISK_SIZE -gt 0 ]; then
|
||||
restsize=`parse_rest_size "$KDUMP_SAVEDIR"`
|
||||
needsize=`expr $dumpsize + $KDUMP_FREE_DISK_SIZE`
|
||||
@ -241,12 +247,8 @@ is_crash_kernel ()
|
||||
test -f /proc/vmcore || return 1
|
||||
# FIXME: any better way to detect crash environment?
|
||||
test -n "$CRASH" && return 0
|
||||
case `uname -i` in
|
||||
ia64)
|
||||
# ia64 has no kdump kernel
|
||||
return 1;;
|
||||
esac
|
||||
return 0
|
||||
grep -q elfcorehdr= /proc/cmdline && return 0
|
||||
return 1
|
||||
}
|
||||
|
||||
# return success if we have a valid dump on the dump device
|
||||
|
@ -1,3 +1,14 @@
|
||||
-------------------------------------------------------------------
|
||||
Wed Mar 14 18:45:27 CET 2007 - tiwai@suse.de
|
||||
|
||||
- add detailed description about dump triggering methods to
|
||||
README.SUSE (#250134)
|
||||
|
||||
-------------------------------------------------------------------
|
||||
Wed Mar 14 15:18:06 CET 2007 - tiwai@suse.de
|
||||
|
||||
- improve the check of crash kernel in kdump init script (#252632)
|
||||
|
||||
-------------------------------------------------------------------
|
||||
Fri Mar 9 21:34:36 CET 2007 - bwalle@suse.de
|
||||
|
||||
|
@ -19,7 +19,7 @@ Requires: %insserv_prereq %fillup_prereq
|
||||
Autoreqprov: on
|
||||
Summary: Tools for fast kernel loading
|
||||
Version: 1.101
|
||||
Release: 82
|
||||
Release: 84
|
||||
Source: %{name}-%{package_version}.tar.bz2
|
||||
Source1: kdump
|
||||
Source2: sysconfig.kdump
|
||||
@ -128,6 +128,11 @@ true # ignore errors
|
||||
%{_sbindir}/kdump-helper
|
||||
|
||||
%changelog
|
||||
* Wed Mar 14 2007 - tiwai@suse.de
|
||||
- add detailed description about dump triggering methods to
|
||||
README.SUSE (#250134)
|
||||
* Wed Mar 14 2007 - tiwai@suse.de
|
||||
- improve the check of crash kernel in kdump init script (#252632)
|
||||
* Fri Mar 09 2007 - bwalle@suse.de
|
||||
- added hint that VGA console doesn't work (#253173)
|
||||
* Thu Feb 15 2007 - bwalle@suse.de
|
||||
|
Loading…
Reference in New Issue
Block a user