Kernel Dump configuration
This page describes the configuration for open sharedroot clusters with kernel dump functionality enabled. Howto configure it.
RHEL5 based and above:
With introduction of RHEL5 the way of dumping kernel memory on disk when a kernel crashes completely changed and substituted the old kernelmodules diskdump and netdump with a generic way of booting a second kernel into a memory segment previously allocated (at boottime). In this segment a new kernel is booted with the kexec utils whenever a system crashes. Now a more relyable and flexible way of writing vmcores on many different storagedevice is available as a completely new kernel is booted and can load any type of module needed.
Configuration:
Bootloader:
First of all the kernel has to be given at boottime a bootparameter in order to allocate that memory segment holding the rescue kernel. The parameter should be specified with your favourite bootloader. The parameter is called crashkernel and the default is adviced to be 128M@16M. That means the kernel allocates 128MByte or RAM at the offset 16MByte at which this reservation starts. An example for a valid open sharedroot cluster kernel cmdline could look as follows
[root@test ~]# cat /proc/cmdline root=/dev/vg_streaming_sr/lv_sharedroot rhgb quiet com-debug crashkernel=128M@16M
Additional software:
As a second prerequesit it might be good to have the debuginfo rpms and crash utility for the kernels installed. If you are using the the core_collector (see below) these rpms are required. In order to install it with RHEL5 do the following
yum --enablerepo=\*debuginfo install kernel-debuginfo.$(uname -m) crash
Disk preparation:
For the time being it is supposed that there is a diskpartition formated with ext3 that can hold the crashimages. This could be also reconfigured to be a nfs-share or the like. But is not yet tested. Let's say the partition would be /dev/sdd1 then the filesytem would be created with mkfs.ext3 -Lcrash /dev/sdd1.
/etc/sysconfig/kdump:
The only option that should be configured in this file is the place where the kdump kernel and initrd should be found. This is because an open sharedroot cluster has no /boot filesystem mounted. Because of this we create a directory called /var/lib/kdump and setup the option KDUMP_BOOTDIR in /etc/sysconfig/kdump to point to there
#Where to find the boot image KDUMP_BOOTDIR="/var/lib/kdump"
There are two files that influence the way how
/etc/kdump.conf:
This file holds the more interesting parameters to influence the kdump. The interesting parameters will be displayed below.
fs type partition: Will mount -t**fs type** partition /mnt and copy /proc/vmcore to /mnt/var/crash/%DATE/. NOTE: partition can be a device node, label or uuid. And the relevant modules to mount the filesystem should be specified also
extra_modules module(s): This directive allows you to specify extra kernel modules that you want to be loaded in the kdump initrd, typically used to set up access to non-boot-path dump targets that might otherwise not be accessible in the kdump environment. Multiple modules can be listed, separated by a space, and any dependent modules will automatically be included. NOTE: Even for ext3 you'll need ext3, jbd
core_collector makedumpfile options: This directive allows you to use the dump filtering program makedumpfile to retrieve your core, which on some arches can drastically reduce core file size. See /sbin/makedumpfile --help for a list of options. NOTE: that the -i and -g options are not needed here, as the initrd will automatically be populated with a config file appropriate for the running kernel.
dump
zero
cache
cache
user
free
level
page
page
private
data
page
0
1
X
2
X
4
X
X
8
X
16
X
31
X
X
X
X
X
Cluster configuration changes:
The fence post fail delay should be adapted so that a dump can be written before the node is fenced. Then maximum time adviced by Redhat is somewhere around 30 seconds. Normally the maximum time to dump can be equated as follows with a average disk write rate of 50MB/sec. With a fully utilized memory the different dump level will not influence time. Nevertheless using 31 is at average the fastest rate.
time[sec] = Ramsize in MB / 50 MB/sec
6GB RAM would result in max: 123sec; 2min 3sec
Example /etc/kdump.conf:
ext3 LABEL=crash core_collector makedumpfile -d 31 extra_modules cciss ext3 jbd
Example /etc/sysconfig/kdump:
# Kernel Version string for the -kdump kernel, such as 2.6.13-1544.FC5kdump # If no version is specified, then the init script will try to find a # kdump kernel with the same version number as the running kernel. KDUMP_KERNELVER="" # The kdump commandline is the command line that needs to be passed off to # the kdump kernel. This will likely match the contents of the grub kernel # line. For example: # KDUMP_COMMANDLINE="ro root=LABEL=/" # If a command line is not specified, the default will be taken from # /proc/cmdline KDUMP_COMMANDLINE="" # This variable lets us append arguments to the current kdump commandline # As taken from either KDUMP_COMMANDLINE above, or from /proc/cmdline KDUMP_COMMANDLINE_APPEND="irqpoll maxcpus=1" # Any additional kexec arguments required. In most situations, this should # be left empty # # Example: # KEXEC_ARGS="--elf32-core-headers" KEXEC_ARGS=" --args-linux" #Where to find the boot image KDUMP_BOOTDIR="/var/lib/kdump" #What is the image type used for kdump KDUMP_IMG="vmlinuz" #What is the images extension. Relocatable kernels don't have one KDUMP_IMG_EXT=""
Example screenshot
dm_task_set_name: Device /dev/mapper/Groups-Volume not found Command failed connect() failed on local socket: Connection refused Skipping clustered volume group vg_streaming_data Skipping clustered volume group vg_streaming_sr dm_task_set_name: Device /dev/mapper/-LV not found Command failed connect() failed on local socket: Connection refused Skipping clustered volume group vg_streaming_data Skipping clustered volume group vg_streaming_sr connect() failed on local socket: Connection refused Skipping clustered volume group vg_streaming_data Skipping clustered volume group vg_streaming_sr dm_task_set_name: Device /dev/mapper/-lv_tmp not found Command failed Saving to the local filesystem /dev/cciss/c0d0p2 e2fsck 1.38 (30-Jun-2005) crash: clean, 13/977280 files, 67781/1954320 blocks [100 %] The dumpfile is saved to /mnt//var/crash/127.0.0.1-2008-02-18-17:59:55/vmcore-in complete. makedumpfile Completed. Saving core complete