Thursday, August 04, 2011

Just learn Kdump...

I posted to ask about Server crashed on My Oracle Support Community. Someone told me check vmcore(The kernel crash dump name) file from kdump. I am curious about it.
Kdump is a kexec based crash dumping mechansim for Linux. So, I tested on my linux (test) for learning.
# rpm -q kexec-tools
kexec-tools-1.101-164.el5.0.1
*** make sure kernel config ***
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y

Started to test:
# /etc/init.d/kdump status
Kdump is not operational
# /etc/init.d/kdump start
No kdump initial ramdisk found. [WARNING]
Rebuilding /boot/initrd-2.6.18-128.el5kdump.img

Starting kdump: [FAILED]

# /etc/init.d/kdump restart
Stopping kdump: [ OK ]
Starting kdump: [FAILED]
have to reserve memory by passing "crashkernel=X@Y" parameter to the kernel. So
# cat /etc/grub.conf | grep `uname -r`
title Enterprise Linux Enterprise Linux Server (2.6.18-128.el5)
kernel /boot/vmlinuz-2.6.18-128.el5 ro root=LABEL=/ rhgb quiet
initrd /boot/initrd-2.6.18-128.el5.img
Changed:
# cat /etc/grub.conf | grep `uname -r`
title Enterprise Linux Enterprise Linux Server (2.6.18-128.el5)
kernel /boot/vmlinuz-2.6.18-128.el5 ro root=LABEL=/ rhgb quiet crashkernel=128M@16M
initrd /boot/initrd-2.6.18-128.el5.img
Enabled it, when system start
# chkconfig --level 3 kdump on

# reboot
After reboot ... checked
# /etc/init.d/kdump status
Kdump is operational

# cat /proc/cmdline
ro root=LABEL=/ rhgb quiet crashkernel=128M@16M
Tested: (don't do on Production)
# echo c > /proc/sysrq-trigger
***
echo c > /proc/sysrq-trigger === reboot kexec and output a crashdump ***
After reboot:
# last
root pts/0 192.168.1.100 Thu Aug 4 02:45 still logged in
reboot system boot 2.6.18-128.el5 Thu Aug 4 02:43 (00:05)
reboot system boot 2.6.18-128.el5 Thu Aug 4 02:40 (00:01)
System should reboot twice time.
# ls /var/crash
2011-08-04-02:40
# ls /var/crash/2011-08-04-02\:40/
vmcore
# file /var/crash/2011-08-04-02\:40/vmcore
/var/crash/2011-08-04-02:40/vmcore: ELF 64-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style

# strings /var/crash/2011-08-04-02\:40/vmcore | fgrep -m1 'Linux '
Load an initial ramdisk FILE for a Linux format boot image and set the appropriate parameters in the Linux setup area in memory.
use "crash" command to analyse vmcore file. you should have kernel-debuginfo package also.
# rpm -ivh kernel-debuginfo-common-2.6.18-128.el5.i686.rpm
Preparing... ########################################### [100%]
1:kernel-debuginfo-common########################################### [100%]
# rpm -ivh kernel-debuginfo-2.6.18-128.el5.i686.rpm
Preparing... ########################################### [100%]
1:kernel-debuginfo ########################################### [100%]

# crash /usr/lib/debug/lib/modules/2.6.18-128.el5/vmlinux /var/crash/2011-08-04-02\:40/vmcore
crash 4.0-3.14
Copyright (C) 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005 Fujitsu Limited
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.

GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...

KERNEL: /usr/lib/debug/lib/modules/2.6.18-128.el5/vmlinux
DUMPFILE: /var/crash/2011-08-04-02:40/vmcore
CPUS: 1
DATE: Thu Aug 4 02:40:00 2011
UPTIME: 00:05:35
LOAD AVERAGE: 2.16, 1.12, 0.47
TASKS: 105
NODENAME: linuxtest01
RELEASE: 2.6.18-128.el5
VERSION: #1 SMP Wed Jan 21 07:58:05 EST 2009
MACHINE: i686 (2298 Mhz)
MEMORY: 1.3 GB
PANIC: "SysRq : Trigger a crashdump"
PID: 2086
COMMAND: "bash"
TASK: f79ea000 [THREAD_INFO: f73ed000]
CPU: 0
STATE: TASK_RUNNING (SYSRQ)

crash> log > log.txt
crash> exit

# ls log.txt
log.txt
*** check
kernel-debuginfo* packages for Oracle Linux:
http://oss.oracle.com/ol6/debuginfo/
http://oss.oracle.com/ol5/debuginfo/
Or
http://oss.oracle.com/el6/debuginfo/
http://oss.oracle.com/el5/debuginfo/

No comments: