There are several reasons for a Linux Kernel Crash which may include hangs, hardware and software errors. We usually consider a "Kernel hangs" and a "Kernel crash" as just a 'crash'. In fact, these are totally two different issues; a "hang" occurs due to a time-consuming operation whilst a "crash" occurs instantaneously leading to a reboot. However, during the crash process prior to the reboot, the kernel will register "oops" messages.
In this article, I will lay emphasis on the installation of the tools for analyzing Linux Kernel crash. I will elaborate more on Linux Kernel errors in a future article. Right now, we will look at the installation of Kdump - Kernel dump, a Linux kernel dumping mechanism which uses a 'kexec mechanism' to enable us to collect a 'dump' of the Linux kernel called "vmcore" (virtual memory core). Whatever event occurred during the time of the crash is registered in the "vmcore" for future analysis.
"Kdump uses kexec to quickly boot to a dump-capture kernel whenever a dump of the system kernel's memory needs to be taken (for example, when the system panics). The system kernel's memory image is preserved across the reboot and is accessible to the dump-capture kernel." - Kernel.org
Follow the steps below:
1. On both CentOS 6/7, you will need to install the kexec package using the command yum install kexec-tools
2.vim /boot/grub/grub.conf and for the kernel you are actually running edit the parameter crashkernel = auto and replace it with crashkernel= 128M (I tested it on a virtual machine with 1024MB)
3. Start the Kdump service using the command service kdump start
4. Save this parameter and verify it using the command cat /proc/cmdline. Here is a screenshot of how it should look
5. You would notice that the Kdump have the following configuration files using the command rpm -qc kexec-tools
6. You can also choose the location to save your vmcore. By default, it will be saved in /var/crash/. However, if your /var directory is assigned to a different partition with low disk space, you can choose exactly where you want to generate your vmcore by modifying the parameter path /var/crash in the /etc/kdump.conf file.
7. After modification, you will need to restart the kdump service using the command service kdump restart.
8. Now the last step is to crash the machine thus creating a vmcore. Use the command echo c > /proc/sysrq-trigger. You would notice that this will take some time and the server will reboot by itself. A crash simulation has been done.
9. You will notice now after the reboot that a vmcore file has been created in the /var/crash directory.
10. The size of the vmcore depends on the consequence of the crash. In this simulation its just 19M. It also depends on the kernel activity during the time of the crash.
- You can also specify crashkernel = auto on a 64-bit machine. However, you can calculate it as follows:
- If your RAM is greater than 0 GB and less than 2 GB use 128 MB
- If your RAM is greater than 2 GB and less than 6 GB use 256 MB
- If your RAM is greater than 6 GB and less than 8 GB use 512 MB and so on
- You can also test with less than 128 MB, it may work but the reliability and consistency is cautioned
- If the kdump service does not start after a fresh installation, you might need to reboot your machine.
- Since you have allocated a portion of the memory to the kdump, you might need to reboot your machine again and test it with a free -m