0

I am trying to test the kernel dump using echo 1 > /proc/sys/kernel/sysrq ; echo c > /proc/sysrq-trigger command. On some servers, I can get the dump and on some servers I don't see anything. The kdump config is the same across the fleet and it is writing to the local /var/crash directory. When I trigger the crash manually and check the console, the server straight away goes to reboot without saving the dump. Is there any memory issue that is preventing the kdump from saving the core?

ram
  • 13
  • 5

1 Answers1

0

kdump must be "armed" via a specific kdump service; please check that the service was correctly started by issuing systemctl status kdump.

In your logs (/var/log/messages) check for similar entries:

systemd[1]: Starting Crash recovery kernel arming...
kdumpctl[542051]: kexec: loaded kdump kernel
kdumpctl[542051]: Starting kdump: [OK]
shodanshok
  • 44,038
  • 6
  • 98
  • 162
  • I do see the service armed and running. I also see the above messages in /var/log/messages. Service looks fine to me. Just when the crash is triggered I guess it is trying to access the dump memory and it couldnt access it for some reason. The server starts rebooting before it starts the dump process. I dont even see the system starting the kdump vmcore save service that is responsible for saving the memory. Is there any way to validate this? – ram Oct 09 '20 at 14:25
  • Can you show the output of `cat /proc/cmdline` ? – shodanshok Oct 09 '20 at 15:01
  • BOOT_IMAGE=(hd0,gpt2)/boot/vmlinuz-4.18.0-147.8.1.el8_1.x86_64 root=UUID=ace6cfb4-a09f-4d6a-b5cd-5830f8117703 ro crashkernel=auto rd.auto=1 skew_tick=1 intel_idle.max_cstates=0 printk.time=0 processor.max_cstate=0 idle=poll biosdevname=0 nmi_watchdog=0 nosoftlockup mce=off rhash_entries=1024 selinux=0 isolcpus=2-29 nohz=on nohz_full=2-29 transparent_hugepage=never pcie_asmp=off nohalt nowatchdog tsc=reliable audit=1 audit_enable=1 skew_tick=1 – ram Oct 12 '20 at 19:07
  • I tried changing the crashkernel value to higher number and that didnt work too. – ram Oct 12 '20 at 19:09
  • Ok, can you check that the `/etc/kdump.conf` file is identical between the two servers (the one where `kdump` works vs the one where it does not work)? – shodanshok Oct 12 '20 at 21:58