0

I am running OEL 6.4 (a RHEL clone), and rsyncing large files over ssh nightly. During the rsync, I regularly (most nights) get a kernel panic:

Message from syslogd@cheshire at Mar 24 00:39:01 ...
 kernel:Oops: 0000 [#1] SMP

Message from syslogd@cheshire at Mar 24 00:39:01 ...
 kernel:Stack:

Message from syslogd@cheshire at Mar 24 00:39:01 ...
 kernel:Call Trace:

Message from syslogd@cheshire at Mar 24 00:39:01 ...
 kernel:Code: 00 55 48 89 e5 48 83 ec 10 48 89 1c 24 4c 89 64 24 08 66 66 66 66 90 41 89 d4 48 89 f3 e8 cf 23 fe ff 41 83 fc 01 48 89 c2 75 1a

Message from syslogd@cheshire at Mar 24 00:39:01 ...
 kernel:CR2: 0000000000000000

Message from syslogd@cheshire at Mar 24 00:39:01 ...
 kernel:Kernel panic - not syncing: Fatal exception

Is there any information in the above that might give me a clue to the reason or is it all generic? There appears to be nothing unusual in /var/log/messages at the point of the crash.


edit:

I should have mentioned I'm using ocfs2 (local not clustered). The files being transferred are the backing files for VMs and they are not in use at the time of the transfer: they are 'reflink' copies taken purely for the purpose of the rsync. The OS is up to date with patches.

  • I've seen a similar thing on servers where there's been a bug in the file system driver. What file system are you using? Are you using the latest version of your kernel/drivers? – Jenny D Mar 25 '13 at 08:12
  • great question, sorry I didn't mention it I'll edit the Q –  Mar 25 '13 at 09:36
  • @JennyD, I switched to btrfs and had the same issue. Then I switched to LVM snapshots and the crashes have gone away. Still no idea why unfortunately. –  May 06 '13 at 08:19

1 Answers1

1

A kernel panic may be induced by several factors, for instance

  • ssh daemon
  • conflict between files being updated / another service using them
  • NIC interface ...

To help identify the source of the problem I'd try a few rsync --dry-run (which doesn't copy anything).

Besides, I read a while ago that there could be some problem with FS mounted with noatime option, relatime being better.

Also I'd try rsync with the --delay-updates option, that minimizes the actual files update timespan.

This is what comes to mind right now, I'll update the answer if something else rings the bell..

Déjà vu
  • 5,408
  • 9
  • 32
  • 52
  • Thanks for the suggestions I'm using OCFS2 (I should have mentioned that earlier, sorry) and `atime` (not `noatime` or `relatime`) –  Mar 25 '13 at 09:47
  • I'm not quite sure what `--delay-updates` does but it [isn't compatible](http://linux.die.net/man/1/rsync) with `--inplace` which I think I need to avoid doubling up storage requirements at the target end (some of the files being transferred are in the 100s of Gb) –  Mar 26 '13 at 12:16