24

I'm looking for a way to take a non-intrusive coredump of a running process on Linux.

I'm familiar with gdb's gcore, but that can only be run when gdb is attached to the process and it's stopped for debugging. For a big core dump that might mean many seconds, or even a few minutes, of interrupted execution.

Is there any non-blocking alternative?

Linux supports copy-on-write memory, which it relies upon to support fork() without exec(). So I'm thinking of something kernel-level where the kernel takes a copy-on-write snapshot of the process page tables of the process being dumped, then writes the core out while the original process keeps on running.

I'm pretty sure I could use gdb to force a fork() then dump the child while the parent carries on happily, then wait() in the parent to reap the child after termination. It's messy, though, and still requires two interruptions of the parent process, albeit short ones.

Surely someone's needed this before?

Craig Ringer
  • 10,553
  • 9
  • 38
  • 59
  • I am sorry that I can give only a single upvote for this wonderful question. – peterh Sep 11 '14 at 07:36
  • Excellent question and I, for one, am looking forward to the answer. +1 from me – thanasisk Sep 11 '14 at 07:58
  • 1
    What about 1) attaching the process with gdb 2) let it be forked by a "call fork" command 3) dumping the core of the child process 4) letting the dead child be waited by the parent (another "call wait4") 5) detach from the process 6) automatize 1-5 ? Gdb uses simple sys_ptrace() system calls, it could be a not really complex C tool totally independent from the gdb. – peterh Sep 11 '14 at 08:09
  • @PeterHorvath That's the option I outlined at the end as a workaround, yes. It seems like there must be a better way though, given that all the infrastructure is there... – Craig Ringer Sep 11 '14 at 09:26
  • 1
    On a virtual machine you could take a snapshot and bring that up as a clone to be analyze. Perhaps one of the tools listed here will help you: http://www.cyberciti.biz/programming/linux-memory-forensics-analysis-tools/ – Giovanni Tirloni Sep 12 '14 at 17:37
  • @gtirloni Interesting idea ... but a *very large hammer* for this little nail. – Craig Ringer Sep 12 '14 at 17:40
  • @CraigRinger With sys_ptrace and such things, you can read the memory of another process as well. Below /proc//maps, you gen read the memory map of this process. Maybe these were enough to implement a new tool for the task. It seems, nobody did this until now. Or, as an alternative, I think it were possible to call a modified version of the kernel coredump code, which could be attached to a signal handler. Although it required a kernel core patch, maybe it were the simplest task. – peterh Sep 16 '14 at 08:30
  • @PeterHorvath The problem with that is that you'll get an inconsistent snapshot, because the process will continue to run and modify its memory while you're copying. – Barmar Sep 16 '14 at 19:04
  • @Barmar Processes can be blocked by sys_ptrace as well. They can't override this block. – peterh Sep 17 '14 at 08:16
  • But he's looking for a way to do it without blocking the process. – Barmar Sep 17 '14 at 08:49
  • 1
    You could avoid the second interruption by having the child process also fork and then exit. Then the parent process can wait for the child immediately and then continue, while the grandparent dumps core. – kasperd Sep 28 '14 at 00:24
  • @kasperd That's a useful improvement on the basic approach. – Craig Ringer Sep 28 '14 at 12:01
  • Could you checkpoint it with blcr? I believe checking pointing is no-blocking. Then somehow use gcore after cr_restart'ing? – gogators Oct 09 '14 at 19:47
  • One problem with fork is that it will not duplicate the thread states, you'd need to capture these from the original process. You'd probably anyway need to freeze the process, grab all the thread states, fork to generate the copy-on-write data and then release the original process and save the memory of the new process as the dump. Should be doable but I don't know if an implementatino of this. – Baruch Even Oct 14 '14 at 09:07

1 Answers1

2

Google CoreDumper springs to mind. It makes a copy-on-write copy of the process's address space, see WriteCoreDump() (see "Notes").

EricM
  • 146
  • 3
  • That looks exceedingly useful! I wonder what the underlying technique used is. Presumably it ptraces the process, but the creation of the CoW snapshot without forking and in a way that doesn't affect the stack(s) would be challenging. I'll have to take a look at the code. Great tip. – Craig Ringer Nov 12 '14 at 04:38
  • Looks like it's in-process only, unfortunately, and can't be invoked via gdb or similar because it requires ptrace its self. So it's a bit like the debughelp DLL under Windows, rather than like a non-blocking gcore, but still very handy looking. I guess it'd be possible to use via an LD_PRELOAD hook and setup of a signal handler with gdb, detach, and signal the process, but it doesn't look like it's really designed to dump unmodified programs, and it has the issue shared by any in-process dump tool that if the process is messed up enough the dump won't work. – Craig Ringer Nov 12 '14 at 04:49
  • Sorry… I missed the "non-intrusive" bit when I first read the question. – EricM Nov 12 '14 at 06:49