3

My application usually crashes and prints stack to log if received segfault signal.

But in some environment the 'dmesg' shows segfault messages related to my application, but application uptime is much older.

Can segfault be suppressed and application doesn't receive signal? Or what errors from dmesg can mean ?

noonex
  • 228
  • 1
  • 10

4 Answers4

3

Programs that run in the background can make good use of handling a SIGSEGV, if only to log the fact that it happened along with the context prior to exiting. This gives not only an indication of what went wrong in a log file, but also useful information to include in a bug report. Yes, the signal can be ignored, but this is only through deliberate action and is almost always a bad idea (unless you are testing under an experimental kernel with a known buggy vmm subsystem).

Unfortunately, once that signal is caught, ANYTHING is suspect. For instance, using anything that allocates memory within the SEGV handler is very likely a bad idea. The same goes for variadic functions like printf(). So yes, while an app is handling the signal, it may not be doing so effectively, hence you only see traces of it in dmesg.

Anyway, yes, the signal is sent to the application, however SEGV is not a real time signal and can be merged by the kernel. I.e., if a program accesses memory it has no rights to access 15 times, there's a very good chance that only one SEGV will actually be delivered, depending on the timing of the illegal memory access.

In SEGV handlers, open() write() and close() are your friends and use a special debug log (i.e not a logging FILE stream that may have been opened previously).

Tim Post
  • 1,515
  • 13
  • 25
2

It is possible for an application or to ignore or do some special handling for the segmentation fault signal. The manual page for signal and related pages has details on it.

One possible situation I see could lead to the behaviour described (dmesg reports segfaults but application is running) is that the application forks and the child process segfaults. To know if this is the case, check if the process id reported by dmesg is the same as the one of the currently running process.

goedson
  • 411
  • 2
  • 5
  • +1, @noonex, do the dmesg segfaults coincide with anything notable (like the program starting up)? – Chris S Apr 07 '10 at 12:51
0

Segmentation Fault usually means something went very wrong with the application internal state. It may be just so broken, that it is not able to run the signal handler – the signal handler, which is supposed to print the stack dump, could crash too.

Edit: I didn't quite understood 'uptime older' part, so I skipped it. As now I see your question was 'why the application is still running', so here is the new answer:

Yes, application can survive a SIGSEGV. Sometimes SIGSEGV will be sent only to some less important thread (it should kill whole application, but sometimes it doesn't) or even just a child process – what you see as one application may be multiple processes or threads.

Jacek Konieczny
  • 3,597
  • 2
  • 21
  • 22
  • But then the application uptime couldn't be from before the time of its last segfault... – Massimo Apr 07 '10 at 11:10
  • A signal handler servicing a SIGSEGV may not be _able_ to print the context. Additionally, SEGV can and will be merged by the kernel. – Tim Post Apr 07 '10 at 13:26
0

SIGSEGV is automatically sent by the kernel if a process does something with memory it shouldn't have done; but the signal can be trapped and the process can run a signal handler, which may try to recover from the fault condition. In this case, the process can keep running.

The signal can also be completely ignored, but this is something that should be avoided; getting a SIGSEGV usually means there's something really, really wrong going on.

Massimo
  • 68,714
  • 56
  • 196
  • 319
  • 1
    Its never, ever a good idea to 'keep going' once you get a SIGSEGV, unless you hope to log it prior to exiting. – Tim Post Apr 07 '10 at 12:38
  • I never said it was a good idea (and I don't think it is). But a program *can* try to handle a SIGSEGV and keep going, so this is a very plausible reason for it having a long uptime despite having received SIGSEGVs. This is a good answer for the question "why did it get a SIGSEGV but kept running?", so I don't really see the reason for a downvote here. – Massimo Apr 07 '10 at 13:56