7

Yesterday around 1am, our server ground to a crawl. This doesn't happen often, but I'm trying to get to the bottom of it.

There is no unusual traffic volume, no unusual processes running, just all of the sudden the server started killing fcgid processes.

[Thu Aug 02 01:17:32 2012] [warn] mod_fcgid: process 26460 graceful kill fail, sending SIGKILL

... for as many fcgid processes as we have...

CPU idle fell to 0% and I/O seemed to take up most of the load. The issue lasted about 5 minutes.

I suspect there was some swap activity, although I'm not sure if it was due to killed processes being swapped in to die, or if it was because some process ramped up memory usage faster than my process watching scripts can see them.

The oom-killer wasn't triggered (at least it's not logged), so I think this was Apache for some reason restarting the processes. This is not regular, and nothing obvious appears in cron.

Is there a normal Apache process which might cause this? We run dozens of different sites, and it was late at night, so volume was very, very low. (maybe 200 requests in a 10 minute period).

mgjk
  • 854
  • 3
  • 9
  • 19

4 Answers4

7

Modify this file /etc/httpd/conf.d/fcgid.conf and change, FcgidIOTimeout to

FcgidIOTimeout 90

It works for me.

Tkx JD

user203987
  • 71
  • 1
  • 2
3

I have had my problem, the error is mainly because the timeout is exceeded mod_fastcgi, put the solution here which to me has solution:

Modify this file /etc/httpd/conf.d/fcgid.conf and change, FcgidIOTimeout to

FcgidIOTimeout 500

And restart apache:

/usr/sbin/apachectl restart

Source: http://www.prestashop.com/forums/topic/194377-warn-mod-fcgid-process-graceful-kill-fail-sending-sigkill/

jruzafa
  • 141
  • 4
2

Server administrators using Ubuntu Server and Webmin/Virtualmin can resolve this issue by editing the fcgid.conf file. The instructions below are for Ubuntu Server and should you be using a different Linux then the location of the configuration may vary.


Fix this issue in 6 easy steps

  1. Login to SSH.
  2. Type cd /etc/apache2/mods-enabled and press enter.
  3. Type sudo pico fcgid.conf and press enter.
  4. Find the line with FcgidConnectTimeout 20 and change it to read FcgidConnectTimeout 120.
  5. Exit pico by doing CTRL+X or CMD+X, then press Y to save.
  6. Type: sudo service apache2 restart and press enter.

If you continue to get the problem then you can increase the 120 to a higher number.

TIP: Use Pingdom (it's free) to notify you when the website is not accessible.

Simon Hayter
  • 329
  • 2
  • 11
0

I had the same issue a couple nights back. I found a blog post where someone removed Webmin & Usermin and found they stopped getting the error.

I upgraded Webmin and it seems to sorted out my issue. I still get a couple errors here and there but it hasn't clogged up the CPU like it was doing before.

HopelessN00b
  • 53,385
  • 32
  • 133
  • 208
jezhug
  • 145
  • 1
  • 8