I have a custom Django app that's becoming unresponsive roughly every 5,000 requests. In the apache logs, I see see the following:
Apr 13 11:45:07 www3 apache2[27590]: **successful view render here**
...
Apr 13 11:47:11 www3 apache2[24032]: [error] server is within MinSpareThreads of MaxClients, consider raising the MaxClients setting
Apr 13 11:47:43 www3 apache2[24032]: [error] server reached MaxClients setting, consider raising the MaxClients setting
...
Apr 13 11:50:34 www3 apache2[27617]: [error] [client 10.177.0.204] Script timed out before returning headers: django.wsgi
(repeated 100 times, exactly)
I believe I am running WSGI 2.6 (/usr/lib/apache2/modules/mod_wsgi.so-2.6) with the following config:
apache config
WSGIDaemonProcess site-1 user=django group=django threads=50
WSGIProcessGroup site-1
WSGIScriptAlias / /somepath/django.wsgi
/somepath/django.wsgi
import os, sys
sys.path.append('/home/django')
os.environ['DJANGO_SETTINGS_MODULE'] = 'myapp.settings'
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()
When this happens, I can kill the wsgi process and the server will recover.
>ps aux|grep django # process is running as user "django"
django 27590 5.3 17.4 908024 178760 ? Sl Apr12 76:09 /usr/sbin/apache2 -k start
>kill -9 27590
This leads me to believe that the problem is a known issue:
deadlock-timeout=sss (2.0+)
Defines the maximum number of seconds allowed to pass before the daemon process is shutdown and restarted after a potential deadlock on the Python GIL has been detected. The default is 300 seconds. This option exists to combat the problem of a daemon process freezing as the result of a rouge Python C extension module which doesn't properly release the Python GIL when entering into a blocking or long running operation.
However, I'm not sure why this condition is not clearing automatically. I do see that the script timeout occurs exactly 5 minutes after the last successful page render, so the deadlock-timeout is getting triggered. But it does not actually kill the process.
Edit: more info
- apache version 2.2, using the worker MPM
- wsgi version 2.8
- SELinux NOT installed l
- xml package being used, infrequently
- Ubuntu 10.04