2

I have a Apache 2.2.3 web server running on a 8 core VM with 8G Ram.

During a load test, the web server stopped responding and load average went up to 1000.

When I run Top command, I see a large number of httpd processes are stuck at "D" status. I did some search and it seems "D" status means uninterruptible sleep.

I straced one of the stuck processes and below is the output:

# strace -p 27843
Process 27843 attached - interrupt to quit
fcntl(34, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=1}

I then did a lsof to check what the fd 34 is and below is the output:

httpd   27843 apache   34u   REG      8,1        0   131756 /tmp/.xcache.0.0.1292616489.lock (deleted)

It seems this might be related to a locking issue with xcache, but how should I continue troubleshoot from here?

Allen Qin
  • 121
  • 3
  • How exactly are you using xcache? Do you call [xcache_get](http://xcache.lighttpd.net/ticket/237)? – David Schwartz Jan 06 '14 at 06:01
  • This is interesting. The cache file has been deleted whilst the process is accessing it. Did you check if /tmp is out of space (if mounted separately)? On top of that, check out if the partition ran out of inodes by df -ih – vagarwal Jan 06 '14 at 17:54
  • I don't have access to the source code so I don't know how xcache is used. There seems to be enough space left under /tmp. – Allen Qin Jan 08 '14 at 01:32

0 Answers0