I've got a server running apache, and have been seeing occasional apache processes go to 100% and stay there. Today, with two processes at 100%, I turned off external access to the server (to prevent further requests to apache). Five minutes later, no requests are coming in to the server but both processes are still at 100%.
I've run lsof
on each process, and they've giving me about 9000 lines of output (that might as well be greek to me). No other processes seem to be behaving strangely or waiting etc.
My database is on a second server. Using mytop
shows two MySQL connections active from the apache server, both with a state of "sleep". I killed one of those MySQL threads, and there was no change to either process on the Apache server.
This apache server is one of two behind a simple load balancer. I don't know if that could be related.
How can I confirm that the apache issue is related to what I'm seeing on the database server? And is this likely to be the result of a dodgy SQL call, or something else?
Edit: Found the issue. It was a code problem with Magento. The image resizing function was failing to open an image, because the extension was incorrect (it was a BMP with a jpg extension). The error handler for this was invoking the resize again, et voila - a loop. Found this by doing strace
on the misbehaving apache process.