6

I have a server system running on Mac OS X 10.4 (Darwin Kernel Version 8.10.1). This server is used mainly as a Bugzilla Server, but there are some other web-based services running (Testlink, TikiWiki).

The Bugzilla database has about 60000 bugs in it, and there are about 300 active users on the system.

Bugzilla is on version 3.0, running on Perl 5.8.6, Apache 1.3.33 with mySQL 5.0.38

From time to time, we have serious trouble that Bugzilla throws a database error:

Software error:

Can't connect to the database.
Error: Too many connections

I already have several leads to possible solutions to this problem, but I wanted to bring up a more general question how you would debug these types of problems?

Right now we have set up the following to monitor the mySQL database:

  • A cron job which dumps the full mysql processlist every 5 minutes
  • Enabled log-slow-queries in my.cnf to log queries that take longer than 15 seconds

We just started gathering this data to see if we can find a reason for the "Too many connections" problem.

Are there any other things you can think of to monitor a mySQL database and to help diagnose the root cause of the problem?

Palmin
  • 350
  • 6
  • 10

3 Answers3

6

There are two distinct plans of attack to follow through with when diagnosing these types of errors:

Firstly, there is the potential that it is an issue relating to the actual software in use: something is essentially sucking up connections and not releasing them back (either at all in terms of hanging thread, or in a reasonable amount of time in terms of a slow query).

The slow query log is very beneficial in diagnosing problems, but your value of 15 seconds is next to useless: if a query is taking 15 seconds then you're pretty screwed full stop. As a general rule of thumb, I look for queries taking more than one or two seconds to run. Work through whatever shows up in this log using the EXPLAIN keyword and look at what is causing the slowdown (bad joins, sorting needing a temp table etc.) - some clever magic with query caching and indexes can often help out if it's not possible to go deeper and tinker with code / database design.

Also, don't overlook the general query log in mysql. Whilst you won't want to leave it turned on (for long) on a production server, it can quickly tell you if, rather than a single query taking an age, a particular function in the software is hammering the database with hundreds of small queries. Obviously the only way to solve these sort of issues is via refactoring the code.

Secondly, you need to investigate if the configuration of the software is to blame. How many concurrent connections are you experiencing? What is the actual number of max connections set to in mysql. It may be something as simple as apache is serving say 100 concurrent requests while mysql is only configured to accept 20 connections - obviously something is going to give. If you can gauge how much traffic you are expecting to handle, then it just takes some common sense (and occasionally a dash of Google to find the correct setting) to balance all the components.

Ian
  • 236
  • 2
  • 5
  • I am not sure if your remark about 15 seconds being too long applies to Bugzilla databases. Some queries there can really take some time, just look at http://bugzilla.mozilla.org and search for some generic term on a few products. – Palmin May 16 '09 at 20:07
  • 15 seconds was a comment in general about web-apps - if bugzilla routinely takes this long for queries then there must be something seriously wrong with it period. – Ian May 17 '09 at 13:54
1

How many apache workers do you have ? How what are the max mysql connections you have allowed ? As apache spawns a cgi process per httpd worker when handling the request then f the former is larger than the latter apache can open more connections than mysql will allow.

I would suggest the following logging settings

log_slow_queries
log-queries-not-using-indexes
set-variable = long_query_time=1
Dave Cheney
  • 18,307
  • 7
  • 48
  • 56
  • The maximum number of httpd worker threads I don't know (how would I find this out?). max_connections is set to 100, but I doubt that at any time there are more than 1000 httpd processes. – Palmin Apr 30 '09 at 11:50
1

A cron job for dumping is very useful, but just in case you don't having anything ready to actually graph the thing, I can recommend munin which has MySQL plugins for monitoring

  • throughput
  • queries
  • databases size
  • slow queries
  • threads

which can quite helpful to determine spikes. I runs in five minutes interval by default.

Using it over the last year I've discovered quite interesting situation with it before which would have passed completely unnoticed otherwise.

mark
  • 1,516
  • 4
  • 21
  • 33