I've moved the mysql server on different machine in LAN, 16 core, 8GB RAM, and the same dreaaded result. During a load test, all is going well until around 300 seconds. There are 15 persistent connections from DB's client to DB, and after this period I can see it going flat:
Suddenly all queries sent to it start to time out:
mysql> show processlist;
+-----+-------------+--------------------+----------+---------+------+-------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+-------------+--------------------+----------+---------+------+-------+------------------+
| 238 | tigase_user | 58.64.157.39:35595 | NULL | Query | 0 | NULL | show processlist |
| 244 | tigase_user | 58.64.157.25:22624 | tigasedb | Sleep | 0 | | NULL |
| 245 | tigase_user | 58.64.157.25:22625 | tigasedb | Sleep | 0 | | NULL |
| 246 | tigase_user | 58.64.157.25:22626 | tigasedb | Sleep | 0 | | NULL |
| 247 | tigase_user | 58.64.157.25:22627 | tigasedb | Sleep | 0 | | NULL |
| 248 | tigase_user | 58.64.157.25:22628 | tigasedb | Sleep | 0 | | NULL |
| 249 | tigase_user | 58.64.157.25:22629 | tigasedb | Sleep | 0 | | NULL |
| 250 | tigase_user | 58.64.157.25:22630 | tigasedb | Sleep | 0 | | NULL |
| 251 | tigase_user | 58.64.157.25:22631 | tigasedb | Sleep | 0 | | NULL |
| 252 | tigase_user | 58.64.157.25:22632 | tigasedb | Sleep | 0 | | NULL |
| 253 | tigase_user | 58.64.157.25:22633 | tigasedb | Sleep | 0 | | NULL |
| 254 | tigase_user | 58.64.157.25:22634 | tigasedb | Sleep | 619 | | NULL |
| 255 | tigase_user | 58.64.157.25:22635 | tigasedb | Sleep | 619 | | NULL |
| 256 | tigase_user | 58.64.157.25:22636 | tigasedb | Sleep | 620 | | NULL |
| 257 | tigase_user | 58.64.157.25:22637 | tigasedb | Sleep | 619 | | NULL |
| 258 | tigase_user | 58.64.157.25:22638 | tigasedb | Sleep | 620 | | NULL |
+-----+-------------+--------------------+----------+---------+------+-------+------------------+
16 rows in set (0.00 sec)
[...] after 25 seconds [...]
mysql> show processlist;
+-----+-------------+--------------------+----------+---------+------+-------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+-------------+--------------------+----------+---------+------+-------+------------------+
| 238 | tigase_user | 58.64.157.39:35595 | NULL | Query | 0 | NULL | show processlist |
| 244 | tigase_user | 58.64.157.25:22624 | tigasedb | Sleep | 25 | | NULL |
| 245 | tigase_user | 58.64.157.25:22625 | tigasedb | Sleep | 25 | | NULL |
| 246 | tigase_user | 58.64.157.25:22626 | tigasedb | Sleep | 25 | | NULL |
| 247 | tigase_user | 58.64.157.25:22627 | tigasedb | Sleep | 25 | | NULL |
| 248 | tigase_user | 58.64.157.25:22628 | tigasedb | Sleep | 25 | | NULL |
| 249 | tigase_user | 58.64.157.25:22629 | tigasedb | Sleep | 25 | | NULL |
| 250 | tigase_user | 58.64.157.25:22630 | tigasedb | Sleep | 25 | | NULL |
| 251 | tigase_user | 58.64.157.25:22631 | tigasedb | Sleep | 25 | | NULL |
| 252 | tigase_user | 58.64.157.25:22632 | tigasedb | Sleep | 25 | | NULL |
| 253 | tigase_user | 58.64.157.25:22633 | tigasedb | Sleep | 25 | | NULL |
| 254 | tigase_user | 58.64.157.25:22634 | tigasedb | Sleep | 644 | | NULL |
| 255 | tigase_user | 58.64.157.25:22635 | tigasedb | Sleep | 644 | | NULL |
| 256 | tigase_user | 58.64.157.25:22636 | tigasedb | Sleep | 645 | | NULL |
| 257 | tigase_user | 58.64.157.25:22637 | tigasedb | Sleep | 644 | | NULL |
| 258 | tigase_user | 58.64.157.25:22638 | tigasedb | Sleep | 645 | | NULL |
+-----+-------------+--------------------+----------+---------+------+-------+------------------+
16 rows in set (0.00 sec)
mysql> show processlist;
+-----+-------------+--------------------+----------+---------+------+---------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+-------------+--------------------+----------+---------+------+---------------+------------------------------------------------------------------------------------------------------+
| 238 | tigase_user | 58.64.157.39:35595 | NULL | Query | 0 | NULL | show processlist |
|
244 | tigase_user | 58.64.157.25:22624 | tigasedb | Query | 0 | freeing items | select nid as nid3, node as node3 from tig_nodes, (select nid as nid2, node as node2 from tig_nodes, |
| 245 | tigase_user | 58.64.157.25:22625 | tigasedb | Sleep | 0 | | NULL |
| 246 | tigase_user | 58.64.157.25:22626 | tigasedb | Sleep | 0 | | NULL |
| 247 | tigase_user | 58.64.157.25:22627 | tigasedb | Sleep | 0 | | NULL |
| 248 | tigase_user | 58.64.157.25:22628 | tigasedb | Sleep | 0 | | NULL |
| 249 | tigase_user | 58.64.157.25:22629 | tigasedb | Sleep | 0 | | NULL |
| 250 | tigase_user | 58.64.157.25:22630 | tigasedb | Sleep | 0 | | NULL |
| 251 | tigase_user | 58.64.157.25:22631 | tigasedb | Sleep | 0 | | NULL |
| 252 | tigase_user | 58.64.157.25:22632 | tigasedb | Sleep | 0 | | NULL |
| 253 | tigase_user | 58.64.157.25:22633 | tigasedb | Sleep | 0 | | NULL |
| 254 | tigase_user | 58.64.157.25:22634 | tigasedb | Sleep | 645 | | NULL |
| 255 | tigase_user | 58.64.157.25:22635 | tigasedb | Sleep | 645 | | NULL |
| 256 | tigase_user | 58.64.157.25:22636 | tigasedb | Sleep | 646 | | NULL |
| 257 | tigase_user | 58.64.157.25:22637 | tigasedb | Sleep | 645 | | NULL |
| 258 | tigase_user | 58.64.157.25:22638 | tigasedb | Sleep | 646 | | NULL |
+-----+-------------+--------------------+----------+---------+------+---------------+------------------------------------------------------------------------------------------------------+
16 rows in set (0.01 sec)
This correlates with what I see in my logs at mysql's client:
Caused by: tigase.db.TigaseDBException: Error getting user data for: multi-user-chat/rooms/room3_+391737@muc.xmpp.xgate.com.hk/creation-date; method took 25410 ms, thread: 399
however, the load on mysql machine is almost 0 during all this time (as reported by 'top')
After some more time the connections start to disappear (although DB's client has not died):
mysql> show processlist;
+-----+-------------+--------------------+----------+---------+------+-------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+-------------+--------------------+----------+---------+------+-------+------------------+
| 238 | tigase_user | 58.64.157.39:35595 | NULL | Query | 0 | NULL | show processlist |
| 247 | tigase_user | 58.64.157.25:22627 | tigasedb | Sleep | 14 | | NULL |
| 249 | tigase_user | 58.64.157.25:22629 | tigasedb | Sleep | 14 | | NULL |
| 250 | tigase_user | 58.64.157.25:22630 | tigasedb | Sleep | 14 | | NULL |
| 252 | tigase_user | 58.64.157.25:22632 | tigasedb | Sleep | 14 | | NULL |
| 254 | tigase_user | 58.64.157.25:22634 | tigasedb | Sleep | 1609 | | NULL |
| 255 | tigase_user | 58.64.157.25:22635 | tigasedb | Sleep | 1609 | | NULL |
| 256 | tigase_user | 58.64.157.25:22636 | tigasedb | Sleep | 1610 | | NULL |
| 257 | tigase_user | 58.64.157.25:22637 | tigasedb | Sleep | 1609 | | NULL |
| 258 | tigase_user | 58.64.157.25:22638 | tigasedb | Sleep | 1610 | | NULL |
+-----+-------------+--------------------+----------+---------+------+-------+------------------+
10 rows in set (0.00 sec)
The 10 problematic timeout-prone connections have beome 4 as seen above, and those would eventually vanish too, only the last 5 would remain (which are used differently btw)
EDIT: after even some more time, 3 out of those remaining 4 have survived and another 7 new connections have appeared.
EDIT2: after even more hours - 0connections to DB with DB's client still around. wth is going on I wonder...
And as per Daniel's remarks:
mysql> show variables like "max_connections";
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 439
Current database: *** NONE ***
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| max_connections | 151 |
+-----------------+-------+
1 row in set (0.31 sec)
mysql> SHOW VARIABLES LIKE 'wait_timeout';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| wait_timeout | 28800 |
+---------------+-------+
1 row in set (0.00 sec)
EDIT3:
mysql> show variables like 'max_allowed_packet'
-> ;
+--------------------+----------+
| Variable_name | Value |
+--------------------+----------+
| max_allowed_packet | 16777216 |
+--------------------+----------+
1 row in set (0.01 sec)
Where should Ibe looking further ? Thanks!