0

I have a Slackware linux server at my work. This is network, database and web server. Our main web application is used as a backend application to administrate our public website (which is frequently updated) and for working with clients (mini ERP) and most of the employees use it as the only application during working hours so when it is not working the whole firm is blocked.

Lately it happens quite often that web server is blocking or really slowly processing requests so it makes easy and short tasks last for tens of minutes and even longer. I use putty to connect to the server. When this happens, I cannot connect to the server through putty and even have difficulties logging into the server machine directly in server room so I don't know how to check what is going on on the server when this happens so I can diagnose and fix the problem. I know that sometimes mysql is overloaded, but I can log into server and see that (I have enabled slow query logging).

My main problem is how to detect what's going on.

Here's server information:

  • Version: Slackware 11.0.0 (when I log in I get: Linux 2.4.33.3.)
  • Processor: Intel(R) Xeon(R) CPU 3050 @ 2.13GHz
  • Memory: 3.5GB (swap 4GB)
  • Disk space: 1TB (3 partitions)

Thanks.

Kosta
  • 103
  • 3
  • Create a script that runs every minute from cron. The script should include network stats, cpu stats, disk I/O, process listings and anything else that might be useful, and its output should be stored in a place where you can analyze it later. – Jenny D May 10 '13 at 10:18
  • You should use some kind of monitoring. F.e. Zabbix – Pascal Schmiel May 10 '13 at 11:59

1 Answers1

0

You should have some kind of monitoring to help debug, but you don't require it.

start a screen and run a vmstat 30

the vmstat will output the CPU, IO and RAM usage each 30 seconds and the the screen will allow you to retrieve the running vmstat later to analyse it. Tune the seconds for each output if 30 is too fast or too slow

If its so slow that you cant even login, i suspect two issues:

  1. heavy swap usage. Kernel managing the memory have priority over everything else, so under memory pressure everything waits for the swap IO. More ram and fine tune the apps might help.
  2. heavy load. If you have many clients, you have probably heavy queries on the sql make the sites lockup for a few seconds, stalling current requests. Yet more and more requests arrive and people start to press reload, increasing the load very quickly.

You can also leave a putty open and a htop running (not top, as it is heavy, htop do more and is lighter). When the load increase, check the htop to see the current machine stats

higuita
  • 1,093
  • 9
  • 13
  • Thanks! We changed the disk and it helped a bit, so we finally ported our application to another server and everything is working just fine now. But still, thanks for the answer. I'll use it just in case :) – Kosta Aug 13 '13 at 13:35
  • If still the same machine but in a different disk helpped, maybe the old disk have some problem... maybe some bad sector or the disk failing. use the `smartctl -a /dev/sdx` to see its smart status and errors or use the HD builder util to test the disk (use the hirensCD to get those tools in a easy way to use. Usually they are DOS utilities). – higuita Aug 13 '13 at 13:39