I have a WordPress based website running on a shared hosting. Its response time is very decent (around 2s to retrieve the HTML page and 5s to load all the resources).
I was planning to move it to a dedicated virtual server (Ubuntu 12.04 LTS), which should theoretically improve things and make them more consistent given its not shared. However I observed severe performance degredation, with the page taking 10seconds to be generated.
I ruled out network issues by editing /etc/hosts
on the server and mapping the domain to 127.0.0.1
. I used the Apache load tester ab
to get the HTML, so JS, CSS and images are all excluded. It still took 10 seconds.
I have Zpanel installed on the server which also uses MySQL, and its pages come up quite fast (1.5s) and also phpMyAdmin. Performing some queries on the wordpress database directly through phpMyAdmin returns them quite fast too, with query times in the 10 to 30 millisecond region.
Memory is also sufficient, with only 800Mb being used of the 1Gb physical memory available, so it doesn't seem to be a swap issue either. I have also installed APC to try to improve the PHP performance, but it didn't have any effect.
What else should I look for? What could be causing this degradation in performance? Could it be some kind of I/O issue since I am running on a cloud based virtual server?
I wish to be able to raise the issue with my provider but without showing actual data from some diagnosis I am afraid he will just blame my application.
UPDATE with sar
output (every second) when I did an HTTP request:
02:31:29 CPU %user %nice %system %iowait %steal %idle
02:31:30 all 0.00 0.00 0.00 0.00 0.00 100.00
02:31:31 all 2.22 0.00 2.22 0.00 0.00 95.56
02:31:32 all 41.67 0.00 6.25 0.00 2.08 50.00
02:31:33 all 86.36 0.00 13.64 0.00 0.00 0.00
02:31:34 all 75.00 0.00 25.00 0.00 0.00 0.00
02:31:35 all 93.18 0.00 6.82 0.00 0.00 0.00
02:31:36 all 90.70 0.00 9.30 0.00 0.00 0.00
02:31:37 all 71.05 0.00 0.00 0.00 0.00 28.95
02:31:38 all 14.89 0.00 10.64 0.00 2.13 72.34
02:31:39 all 2.56 0.00 0.00 0.00 0.00 97.44
02:31:40 all 0.00 0.00 0.00 0.00 0.00 100.00
02:31:41 all 0.00 0.00 0.00 0.00 0.00 100.00
UPDATE 2 After josten's suggestions.
I/O:
iotop
fails with OSError: Netlink error: No such file or directory (2)
and sar -d
also fails with Requested activities not available in file /var/log/sysstat/sa14
. I think this is because this is a virtual machine, just like iostat
also fails. Could it be the reason why %iowait
reported by sar 1 10
is always 0%?
CPU Load:
The process that is topping the CPU% in htop
is actually apache2
. I was expecting this to maybe be the database, but its not. It goes up to 94% for a few seconds when I do a fresh HTTP request. Seems this is the culprit.
I have done an strace -f -t
and one summary strace -c -f
. There seems to be an awful lot of lstat
calls (57786), with 2455 resulting in errors. No idea if this is normal.
Other than that the topmost call was wait4
which I presume is normal (its just waiting), and munmap
. Top 5 below.
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
51.06 0.124742 897 139 6 wait4
14.90 0.036388 1 57786 2455 lstat
9.67 0.023622 13 1857 munmap
7.69 0.018790 37 514 brk
6.70 0.016361 481 34 clone
2.87 0.006999 74 94 12 select
strace
itself slowed down apache by a factor of 2. I am trying to understand the long trace now to see if there is anything indicative of what was causing the CPU to spike for a few seconds.
What is the typical time for lstat
for a good performing server? I wish to gather some information so that I can complain in a constructive manner to the provider if it is the storage access fault.
UPDATE Output of fio
random read test:
random-read: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=sync, iodepth=1
fio 1.59
Starting 1 process
random-read: Laying out IO file(s) (1 file(s) / 128MB)
Jobs: 1 (f=1): [r] [100.0% done] [12185K/0K /s] [2975 /0 iops] [eta 00m:00s]
random-read: (groupid=0, jobs=1): err= 0: pid=24264
read : io=131072KB, bw=10298KB/s, iops=2574 , runt= 12728msec
clat (usec): min=119 , max=162219 , avg=380.34, stdev=957.37
lat (usec): min=119 , max=162219 , avg=380.89, stdev=957.40
bw (KB/s) : min= 7200, max=13424, per=99.89%, avg=10285.72, stdev=1608.68
cpu : usr=2.80%, sys=18.65%, ctx=33511, majf=0, minf=23
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued r/w/d: total=32768/0/0, short=0/0/0
lat (usec): 250=45.57%, 500=37.17%, 750=3.41%, 1000=7.83%
lat (msec): 2=5.67%, 4=0.27%, 10=0.08%, 20=0.01%, 250=0.01%
Run status group 0 (all jobs):
READ: io=131072KB, aggrb=10297KB/s, minb=10545KB/s, maxb=10545KB/s, mint=12728msec, maxt=12728msec
The only hint I have now is that the CPU line of the fio
output seems to show quite a bit of activity when compared to other systems. I ran it on my local Ubuntu machine and the output was:
cpu : usr=0.19%, sys=0.59%, ctx=32923, majf=0, minf=23
The usr
percentage seems to be a small fraction of what is being reported on my server.
UPDATE Re PHP APC. Yes it is installed. Output from phpinfo:
APC Support enabled
Version 3.1.7
APC Debugging Disabled
MMAP Support Enabled
MMAP File Mask no value
Locking type pthread mutex Locks
Serialization Support php
Revision $Revision: 307215 $
Build Date May 2 2011 19:00:42
Is there any specific setting I should check for? These are the settings I have (local value, Master value):
apc.cache_by_default On On
apc.canonicalize On On
apc.coredump_unmap Off Off
apc.enable_cli Off Off
apc.enabled On On
apc.file_md5 Off Off
apc.file_update_protection 2 2
apc.filters no value no value
apc.gc_ttl 3600 3600
apc.include_once_override Off Off
apc.lazy_classes Off Off
apc.lazy_functions Off Off
apc.max_file_size 1M 1M
apc.mmap_file_mask no value no value
apc.num_files_hint 1000 1000
apc.preload_path no value no value
apc.report_autofilter Off Off
apc.rfc1867 Off Off
apc.rfc1867_freq 0 0
apc.rfc1867_name APC_UPLOAD_PROGRESS APC_UPLOAD_PROGRESS
apc.rfc1867_prefix upload_ upload_
apc.rfc1867_ttl 3600 3600
apc.serializer default default
apc.shm_segments 1 1
apc.shm_size 32M 32M
apc.slam_defense On On
apc.stat On On
apc.stat_ctime Off Off
apc.ttl 0 0
apc.use_request_time On On
apc.user_entries_hint 4096 4096
apc.user_ttl 0 0
apc.write_lock On On
UPDATE Increased apc.shm_size
to 96M. Cache full count is now 0, and there are 96.5% hits to the cache after a few refreshes of the website here and there. APC memory usage is 25.4MB free.
It seems to have reduced the loading time by 3 seconds or so, now down to around 4 to 5 seconds if I do a pure wget
from the server itself without getting any images etc. Still more than twice slower than the other hosting but definitely was an improvement.
I am still finding it strange why it was taking so long to render those pages when the server is totally idle (I don't have APC installed on my development PC and it doesn't have that kind of behaviour). And its still strange where those extra remaining seconds are being wasted.