0

Our Linode server load has increased in the last month running CentOS 7. I have upgraded to MariaDB 10.3 with PHP 7.2 now on CentOS 7.5 with 16GB of ram and 6 cores. MariaDB also on the server using 5372.81 MB according to apache2buddy perl script. I'm using default MaxRequestWorkers which the script says is too high, but I've tried within its range and doesn't make a difference. We have put the entire site under HTTPS recently but a lot of it already was before the issue. Where the server used to run mainly around a server load of 1, it now averages 3-4 with spikes to 8+.

top - 12:39:15 up 10:26,  2 users,  load average: 3.27, 3.57, 4.08
Tasks: 181 total,   2 running, 117 sleeping,   0 stopped,   0 zombie
%Cpu(s):  9.3 us,  6.0 sy,  0.0 ni, 54.2 id,  4.2 wa,  0.0 hi,  0.9 si, 25.4 st
KiB Mem : 16419324 total,   462908 free,  7237284 used,  8719132 buff/cache
KiB Swap:   524284 total,   524284 free,        0 used.  8808064 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 3614 mysql     20   0   10.1g   5.2g  21372 S  45.2 33.5 259:48.57 mysqld
15613 wmnf_ad+  20   0  842992 178644  95772 S  20.9  1.1   0:03.67 httpd
15650 wmnf_ad+  20   0  792196 131184  97048 S  16.6  0.8   0:10.31 httpd
15636 wmnf_ad+  20   0  837444 179280 101916 R  15.9  1.1   0:15.06 httpd
15634 wmnf_ad+  20   0  870480 136836 100236 S   7.0  0.8   0:16.07 httpd
15632 wmnf_ad+  20   0  794060 125052  89772 S   5.6  0.8   0:12.00 httpd
 1937 root      20   0       0      0      0 D   2.3  0.0   7:03.28 jbd2/sda-8
    1 root      20   0  191432   5732   3856 S   1.0  0.0   1:01.34 systemd
15654 wmnf_ad+  20   0  795988 123584  88628 S   1.0  0.8   0:05.54 httpd
    8 root      20   0       0      0      0 I   0.7  0.0   3:45.27 rcu_sched
   34 root      20   0       0      0      0 S   0.3  0.0   1:13.51ksoftirqd/5
 3207 root      20   0  492880  15488  12152 S   0.3  0.1   0:17.17 NetworkManager
15254 root      20   0  161992   4632   3856 R   0.3  0.0   0:03.46 top
15604 wmnf_ad+  20   0  799320 141524 104940 S   0.3  0.9   0:17.32 httpd
15628 wmnf_ad+  20   0  794284 128708  95872 S   0.3  0.8   0:17.75 httpd
15951 wmnf_ad+  20   0  796216 124368  89456 S   0.3  0.8   0:09.05 httpd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.12 kthreadd
    4 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/0:0H

I mostly doubled these numbers in the httpd conf when Linode gave us the 16GB from 8GB previously:

StartServers        4
MinSpareServers     20
MaxSpareServers     40
MaxClients          200
MaxRequestsPerChild 4500

Apache memory usage looks like this after about an hour since last restart:

[root@archives conf.d]# ps -ylC httpd | awk '{x += $8;y += 1} END {print "Apache Memory Usage (MB): "x/1024; print "Average Process Size (MB): "x/((y- 1)*1024)}'
Apache Memory Usage (MB): 6136.5
Average Process Size (MB): 109.58

Looking at the Apache server status page, it does not look like there are a huge amount of requests and it's certainly not using all the resources I've allowed:

Server Version: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips mod_fcgid/2.3.9 PHP/7.2.6
Server MPM: prefork
Server Built: Apr 20 2018 18:10:38
Current Time: Monday, 28-May-2018 12:27:12 EDT
Restart Time: Monday, 28-May-2018 12:26:14 EDT
Parent Server Config. Generation: 1
Parent Server MPM Generation: 0
Server uptime: 57 seconds
Server load: 4.54 4.76 4.67
Total accesses: 324 - Total Traffic: 127.9 MB
CPU Usage: u47.7 s18.95 cu0 cs0 - 117% CPU load
5.68 requests/sec - 2.2 MB/second - 404.3 kB/request
30 requests currently being processed, 30 idle workers
WKWR__W._W__WWW_RWK_W______RK._.K_R_K_R___K_RW__.R_..___W__KWRRW
.W_.............................................................
................................................................
........

I used pmap on a couple of the processes and notice a lot of modules and I probably do not need but a few of these, do all these modules load by default? Of course I installed php7, fcgid, status and others...

[root@archives conf.d]# httpd -M
Loaded Modules:
 core_module (static)
 so_module (static)
 http_module (static)
 access_compat_module (shared)
 actions_module (shared)
 alias_module (shared)
 allowmethods_module (shared)
 auth_basic_module (shared)
 auth_digest_module (shared)
 authn_anon_module (shared)
 authn_core_module (shared)
 authn_dbd_module (shared)
 authn_dbm_module (shared)
 authn_file_module (shared)
 authn_socache_module (shared)
 authz_core_module (shared)
 authz_dbd_module (shared)
 authz_dbm_module (shared)
 authz_groupfile_module (shared)
 authz_host_module (shared)
 authz_owner_module (shared)
 authz_user_module (shared)
 autoindex_module (shared)
 cache_module (shared)
 cache_disk_module (shared)
 data_module (shared)
 dbd_module (shared)
 deflate_module (shared)
 dir_module (shared)
 dumpio_module (shared)
 echo_module (shared)
 env_module (shared)
 expires_module (shared)
 ext_filter_module (shared)
 filter_module (shared)
 headers_module (shared)
 include_module (shared)
 info_module (shared)
 log_config_module (shared)
 logio_module (shared)
 mime_magic_module (shared)
 mime_module (shared)
 negotiation_module (shared)
 remoteip_module (shared)
 reqtimeout_module (shared)
 rewrite_module (shared)
 setenvif_module (shared)
 slotmem_plain_module (shared)
 slotmem_shm_module (shared)
 socache_dbm_module (shared)
 socache_memcache_module (shared)
 socache_shmcb_module (shared)
 status_module (shared)
 substitute_module (shared)
 suexec_module (shared)
 unique_id_module (shared)
 unixd_module (shared)
 userdir_module (shared)
 version_module (shared)
 vhost_alias_module (shared)
 dav_module (shared)
 dav_fs_module (shared)
 dav_lock_module (shared)
 lua_module (shared)
 mpm_prefork_module (shared)
 proxy_module (shared)
 lbmethod_bybusyness_module (shared)
 lbmethod_byrequests_module (shared)
 lbmethod_bytraffic_module (shared)
 proxy_ajp_module (shared)
 proxy_balancer_module (shared)
 proxy_connect_module (shared)
 proxy_express_module (shared)
 proxy_fcgi_module (shared)
 proxy_fdpass_module (shared)
 proxy_ftp_module (shared)
 proxy_http_module (shared)
 proxy_scgi_module (shared)
 proxy_wstunnel_module (shared)
 ssl_module (shared)
 systemd_module (shared)
 cgi_module (shared)
 fcgid_module (shared)
 php7_module (shared)

I also checked PHP modules and found apc loaded, which should not be with the new opcache running? It's probably been around since earlier versions, but all in all nothing makes a difference. Is there more I can do or how can I determine the cause of this higher load? The load does subside when Apache is stopped.

Running iotop, I see this [jbd2/sda-8] process sites at the top constantly between 10-60% IO. If this is a journal related process, could there be an underlying issue with disk. Perhaps a disk clean in single user mode needed?

    Total DISK READ :       0.00 B/s | Total DISK WRITE :     181.41 K/s
Actual DISK READ:       0.00 B/s | Actual DISK WRITE:     272.12 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
 1937 be/3 root        0.00 B/s    0.00 B/s  0.00 % 27.02 % [jbd2/sda-8]
 3648 be/4 mysql       0.00 B/s  151.18 K/s  0.00 %  4.66 % mysqld
 3645 be/4 mysql       0.00 B/s    0.00 B/s  0.00 %  1.40 % mysqld
19502 be/4 mysql       0.00 B/s   18.14 K/s  0.00 %  0.53 % mysqld
    1 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % init
    2 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kthreadd]
    4 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kworker/0:0H]
    6 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [mm_percpu_wq]
    7 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [ksoftirqd/0]
    8 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [rcu_sched
<snip>
rwfitzy
  • 223
  • 5
  • 15
  • 1
    Why are you looking at Apache iso. MySQL? – Gerard H. Pille May 28 '18 at 17:34
  • MySQL doesn't seem to be overworked, responds quick. Some more digging with iotop shows jbd2/sda-8 process pinned at the top between 10-60%, this is not right? I'll edit to show. – rwfitzy May 28 '18 at 17:49
  • Have you considered that maybe your application just needs more resources and/or optimization? You can't config-tweak your way around genuine increases in traffic/load. – Sammitch May 28 '18 at 18:06
  • Yeah, and we have ticked up according to stats and fully implemented full site https recently. However, the numbers I see in the server stats don't seem to show a dramatic change in activity. This was kinda all of sudden last week and could be after the Linode upgrade and related to the iotop issue. I was just looking for ways to lower the server load if higher traffic is the case. It doesn't seem like Apache and MySQL are using all the resources available. – rwfitzy May 28 '18 at 18:15
  • Mysql is the first line in top output, is the only user process writing. What is it writing? – Gerard H. Pille May 28 '18 at 19:19

2 Answers2

1

I also think MySQL is causing the load. The jbd2 process is a kernel thread that updates the filesystem journal as you surmised. Looks like MySQL is writing to disk heavily and that is causing the load on jbd2.

MySQL sometimes needs to create temporary tables to process queries, notably queries with a group by clause. That could explain your load if those temp tables are being created on disk. This command will show you how many temporary tables are being created on disk in and memory SHOW GLOBAL STATUS LIKE 'created_tmp%tables';.

Also from that link, these are the two reasons why MySQL creates a temp table on disk rather than memory:

  • The result is bigger than the smaller one of the MySQL variables max_heap_table_size and tmp_table_size.

  • The result contains columns of type BLOB or TEXT.

Jose Quinteiro
  • 874
  • 6
  • 9
  • You're quite right. He's got a bad database design and mysql can't keep up with his queries. – Gerard H. Pille May 28 '18 at 19:19
  • 1
    Yes, this is the culprit I guess, thanks. Created_tmp_disk_tables shows 301506. This is a WordPress site, no real changes to designover time but I guess it perhaps got worse over time. I've read now about optimizing mariadb and mysqltuner is saying a few things need to be adjusted, I need query_cache_size, join_buffer_size, tmp_table_size and max_heap_table_size increased. I've tried this a bit on the dev server here. Following the tuner script suggestions and adjusting these variables higher should help? – rwfitzy May 28 '18 at 19:26
  • I believe so, yes. I'm guessing what happened is there's some query result that's been getting bigger gradually due to increasing comments on a post or increasing number of posts on your site. At some point you crossed over the threshold that triggered writing the temp table(s) for this query to disk, and that caused the sudden increase in load. – Jose Quinteiro May 28 '18 at 19:42
  • Like the tuner script keeps suggesting higer and wanting 16M or higher for query_cache_size and join_buffer_size, but I read some say you should raise so high globally as it takes too much to create on every query. And I see more info about raising these even higher. I know it depends on the resources on what you can do, but will these higher sizes actually cause more load issues? – rwfitzy May 28 '18 at 19:43
  • 16M doesn't strike me as a lot on a system with 16GB of RAM. Your top(1) output shows that about half the memory is in use. The thing to watch for is any increase in swap use. Any use of swap will probably slow things down severely unless your swap is on some fast SSDs. – Jose Quinteiro May 28 '18 at 19:46
-1

Yes, the mysqld process was the issue, but due to waiting for the host of this virtual server. The high 'st' number in my top was an indicator of the host being very busy with other virtual machines....

top - 12:39:15 up 10:26,  2 users,  load average: 3.27, 3.57, 4.08
Tasks: 181 total,   2 running, 117 sleeping,   0 stopped,   0 zombie
%Cpu(s):  9.3 us,  6.0 sy,  0.0 ni, 54.2 id,  4.2 wa,  0.0 hi,  0.9 si, 25.4 st
                                                                        ^^^^^^^

After pointing this out to the hosting provider, our vm was migrated to a new host. Problem solved, now load very low as usual.

rwfitzy
  • 223
  • 5
  • 15