I am running a Apache2 Ubuntu server which serves the API request from mobile apps.
Recently I am facing a bottleneck that when the request getting higher (more concurrent users), the request response is getting slow. Previously ~1 or 2 seconds will do, but once the concurrent users increase (at certain peak hour), it can delay up to 10 seconds or more, even-though the server load, CPU and Memory remains very low.
My target is to increase the capability of Apache2 and Ubuntu Server to serve as much concurrent users as possible at lowest response time. The memory and CPU is not a major consideration factor as the VPS specification can be scaled up if it hits the limit. How can I do that?
What I have done so far:
Configure ulimit /etc/security/limits.conf and add the following:
- soft nofile 40000
- hard nofile 40000
- soft nproc 40000
Add the following line to /etc/pam.d/common-session
session required pam_limits.so
Configure Apache2 /etc/apache2/mods-enabled/mpm_prefork.conf
<IfModule mpm_prefork_module> StartServers 20 MinSpareServers 25 MaxSpareServers 100 MaxRequestWorkers 150 MaxConnectionsPerChild 0 MaxClients 8192 MaxRequestsPerChild 0 ServerLimit 8192 </IfModule>
Add the following line to /etc/sysctl.conf
fs.file-max = 2097152
The response time seems to be improved (though some requests still delayed), but not satisfactory as it is still much slower than non-peak hour.