0

I making a tuning for our production servers for a portal, we have 4 servers, 2 for web and 2 for app, and there is a firewall before and after web servers (so yes there is a firewall between app and web servers) the issue here started from dropping idle connections between app servers and web servers by firewall, tried with a lot of solutions and now seemed that issue moved from stuck broken connections that was in app because dropping from firewall, this issue was happens when I have low load to portal, and to solve it I need to restart all app servers, now I have issue with high load days instead, and urgent solution was simply a quick restart Apache web servers, how to solve this issue.

I made changes by helping of Jboss loadbalancing configuration generator : http://lbconfig.appspot.com/?lb=mod_jk&mjv=1.2.28&nca=64&ncj=64&nai=2&nji=2&njips=6&f=true&c=false&lr=false&lrl=&mpm=Prefork

And monitoring connections in both servers using netstat command and with google analytics Real Time overview, I got the following stats with ~ 40 visitors after 3 days of last restart:

Web side (2 servers but connections her "for each" not total):

ESTABLISHED ~700 - 750
TIME_WAIT: 100-200 (big jumbs for one second 150 another 200 another 170 and then 120 and so)

App Side (here I counted all connections, most of them ESTABLISHED and few CLOSE_WAIT 0 - 5 each time I check):

S1 (4 instances running) : 900-950
S2 (5 instances running) : 1000-1100

Servers details :

  • On web 2x servers: Apache 2.2.14 / mod_jk 1.2.37
  • on app 2x servers: Clustered Glassfish 2.1.1 with ajp13 (6 instances / each server)
  • All servers Solaris SPARC 64 V-CPUs 32GB ram.

My configurations : Mostly like the generator gave me (u can see link) :

httpd.conf:

KeepAlive On
ServerLimit         12800
StartServers        5
MinSpareServers     5
MaxSpareServers     20
MaxClients          12800
MaxRequestsPerChild 5000

ExtendedStatus Off

worker.properties:

worker.maintain=30
worker.template.type=ajp13
worker.template.session_cookie=JSESSIONID
worker.template.lbfactor=1
worker.template.ping_timeout=10000
worker.template.connection_pool_timeout=10
worker.template.socket_keepalive=True
worker.template.socket_timeout=600
worker.template.connect_timeout=10000
worker.template.prepost_timeout=10000
worker.template.connection_ping_interval=20
worker.template.ping_mode=A
worker.template.socket_connect_timeout=600000

From glassfish side time-outs 10 seconds from cluster configuration side, I have:

HTTP service property :

  • connectionTimeout= 10000

Request Processing:

  • Thread Count: 2133
  • Initial Thread Count : 20
  • Thread Increment : 10

Keep Alive (enabled):

  • Thread Count: 400
  • Max Connections 256
  • Time out : 10 seconds

Connection Pool:

  • Max Pending Count 4096 connections

So:

  • So Is my configurations is correct ?
  • How to solve high number of established connections or its safe?, I don't want down time again for apache if got high load again.
Al-Mothafar
  • 109
  • 6
  • @ScottPack it totally different ! I have issue to solve with configuration after I got some knowledge not asking how to do ! – Al-Mothafar Sep 13 '13 at 12:18
  • I don't see how either of the questions are answerable at the moment. You need to define what you mean by "correct" before anyone can determine if your configuration is correct. What *exactly* are you trying to achieve with this configuration? The same problem exists with the second question. We need specific numbers. *How* high is "high load"? How many concurrent requests? Your questions are too vague to be answered. Once you make them more specific, you will probably find you can answer them yourself. – Ladadadada Sep 16 '13 at 10:36

1 Answers1

0

regarding mod_jk / mod_ajp: we used this is a slightly bigger setup and stumbled upon bugs and errors every then and there, connections getting dropped, but never found a real solution to any of our problems (but we found some bugs, that still exists)

my advise: make an alternate setup and perf-tests: mod_jk vs proxy_http and if proxy_http is within acceptable ranges, skip mod_jk. i did this in 2 different setups now (and, additionally, are able to replace apache with nginx -> BIG WIN) and do not regret it.

pros

  • easier to debug
  • more variety of possible lb/frontend gateways (haproxy, nginx, varnish)
  • less heisenbugs

cons

  • didnt found some