1

I'm running performance tests on a web application hosted on a Glassfish cluster.

Each cluster instance is hosted on a separate Solaris 10 zone and the http traffic is load balanced between the instances by a F5 BigIp load balancer. The problem I'm facing is that the SOAP requests periodically get aborted by tcp connection resets.

Now I need to figure out why the connections are closed and if there is anything I can do to prevent this. I've used tcpdump to monitor the traffic between the load generator and the load balancer and I can see that the tcp connections are established and that the SOAP request is sent and then the loadbalancer sends an ACK and 4-5 seconds later I get the RST and ACK flags in a tcp frame from the load balancer.

I can however not monitor the traffic between the load balancer and the cluster instances so I can't see what happens on the cluster. This is because tcpdump can't listen to the virtual network interfaces in the zones, at least I haven't found out how to do it.

So I hope there is a way to use DTrace to monitor what's going on in the cluster instances when the connections are reset, I'm guessing some resource run out, like a tcp connection queue (? Not sure about the terminology ?)

Do you have any working example of a dtrace script that show why the connections are reset?

I've looked at https://blogs.oracle.com/hkchu/entry/diagnose_networking_problems_on_solaris but the Dtrace script provided on that page does not compile on my Solaris server.

Ola Mattsson
  • 15
  • 1
  • 3
  • You might want to switch to Solaris 11 where the loopback interfaces can be snooped and where the dtrace scripts you mention might work. – jlliagre Feb 21 '13 at 12:40
  • Unfortunately I'm stuck with Solaris 10, I need to test on the same OS as we use in production. – Ola Mattsson Feb 21 '13 at 16:43

0 Answers0