1

I'm new to solr and I'm load testing our setup to see what we can handle. I'm using solrmeter and my problem is a bit odd:

  • When I set solrmeter to run 8000 queries/min, it will handle a few hundred queries and then tomcat will stop responding completely to requests (even though according to lsof -i it is still listening and the java process is still running).
  • When I set solrmeter to run 1000 queries/min it runs fine. I can stop solrmeter after a few minutes and then run 8000/min without issue.

It's as if it needs a ramp up time? Also, I noticed (regardless of ramp up) that my setup cannot handle 12000/min. The reaction at 12k/min is the same as if I were to run 8k/min without the ramp up. Of note, only the shard that solrmeter is pointed to stops responding. The other shard hums along without incident.

Setup (everything in AWS):

  • 2x m1.large (7.5Gb RAM) running tomcat7 + solr 4.2.0 (open-jdk-7-headless) : Ubuntu 12.04
  • 1x m1.micro running zookeeper 3.4.5 : Ubuntu 12.04

The vast majority of my solr/tomcat7 config is default from ubuntu's packages/solr's example. Here's the configs and the end of the catalina.out file: https://gist.github.com/anonymous/ef8fa79ecc1673d11bc0

I redirected the solrmeter console (stderr and stdout) to a file. Its a large log (67Mb): https://docs.google.com/file/d/0BwPYmFCfmBYsU1hDWjlkUGdGTlU/edit?usp=sharing

My main question is two fold:

  1. Is this normal behavior for tomcat (to just stop responding completely) when it gets overwhelmed? And the only option is to restart it?
  2. Why does it handle better when I give it a lower number of queries and then ramp it up? It concerns me that if I have to restart a server in the cluster and it gets thrown into the pool of machines that things will blow up.
natefox
  • 33
  • 1
  • 5

2 Answers2

1

When tomcat is frozen, you can run 2-3 thread dumps 2-3 minutes apart. You can analyze them and figure out what tomcat is doing.

You can use jvisualvm that comes with the JDK to see if the heap usage is not high and also to obtain thread dumps.

Mircea Vutcovici
  • 16,706
  • 4
  • 52
  • 80
0

The solution to this ended up being maxThreads on the Connector. Upping that to a much higher number (10000) than the default (200) allowed Solr (and Tomcat) to handle the 'instant' load much better.

natefox
  • 33
  • 1
  • 5