1

I am experiencing periodic unresponsiveness in tomcat in our production environment. I cannot reproduce this in a test environment, and nothing appears in the logs prior to or during the event. Tomcat continues running, but stops servicing requests. I've read this thread and have placed garbage collection output options in JAVA_OPTS, though I have yet to restart tomcat to put them into effect. My situation differs in that tomcat/jvm apparently will not recover or "wake up". I've confirmed that our app was unresponsive for at least 15 minutes on multiple occasions. The solution is always to restart tomcat (using daemontools). Frequency varies, sometimes during peak load, and sometimes in the middle of the night (very light load).

I've allowed up to 4g of memory for the jvm (-Xms2g -Xmx4g). The server has 16g of memory and is running the 64-bit jvm. Sun's whitepaper on Java tuning claims: "Committing too much of a system's physical memory is likely to result in paging of virtual memory to disk, quite likely during garbage collection operations, leading to significant performance issues." Am I setting the heap size too large? Would I benefit from setting the minimum size to be the same as the maximum?

I don't believe the system is swapping memory to disk. Output of free -m shows no swap usage, and I've set swappiness to 0 on the system.

When the unresponsiveness occurred at 2:30am this morning, I ran a quick jstat and ps prior to restarting tomcat:

jstat showed similar values to what it is now, with some exceptions: YGC was 431 vs 44 now, YGCT 10/1, FGC 59/7, FGCT 39/2, GCT 49/3

The output of ps showed 1422832 resident and 5723580 virtual memory usage. This compares with 1390036 and 5642668 from yesterday during normal operation.

I'm no expert in any of this, so any help would be appreciated.


UPDATE: Okay, I've added the following to JAVA_OPTS and will restart tomcat momentarily:

-XX:+UseConcMarkSweepGC -Xms2g -Xmx2g -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails

The changes are: 1) swich gc algorithm. 2) lower the max heap size, as it seems I don't need 4g, and apparently overcommitting can cause periodic massive gc. 3) Turn on vebose gc logging. Thanks all.

tangent
  • 13
  • 4

2 Answers2

1

To start, here's a useful link on "Tuning Garbage Collection with the 5.0 Java TM Virtual Machine"

This does sound like GC pauses making tomcat unresponsive. One thing to start with is a "low pause" garbage collector with the option -XX:+UseConcMarkSweepGC.

JimB
  • 1,924
  • 12
  • 15
0

We saw this in our production environment a few times, and it did end up being java's garbage collection halting further requests. The biggest tell for us was 100% processor usage on at least one of the cores for the duration of the unresponsive period.

The answer in our case was to track down a memory leak in the application. I'm not certain this counts as an answer for you, but it's at least another data point.

Matt Simmons
  • 20,218
  • 10
  • 67
  • 114
  • I'm just realizing I haven't checked processor usage during one of these intervals in a long time (I'm generally kinda panicked). I'll be sure to do that next time. Based on the behavior, I would expect it to be at 100%. Thanks. – tangent Jun 17 '11 at 15:10