0

I have a Grails web application (just a standard war file) deployed on a Ubuntu 10.10 server running on tomcat 6. My database is postgresql.

The problem is that every so often (once or twice a day after inactivity) when I try to log into this web application it just freezes. I can navigate to the login page but when I try and login (first time the DB is hit, might be a clue..?) the application just freezes indefinitely, no 500 response code... the browser just waits and waits.

I followed the instructions detailed here

because the problem described sounded the same as mine. My GC logging showed no long running GC, all sub sec.

When the application freezes a jmap heap output is...

using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC

Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 536870912 (512.0MB)
   NewSize          = 21757952 (20.75MB)
   MaxNewSize       = 87228416 (83.1875MB)
   OldSize          = 65404928 (62.375MB)
   NewRatio         = 7
   SurvivorRatio    = 8
   PermSize         = 21757952 (20.75MB)
   MaxPermSize      = 85983232 (82.0MB)

Heap Usage:
New Generation (Eden + 1 Survivor Space):
   capacity = 19595264 (18.6875MB)
   used     = 11411976 (10.883308410644531MB)
   free     = 8183288 (7.804191589355469MB)
   58.23843965562291% used
Eden Space:
   capacity = 17432576 (16.625MB)
   used     = 9249296 (8.820816040039062MB)
   free     = 8183280 (7.8041839599609375MB)
   53.05754009046053% used
From Space:
   capacity = 2162688 (2.0625MB)
   used     = 2162680 (2.0624923706054688MB)
   free     = 8 (7.62939453125E-6MB)
   99.99963008996212% used
To Space:
   capacity = 2162688 (2.0625MB)
   used     = 0 (0.0MB)
   free     = 2162688 (2.0625MB)
   0.0% used
concurrent mark-sweep generation:
   capacity = 101556224 (96.8515625MB)
   used     = 83906080 (80.01907348632812MB)
   free     = 17650144 (16.832489013671875MB)
   82.62032270912317% used
Perm Generation:
   capacity = 85983232 (82.0MB)
   used     = 62866832 (59.95448303222656MB)
   free     = 23116400 (22.045516967773438MB)
   73.1152232100324% used

Anyone know what "From Space:" is?

Any ideas on further fault finding ideas? I dont have much experience with this type of fault finding.

tinny
  • 461
  • 2
  • 5
  • 11

1 Answers1

1

Your delay sounds way too long to be gc-related. I would add some instrumentation code to the login page and measure things like database and page response. Then reproduce the problem manually or using a load testing tool like the Grinder.

Also, what are you running ask of this on? Dedicated hardware or a VM?

HTH!

Tom Purl

Tom Purl
  • 549
  • 1
  • 3
  • 13
  • Running on a rackspace VM – tinny May 07 '11 at 02:47
  • Ok, sometimes VM's have a tendency to freeze, especially when they're using a lot of RAM or CPU. The next thing I would do is find out how much RAM your tomcat process is using. I know that your max heap size is 512MB, but the tomcat process may be using more than that. Also, I would look on this forum for help with rackspace vm's freezing. Good luck! – Tom Purl May 11 '11 at 12:47
  • I'm now starting to think this is actually a problem with my apache reverse proxy that is in-front of my tomcat instance. I can reproduce this problem when browsing the site through the apache reverse proxy (on port 80), but I havent managed to reproduce it when directly browsing to tomcat (on port 8080). So the problem seems to be in the apache area.... – tinny May 15 '11 at 10:49