We have an application running on WebSphere (unfortunately it is 6.1 which is no longer supported, it has not yet been migrated in production to a later version) which becomes entirely unresponsive because of hung threads.
As far as I can tell we entirely exhaust one of the thread pools.
I have activated hung thread detection and I get a core/thread dump when hung threads are detected.
The server can run for several days without problems but has crashed twice this week.
When load the core/thread dump in "IBM Thread and Monitor Dump Analyzer for Java", it tells me that there are a certain number of hung threads (this time it was 2, last time 11) and multiple (usually around 40) threads "waiting on condition" and some running threads. I believe one of the thread pool has around that size (50).
Now what I see in there are threads waiting for locks, having locks or in wait.
Most of them show a stack track which always ends like this:
at java/lang/Object.wait(Native Method) at java/lang/Object.wait(Object.java:231)
Now, how can I track this down to either a server configuration problem, application issue, WebSphere problem or something else? How is this supposed to help me track down the problem when almost everything in there refers to IBM code?
I cannot ask IBM's help as 6.1 is now an unsupported version of WebSphere and while work has been done to make it work under WebSphere 7 we are not yet ready to switch to it in Production yet.
These are the stack trace of the threads which are blocked:
HAManager.thread.pool : 0
Blocked
Waiting for Monitor Lock on java/lang/Object@12648F78/12648F84
at java/lang/Object.wait(Native Method)
at java/lang/Object.wait(Object.java:231)
at com/ibm/ws/util/BoundedBuffer.waitGet_(BoundedBuffer.java:203)
at com/ibm/ws/util/BoundedBuffer.take(BoundedBuffer.java:564)
at com/ibm/ws/util/ThreadPool.getTask(ThreadPool.java:869)
at com/ibm/ws/util/ThreadPool$Worker.run(ThreadPool.java:1617)
WebContainer : 5
Blocked
Waiting for Monitor Lock on com/ibm/ejs/ras/TraceLogger@11265878/11265884
at com/ibm/ejs/ras/TraceLogger.doLog(TraceLogger.java:299)
at com/ibm/ejs/ras/TraceLogger.processEvent(TraceLogger.java:286)
at com/ibm/ws/logging/WsHandlerWrapper.publish(WsHandlerWrapper.java:43)
at java/util/logging/Logger.log(Logger.java:526)
at com/ibm/ejs/ras/Tr.logToJSR47Logger(Tr.java:1625)
at com/ibm/ejs/ras/Tr.fireEvent(Tr.java:1587)
at com/ibm/ejs/ras/Tr.fireTraceEvent(Tr.java:1509)
at com/ibm/ejs/ras/Tr.debug(Tr.java:610)
at com/ibm/ws/rsadapter/spi/WSRdbManagedConnectionImpl.inGlobalTransaction(WSRdbManagedConnectionImpl.java:1045)
at com/ibm/ws/rsadapter/jdbc/WSJdbcConnection.commit(WSJdbcConnection.java:961)
...
The threads "waiting on condition" are a mix of:
at java/lang/Thread.sleep(Native Method)
at java/lang/Thread.sleep(Thread.java:941)
....
and lots of
at java/lang/Object.wait(Native Method)
at java/lang/Object.wait(Object.java:231)
...
I have a pretty peculiar one, why is it waiting on this?:
at java/util/Calendar.setFieldsComputed(Calendar.java:1402)
at java/util/Calendar.complete(Calendar.java:1349)
at java/util/Calendar.get(Calendar.java:1120)
at java/text/SimpleDateFormat.subFormat(SimpleDateFormat.java:930)
at java/text/SimpleDateFormat.format(SimpleDateFormat.java:837)
at java/text/SimpleDateFormat.format(SimpleDateFormat.java:809)
at com/ibm/ws/logging/TraceLogFormatter.formatTimeAndThread(TraceLogFormatter.java:125)
at com/ibm/ws/logging/TraceLogFormatter.formatHeaderBasic(TraceLogFormatter.java:223)
at com/ibm/ws/logging/TraceLogFormatter.formatBasic(TraceLogFormatter.java:439)
at com/ibm/ejs/ras/TraceLogger.doLog(TraceLogger.java:295)
at com/ibm/ejs/ras/TraceLogger.processEvent(TraceLogger.java:286)
at com/ibm/ws/logging/WsHandlerWrapper.publish(WsHandlerWrapper.java:43)
...
I also have this:
WebContainer : 6
Owns Monitor Lock on com/ibm/ejs/ras/TraceLogger@11265878/11265884
at sun/io/CharToByteSingleByte.convert(CharToByteSingleByte.java:329)
at com/ibm/misc/IOConverter.convert(IOConverter.java:111)
at sun/io/CharToByteConverter.convertAny(CharToByteConverter.java:191)
at sun/nio/cs/StreamEncoder$ConverterSE.implWrite(StreamEncoder.java:259)
at sun/nio/cs/StreamEncoder.write(StreamEncoder.java:186)
at java/io/OutputStreamWriter.write(OutputStreamWriter.java:214)
at java/io/BufferedWriter.flushBuffer(BufferedWriter.java:132)
at java/io/PrintStream.write(PrintStream.java:492)
at java/io/PrintStream.print(PrintStream.java:636)
at java/io/PrintStream.println(PrintStream.java:773)
at com/ibm/ejs/ras/TraceLogger.doLog(TraceLogger.java:300)
at com/ibm/ejs/ras/TraceLogger.processEvent(TraceLogger.java:286)
at com/ibm/ws/logging/WsHandlerWrapper.publish(WsHandlerWrapper.java:43)
at java/util/logging/Logger.log(Logger.java:526)
at com/ibm/ejs/ras/Tr.logToJSR47Logger(Tr.java:1625)
at com/ibm/ejs/ras/Tr.fireEvent(Tr.java:1587)
at com/ibm/ejs/ras/Tr.fireTraceEvent(Tr.java:1509)
at com/ibm/ejs/ras/Tr.entry(Tr.java:776)
at com/ibm/ws/rsadapter/jdbc/WSJdbcConnection.prepareStatement(WSJdbcConnection.java:2056)
at com/ibm/ws/rsadapter/jdbc/WSJdbcConnection.prepareStatement(WSJdbcConnection.java:2039)
...
and finally this:
WebContainer : 8
at java/lang/StringBuffer.append(StringBuffer.java:239)
at com/ibm/ws/logging/TraceLogFormatter.formatTimeAndThread(TraceLogFormatter.java:137)
at com/ibm/ws/logging/TraceLogFormatter.formatHeaderBasic(TraceLogFormatter.java:223)
at com/ibm/ws/logging/TraceLogFormatter.formatBasic(TraceLogFormatter.java:439)
at com/ibm/ejs/ras/TraceLogger.doLog(TraceLogger.java:295)
at com/ibm/ejs/ras/TraceLogger.processEvent(TraceLogger.java:286)
at com/ibm/ws/logging/WsHandlerWrapper.publish(WsHandlerWrapper.java:43)
at java/util/logging/Logger.log(Logger.java:526)
at com/ibm/ejs/ras/Tr.logToJSR47Logger(Tr.java:1625)
at com/ibm/ejs/ras/Tr.fireEvent(Tr.java:1587)
at com/ibm/ejs/ras/Tr.fireTraceEvent(Tr.java:1509)
at com/ibm/ejs/ras/Tr.exit(Tr.java:796)
at com/ibm/ejs/j2c/PoolManager.processPoolRequestStats(PoolManager.java:6879)
at com/ibm/ejs/j2c/PoolManager.reserve(PoolManager.java:2026)
at com/ibm/ejs/j2c/ConnectionManager.allocateMCWrapper(ConnectionManager.java:940)
at com/ibm/ejs/j2c/ConnectionManager.allocateConnection(ConnectionManager.java:614)
at com/ibm/ws/rsadapter/jdbc/WSJdbcDataSource.getConnection(WSJdbcDataSource.java:449)
at com/ibm/ws/rsadapter/jdbc/WSJdbcDataSource.getConnection(WSJdbcDataSource.java:418)
...