1

We are running lotus domino server 6.5 in one our VM machine. The web applications hosted in this server are running for past 10 year with no issue. But suddenly for past few months the server is crashing frequently. I have used LND to analyze the crash details and all the crash have similar crash stack.

Below link contains screenshot for all details provided by LND Analyzer. Any help is much appreciated.

Also only 30 to 80 users will be using the applications and i dont think there is load to the server.

NSD log File

Crash Screenshots

Fatal Stack:

#
### FATAL THREAD 43/72 [   nHTTP:125c: 5332]
### FP=0x0d2dfe3c, PC=0x002a18af, SP=0x0d2dfe38, stksize=4
### EAX=0x00000001, EBX=0x00000000, ECX=0x00010000, EDX=0x0e219b34
### ESI=0x00000000, EDI=0x000000c8, CS=0x0000001b, SS=0x00000023
### DS=0x00000023, ES=0x00000023, FS=0x0000003b, GS=0x00000000 Flags=0x00010202
Exception code: c0000005 (ACCESS_VIOLATION)
############################################################
@[ 1] 0x002a18af nservlet._Remove+15 (0,3,d2dfe78,2a127a)
@[ 2] 0x002a1992 nservlet._ListMgr_RemoveNode+34 (3,a335430,55559ec,9b3c534)
@[ 3] 0x002a127a nservlet._ServletProcessRequest+106 (d2dfe90,40eb6c2c,0,3)
 [ 4] 0x100254ab nhttpstack (40eb6c2c,40eb6ac8,0,3)
 [ 5] 0x1000fea9 nhttpstack (2,4272ef40,1baa6760,0)
 [ 6] 0x1001cd31 nhttpstack (0,4aa0a00,0,d2dff24)
 [ 7] 0x10021246 nhttpstack (4aa0a0c,4aa0a00,0,60091b21)
 [ 8] 0x1002a51d nhttpstack (3,4aa0a00,10027650,1002767a)
 [ 9] 0x1002a3a7 nhttpstack (4aa0a00,0,0,0)
@[10] 0x60114924 nnotes._ThreadWrapper@4+212 (0,0,0,0)
 [11] 0x77e6482f KERNEL32

Thread details.

.Mapped To: PThread [   nHTTP:125c: 5332]
..      SOBJ: addr=0x1b604ddc, h=0xf010440e t=cf02 (BLK_FT_STATIC)
..      SOBJ: addr=0x1b801414, h=0xf0104412 t=c820 (BLK_CLIENT_OPENSESSION_TIME)
..      SOBJ: addr=0x0e9c04bc, h=0xf010427e t=c30a (BLK_LOOKUP_THREAD)
..      SOBJ: addr=0x420a0114, h=0xf01044cf t=ca35 (BLK_TRACECONNECTION)
..      SOBJ: addr=0x0112cfb0, h=0xf010418c t=c130 (BLK_TLA)
..      SOBJ: addr=0x0113e3f4, h=0xf010429b t=c436 (BLK_LSITLS)
..      SOBJ: addr=0x0e9709fc, h=0xf0104295 t=c275 (BLK_NSFT)
..  Database: D:\Lotus\Domino\Data\ew\archives\ew-2012q3.nsf
....       DBH:    383, By: Pradeep Rajan
......       doc: HDB=383, ID=8650, H=7343, class=0001, flags=0300
......       doc: HDB=383, ID=25686, H=7225, class=0001, flags=0300
......       doc: HDB=383, ID=8650, H=7298, class=0001, flags=0300
......       doc: HDB=383, ID=8650, H=7196, class=0001, flags=0300 

One more Crash today. Please help.

Fatal Stack

############################################################
### FATAL THREAD 32/66 [   nHTTP:0568: 5700]
### FP=0x0bddfe3c, PC=0x002a18af, SP=0x0bddfe38, stksize=4
### EAX=0x00000001, EBX=0x00000000, ECX=0x00010000, EDX=0x0d90c694
### ESI=0x00000000, EDI=0x000000c8, CS=0x0000001b, SS=0x00000023
### DS=0x00000023, ES=0x00000023, FS=0x0000003b, GS=0x00000000 Flags=0x00010202
Exception code: c0000005 (ACCESS_VIOLATION)
############################################################
@[ 1] 0x002a18af nservlet._Remove+15 (0,4,bddfe78,2a127a)
@[ 2] 0x002a1992 nservlet._ListMgr_RemoveNode+34 (4,a302eb8,48b58cc,8e98b3c)
@[ 3] 0x002a127a nservlet._ServletProcessRequest+106 (bddfe90,43277dec,0,3)
 [ 4] 0x100254ab nhttpstack (43277dec,43277c88,0,3)
 [ 5] 0x1000fea9 nhttpstack (2,42cf2a64,3784145c,0)
 [ 6] 0x1001cd31 nhttpstack (0,a67a4be,0,bddff24)
 [ 7] 0x10021246 nhttpstack (a67a4ca,a67a4be,0,60091b21)
 [ 8] 0x1002a51d nhttpstack (3,a67a4be,10027650,1002767a)
 [ 9] 0x1002a3a7 nhttpstack (a67a4be,0,0,0)
@[10] 0x60114924 nnotes._ThreadWrapper@4+212 (0,0,0,0)
 [11] 0x77e6482f KERNEL32

Pass2

############################################################
### PASS 2 : FATAL THREAD with STACK FRAMES 32/66 [   nHTTP:0568: 5700]
### FP=0bddfe3c, PC=002a18af, SP=0bddfe38, stksize=4
Exception code: c0000005 (ACCESS_VIOLATION)
############################################################
# ---------- Top of the Stack ----------
       # 0bddfe38  00000000 0bddfe4c 002a1992 00000000  |....L.....*.....|

@[ 1] 0x002a18af nservlet._Remove+15 (0,4,bddfe78,2a127a)

       # 0bddfe3c  0bddfe4c 002a1992 00000000 00000004  |L.....*.........|


@[ 2] 0x002a1992 nservlet._ListMgr_RemoveNode+34 (4,a302eb8,48b58cc,8e98b3c)

       # 0bddfe4c  0bddfe78 002a127a 00000004 0a302eb8  |x...z.*.......0.|
       # 0bddfe5c  048b58cc 08e98b3c 00000004 0bddfe90  |.X..<...........|
       # 0bddfe6c  43277e84 4327840c 0a302eb8 0bddfeac  |.~'C..'C..0.....|


@[ 3] 0x002a127a nservlet._ServletProcessRequest+106 (bddfe90,43277dec,0,3)

       # 0bddfe78  0bddfeac 100254ab 0bddfe90 43277dec  |.....T.......}'C|
       # 0bddfe88  00000000 00000003 43277e84 10025530  |.........~'C0U..|
       # 0bddfe98  10025850 100255c0 10025580 10025680  |PX...U...U...V..|
       # 0bddfea8  10025780 0bddfec4 1000fea9 43277dec  |.W...........}'C|


 [ 4] 0x100254ab nhttpstack (43277dec,43277c88,0,3)

       # 0bddfeac  0bddfec4 1000fea9 43277dec 43277c88  |.........}'C.|'C|
       # 0bddfebc  00000000 00000003 0bddfef4 1001cd31  |............1...|


 [ 5] 0x1000fea9 nhttpstack (2,42cf2a64,3784145c,0)

       # 0bddfec4  0bddfef4 1001cd31 00000002 42cf2a64  |....1.......d*.B|
       # 0bddfed4  3784145c 00000000 1002a75c 43277c88  |\..7....\....|'C|
       # 0bddfee4  00000000 0bddff24 10033dec ffffffff  |....$....=......|


 [ 6] 0x1001cd31 nhttpstack (0,a67a4be,0,bddff24)

       # 0bddfef4  0bddff30 10021246 00000000 0a67a4be  |0...F.........g.|
       # 0bddff04  00000000 0bddff24 1000adc9 000007b7  |....$...........|
       # 0bddff14  3784145c 00000001 00000002 0bddfefc  |\..7............|
       # 0bddff24  0bddff70 100340a0 00000000 0bddff7c  |p....@......|...|


 [ 7] 0x10021246 nhttpstack (a67a4ca,a67a4be,0,60091b21)

       # 0bddff30  0bddff7c 1002a51d 0a67a4ca 0a67a4be  ||.........g...g.|
       # 0bddff40  00000000 60091b21 0a67a730 00000001  |....!..`0.g.....|
       # 0bddff50  00000000 0a67a4be 0a67a738 00000003  |......g.8.g.....|
       # 0bddff60  00000002 0000f6b1 00000000 0bddff38  |............8...|
       # 0bddff70  0bddffdc 100344b0 00000000 0bddff98  |.....D..........|


 [ 8] 0x1002a51d nhttpstack (3,a67a4be,10027650,1002767a)

       # 0bddff7c  0bddff98 1002a3a7 00000003 0a67a4be  |..............g.|
       # 0bddff8c  10027650 1002767a 00000000 0bddffb8  |Pv..zv..........|


 [ 9] 0x1002a3a7 nhttpstack (a67a4be,0,0,0)

       # 0bddff98  0bddffb8 60114924 0a67a4be 00000000  |....$I.`..g.....|
       # 0bddffa8  00000000 00000000 00000000 0a67a4be  |..............g.|


@[10] 0x60114924 nnotes._ThreadWrapper@4+212 (0,0,0,0)

       # 0bddffb8  0bddffec 77e6482f 00000000 00000000  |..../H.w........|
       # 0bddffc8  00000000 00000000 c0000005 0bddffc4  |................|
       # 0bddffd8  0bddfa64 ffffffff 77e61a60 77e64838  |d.......`..w8H.w|
       # 0bddffe8  00000000 00000000 00000000 60114850  |............PH.`|

Mapped Thread

.Mapped To: PThread [   nHTTP:0568: 5700]
..      SOBJ: addr=0x334d00f8, h=0xf01040e2 t=ca35 (BLK_TRACECONNECTION)
..      SOBJ: addr=0x3e00029c, h=0xf01043f7 t=c820 (BLK_CLIENT_OPENSESSION_TIME)
..      SOBJ: addr=0x0112c3a4, h=0xf010416b t=c130 (BLK_TLA)
..      SOBJ: addr=0x0113df28, h=0xf01042fe t=c436 (BLK_LSITLS)
..      SOBJ: addr=0x438f0174, h=0xf0104461 t=cf02 (BLK_FT_STATIC)
..      SOBJ: addr=0x04663820, h=0xf01042e4 t=c30a (BLK_LOOKUP_THREAD)
..      SOBJ: addr=0x0114dfe0, h=0xf01042e1 t=c275 (BLK_NSFT)
Saravanan
  • 125
  • 5
  • Can you past the full Crash logs into the posting here (please remove company name and IP adress if possible). Using the screenshots isn´t so easy as some of the content seamed to be cut off or isn´t included. – BastianW Jun 11 '13 at 16:56
  • Apologies for delayed reply. I have attached the NSD Crash Log file. – Saravanan Jun 19 '13 at 09:36
  • Still the same issue. – Simon O'Doherty Jun 26 '13 at 09:15
  • But there in no load to the server. Maximum 50 users would have been connected as this is an intranet application. Is upgrading the server to Lotus Domino 6.5 and windows 2003 the issue? or this can be fixed by running fixups on some open databases? Need your advice here please. – Saravanan Jun 26 '13 at 09:59
  • 1
    There's no evidence in the those stacktraces pointing to any database problems. But more importantly: Simon knows what he is talking about :-) The stack trace matches a known problem, so obviously the load doesn't really matter no matter what the SPR says. Sometimes a problem is blamed on load, but it can have other triggers that may show up in other environments. That could be happening here. Regardless, the solution is to upgrade the software to get the fix that prevents this code path from crashing, or to go back to the version of the software that worked for your client for ten years. – rhsatrhs Jun 26 '13 at 18:01
  • Thanks All. Finally client agreed to upgrade domino server to 6.5.1 as SPR MIAS5S7F39 is fixed in 6.5.1. But also domino 6.5 extends upto 6.5.6. Is it good to go for 6.5.6? or can I stick with 6.5.1? – Saravanan Jul 10 '13 at 08:08
  • Personally? I'd recommend Domino 9 as it is the very latest code stream with numerous improvements. Also any security issues/fixes may not exist for R6 due to it's age. If you are set only working with a discontinued release, then always go for the latest version/fixpack combination prior to being discontinued. – Simon O'Doherty Jul 10 '13 at 13:43

1 Answers1

1

The only issue I can find for that stack is SPR MIAS5S7F39. Where the server would crash under heavy load on the servlet manager.

It is marked as fixed in the latest release of the R6 code stream. However if the issue is still happening then you need to upgrade to a more recent version, as R6 has been obsolete for years.

Simon O'Doherty
  • 320
  • 1
  • 7
  • SPR MIAS5S7F39 is related to ZSeries. But our environment is Windows 2003. Also our client is not interested in upgrading the servers. It was working fine for more than 10 years in windows 2000 physical server with same load. But we are facing this issue after migrating to window 2003 VM server. Is there any way to make it work fine in current setup. – Saravanan Jun 19 '13 at 10:49
  • It was initially reported on ZSeries, but the crash stack matches exactly what you have. – Simon O'Doherty Jun 19 '13 at 14:15
  • Thanks. But as i said only few users use this server. I can say there is no load. CPU will not go beyond 15 %. Memory will be around 3000 out of 10000. This is happening only after we migrated to Windows 2003 in VM. Also we upgraded Domino 6 to Domino 6.5 – Saravanan Jun 20 '13 at 02:55