I have a server that runs nexentastor version 4. I use web interface to manage it. The server has been working on production environment for about two years with no issues until last week - where I could no connect to the web interface, and I lost connection to the storage as well. When I tried to login to the console using root, it took long time to authenticate - and then it displayed the following lines:
login as: root Using keyboard-interactive authentication. Password: Initializing recovery session. The operation may take up to 6 minutes. Remains: 330 seconds. Please stand by... Remains: 300 seconds. Please stand by... Remains: 270 seconds. Please stand by... Remains: 240 seconds. Please stand by... Remains: 210 seconds. Please stand by... Remains: 180 seconds. Please stand by... Remains: 150 seconds. Please stand by... Remains: 120 seconds. Please stand by... Remains: 90 seconds. Please stand by... Remains: 60 seconds. Please stand by... Remains: 30 seconds. Please stand by... Remains: 0 seconds. Please stand by... * * * SYSTEM NOTICE Failed to initialize NMC: The name com.nexenta.nms was not provided by any .service files Suggested possible recovery actions: - Reboot into a known working system checkpoint - Run 'svcadm clear nms'; then try to re-login - Run 'svcadm enable -rs nms' to enable nms daemon and then try to re-login Suggested troubleshooting actions: - Run 'svcs -vx' and collect output for further analysis - Run 'dmesg' and look for error messages - View "/var/log/nms.log" for error messages - View "/var/svc/log/application-nms:default.log" for error messages Entering UNIX shell. Type 'exit' to go back to NMC login...
I ran the commands in the suggested possible recovery and troubleshooting actions, but I was not able to resolve the issue. Below are some log data from the log files.
/var/log/nms.log ================ Aug 9 17:47:46 myhost nms[543]: [ID 702911 local0.info] Starting... Aug 9 17:47:46 myhost nms[543]: [ID 702911 local0.info] Syncing devices... Aug 9 17:47:51 myhost nms[543]: [ID 702911 local0.info] Warning: format timeout: Command timed out Aug 9 17:47:56 myhost nms[543]: [ID 702911 local0.info] Warning: rmformat timeout: Command timed out Aug 9 17:47:56 myhost nms[674]: [ID 702911 local0.info] Syncing time... Aug 9 17:47:58 myhost nms[543]: [ID 702911 local0.info] Syncing LUNs... Aug 9 17:48:24 myhost nms[543]: [ID 702911 local0.info] Syncing datasets ... Aug 9 17:48:24 myhost nms[543]: [ID 702911 local0.info] Loading plugins ... Aug 9 17:48:26 myhost nms[543]: [ID 702911 local0.info] scsitarget: importing Volumes Aug 9 17:48:26 myhost nms[543]: [ID 702911 local0.info] scsitarget: comstar plugin loaded Aug 9 17:48:26 myhost nms[543]: [ID 702911 local0.info] Plugin: nms-comstar, v40-0-20, (COMSTAR Target extension) Aug 9 17:48:26 myhost nms[543]: [ID 702911 local0.info] Starting IPC listener... Aug 9 17:48:26 myhost nms[881]: [ID 702911 local0.info] (:1.1) Group sync OK Aug 9 17:48:26 myhost nms[881]: [ID 702911 local0.info] (:1.1) Delayed server "pooling" (initial count = 2) Aug 9 17:48:27 myhost nms[881]: [ID 702911 local0.info] (:1.1) Nexenta Management Server is ready (881) Aug 9 17:48:58 myhost nms[967]: [ID 702911 local0.info] (1:) Starting... Aug 9 17:48:59 myhost nms[967]: [ID 702911 local0.info] (1:) Syncing LUNs... Aug 9 17:49:01 myhost nms[967]: [ID 702911 local0.info] (1:) Syncing datasets ... Aug 9 17:49:01 myhost nms[967]: [ID 702911 local0.info] (1:) Loading plugins ... Aug 9 17:49:03 myhost nms[967]: [ID 702911 local0.info] (1:) scsitarget: importing Volumes Aug 9 17:49:03 myhost nms[967]: [ID 702911 local0.info] (1:) scsitarget: comstar plugin loaded Aug 9 17:49:03 myhost nms[967]: [ID 702911 local0.info] (1:) Plugin: nms-comstar, v40-0-20, (COMSTAR Target extension) Aug 9 17:49:03 myhost nms[967]: [ID 702911 local0.info] (1:) Starting IPC listener... Aug 9 17:49:03 myhost nms[967]: [ID 702911 local0.info] (1:) Nexenta Management Server is ready (1:967) Aug 9 17:49:11 myhost nms[1213]: [ID 702911 local0.info] (2:) Starting... Aug 9 17:49:12 myhost nms[1213]: [ID 702911 local0.info] (2:) Syncing LUNs... Aug 9 17:49:13 myhost nms[1213]: [ID 702911 local0.info] (2:) Syncing datasets ... Aug 9 17:49:14 myhost nms[1213]: [ID 702911 local0.info] (2:) Loading plugins ... Aug 9 17:49:15 myhost nms[1213]: [ID 702911 local0.info] (2:) scsitarget: importing Volumes Aug 9 17:49:15 myhost nms[1213]: [ID 702911 local0.info] (2:) scsitarget: comstar plugin loaded Aug 9 17:49:15 myhost nms[1213]: [ID 702911 local0.info] (2:) Plugin: nms-comstar, v40-0-20, (COMSTAR Target extension) Aug 9 17:49:15 myhost nms[1213]: [ID 702911 local0.info] (2:) Starting IPC listener... Aug 9 17:49:15 myhost nms[1213]: [ID 702911 local0.info] (2:) Nexenta Management Server is ready (2:1213) Aug 9 17:49:22 myhost nms[881]: [ID 702911 local0.info] (:1.6) Server "pooling": 2 management servers in a pool, in addition to the main NMS Aug 10 11:29:41 myhost hosts-check[1860]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 11:29:52 myhost volume-check[2192]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 11:30:05 myhost ses-check[1498]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 11:30:19 myhost nfs-collector[1380]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 11:34:41 myhost hosts-check[1860]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 11:34:52 myhost volume-check[2192]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 11:35:05 myhost ses-check[1498]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 11:35:19 myhost nfs-collector[1380]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 11:39:41 myhost hosts-check[1860]: [ID 702911 local0.info] The exception (org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.) can be ignored.. Aug 10 11:39:52 myhost volume-check[2192]: [ID 702911 local0.info] The exception (org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.) can be ignored.. Aug 10 11:40:06 myhost ses-check[1498]: [ID 702911 local0.info] The exception (org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.) can be ignored.. Aug 10 11:40:19 myhost nfs-collector[1380]: [ID 702911 local0.info] The exception (org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.) can be ignored.. Aug 10 13:34:37 myhost nms[967]: [ID 702911 local0.info] (1:1.1) Sending signal 'TERM' to contract process member PID 881 Aug 10 13:34:37 myhost nms[1213]: [ID 702911 local0.info] (2:1.124) Sending signal 'TERM' to contract process member PID 881 Aug 10 13:34:38 myhost nms[881]: [ID 702911 local0.info] (:1.762) Sending signal 'TERM' to contract process member PID 957 Aug 10 13:34:38 myhost nms[881]: [ID 702911 local0.info] (:1.762) Sending signal 'TERM' to contract process member PID 1202 Aug 10 13:34:38 myhost nms[881]: [ID 702911 local0.info] (:1.762) Sending signal 'TERM' to contract process member PID 1213 Aug 10 13:34:38 myhost nms[881]: [ID 702911 local0.info] (:1.762) Sending signal 'TERM' to contract process member PID 2212 Aug 10 13:34:43 myhost nms[881]: [ID 702911 local0.info] (:1.762) Sending signal 'TERM' to contract process member PID 2398 Aug 10 13:34:48 myhost nms[881]: [ID 702911 local0.info] (:1.762) Sending signal 'TERM' to contract process member PID 3314 Aug 10 13:34:53 myhost nms[881]: [ID 702911 local0.info] (:1.762) Sending signal 'TERM' to contract process member PID 3488 Aug 10 13:34:58 myhost nms[881]: [ID 702911 local0.info] (:1.762) Sending signal 'TERM' to contract process member PID 3521 Aug 10 13:35:03 myhost nms[881]: [ID 702911 local0.info] (:1.762) Sending signal 'TERM' to contract process member PID 3609 Aug 10 13:35:03 myhost nms[881]: [ID 702911 local0.info] (:1.762) All child processes terminated now (9) Aug 10 13:35:03 myhost nms[881]: [ID 702911 local0.info] (:1.762) Closing server log... Aug 10 13:47:20 myhost nms[3723]: [ID 702911 local0.info] Starting... Aug 10 13:47:20 myhost nms[3723]: [ID 702911 local0.info] Syncing devices... Aug 10 13:47:25 myhost nms[3723]: [ID 702911 local0.info] Warning: format timeout: Command timed out Aug 10 13:47:30 myhost nms[3723]: [ID 702911 local0.info] Warning: rmformat timeout: Command timed out Aug 10 13:47:30 myhost nms[3751]: [ID 702911 local0.info] Syncing time... Aug 10 13:47:31 myhost nms[3723]: [ID 702911 local0.info] Syncing LUNs... Aug 10 13:47:58 myhost nms[3723]: [ID 702911 local0.info] Syncing datasets ... Aug 10 13:47:58 myhost nms[3723]: [ID 702911 local0.info] Loading plugins ... Aug 10 13:48:00 myhost nms[3723]: [ID 702911 local0.info] scsitarget: importing Volumes Aug 10 13:48:00 myhost nms[3723]: [ID 702911 local0.info] scsitarget: comstar plugin loaded Aug 10 13:48:00 myhost nms[3723]: [ID 702911 local0.info] Plugin: nms-comstar, v40-0-20, (COMSTAR Target extension) Aug 10 13:48:00 myhost nms[3723]: [ID 702911 local0.info] Starting IPC listener... Aug 10 13:48:00 myhost nms[3968]: [ID 702911 local0.info] (:1.772) Group sync OK Aug 10 13:48:00 myhost nms[3968]: [ID 702911 local0.info] (:1.772) Delayed server "pooling" (initial count = 2) Aug 10 13:48:00 myhost nms[3968]: [ID 702911 local0.info] (:1.772) Warning: NMV maintenance state cleared... Aug 10 13:48:00 myhost nms[3968]: [ID 702911 local0.info] (:1.772) Nexenta Management Server is ready (3968) Aug 10 13:48:32 myhost nms[4068]: [ID 702911 local0.info] (1:) Starting... Aug 10 13:48:33 myhost nms[4068]: [ID 702911 local0.info] (1:) Syncing LUNs... Aug 10 13:48:35 myhost nms[4068]: [ID 702911 local0.info] (1:) Syncing datasets ... Aug 10 13:48:35 myhost nms[4068]: [ID 702911 local0.info] (1:) Warning: Database file /var/lib/nza/report.db corrupted and saved as: /var/lib/nza/report.db.corrupted_2017_Aug_10_13_48_35 Aug 10 13:48:35 myhost nms[4068]: [ID 702911 local0.info] (1:) Warning: Database will be recreated Aug 10 13:48:36 myhost nms[4068]: [ID 702911 local0.info] (1:) Loading plugins ... Aug 10 13:48:36 myhost nms[4068]: [ID 702911 local0.info] (1:) scsitarget: importing Volumes Aug 10 13:48:36 myhost nms[4068]: [ID 702911 local0.info] (1:) scsitarget: comstar plugin loaded Aug 10 13:48:36 myhost nms[4068]: [ID 702911 local0.info] (1:) Plugin: nms-comstar, v40-0-20, (COMSTAR Target extension) Aug 10 13:48:36 myhost nms[4068]: [ID 702911 local0.info] (1:) Starting IPC listener... Aug 10 13:48:37 myhost nms[4068]: [ID 702911 local0.info] (1:) Nexenta Management Server is ready (1:4068) Aug 10 13:48:45 myhost nms[4364]: [ID 702911 local0.info] (2:) Starting... Aug 10 13:48:46 myhost nms[4364]: [ID 702911 local0.info] (2:) Syncing LUNs... Aug 10 13:48:47 myhost nms[4364]: [ID 702911 local0.info] (2:) Syncing datasets ... Aug 10 13:48:48 myhost nms[4364]: [ID 702911 local0.info] (2:) Loading plugins ... Aug 10 13:48:49 myhost nms[4364]: [ID 702911 local0.info] (2:) scsitarget: importing Volumes Aug 10 13:48:49 myhost nms[4364]: [ID 702911 local0.info] (2:) scsitarget: comstar plugin loaded Aug 10 13:48:49 myhost nms[4364]: [ID 702911 local0.info] (2:) Plugin: nms-comstar, v40-0-20, (COMSTAR Target extension) Aug 10 13:48:49 myhost nms[4364]: [ID 702911 local0.info] (2:) Starting IPC listener... Aug 10 13:48:49 myhost nms[4364]: [ID 702911 local0.info] (2:) Nexenta Management Server is ready (2:4364) Aug 10 13:48:56 myhost nms[3968]: [ID 702911 local0.info] (:1.779) Server "pooling": 2 management servers in a pool, in addition to the main NMS Aug 10 13:58:40 myhost nfs-collector[4304]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 13:58:44 myhost hosts-check[4342]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 13:58:55 myhost volume-check[4051]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 13:59:31 myhost ses-check[3949]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 14:03:40 myhost nfs-collector[4304]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 14:03:44 myhost hosts-check[4342]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 14:03:55 myhost volume-check[4051]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 14:04:31 myhost ses-check[3949]: [ID 702911 local0.info] Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Aug 10 14:08:40 myhost nfs-collector[4304]: [ID 702911 local0.info] The exception (org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.) can be ignored.. Aug 10 14:08:44 myhost hosts-check[4342]: [ID 702911 local0.info] The exception (org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.) can be ignored.. Aug 10 14:08:55 myhost volume-check[4051]: [ID 702911 local0.info] The exception (org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.) can be ignored.. Aug 10 14:09:31 myhost ses-check[3949]: [ID 702911 local0.info] The exception (org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.) can be ignored.. Aug 10 15:27:19 myhost nms[4068]: [ID 702911 local0.info] (1:1.2) Sending signal 'TERM' to contract process member PID 3968 Aug 10 15:27:19 myhost nms[4364]: [ID 702911 local0.info] (2:1.2) Sending signal 'TERM' to contract process member PID 3968 Aug 10 15:27:20 myhost nms[3968]: [ID 702911 local0.info] (:1.790) Sending signal 'TERM' to contract process member PID 4058 Aug 10 15:27:20 myhost nms[3968]: [ID 702911 local0.info] (:1.790) Sending signal 'TERM' to contract process member PID 4353 Aug 10 15:27:20 myhost nms[3968]: [ID 702911 local0.info] (:1.790) Sending signal 'TERM' to contract process member PID 4364 Aug 10 15:27:20 myhost nms[3968]: [ID 702911 local0.info] (:1.790) Sending signal 'TERM' to contract process member PID 4885 Aug 10 15:27:25 myhost nms[3968]: [ID 702911 local0.info] (:1.790) Sending signal 'TERM' to contract process member PID 5050 Aug 10 15:27:30 myhost nms[3968]: [ID 702911 local0.info] (:1.790) Sending signal 'TERM' to contract process member PID 5209 Aug 10 15:27:35 myhost nms[3968]: [ID 702911 local0.info] (:1.790) Sending signal 'TERM' to contract process member PID 5316 Aug 10 15:27:35 myhost nms[3968]: [ID 702911 local0.info] (:1.790) All child processes terminated now (7) Aug 10 15:27:35 myhost nms[3968]: [ID 702911 local0.info] (:1.790) Closing server log... /var/svc/log/application-nms:default.log ======================================== [ Aug 9 16:12:33 Enabled. ] [ Aug 9 16:12:58 Executing start method ("/lib/svc/method/nms -d"). ] [ Aug 9 16:13:51 Method "start" exited with status 0. ] [ Aug 9 16:41:56 Stopping because service disabled. ] [ Aug 9 16:41:56 Executing stop method ("/lib/svc/method/nms stop"). ] Stopping NMS daemon (1:1208) ... NMS daemon (1:1208) stopped (terminated) Stopping NMS daemon (1107) ... [ Aug 9 16:42:23 Method "stop" exited with status 0. ] [ Aug 9 16:47:45 Enabled. ] [ Aug 9 16:48:01 Executing start method ("/lib/svc/method/nms -d"). ] [ Aug 9 17:48:26 Method "start" exited with status 0. ] [ Aug 10 13:34:36 Stopping because service restarting. ] [ Aug 10 13:34:36 Executing stop method ("/lib/svc/method/nms stop"). ] Stopping NMS daemon (1:967) ... NMS daemon (1:967) stopped (terminated) Stopping NMS daemon (881) ... NMS daemon (881) stopped (terminated) [ Aug 10 13:38:19 Method "stop" exited with status 0. ] [ Aug 10 13:38:36 Method or service exit timed out. Killing contract 80. ] [ Aug 10 13:38:37 Method or service exit timed out. Killing contract 80. ] [ Aug 10 13:38:38 Method or service exit timed out. Killing contract 80. ] ... ... ... [ Aug 10 13:46:55 Method or service exit timed out. Killing contract 80. ] [ Aug 10 13:46:56 Method or service exit timed out. Killing contract 80. ] [ Aug 10 13:46:57 Method or service exit timed out. Killing contract 80. ] [ Aug 10 13:47:18 Leaving maintenance because clear requested. ] [ Aug 10 13:47:18 Enabled. ] [ Aug 10 13:47:18 Executing start method ("/lib/svc/method/nms -d"). ] [ Aug 10 13:48:00 Method "start" exited with status 0. ] [ Aug 10 15:27:18 Stopping because service restarting. ] [ Aug 10 15:27:18 Executing stop method ("/lib/svc/method/nms stop"). ] Stopping NMS daemon (1:4068) ... NMS daemon (1:4068) stopped (terminated) Stopping NMS daemon (3968) ... NMS daemon (3968) stopped (terminated) [ Aug 10 15:31:00 Method "stop" exited with status 0. ] [ Aug 10 15:31:19 Method or service exit timed out. Killing contract 128. ] [ Aug 10 15:31:20 Method or service exit timed out. Killing contract 128. ] [ Aug 10 15:31:21 Method or service exit timed out. Killing contract 128. ] [ Aug 10 15:31:22 Method or service exit timed out. Killing contract 128. ] ... ... ...
I hope someone can assist me with resolving this issue.
Here is the output of df:
root@myhost:~# df / (syspool/rootfs-nmu-000):819438476 blocks 819438476 files /devices (/devices ): 0 blocks 0 files /dev (/dev ): 0 blocks 0 files /system/contract (ctfs ): 0 blocks 2147483598 files /proc (proc ): 0 blocks 29922 files /etc/mnttab (mnttab ): 0 blocks 0 files /etc/svc/volatile (swap ):224971408 blocks 18910464 files /system/object (objfs ): 0 blocks 2147483412 files /etc/dfs/sharetab (sharefs ): 0 blocks 2147483646 files /lib/libc.so.1 (/usr/lib/libc/libc_hwcap1.so.1):819438476 blocks 819438476 files /dev/fd (fd ): 0 blocks 0 files /tmp (swap ):224971408 blocks 18910464 files /var/run (swap ):224971408 blocks 18910464 files /syspool (syspool ):819438476 blocks 819438476 files /var/cores (syspool/cores ): 2097090 blocks 2097090 files root@myhost:~#