5

I have two Linode containers. Box A is our general purpose web server. It occasionally needs to access Box B, which is set up as an NFS server.

When Box B reboots, Box A is unable to access any NFS shares no matter what I do. After several hours of troubleshooting, I was finally able to narrow it down to a single step fix.

After Box B reboots:

$ sudo service nfs restart

These are both CentOS 6.8 boxes, up to date. NFS-related packages were all installed via yum, I believe. I did have some trouble getting the whole thing set up; it was not a smooth process, but after restarting the nfs service(s), everything works great.

If I

$ sudo service --status-all

there is no difference before and after issuing the restart. Maybe it's a timing issue? But I don't know how to even begin to trouble shoot this. What can I do?

Other things of note:

  • I'm using autofs to automatically mount the share on demand from Box A, but the share won't mount manually either

  • I spend my days on Windows and Mac desktops and servers, but I've been running websites on Linux for many years. I'm proficient in the things I need to do, but it's not my area of comfort and I spend a lot of time googling how to do new things.

I don't even know where to check. I didn't see anything obvious in the logs, but tell me what to look for and I'll post.

Update

On Box B:

 [shorowitz@BoxB ~]$ sudo chkconfig --list nfs
 nfs                0:off   1:off   2:on    3:on    4:on    5:on    6:off
 [shorowitz@BoxB ~]$ sudo chkconfig --list nfslock
 nfslock            0:off   1:off   2:on    3:on    4:on    5:on    6:off

Update 2

After a fresh reboot of BoxB, running

$ sudo showmount -e BoxB

from BoxA shows the expected mount points, but I'm unable to mount them. Simply restarting nfs on BoxB

 $ sudo service nfs restart
 Shutting down NFS daemon:                                  [  OK  ]
 Shutting down NFS mountd:                                  [  OK  ]
 Shutting down NFS services:                                [  OK  ]
 Shutting down RPC idmapd:                                  [  OK  ]
 FATAL: Module nfsd not found.
 FATAL: Error running install command for nfsd
 Starting NFS services:                                     [  OK  ]
 Starting NFS mountd:                                       [  OK  ]
 Starting NFS daemon:                                       [  OK  ]
 Starting RPC idmapd:                                       [  OK  ]

And the mounts are immediately available on BoxA. Those fatal errors appear on subsequent restarts as well when NFS is already working, so I don't know how relevant they are (I thought I had posted them already).

Additional Log Info

I issued the reboot command at 9:29 on Nov 15

 grep -i "nfs" /var/log/message*
 messages:Nov 15 09:29:08 BoxB kernel: nfsd: last server has exited, flushing export cache
 messages:Nov 15 09:29:54 BoxB kernel: RPC: Registered tcp NFSv4.1 backchannel transport module.
 messages:Nov 15 09:29:54 BoxB kernel: FS-Cache: Netfs 'nfs' registered for caching
 messages:Nov 15 09:29:54 BoxB kernel: NFS: Registering the id_resolver key type
 messages:Nov 15 09:29:54 BoxB kernel: nfs4filelayout_init: NFSv4 File Layout Driver Registering...
 messages:Nov 15 09:29:54 BoxB kernel: Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
 messages:Nov 15 09:29:54 BoxB kernel: xenfs: not registering filesystem on non-xen platform
 messages:Nov 15 09:29:54 BoxB rpc.mountd[2740]: NFS v4 mounts will be disabled unless fsid=0
 messages:Nov 15 09:29:54 BoxB kernel: NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
 messages:Nov 15 09:29:54 BoxB kernel: NFSD: starting 90-second grace period (net ****************)
 messages:Nov 15 09:33:39 BoxB kernel: nfsd: last server has exited, flushing export cache
 messages:Nov 15 09:33:40 BoxB kernel: NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
 messages:Nov 15 09:33:40 BoxB kernel: NFSD: starting 90-second grace period (net ****************)

Update 3:

BoxB

 [shorowitz@BoxB ~]$ sudo chkconfig --list | egrep "nfs|rpc"
 nfs                0:off   1:off   2:on    3:on    4:on    5:on    6:off
 nfslock            0:off   1:off   2:on    3:on    4:on    5:on    6:off
 rpcbind            0:off   1:off   2:on    3:on    4:on    5:on    6:off
 rpcgssd            0:off   1:off   2:off   3:on    4:on    5:on    6:off
 rpcsvcgssd         0:off   1:off   2:off   3:off   4:off   5:off   6:off

 [shorowitz@BoxB ~]$ sudo iptables --list -n -v
 Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
  pkts bytes target     prot opt in     out     source               destination         
     0     0 ACCEPT     all  --  lo     *       0.0.0.0/0            0.0.0.0/0           
     0     0 REJECT     all  --  !lo    *       127.0.0.0/8          0.0.0.0/0           reject-with icmp-port-unreachable 
    18   710 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0           state NEW icmp type 8 
   471 26200 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           tcp dpt:22 state NEW 
  204K  393M ACCEPT     all  --  *      *       {BoxA IP}            0.0.0.0/0           
  6721  754K ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0           state RELATED,ESTABLISHED 
  2859  168K LOG        all  --  *      *       0.0.0.0/0            0.0.0.0/0           limit: avg 5/min burst 5 LOG flags 0 level 7 prefix `iptables_INPUT_denied: ' 
  9229  628K REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0           reject-with icmp-port-unreachable 

 Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
  pkts bytes target     prot opt in     out     source               destination         
     0     0 LOG        all  --  *      *       0.0.0.0/0            0.0.0.0/0           limit: avg 5/min burst 5 LOG flags 0 level 7 prefix `iptables_FORWARD_denied: ' 
     0     0 REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0           reject-with icmp-port-unreachable 

 Chain OUTPUT (policy ACCEPT 278K packets, 8386M bytes)
  pkts bytes target     prot opt in     out     source               destination        

 [shorowitz@BoxB ~]$ sudo rpcinfo -p
program vers proto   port  service
 100000    4   tcp    111  portmapper
 100000    3   tcp    111  portmapper
 100000    2   tcp    111  portmapper
 100000    4   udp    111  portmapper
 100000    3   udp    111  portmapper
 100000    2   udp    111  portmapper
 100024    1   udp  38148  status
 100024    1   tcp  45681  status
 100005    1   udp  37846  mountd
 100005    1   tcp  59259  mountd
 100005    2   udp  59934  mountd
 100005    2   tcp  42645  mountd
 100005    3   udp  33867  mountd
 100005    3   tcp  41823  mountd
 100003    2   tcp   2049  nfs
 100003    3   tcp   2049  nfs
 100003    4   tcp   2049  nfs
 100227    2   tcp   2049  nfs_acl
 100227    3   tcp   2049  nfs_acl
 100003    2   udp   2049  nfs
 100003    3   udp   2049  nfs
 100003    4   udp   2049  nfs
 100227    2   udp   2049  nfs_acl
 100227    3   udp   2049  nfs_acl
 100021    1   udp  37287  nlockmgr
 100021    3   udp  37287  nlockmgr
 100021    4   udp  37287  nlockmgr
 100021    1   tcp  37579  nlockmgr
 100021    3   tcp  37579  nlockmgr
 100021    4   tcp  37579  nlockmgr

This returns nothing:

 grep -v "^#" /etc/sysconfig/nfs

BoxA

 $ chkconfig --list | egrep "nfs|rpc"
 nfs                0:off   1:off   2:on    3:on    4:on    5:on    6:off
 nfslock            0:off   1:off   2:on    3:on    4:on    5:on    6:off
 rpcbind            0:off   1:off   2:on    3:on    4:on    5:on    6:off
 rpcgssd            0:off   1:off   2:off   3:on    4:on    5:on    6:off
 rpcsvcgssd         0:off   1:off   2:off   3:off   4:off   5:off   6:off

 $ iptables --list -n -v
 Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
  pkts bytes target     prot opt in     out     source               destination         
  390K   58M ACCEPT     all  --  lo     *       0.0.0.0/0            0.0.0.0/0           
     0     0 REJECT     all  --  *      *       0.0.0.0/0            127.0.0.0/8         reject-with icmp-port-unreachable 
  990K 7850M ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0           state RELATED,ESTABLISHED 
     0     0 DROP       all  --  *      *       43.255.188.145       0.0.0.0/0           
     8   388 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           tcp dpt:587 
 11864  608K ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           tcp dpt:25 
     1    40 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           tcp dpt:636 
  4545  238K ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           tcp dpt:80 
  9759  553K ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           tcp dpt:443 
    24   960 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           tcp dpt:8080 
   320 19152 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0           state NEW tcp dpt:22 
    85  5681 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0           
  3254  194K LOG        all  --  *      *       0.0.0.0/0            0.0.0.0/0           limit: avg 5/min burst 5 LOG flags 0 level 7 prefix `iptables denied: ' 
  3634  227K DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0           

 Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
  pkts bytes target     prot opt in     out     source               destination         
     0     0 DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0           

 Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
  pkts bytes target     prot opt in     out     source               destination         
 1360K 1907M ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

 $ rpcinfo -p
program vers proto   port  service
 100000    4   tcp    111  portmapper
 100000    3   tcp    111  portmapper
 100000    2   tcp    111  portmapper
 100000    4   udp    111  portmapper
 100000    3   udp    111  portmapper
 100000    2   udp    111  portmapper
 100024    1   udp  55882  status
 100024    1   tcp  58283  status
 100011    1   udp    875  rquotad
 100011    2   udp    875  rquotad
 100011    1   tcp    875  rquotad
 100011    2   tcp    875  rquotad
 100005    1   udp  43136  mountd
 100005    1   tcp  55047  mountd
 100005    2   udp  51117  mountd
 100005    2   tcp  42791  mountd
 100005    3   udp  44511  mountd
 100005    3   tcp  46535  mountd
 100003    2   tcp   2049  nfs
 100003    3   tcp   2049  nfs
 100003    4   tcp   2049  nfs
 100227    2   tcp   2049  nfs_acl
 100227    3   tcp   2049  nfs_acl
 100003    2   udp   2049  nfs
 100003    3   udp   2049  nfs
 100003    4   udp   2049  nfs
 100227    2   udp   2049  nfs_acl
 100227    3   udp   2049  nfs_acl
 100021    1   udp  43509  nlockmgr
 100021    3   udp  43509  nlockmgr
 100021    4   udp  43509  nlockmgr
 100021    1   tcp  38725  nlockmgr
 100021    3   tcp  38725  nlockmgr
 100021    4   tcp  38725  nlockmgr

 $ mount | grep nfs
 nfsd on /proc/fs/nfsd type nfsd (rw)
 sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)

Update 14 November

 BoxA:

 $ cat /etc/auto.master.d/nfs
 xdata  -rw boxb:/srv/nfs/xdata
 xbackup    -rw boxb:/srv/nfs/xbackup
 zbackups   -rw boxb:/srv/nfs/zbackups

 $ mount | grep nfs
 mount |grep nfs
 nfsd on /proc/fs/nfsd type nfsd (rw)
 sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
 boxb:/srv/nfs/xdata on /mnt/nfs/xdata type nfs (rw,sloppy,vers=4,addr={boxb ip},clientaddr={boxa ip})
MadHatter
  • 78,442
  • 20
  • 178
  • 229
samh
  • 213
  • 4
  • 11
  • 3
    http://serverfault.com/a/459732/126632 – Michael Hampton Nov 15 '16 at 14:53
  • 2
    You should check that the services are enabled at the correct runlevel: `chkconfig --list nfslock` and `chkconfig --list nfs` will confirm. – Brett Levene Nov 15 '16 at 15:36
  • 1
    Added results of chkconfig --list. I'm not sure what the runlevels *should* be—but they are consistent with other services that are working on boot. – samh Nov 16 '16 at 13:08
  • 1
    Are you using names , IP addresses, or a simple * glob in your exports file to restrict host access? The problem you mention only really applies if you're restricting by name. If that solution you mentioned doesn't do anything, consider looking through your logs on the server and client side to see exactly why this might be happening. This is a common problem when exporting fails during startup. It's usually caused by a delayed mount on the NFS server side, causing exporting to fail because the directory to export doesn't yet exist. If in doubt, provide your /etc/exports and /etc/fstab files. – Spooler Nov 16 '16 at 13:20
  • 1
    @SmallLoanOf1M I deleted my comment. After I checked this morning, restrictions are based entirely on IP address (private IP). – samh Nov 16 '16 at 14:27
  • I added logs from /var/log/messages -- I don't see any NFS-related oddities. And again, as soon as I issue **services nfs restart** on the *NFS server* all is immediately well on the client. If the NFS server stays up and the client reboots, all is well. – samh Nov 16 '16 at 14:38
  • Just for giggles, also specify a unique `fsid=`(number) for each export, then reboot both machines and see if the problem comes back. – Aaron Nov 22 '16 at 14:36
  • @Aaron no difference – samh Nov 23 '16 at 19:01

3 Answers3

1

Can you update your question with more information?
On NFS Server run

chkconfig --list | egrep "nfs|rpc"
iptables --list -n -v
rpcinfo -p

Next one should return nothing if your nfs server settings are not customized.

grep -v "^#" /etc/sysconfig/nfs

However if you are running iptables - it should be customized. See here
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/s2-nfs-nfs-firewall-config.html

On your client run

chkconfig --list | egrep "nfs|rpc"
iptables --list -n -v
rpcinfo -p
mount | grep nfs

If you are running NFSV2 or NFSV3 you also need nfslock running on client

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/s1-nfs-start.html

What does look strange in your output when starting NFS - is this below line

FATAL: Module nfsd not found.
 FATAL: Error running install command for nfsd

And I just realized one thing - since you are running on openvz - below link applies to your situation. See if this helps

https://openvz.org/NFS_server_inside_container

Edit 1: I ran tests today on openvz containers.
Message that you see

FATAL: Module nfsd not found.
FATAL: Error running install command for nfsd

is harmless. It's described here https://openvz.org/NFS_server_inside_container

I could not reproduce your issue. After reboot of nfs server - nfs client was still able to browse files, create new files, delete files from the nfs share.

Can you now post your auto mount configuration and also output of

mount|grep nfs

From Box A, which you already did. But you did it while auto mounted file system was unmounted. So now on Box A cd to auto mounted dir and then run above command. This will give information on which options were used during mount.

Also next time you reboot your Box B and you have this automount issue present - from Site A run this command

netstat -anp|grep ipofB

This will give information on which ports are involved.
In this kind of situations it's also good to collect tcpdumps on B and A.
I tend to think that there is nothing wrong with your config - but something weird is going on with iptables on vzhosts. Not in your containers, but on the hosts.

What you can also try to do is to install nmap on your host A and at the time of the problem - scan your host B to see which ports are open from the point of view of A (some might suggest netstat or ss, but in this case there is also openvz host firewall in front of the container and we don't know if they are blocking something or not)

Edit 2 (Nov 25) Something is very strange with your mount options

Compare your output

$ mount | grep nfs
 mount |grep nfs
 nfsd on /proc/fs/nfsd type nfsd (rw)
 sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
 boxb:/srv/nfs/xdata on /mnt/nfs/xdata type nfs (rw,sloppy,vers=4,addr={boxb ip},clientaddr={boxa ip}) 

To mine below. Here is my /etc/auto.misc. Line 6 is default. Line 16 I added

[root@vznfsclient /]# egrep -vn "^#|^$" /etc/auto.misc
6:cd            -fstype=iso9660,ro,nosuid,nodev :/dev/cdrom
16:nfs          -fstype=nfs             192.168.0.54:/nfs_export

So when I cd to /misc/nfs - my share gets mounted. But look at the default options on line 12.

[root@vznfsclient ~]# mount|egrep -n "nfs|auto"
4:nfsd on /proc/fs/nfsd type nfsd (rw,relatime)
5:sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
10:/etc/auto.misc on /misc type autofs (rw,relatime,fd=6,pgrp=768,timeout=300,minproto=5,maxproto=5,indirect)
11:-hosts on /net type autofs (rw,relatime,fd=12,pgrp=768,timeout=300,minproto=5,maxproto=5,indirect)
12:192.168.0.54:/nfs_export/ on /misc/nfs type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.54,mountvers=3,mountport=42089,mountproto=udp,local_lock=none,addr=192.168.0.54) 

First it's nfsv3 and it's using udp. Ok udp we can change to tcp by changing /etc/auto.misc to this

[root@vznfsclient /]# egrep -vn "^#|^$" /etc/auto.misc
6:cd            -fstype=iso9660,ro,nosuid,nodev :/dev/cdrom
16:nfs          -fstype=nfs,proto=tcp   192.168.0.54:/nfs_export

And mount options will change to

192.168.0.54:/nfs_export/ on /misc/nfs type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.54,mountvers=3,mountport=45378,mountproto=tcp,local_lock=none,addr=192.168.0.54)

When I try to use in /etc/auto.misc -fstype=nfs4 - I can't even cd to /misc/nfs which kind of makes sense since per openvz nfsv4 is not supported inside container https://openvz.org/NFS_server_inside_container

Notice that you have in your mount options sloppy and simple rw. That basically means that

  1. Options passed to mount.nfs are not exactly correct, so it's trying to work around it. Read man mount.nfs and search for /sloppy.
  2. I assume it's trying to use nfsv4. Don't even know how it works if it's not supported inside container.

I'd suggest to change your auto mount maps to correct syntax (see my example) or see examples here https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/s2-nfs-config-autofs.html

And test. I just ran tests with autofs in those same 2 containers running on 2 different openvz hosts - and after nfs server reboot - client is still happy.

Edit 3 I can't even reproduce your scenario. I changed my /etc/auto.misc to below, which is pretty much what you have

[root@vznfsclient nfs]# egrep -nv "^#|^$" /etc/auto.misc
6:cd            -fstype=iso9660,ro,nosuid,nodev :/dev/cdrom
16:nfs          -rw                     192.168.0.54:/nfs_export

And after reboot and cd /misc/nfs I still have this

[root@vznfsclient nfs]# mount|grep nfs
nfsd on /proc/fs/nfsd type nfsd (rw,relatime)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
192.168.0.54:/nfs_export/ on /misc/nfs type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.54,mountvers=3,mountport=46453,mountproto=udp,local_lock=none,addr=192.168.0.54)

So in my case it's correctly trying to use nfsv3.

The only thing left to do now is

Activate debug in the /etc/autofs.conf

[root@vznfsclient nfs]# grep -i debug /etc/autofs.conf|grep -v "^#"
    logging = debug

Enable debug severity to be send to /var/log/messages in the /etc/rsyslog.conf. Change this

[root@vznfsclient nfs]# grep info /etc/rsyslog.conf
*.info;mail.none;authpriv.none;cron.none                -/var/log/messages

To this (.info to .debug)

[root@vznfsclient nfs]# grep debug /etc/rsyslog.conf
*.debug;mail.none;authpriv.none;cron.none                -/var/log/messages

Restart autofs and rsyslog and then when you change to auto mounted location - you should see debug output in the /var/log/messages

Here is output from my test system

Nov 25 03:06:00 vznfsclient automount[583]: attempting to mount entry /misc/nfs
Nov 25 03:06:00 vznfsclient automount[583]: lookup_mount: lookup(file): looking up nfs
Nov 25 03:06:00 vznfsclient automount[583]: lookup_mount: lookup(file): nfs -> -rw#011#011#011192.168.0.54:/nfs_export
Nov 25 03:06:00 vznfsclient automount[583]: parse_mount: parse(sun): expanded entry: -rw#011#011#011192.168.0.54:/nfs_export
Nov 25 03:06:00 vznfsclient automount[583]: parse_mount: parse(sun): gathered options: rw
Nov 25 03:06:00 vznfsclient automount[583]: parse_mount: parse(sun): dequote("192.168.0.54:/nfs_export") -> 192.168.0.54:/nfs_export
Nov 25 03:06:00 vznfsclient automount[583]: parse_mount: parse(sun): core of entry: options=rw, loc=192.168.0.54:/nfs_export
Nov 25 03:06:00 vznfsclient automount[583]: sun_mount: parse(sun): mounting root /misc, mountpoint nfs, what 192.168.0.54:/nfs_export, fstype nfs, options rw
Nov 25 03:06:00 vznfsclient automount[583]: mount_mount: mount(nfs): root=/misc name=nfs what=192.168.0.54:/nfs_export, fstype=nfs, options=rw
Nov 25 03:06:00 vznfsclient automount[583]: mount_mount: mount(nfs): nfs options="rw", nobind=0, nosymlink=0, ro=0
Nov 25 03:06:00 vznfsclient automount[583]: get_nfs_info: called with host 192.168.0.54(192.168.0.54) proto 6 version 0x40
Nov 25 03:06:00 vznfsclient automount[583]: get_nfs_info: called with host 192.168.0.54(192.168.0.54) proto 6 version 0x70
Nov 25 03:06:00 vznfsclient automount[583]: get_nfs_info: nfs v3 rpc ping time: 0.000366
Nov 25 03:06:00 vznfsclient automount[583]: get_nfs_info: host 192.168.0.54 cost 365 weight 0
Nov 25 03:06:00 vznfsclient automount[583]: get_nfs_info: called with host 192.168.0.54(192.168.0.54) proto 17 version 0x70
Nov 25 03:06:00 vznfsclient automount[583]: get_nfs_info: nfs v3 rpc ping time: 0.000507
Nov 25 03:06:00 vznfsclient automount[583]: get_nfs_info: nfs v2 rpc ping time: 0.000692
Nov 25 03:06:00 vznfsclient automount[583]: get_nfs_info: host 192.168.0.54 cost 599 weight 0
Nov 25 03:06:00 vznfsclient automount[583]: prune_host_list: selected subset of hosts that support NFS3 over TCP
Nov 25 03:06:00 vznfsclient automount[583]: mount_mount: mount(nfs): calling mkdir_path /misc/nfs
Nov 25 03:06:00 vznfsclient automount[583]: mount_mount: mount(nfs): calling mount -t nfs -s -o rw 192.168.0.54:/nfs_export /misc/nfs
Nov 25 03:06:00 vznfsclient automount[583]: spawn_mount: mtab link detected, passing -n to mount
Nov 25 03:06:00 vznfsclient automount[583]: mount(nfs): mounted 192.168.0.54:/nfs_export on /misc/nfs
Nov 25 03:06:00 vznfsclient automount[583]: ioctl_send_ready: token = 28
Nov 25 03:06:00 vznfsclient automount[583]: mounted /misc/nfs
Dmitry Zayats
  • 1,378
  • 6
  • 7
  • I added the requested info -- although this was while NFS was working on both. Let me know if there's anything in particular you want to see while the machines are in a state where the NFS mounts are not available. – samh Nov 23 '16 at 18:30
  • Can you now post your auto mount maps? And also output of the `mount|grep nfs` when you are inside auto mounted dir on Box A. This will give information on what options were used during auto mount. Also when you have a problem after Box B reboot cd to auto mounted dir on Box A and in another terminal window run `netstat -anp|grep ipofBoxB1` this will show which ports A is trying to access on B. – Dmitry Zayats Nov 24 '16 at 01:26
  • It's not an autofs issue. The very first thing I did was disable autofs, and try and mount them manually. If the server reboots, this doesn't work either. restart nfs on the server, run the exact same manual mount command on the client, and it works fine (just the same as using autofs, if enabled, works automatically). – samh Nov 28 '16 at 19:30
  • That's not the point. If you enable debug on autofs - you will see what mount command it's trying to use and perhaps why. Bottom line is - your mount is using incorrect mount options, it should not be trying to mount server share as nfsv4. You can find explanation even on serverfault on why nfsv4 doesn't work inside openvz container. Perhaps you should just force your server to use nfsv3 via `/etc/sysconfig/nfs` on server side. – Dmitry Zayats Nov 28 '16 at 20:17
0

Couple of things... CentOS 6.5 isn't up-to-date. It's worth planning to bring your OS to the current release (6.8 at this time).

Try the ntsysv command instead of the chkconfig. You can make sure the NFS server daemon is enabled (checked) at boot.

Enable the netfs service, as this is helpful for ensuring that network filesystem clients mount the requisite filesystems at boot.

enter image description here

Good luck!

ewwhite
  • 194,921
  • 91
  • 434
  • 799
0

Firstly, NFS is generally not that reliable for sharing files between servers. Having BoxB reboot, or otherwise become unavailable while BoxA has active mounts is problematic in much the same way as pulling a drive out of a running system. All sorts of things can be left in odd states.

You haven't given us much info about what you use the mount for, but consider alternatives to NFS. If you give some indication of what you are doing this for, I can probably suggest alternatives. GFS, GlusterFS, rsync, lsyncd, or an HTTP(S) based service come to mind as possibilities that might apply in at least some circumstances.

As to your current situation, I'm guessing that something is left hanging from your mount when BoxB goes down, and that the important part of the server restart is shutting down what's already running. If that's right, then you should be able to turn off both boxes, then start BoxA and then BoxB, and the mount should still start up OK. If I'm right here, then I think the logs on BoxA are more to the point than those from BoxB which you've shared with us. I'd look at the mount table for clues also.

mc0e
  • 5,786
  • 17
  • 31
  • No. I shut both down. I booted A and let it sit for a while. I booted B and let it sit for a while. Unable to mount the NFS share. After restarting the nfs service on BoxB, BoxA is able almost instantly to mount the shares. – samh Nov 22 '16 at 13:23