18

On the server node, it is possible to access an exported folder. However, after reboots (both server and client), the folder is no longer accessible from the clients.

On server

# ls /data
Folder1
Forlder2

and the /etc/exports file contains

/data 192.168.1.0/24(rw,no_subtree_check,async,no_root_squash)

On client

# ls /data
ls: cannot access /data: Stale NFS file handle

I have to say that there were no problem with the shared folder from client side however after reboots (server and client), I see this message.

Any way to fix that?

mahmood
  • 962
  • 7
  • 18
  • 31

4 Answers4

27

The order of reboots is important. Rebooting the server after the clients can result in this situation. The stale NFS handle indicates that the client has a file open, but the server no longer recognizes the file handle. In some cases, NFS will cleanup its data structures after a timeout. In other cases, you will need to clean the NFS data structures yourself and restart NFS afterwards. Where these structures are located are somewhat O/S dependent.

Try restarting NFS first on the server and then on the clients. This may clear the file handles.

Rebooting NFS servers with files opened from other servers is not recommended. This is especially problematic if the open file has been deleted on the server. The server may keep the file open until it is rebooted, but the reboot will remove the in-memory file handle on the server side. Then the client will no longer be able to open the file.

Determining which mounts have been used from the server is difficult and unreliable. The showmount -a option may show some active mounts, but may not report all of them. Locked files are easier to identify, but require the locking to be enabled and relies on the client software to lock the files.

You can use lsof on the clients to identify the processes which have files open on the mounts.

I use the hard and intr mount options on my NFS mounts. The hard option causes IO to be retried indefinitely. The intr option allows processes to be killed if they are waiting on NFS IO to complete.

BillThor
  • 27,354
  • 3
  • 35
  • 69
  • Using `hard, intr` is good advice. However, note that NFS doubles the timeouts with each try. So you best set `timeo=1` and `retrans=5` or so. Note that this *will* put heavy strain on your NFS server after NFS restart. Try to not restart your NFS service so often ;) – bjanssen Aug 04 '14 at 12:27
  • 1
    Your answer is correct. I also found another simple solution. On the node that says stale NFS handler, just umount and remount the folder again. – mahmood Aug 04 '14 at 15:35
  • The root cause for problems of this type is usually that the `rpc.statd` service fails to resolve the IP address of the other partner by the `uname -n` name it used before the reboot. Yes, even if you mount by IP address, the names must be resolvable, *because the NFS lock protocol uses names internally.* [Please see this answer of mine in another NFS question.](https://serverfault.com/a/1017642/442837) – telcoM Dec 23 '20 at 20:41
5

Try this script I wrote:

#!/bin/bash
# Purpose:
# Detect Stale File handle and remove it
# Script created: July 29, 2015 by Birgit Ducarroz
# Last modification: --
#

# Detect Stale file handle and write output into a variable and then into a file
mounts=`df 2>&1 | grep 'Stale file handle' |awk '{print ""$2"" }' > NFS_stales.txt`
# Remove : ‘ and ’ characters from the output
sed -r -i 's/://' NFS_stales.txt && sed -r -i 's/‘//' NFS_stales.txt && sed -r -i 's/’//' NFS_stales.txt

# Not used: replace space by a new line
# stales=`cat NFS_stales.txt && sed -r -i ':a;N;$!ba;s/ /\n /g' NFS_stales.txt`

# read NFS_stales.txt output file line by line then unmount stale by stale.
#    IFS='' (or IFS=) prevents leading/trailing whitespace from being trimmed.
#    -r prevents backslash escapes from being interpreted.
#    || [[ -n $line ]] prevents the last line from being ignored if it doesn't end with a \n (since read returns a non-zero exit code when it encounters EOF).

while IFS='' read -r line || [[ -n "$line" ]]; do
    echo "Unmounting due to NFS Stale file handle: $line"
    umount -fl $line
done < "NFS_stales.txt"
#EOF

In meantime, the above script works not with all servers. Here is an update:

#!/bin/bash
# Purpose:
# Detect Stale File handle and remove it
# Script created: July 29, 2015 by Birgit Ducarroz
# Last modification: 23.12.2020  /bdu
#

MYMAIL="my.mail@something.com"
THIS_HOST=`hostname`

# Detect Stale file handle and write output into a variable and then into a file
mounts=`df 2>&1 | grep 'Stale' |awk '{print ""$2"" }' > NFS_stales.txt`
sleep 8

# Remove : special characters from the output

sed -r -i 's/://' NFS_stales.txt && sed -r -i 's/‘//' NFS_stales.txt && sed -r -i 's/’//' NFS_stales.txt && sed -r -i 's/`//' NFS_stales.txt  && sed -r -i "s/'//" NFS_stales.txt 


# Not used: replace space by a new line
# stales=`cat NFS_stales.txt && sed -r -i ':a;N;$!ba;s/ /\n /g' NFS_stales.txt`

# read NFS_stales.txt output file line by line then unmount stale by stale.
#    IFS='' (or IFS=) prevents leading/trailing whitespace from being trimmed.
#    -r prevents backslash escapes from being interpreted.
#    || [[ -n $line ]] prevents the last line from being ignored if it doesn't end with a \n (since read returns a non-zero exit code when it encounters EOF).

while IFS='' read -r line || [[ -n "$line" ]]; do
    message=`echo "Unmounting due to NFS Stale file handle: $line"`
    echo echo | mail -s "$THIS_HOST: NFS Stale Handle unmounted" $MYMAIL <<< $message
    umount -f -l $line
done < "NFS_stales.txt"
mount -a

#EOF
2

On the NFS server UN-export and re-export the file system:

exportfs -u nfs-server:/file_system exportfs nfs-server:/file_system

On the client mount the file system

mount -t nfs nfs-server:/filesystem /mount_point

Chin
  • 21
  • 1
0

check lsof of specific path and kill respective pid . Then unmount the partition and mount it back.