I use PHP flock() (which uses the system flock) on systems with shared file system via NFS.
The flock() fails when I use EXCLUSIVE, BLOCKING lock to access the same (shared) file on 2 servers. Of course, only one process should be able to obtain the (exclusive) lock, but the other should block in this case. But what I am seeing is that the flock() call returns immediately with an error.
If I do the same thing (start 2 programs to obtain EXCLUSIVE, BLOCKING lock) on 1 server, it works.
The question is: Should this work? Is it not recommended to use file locking via NFS in general? (The information that was given that it does not work at all often referred to outdated information). If this should work, what can I do to debug or solve this?
Test setup
(I have used a PHP script, but a simpler test setup can be done using the command line flock):
System 1:
flock -x lock.txt sleep 10
Result: lock is acquired
System 2 (while System 1 has lock acquired):
flock -x lock.txt sleep 10
This returns immediately with
flock: lock.txt: No locks available
Diagnosis
strace flock -x lock.txt sleep 10
flock(3, LOCK_EX) = -1 ENOLCK (No locks available)
Add debugging information with rpcdebug -m nfs all
(on client)
This is the log for the failed flock attempt.
/var/log/messages
Feb 4 10:24:51 myclient kernel: NFS: initiated commit call
Feb 4 10:24:51 myclient kernel: NFS: 6791 nfs_commit_done (status 0)
Feb 4 10:24:51 myclient kernel: NFS: nfs_update_inode(0:40/916722366 fh_crc=0xa8927c2a ct=1 info=0x27e7f)
Feb 4 10:24:51 myclient kernel: NFS: commit (0:40/916722366 1358@4096) OK
Feb 4 10:24:59 myclient kernel: NFS: permission(0:41/872433655), mask=0x81, res=-10
Feb 4 10:24:59 myclient kernel: NFS call access
Feb 4 10:24:59 myclient kernel: NFS: nfs_update_inode(0:41/872433655 fh_crc=0x9e46fe1a ct=2 info=0x27e7f)
Feb 4 10:24:59 myclient kernel: NFS reply access: 0
Feb 4 10:24:59 myclient kernel: NFS: permission(0:41/872433655), mask=0x1, res=0
Feb 4 10:24:59 myclient kernel: NFS: nfs_lookup_revalidate(/lock.txt) is valid
Feb 4 10:24:59 myclient kernel: NFS: permission(0:41/915542237), mask=0x10, res=0
Feb 4 10:24:59 myclient kernel: NFS: dentry_delete(/lock.txt, 40808cc)
Feb 4 10:24:59 myclient kernel: NFS: permission(0:41/872433655), mask=0x81, res=0
Feb 4 10:24:59 myclient kernel: NFS: nfs_lookup_revalidate(/lock.txt) is valid
Feb 4 10:24:59 myclient kernel: NFS: revalidating (0:41/915542237)
Feb 4 10:24:59 myclient kernel: NFS call getattr
Feb 4 10:24:59 myclient kernel: NFS reply getattr: 0
Feb 4 10:24:59 myclient kernel: NFS: nfs_update_inode(0:41/915542237 fh_crc=0x35293470 ct=1 info=0x27e7f)
Feb 4 10:24:59 myclient kernel: NFS: nfs3_forget_cached_acls(0:41/915542237)
Feb 4 10:24:59 myclient kernel: NFS: (0:41/915542237) revalidation complete
Feb 4 10:24:59 myclient kernel: NFS: dentry_delete(/lock.txt, 40808cc)
Feb 4 10:24:59 myclient kernel: NFS: nfs_weak_revalidate: inode 872433655 is valid
Feb 4 10:24:59 myclient kernel: NFS: permission(0:41/872433655), mask=0x81, res=0
Feb 4 10:24:59 myclient kernel: NFS: revalidating (0:41/915542237)
Feb 4 10:24:59 myclient kernel: NFS call getattr
Feb 4 10:24:59 myclient kernel: NFS reply getattr: 0
Feb 4 10:24:59 myclient kernel: NFS: nfs_update_inode(0:41/915542237 fh_crc=0x35293470 ct=1 info=0x27e7f)
Feb 4 10:24:59 myclient kernel: NFS: (0:41/915542237) revalidation complete
Feb 4 10:24:59 myclient kernel: NFS: nfs_lookup_revalidate(/lock.txt) is valid
Feb 4 10:24:59 myclient kernel: NFS call access
Feb 4 10:24:59 myclient kernel: NFS: nfs_update_inode(0:41/915542237 fh_crc=0x35293470 ct=1 info=0x27e7f)
Feb 4 10:24:59 myclient kernel: NFS reply access: 0
Feb 4 10:24:59 myclient kernel: NFS: permission(0:41/915542237), mask=0x24, res=0
Feb 4 10:24:59 myclient kernel: NFS: open file(/lock.txt)
Feb 4 10:24:59 myclient kernel: NFS: llseek file(/lock.txt, 0, 1)
Feb 4 10:24:59 myclient kernel: NFS: flock(/lock.txt, t=1, fl=82)
Feb 4 10:24:59 myclient kernel: NFS: flush(/lock.txt)
Feb 4 10:24:59 myclient kernel: NFS: release(/lock.txt)
Feb 4 10:24:59 myclient kernel: NFS: dentry_delete(/lock.txt, 40808cc)
System
RHEL
uname -r
3.10.0-1062.9.1.el7.x86_64
nfsstat –s
Server rpc stats:
calls badcalls badclnt badauth xdrcall
0 0 0 0 0
Client rpc stats:
calls retrans authrefrsh
588092 0 588092
Client nfs v3:
null getattr setattr lookup access readlink
0 0% 350667 59% 0 0% 1714 0% 231693 39% 5 0%
read write create mkdir symlink mknod
748 0% 2243 0% 0 0% 3 0% 0 0% 0 0%
remove rmdir rename link readdir readdirplus
0 0% 0 0% 0 0% 0 0% 0 0% 110 0%
fsstat fsinfo pathconf commit
0 0% 10 0% 5 0% 889 0%
mount options:
rw,nosuid,noexec,noatime,nodiratime,context=system_u:object_r:httpd_sys_rw_content_t:s0,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=someip,mountvers=3,mountport=300,mountproto=udp,local_lock=none,addr=someip
I have searched about this topic. Some hits are quite old, did not get answered or refer to older versions of flock on linux which did not yet support shared locking.
For example on my system, man 2 flock gives the following information:
In Linux kernels up to 2.6.11, flock() does not lock files over NFS (i.e., the scope of locks was limited to the local system). Instead, one could use fcntl(2) byte-range locking, which does work over NFS, given a sufficiently recent version of Linux and a server which supports locking. Since Linux 2.6.12, NFS clients support flock() locks by emulating them as byte-range locks on the entire file. This means that fcntl(2) and flock() locks do interact with one another over NFS. Since Linux 2.6.37, the kernel supports a compatibility mode that allows flock() locks (and also fcntl(2) byte region locks) to be treated as local; see the discussion of the local_lock option in nfs(5).