5

I am currently running a debian server that is exporting a large JFS filesystem (22TB) over NFS (nfs-kernel-server.) When attempting to write to the NFS share, the performance is very poor. The 22TB disk is sitting on a NAS mounted using iSCSI.

  • It will bust for a moment near expected line speed, and then sit idle for several seconds. Very little traffic measured in the low kb/sec.
  • The wait peeks on write.
  • When reading from the NFS mount, the system operates at expected speeds (11MB/sec).
  • The issue does not occur when using SFTP, rsync, or local coping (non-nfs).
  • The issue persists between stable and testing releases.
  • On the same machine I have a 14TB ext4 filesystem using the exact same export configuration that does not share the issue. This share is not in regular use and thus not consuming resources.

NFS Server:

cat /etc/exports
/data2      10.1.20.86(rw,no_subtree_check,async,all_squash)

cat /sys/block/sdb/queue/scheduler
noop [deadline] cfq

cat /etc/default/nfs-kernel-server 
RPCNFSDCOUNT=8
RPCNFSDPRIORITY=0
RPCMOUNTDOPTS=--manage-gids
NEED_SVCGSSD=
RPCSVCGSSDOPTS=

NFS Client:

cat /etc/fstab
10.1.20.100:/data2  /root/incoming  nfs     rw,noatime,soft,intr,noacl 0 2

cat /sys/block/sdb/queue/scheduler
noop [deadline] cfq

cat /proc/mounts
10.1.20.100:/data2/ /root/incoming nfs4 rw,noatime,vers=4,rsize=262144,wsize=262144,namlen=255,soft,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.1.20.86,minorversion=0,addr=10.1.20.100 0 0

This problem has me pretty stumped. Any help would be greatly welcomed. Thanks.

user143546
  • 51
  • 1
  • 2
  • Is the mount options of the 14T and 22T the same on the NFS server? – John Siu Nov 10 '12 at 01:55
  • Yeah, I don't know that people use JFS in this setup often. If ext4 doesn't exhibit the issue, perhaps that's an indicator that the issues lies with the filesystem. – ewwhite Nov 20 '12 at 14:45
  • 1
    Other questions, NETWORK: -> How many network interfaces does the server have? -> Bonded Interfaces? -> Is the iSCSI and NFS traffic isolated from other network traffic? -> either with vlans or separate switches? -> network: is jumboframes enabled? -> Does the iSCSI server accessed via a different vlan, ie network traffic flows through a router? -> Client network configuration/speed? iSCSI -> raid configuration? Thats a couple of questions that will help with a answer :) – Danie Nov 20 '12 at 13:06
  • did you tryto use nfs v3? mount -o vers=3. V4 performace very havely depends on the kernel version. – kofemann Feb 20 '13 at 19:51
  • 2
    I don't have an answer, but some info and questions. You have 2 layers in here and 3 points of observation. Layers are: * JFS layer (The one talking to the disk) * I haven't noticed you're mentioning what is the exact mount options list you're using, to mount, maybe there is some place of improvement there. * NFS layer (the shared one): * NFS has statistics tool, `nfsstat` Observation points are: * local point, local fs * NFS server * NFS client What I'd suggest is running `nfsstat` on client and server, before, during and after the write/read test, AND on both - the **good** and the **bad** v – Max K. Feb 20 '13 at 19:42
  • Are you write heavy, ready heavy or a little bit of both? I noticed you are using deadline for your i/o scheduler. Did you get worse performance when you were using cfq? – madflojo Jun 21 '13 at 14:50

3 Answers3

2

My guess is that the number of NFS server threads is too low. Instead of 8, the number should be much higher.

8 threads would probably be enough for shares that contain only small files and are accessed by a very small number of clients (e.g. in a home network) or on slow networks (10 Mbit).

Try to determine the retrans value on your NFS server during writing:

nfsstat -r

If you get transmission retries, increase the number of server threads.

And I think it would be save to remove the rsize / wsize / tcp settings from your mount options. TCP is the default protocol anyway and with TCP it is not necessary to limit the transfer size.

SvennD
  • 739
  • 5
  • 18
life-on-mars
  • 121
  • 3
0

I suspect some problem with JumboFrames. Check offloading configuration on both your interfaces using

sudo ethtool -k your_nic

and also try listening the wires using wireshark. You may find some out-of-order packets, dups, ...

jary
  • 111
  • 2
0

Maybe it is incompatibility with nfs locking used for writting and jfs. I found some bug in ubuntu: https://bugs.launchpad.net/ubuntu/+source/jfsutils/+bug/754495

Znik
  • 338
  • 1
  • 3
  • 12