7

The configuration is: A linux server and a nas box (netgear) acting as nfs server.

It is easy for a single process on the linux server to use all i/o bandwidth by simply copying a file from the nfs share to the nfs share. The i/o channel is jammed and all other processes on the server will nearly halt waiting for i/o. Load grows up to 10-20 (four cores), more and more pdflush processes appear... until someone stopps the file copy.

How can I limit the i/o bandwidth the cp process uses? nice will not help of course, but also ionice -c3 has no effect. Does ionice affect nfs mounts at all? Is there something like nfsnice ?

Moritz Both
  • 647
  • 8
  • 17

2 Answers2

1

What are your "rsize" and "wsize" values set to?

Normally, modern linux NFS clients negotiate the maximum values with the server, but sometimes, they can end up way off base. For example, we had rsize=1m,wsize=1m in /proc/mounts, not knowing the NAS being unable to support more than 32768. Same slowliness, same effect of load skyrocketing as you describe.

Setting both values down to 32k immediately solved the slowliness and the rising load for us, desktop remained perfectly responsive even while copying gigabytes per NFS. And we have our home directories on NFS...

Perhaps your NAS's NFS server implementation does a little "show off" by offering more size than it can chew...?

Cheers

Christian
  • 295
  • 1
  • 7
0

This seems like the Netgear NAS is not keeping up and is causing blocked I/O. What does the NAS look like? How many drives? What does the RAID config look like? This appears to be a server-side issue.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • The configuraton ist "RAID X", a proprietary netgear level, with 3 1tb disks resulting in 2tb capacity (similar to raid5). The whole disk space is dedicated to the nfs share. – Moritz Both Oct 31 '11 at 23:31
  • It may be a server-side issue, but still with only one process accessing the nfs share, things work quite well, but as soon as we have more processes, some tend to hang. A `nfsnice` command / api might still help... – Moritz Both Oct 31 '11 at 23:38
  • Do you have any way to see the CPU utilization of the Netgear or its disk activity during the period of blocked I/O on the NFS client? – ewwhite Oct 31 '11 at 23:38