1

I have a Solaris 10 ZFS based NFS server. The physical disks are more or less at their maximum io rates. Performance is very bad - so we will add spindles. The NFS solely serves as storage for XenServer Hypervisors.

I want to know which VM disks (means .vhd files on the storage) are producing most of the load. How can I query the filesystem or maybe nfsd to get an iostat or top like output with filename and reads / writes. Numbers can be absolute or relative.

I tried iosnoop. It definitely goes in the right direction. Unfortunately it seems to be unable to get the filenames on a ZFS filesystem. I have no experience with dtrace. Maybe there is already a script out there?

Roman
  • 372
  • 1
  • 7
  • 19

2 Answers2

2

You need to use dtrace, for that level of introspection. Here's basically the nfsv3fileio.d example from https://wikis.oracle.com/display/DTrace/nfsv3+Provider, but updated to run live (I think, my testing was minimal). That same page has a few more examples. You might also Google for 'nfssvrtop'.

#!/usr/sbin/dtrace -s

#pragma D option quiet

dtrace:::BEGIN
{
        trace("Tracing.. hit CTRL-C to end. Updates every 5 seconds.\n");
}

nfsv3:::op-read-done
{
        @bytes_read[args[1]->noi_curpath] = sum(args[2]->res_u.ok.data.data_len);
}

nfsv3:::op-write-done
{
        @bytes_written[args[1]->noi_curpath] = sum(args[2]->res_u.ok.count);
}

profile:::tick-5sec
{
        trunc(@bytes_read, 15);
        trunc(@bytes_written, 15);
        printf("\n%15s   %15s   %s\n", "Bytes Read/5s", "Bytes Written/5s", "Pathname");
        printa("%@15d   %@15d   %s\n", @bytes_read, @bytes_written);
        trunc(@bytes_read);
        trunc(@bytes_written);
}
Nex7
  • 1,925
  • 11
  • 14
1

Do you have any level of NVRAM write caching on your setup? In the ZFS case, it would be the presence of a ZIL device? If not, that's probably the key to your performance problems.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • Actually the ZIL has gone corrupted (bad SSDs). That will be replaced asap. So you are right, no ZIL at the moment. – Roman Mar 15 '13 at 14:59
  • ... And because "asap" is a bit too distant in time, I want to move some major load to another system. – Roman Mar 15 '13 at 15:06
  • I would disable ZIL for the moment, as that will help performance while you work to replace your ZIL SSDs. – ewwhite Mar 15 '13 at 17:04
  • Do you mean the ZFS wide zil? Which is now on the spindles? – Roman Mar 15 '13 at 22:39
  • Right, but you need to set sync=disabled for the relevant filesystems. – ewwhite Mar 15 '13 at 22:54
  • Ok. In the meantime we had the chance to replace the SSDs. So there is no intermediate action necessary. Still it would be really nice to know how the io load is distributed over the files. – Roman Mar 17 '13 at 12:24