4

On my FreeNAS NAS (9.1.1 running zfs v28), I am getting terrible performance for file moves between two directories in the same raidz fs. Is this expected? How can I fault-find this, if not?

The application in this case is Beets (mp3 mgmt software), running in a jail on the NAS itself, so it isn't a case of CIFS performance, or network issues - the data doesn't leave the server. All the software is doing is renames into a different directory, but the performance is as if it is copying all the data.

The system is not under any particular load. I have actually stopped the other processes running on the server just to free up some memory and CPU, just in case.

Updated: The two directories are on the same mountpoint within the jail. The pool is 4 x 2TB SATA drives in a raidz1. No dedupe or compression.

Updated 2: disabling atime on the fs also makes no difference (thought I may as well try it).

Update 3: zfs/zpool output.

[root@Stillmatic2] ~# zpool status
  pool: jumbo1
 state: ONLINE
  scan: scrub repaired 0 in 95h19m with 0 errors on Wed Jul 16 23:20:06 2014
config:

        NAME        STATE     READ WRITE CKSUM
        jumbo1      ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            ada0    ONLINE       0     0     0
            ada1    ONLINE       0     0     0
            ada2    ONLINE       0     0     0
            ada3    ONLINE       0     0     0

errors: No known data errors

[root@Stillmatic2] ~# zfs list
NAME                                                         USED  AVAIL  REFER  MOUNTPOINT
jumbo1                                                      5.32T  21.4G  40.4K  /mnt/jumbo1
jumbo1/data                                                 76.0G  21.4G  76.0G  /mnt/jumbo1/data
jumbo1/howie                                                2.03G  21.4G  2.03G  /mnt/jumbo1/howie
jumbo1/jails                                                45.1G  21.4G   139M  /mnt/jumbo1/jails
jumbo1/jails/.warden-template-9.1-RELEASE-amd64              347M  21.4G   347M  /mnt/jumbo1/jails/.warden-template-9.1-RELEASE-amd64
jumbo1/jails/.warden-template-9.1-RELEASE-amd64-pluginjail   853M  21.4G   852M  /mnt/jumbo1/jails/.warden-template-9.1-RELEASE-amd64-pluginjail
jumbo1/jails/hj-tools                                       43.8G  21.4G  44.1G  /mnt/jumbo1/jails/hj-tools
jumbo1/movies                                               1.56T  21.4G  1.56T  /mnt/jumbo1/movies
jumbo1/music                                                1.45T  21.4G  1.45T  /mnt/jumbo1/music
jumbo1/tv                                                   2.19T  21.4G  2.19T  /mnt/jumbo1/tv
AnotherHowie
  • 206
  • 3
  • 13
  • 2
    Are you sure Beets actually *moves* the data, and does not copy and delete it to try to prevent problems from becoming critical? What is the layout of your pool? Does `zpool status` indicate any problems? Is this *really* within the same file system (same pool doesn't count)? – user Aug 11 '14 at 14:41
  • Are you sure they're in the same dataset? – Michael Hampton Aug 11 '14 at 14:43
  • Beets (in python) is using os.rename(path, dest) although with a fallback to copy+delete if that fails for some reason. I will write a little test to see if it would fallback. – AnotherHowie Aug 11 '14 at 15:17
  • Definitely renaming (by pulling out a minimal testcase from the Beets code). – AnotherHowie Aug 11 '14 at 15:29
  • 1
    What are your server specs, especially RAM? – duenni Aug 11 '14 at 20:23
  • It's a HP N36L (AMD Neo II) with 8GB. It's actual fileserver (CIFS) performance isn't bad at all for a little system. I'm just confused why a local metadata rewrite should slow down so much. – AnotherHowie Aug 11 '14 at 20:27
  • Show your `zfs list` and `zpool status`. – ewwhite Aug 11 '14 at 20:44
  • And where are you moving your files FROM and where are they being copied TO? – ewwhite Aug 11 '14 at 21:24
  • @ewwhite - from jumbo1/music/Incoming to jumbo1/music/Cleaned (i.e. two directories on the same mount) – AnotherHowie Aug 11 '14 at 21:30

1 Answers1

5

21GB out of ~6TB available => <1% Freespace. ZFS recommends 20% freespace for RAIDZ, and at least 10% is mostly mandatory for any reasonable performance. You need to free up some space or expand the size of the array.

Side nodes:

  1. SATA drives need to be scrubbed weekly if you expect to detect array failures before you get into likely data-loss territory. Looks like it's been a month since the last scrub.
  2. You're probably still in the whole percent chances of array failure upon rebuild because of the way that works. See What counts as a 'large' raid 5 array? for details.
Chris S
  • 77,337
  • 11
  • 120
  • 212
  • Totally. This is a space issue. – ewwhite Aug 11 '14 at 21:46
  • Noted. More capacity is to be added soon (and I'll adjust the schedule for scrubbing), but would the freespace issue affect metadata updates so badly or mainly actual data throughput? – AnotherHowie Aug 11 '14 at 21:53
  • @AnotherHowie How will you add space? You can't expand a RAIDZ array. And yes, lack of free space will mess you up! – ewwhite Aug 11 '14 at 21:55
  • New server waiting in the wings (no disks yet), so it will either be a partial copy of the data between the two, or a larger disks (still 4) in the new one. External eSATA drive enclosures cost about as much as these servers. – AnotherHowie Aug 11 '14 at 22:15
  • Getting back to 250GB free space does indeed seem to be having a positive effect (even if it's still only 4% free). Thanks for your help! – AnotherHowie Aug 11 '14 at 22:37
  • Yeah, when you get under a few percent ZFS just starts crawling. I wish I understood why better, but it comes with the territory for now. – Chris S Aug 12 '14 at 01:50