Why does ZFS slow down making large files after 2.4 gig?

2

Zfs on linux, I'm making a large tar file, I'm watching it add to the file, and it goes very quickly, until it hits 2.4gig in size, then it just crawls, for hours.

The same file on ext4 has no problem. Anybody have any idea why that might be?

The zfs filesystem is on a mirrored 1tb vdev. Plenty of space.

zpool list z
NAME   SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
z      928G   463G   465G         -    28%    49%  1.00x  ONLINE  -

Edit: Adding results of get all and iostat...

    root@io:~# zfs get all z
    NAME  PROPERTY              VALUE                  SOURCE
    z     type                  filesystem             -
    z     creation              Sat Jul 25  8:29 2015  -
    z     used                  472G                   -
    z     available             427G                   -
    z     referenced            19K                    -
    z     compressratio         1.10x                  -
    z     mounted               yes                    -
    z     quota                 none                   default
    z     reservation           none                   default
    z     recordsize            128K                   default
    z     mountpoint            /z                     default
    z     sharenfs              off                    default
    z     checksum              on                     default
    z     compression           lz4                    local
    z     atime                 on                     default
    z     devices               on                     default
    z     exec                  on                     default
    z     setuid                on                     default
    z     readonly              off                    default
    z     zoned                 off                    default
    z     snapdir               hidden                 default
    z     aclinherit            restricted             default
    z     canmount              on                     default
    z     xattr                 on                     default
    z     copies                1                      default
    z     version               5                      -
    z     utf8only              off                    -
    z     normalization         none                   -
    z     casesensitivity       sensitive              -
    z     vscan                 off                    default
    z     nbmand                off                    default
    z     sharesmb              off                    default
    z     refquota              none                   default
    z     refreservation        none                   default
    z     primarycache          all                    default
    z     secondarycache        all                    default
    z     usedbysnapshots       0                      -
    z     usedbydataset         19K                    -
    z     usedbychildren        472G                   -
    z     usedbyrefreservation  0                      -
    z     logbias               latency                default
    z     dedup                 off                    default
    z     mlslabel              none                   default
    z     sync                  standard               default
    z     refcompressratio      1.00x                  -
    z     written               19K                    -
    z     logicalused           521G                   -
    z     logicalreferenced     9.50K                  -
    z     filesystem_limit      none                   default
    z     snapshot_limit        none                   default
    z     filesystem_count      none                   default
    z     snapshot_count        none                   default
    z     snapdev               hidden                 default
    z     acltype               off                    default
    z     context               none                   default
    z     fscontext             none                   default
    z     defcontext            none                   default
    z     rootcontext           none                   default
    z     relatime              off                    default
    z     redundant_metadata    all                    default
    z     overlay               off                    default




    when it's going fast.

    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
               4.30    0.25    8.97    9.29    0.00   77.20

    Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
    sda               0.00     0.00    0.00    0.50     0.00     6.00    24.00     0.00    0.00    0.00    0.00   0.00   0.00
    sdb               0.00     0.50    0.00    1.00     0.00     6.00    12.00     0.00    2.00    0.00    2.00   2.00   0.20
    sdc               0.00     0.00  108.50  199.50  3702.50 11780.25   100.54     3.78   12.35   23.93    6.06   2.69  82.80
    sdd               0.00     0.00  255.00  177.50  1930.50 10308.25    56.60     2.39    5.57    5.49    5.68   1.95  84.40

    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
               4.00    0.13   10.98    2.98    0.00   81.92

    Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
    sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
    sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
    sdc               0.00     0.00  401.50    0.00  1168.50     0.00     5.82     1.09    2.71    2.71    0.00   1.48  59.40
    sdd               0.00     0.00  443.50    0.00  9012.00     0.00    40.64     1.70    3.83    3.83    0.00   1.31  58.00


    after 2.1 gigs.

    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
               2.41    0.00    3.99   15.76    0.00   77.85

    Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
    sda               0.00     0.00    0.00    4.50     0.00    20.00     8.89     0.00    0.00    0.00    0.00   0.00   0.00
    sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
    sdc               1.00     0.00  173.00   51.00  8988.00  1645.75    94.94     3.31   14.79   17.47    5.73   4.09  91.60
    sdd               2.00     0.00  357.50   36.50 21646.00   818.75   114.03     3.90   10.39   11.14    3.01   2.19  86.40

    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
               3.17    0.00    2.09   10.71    0.00   84.03

    Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
    sda               0.00     7.50    0.00    1.50     0.00    40.00    53.33     0.00    2.67    0.00    2.67   2.67   0.40
    sdb               0.00    28.00    0.50   13.00     2.00   162.00    24.30     0.03    1.93    0.00    2.00   1.93   2.60
    sdc               2.00     0.00  360.00    0.00 22623.50     0.00   125.69     5.33   14.65   14.65    0.00   2.71  97.60
    sdd               0.00     0.00  163.50    0.00  7950.25     0.00    97.25     3.46   21.17   21.17    0.00   5.83  95.40

So there's a lot more read io when it's slowing down. Any idea why? Tar should just be writing I'd think.

Thanks.

Stu

Posted 2015-10-17T23:53:51.170

Reputation: 1 044

How much RAM do you have? What sizes are the files? – qasdfdsaq – 2015-10-18T00:08:16.847

8 gigs of ram, 2-4meg files each. – Stu – 2015-10-18T17:46:55.713

oh, wait, sorry wrong, machine 2-4 meg files, 32 gigs of ram – Stu – 2015-10-18T17:47:20.067

Can you do 'zfs get all z' and 'iostat -x 2'? – qasdfdsaq – 2015-10-19T14:06:38.300

I'm curious whether the problem is due to reading or writing. Can you try dd if=/dev/urandom of=/z/somefileonzfs bs=1M, keep an eye on the size of the named output (of) file, and see if it slows down after a couple of gigabytes also in that case? Note: this will exhaust your system random pool entropy, so don't do something like generate a new PGP key immediately afterwards... (You can also turn off compression on your file system and use /dev/zero instead of /dev/urandom.) If the write slows down in the case of dd, the problem stems from writing; if not, it stems from reading. – a CVn – 2015-10-20T09:55:05.630

I'll try that when I get home tonight but it's also worth mentioning, that the original source of the data being tarred up is not coming from zfs.... so what's to read? – Stu – 2015-10-20T13:39:28.817

okay, I tried it. i/o stayed pretty consistent throughout. it got passed 2.2 gig just fine, no hiccups, so maybe it has something to do with tar. – Stu – 2015-10-20T23:29:43.003

You have compression on, which can mess around with creating large archives but tar is also supposed to be perfectly linear since it was originally meant for tape backups. There's definitely more reads going on and it's hard to tell why - perhaps iotop would give a clue. Perhaps it's some other unrelated application reading more as a result of depletion of RAM cache space? – qasdfdsaq – 2015-11-05T14:43:38.603

No answers