2

Our FreeNAS server log is constantly filling up with

Apr  5 15:13:49 Wheelhouse NAS kernel: swap_pager: I/O error - pagein failed; blkno 524312,size 4096, error 6
Apr  5 15:13:49 Wheelhouse NAS kernel: vm_fault: pager read error, pid 1 (init)
Apr  5 15:13:49 Wheelhouse NAS kernel: swap_pager: I/O error - pagein failed; blkno 524312,size 4096, error 6
Apr  5 15:13:49 Wheelhouse NAS kernel: vm_fault: pager read error, pid 1 (init)
Apr  5 15:13:49 Wheelhouse NAS kernel: swap_pager: I/O error - pagein failed; blkno 524312,size 4096, error 6
Apr  5 15:13:49 Wheelhouse NAS kernel: vm_fault: pager read error, pid 1 (init)
Apr  5 15:13:49 Wheelhouse NAS kernel: swap_pager: I/O error - pagein failed; blkno 524312,size 4096, error 6

and so on.

What can we do?

It's already filled up /var/log so that /var is "109%" full! Can I stop the logging somehow?

We are currently replacing a bad drive in one of the RAIDZs...

> zpool status
  pool: raid-5x3
 state: ONLINE
 scrub: scrub completed after 15h52m with 0 errors on Sun Mar 30 13:52:46 2014
config:

    NAME                                            STATE     READ WRITE CKSUM
    raid-5x3                                        ONLINE       0     0     0
      raidz1                                        ONLINE       0     0     0
        ada5p2                                      ONLINE       0     0     0
        gptid/a767b8ef-1c95-11e2-af4c-f46d049aaeca  ONLINE       0     0     0
        ada8p2                                      ONLINE       0     0     0
        ada10p2                                     ONLINE       0     0     0
        ada7p2                                      ONLINE       0     0     0

errors: No known data errors

  pool: raid2
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
    continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h57m, 4.48% done, 20h24m to go
config:

    NAME                                              STATE     READ WRITE CKSUM
    raid2                                             DEGRADED     0     0     0
      raidz1                                          DEGRADED     0     0     0
        gptid/5f3c0517-3ff2-11e2-9437-f46d049aaeca    ONLINE       0     0     0
        replacing                                     DEGRADED     0     0     0
          gptid/5fe33556-3ff2-11e2-9437-f46d049aaeca  UNAVAIL      0     0     0  cannot open
          ada0                                        ONLINE       0     0     0  113G resilvered
        gptid/60570005-3ff2-11e2-9437-f46d049aaeca    ONLINE       0     0     0
        gptid/60ebeaa5-3ff2-11e2-9437-f46d049aaeca    ONLINE       0     0     0
        gptid/61925b86-3ff2-11e2-9437-f46d049aaeca    ONLINE       0     0     0

errors: No known data errors
Dan
  • 939
  • 5
  • 14
  • 25

1 Answers1

3

It would appear that what has happened is that the bad drive not only was part of a RAIDZ but also held a swap partition that was not only active but actually had something that had been swapped out to it.

FreeNAS creates swap partitions by default when adding drives which actually can be a reliability concern if you actually end up with data swapped out there as the swap has no redundancy. See https://bugs.freenas.org/issues/208 for some discussion of this.

It seems to me like you may want to reboot after this to get back into a known good state as it's not exactly clear what data the lost swapped out pages held.

Håkan Lindqvist
  • 33,741
  • 5
  • 65
  • 90
  • I see ... so maybe that's why the GUI process was down. I had to restart it manually from shell. And we were getting errors related to CNID stuff for our AFP shares. AND I was unable to reboot the server via shell. Sounds like some of these processes were on/using the swap when the drive failed. Am I reading the situation somewhat correctly? My hope is that after a proper reboot we will be back to normal. I should wait for resilver to finish, then reboot, right? – Dan Apr 05 '14 at 19:53
  • Oh, cr@p ... the server just crashed... – Dan Apr 05 '14 at 20:03
  • 1
    Okay, as for the resilvering process that should actually be resumable AFAIK. I guess the worry here is that something could have gone more wrong than just crashing (like it writing something bad to disk for whatever reason). – Håkan Lindqvist Apr 05 '14 at 20:06