e2fsck on a low-memory machine: can I get more out of scratch_files or swap?

Question

I am running CentOS 6 on a 32-bit machine with 1 GB of RAM.

I have a 1TB external HDD that I am trying to run e2fsck on. It runs for about an hour and a half and then fails with Error storing directory block information (inode=45206324, block=0, num=39291127): Memory allocation failed. After finding this question, I created /etc/e2fsck.conf with the indicated contents, and based on the answers to this question, created a 20GB swap file (I only have one disk, so splitting swap across multiple disks is not possible). There was already 2GB of swap space.

At the time of failure, it had used about 325MB in its scratch_files directory and swap usage was at 550MB. The new 20GB swapfile had not been touched. It hung on for another 45 minutes at about 2% CPU usage before the program died with e2fsck: aborted and swap went back to about 65 MB.

Using iostat -dx, I found that the utilization of the main disk was 4.3% and the external drive 7.2% while e2fsck was still running (but failed), but I didn't have this going while CPU was at 100% so I don't know what that looked like. After the program was finally aborted, those disk utilization numbers didn't change.

So my question is: why did e2fsck fail without either using up swap space or filling up the disk with scratch files? Is there anything else I can try to fix this disk using this machine? It's 3000 miles away...

Edit: Here are the memory-related lines of top before the memory failure:

Mem:   1029080k total,  1010780k used,    18300k free,   309780k buffers
Swap: 23019504k total,    71796k used, 22947708k free,   433072k cached

And after, while still running:

Mem:  1004.961M total,  991.344M used,   13.617M free, 1728.000k buffers
Swap:   21.953G total,  541.621M used,   21.424G free,   27.949M cached

Edit 2

I ran e2fsck again using strace. Interestingly, it was able to run for much longer, using up about 220 minutes of CPU time while at about 70% usage (strace and tail took up the other 30%). The program made it to 1832 MB of virtual memory, 811 MB of resident memory, and 105 MB of shared memory. Here are the strace lines from the failed memory allocation:

22648 mremap(0x32ebc000, 1675546624, 2513317888, MREMAP_MAYMOVE) = -1 ENOMEM (Cannot allocate memory)
22648 mmap2(NULL, 2513317888, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
22648 mmap2(NULL, 2513453056, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
22648 mmap2(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0xb2ca5000
22648 munmap(0xb2ca5000, 372736)        = 0
22648 munmap(0xb2e00000, 675840)        = 0
22648 mprotect(0xb2d00000, 135168, PROT_READ|PROT_WRITE) = 0
22648 mmap2(NULL, 2513317888, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)

There doesn't seem to have been anything odd going on with reads and writes:

22648 _llseek(3, 755934822400, [755934822400], SEEK_SET) = 0
22648 write(3, "/\276\0\0|}\33xA\236\0d/\243\0\0A\236\0\\\0\0\0\0\177\303\363x8\200\0\4"..., 4096) = 4096
22648 lseek(3, 260845568, SEEK_SET)     = 260845568
22648 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
22648 _llseek(3, 755934793728, [755934793728], SEEK_SET) = 0
22648 write(3, "\376\355\372\317\1\0\0\22\0\0\0\0\0\0\0\6\0\0\0\23\0\0\0\0\0\21\0\205\0\0\0\0"..., 4096) = 4096
22648 write(1, "I", 1)                  = 1
22648 write(1, "node", 4)               = 4
22648 write(1, " ", 1)                  = 1
22648 write(1, "46217250", 8)           = 8
22648 write(1, " is too big.  ", 14)    = 14
22648 write(1, "Truncate? yes\n\n", 15) = 15

Another helpful option is to run e2fsck through strace: `strace -f -o e2fsck.strace e2fsck...` So we can see which syscall really fails. The dump file can get big though. As only the end of it is relevant you may use a FIFO instead and start in another shell: `tail -f fifo -n 500 > e2fsck.strace`. But check first that this works as expected. — Hauke Laging, Feb 26 '13 at 23:02
To eliminate bad hardware I'd try a `dd if=/dev/sdb of=/dev/null` and see if it succeeds. During the `dd` watch the logs for interesting messages. — Mark Wagner, Feb 27 '13 at 18:18

score 1 · Answer 1 · answered May 28 '20 at 01:15

We can tell that e2fsck is trying to allocate 2.5gb for whatever table he's trying to come up with, and even though you do have enough (virtual) ram available you don't have address space for that in a 32-bit process.

Well, you do, but you're asking for 5/6ths of it at once, odds are other mappings/allocations are taking the remaining 500mb of address space before, hence the kernel can't spot a contiguous 2.5gb space to satisfy that mmap2.

My advice is: Try running a USB bootable 64-bit Linux, do make use that same 20gb swapfile you have (or have at least 4gb of swap handy), you're already aware this might take ages to complete.

On a side note: I've downloaded e2fsprogs source code to determine whether e2fsck could be requesting real ram by calling mlock() or mlockall(), but grepping "mlock" recursively yield no results, so this path seems unlikely.

I can't comment on posts (I'm new here), please like my answer if useful so that I can earn the reputation points serverfault's requiring to let me comment on posts.

Last, but not least: You can strace all memory related calls with strace -e memory e2fsck..

score 1 · Answer 2 · edited Feb 27 '13 at 16:25

I am not an e2fsck expert. I assume that e2fsck does care whether the memory it sees is real RAM or swap. Pages can be locked into memory. I assume that the information how much memory is locked is available via /proc or ps, top,... You may monitor this value.

Obviously the only good solution would be to connect the disk to better hardware. Difficult for you. But it may even help not to make this connection physically but via network. If there is another Linux system with a suitable LAN connection to yours and with more RAM then you could export the device to be checked as a network block device. Probably still faster than my next idea.

If the problem is that e2fsck requires "real" RAM then you could create a virtual machine with a tiny Linux installation (nothing more needed than e2fsck...). This VM could be configured with 2, 4, 16 GiB of "RAM". The device to be checked can be exported as a block device (appearing as a disk in the VM). It probably makes sense to use the scratch_files feature anyway. This would obviously be a performance nightmare but I guess you have accepted already that any possible solution in in this category.

Edit 1

You can see the amount of virtual memory a process has locked into RAM by:

grep VmLck /proc/$PID/status

Edit 2

Here's everything from dmesg related to device sdb. The errors for EXT4-fs are the reason I was running e2fsck in the first place.

sd 0:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
sd 0:0:0:0: [sdb] Write Protect is off
sd 0:0:0:0: [sdb] Mode Sense: 28 00 00 00
sd 0:0:0:0: [sdb] Assuming drive cache: write through
sd 0:0:0:0: [sdb] Assuming drive cache: write through
 sdb: sdb1
sd 0:0:0:0: [sdb] Assuming drive cache: write through
sd 0:0:0:0: [sdb] Attached SCSI disk
EXT4-fs (sdb1): barriers disabled
EXT4-fs (sdb1): warning: mounting fs with errors, running e2fsck is recommended
EXT4-fs (sdb1): recovery complete
EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: 
SELinux: initialized (dev sdb1, type ext4), uses xattr
EXT4-fs error (device sdb1): ext4_lookup: deleted inode referenced: 46006273
EXT4-fs (sdb1): ext4_check_descriptors: Checksum for group 0 failed (2332!=0)
EXT4-fs (sdb1): group descriptors corrupted!
EXT4-fs (sdb1): ext4_check_descriptors: Checksum for group 0 failed (34754!=0)
EXT4-fs (sdb1): group descriptors corrupted!
EXT4-fs (sdb1): ext4_check_descriptors: Checksum for group 0 failed (34754!=0)
EXT4-fs (sdb1): group descriptors corrupted!

Not sure what you mean by memory being locked. I've updated the question with some information from `top`, if that gets at what you're saying. — WinnieNicklaus, Feb 26 '13 at 20:27
Locking pages into memory means that the kernel is not allowed to swap these pages out. So if you have 1 GiB of RAM and e2fsck has 3 GiB of virtual address space (no problem thanks to swap) but tries to lock about 1 GiB then the process dies because the kernel cannot fulfill the request. — Hauke Laging, Feb 26 '13 at 20:40
Google just told me that you can see the amount of locked memory for a process by this: `grep VmLck /proc/$PID/status` — Hauke Laging, Feb 26 '13 at 20:52
Thanks. I'll run it again and monitor that value to see if it does anything interesting. — WinnieNicklaus, Feb 26 '13 at 21:15
Well, as it turns out, no virtual memory was locked into RAM, the memory usage of e2fsck was just 180 MB, and swap and scratch_files were used for just a couple hundred MB more. I can't understand why the process would fail under these conditions, but perhaps I'm at a dead end with this disk for now. — WinnieNicklaus, Feb 27 '13 at 15:55
Anything useful from strace? Any hardware related errors in the kernel log (dmesg)? — Hauke Laging, Feb 27 '13 at 16:16
I've attached output from `dmesg`. Not sure how to use `strace`. I tried it out with `strace ls` and got dozens of lines of output, so I'm wary of running it on `e2fsck` unless I know what I'm looking for. Thank you for continuing to think about this. — WinnieNicklaus, Feb 27 '13 at 16:23
The kernel errors with respect to the mount try are not helpful. I meant: Does e2fsck cause messages about hardware errors. The messages you posted are not related to hardware. I have shown you how to use strace in a comment to your question. You are looking for failing read() or write() (or familiar ones) syscalls on the block device. — Hauke Laging, Feb 27 '13 at 16:29
Oh, sorry, I didn't see that before. I don't see anything interesting in `dmesg`. I'll try strace, though, and see what happens. If nothing else, I'm learning a lot! — WinnieNicklaus, Feb 27 '13 at 16:32

e2fsck on a low-memory machine: can I get more out of scratch_files or swap?

2 Answers2

Linked