I am running CentOS 6 on a 32-bit machine with 1 GB of RAM.
I have a 1TB external HDD that I am trying to run e2fsck on. It runs for about an hour and a half and then fails with Error storing directory block information (inode=45206324, block=0, num=39291127): Memory allocation failed
. After finding this question, I created /etc/e2fsck.conf
with the indicated contents, and based on the answers to this question, created a 20GB swap file (I only have one disk, so splitting swap across multiple disks is not possible). There was already 2GB of swap space.
At the time of failure, it had used about 325MB in its scratch_files directory and swap usage was at 550MB. The new 20GB swapfile had not been touched. It hung on for another 45 minutes at about 2% CPU usage before the program died with e2fsck: aborted
and swap went back to about 65 MB.
Using iostat -dx
, I found that the utilization of the main disk was 4.3% and the external drive 7.2% while e2fsck was still running (but failed), but I didn't have this going while CPU was at 100% so I don't know what that looked like. After the program was finally aborted, those disk utilization numbers didn't change.
So my question is: why did e2fsck fail without either using up swap space or filling up the disk with scratch files? Is there anything else I can try to fix this disk using this machine? It's 3000 miles away...
Edit:
Here are the memory-related lines of top
before the memory failure:
Mem: 1029080k total, 1010780k used, 18300k free, 309780k buffers
Swap: 23019504k total, 71796k used, 22947708k free, 433072k cached
And after, while still running:
Mem: 1004.961M total, 991.344M used, 13.617M free, 1728.000k buffers
Swap: 21.953G total, 541.621M used, 21.424G free, 27.949M cached
Edit 2
I ran e2fsck
again using strace
. Interestingly, it was able to run for much longer, using up about 220 minutes of CPU time while at about 70% usage (strace
and tail
took up the other 30%). The program made it to 1832 MB of virtual memory, 811 MB of resident memory, and 105 MB of shared memory. Here are the strace
lines from the failed memory allocation:
22648 mremap(0x32ebc000, 1675546624, 2513317888, MREMAP_MAYMOVE) = -1 ENOMEM (Cannot allocate memory)
22648 mmap2(NULL, 2513317888, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
22648 mmap2(NULL, 2513453056, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
22648 mmap2(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0xb2ca5000
22648 munmap(0xb2ca5000, 372736) = 0
22648 munmap(0xb2e00000, 675840) = 0
22648 mprotect(0xb2d00000, 135168, PROT_READ|PROT_WRITE) = 0
22648 mmap2(NULL, 2513317888, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
There doesn't seem to have been anything odd going on with reads and writes:
22648 _llseek(3, 755934822400, [755934822400], SEEK_SET) = 0
22648 write(3, "/\276\0\0|}\33xA\236\0d/\243\0\0A\236\0\\\0\0\0\0\177\303\363x8\200\0\4"..., 4096) = 4096
22648 lseek(3, 260845568, SEEK_SET) = 260845568
22648 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
22648 _llseek(3, 755934793728, [755934793728], SEEK_SET) = 0
22648 write(3, "\376\355\372\317\1\0\0\22\0\0\0\0\0\0\0\6\0\0\0\23\0\0\0\0\0\21\0\205\0\0\0\0"..., 4096) = 4096
22648 write(1, "I", 1) = 1
22648 write(1, "node", 4) = 4
22648 write(1, " ", 1) = 1
22648 write(1, "46217250", 8) = 8
22648 write(1, " is too big. ", 14) = 14
22648 write(1, "Truncate? yes\n\n", 15) = 15