MongoDB secondary crashes on initial sync because of too many journal files on RAID 10

Question

My secondary DB server went down, so I'm booting up a replacement secondary and trying to perform the initial sync. I've been following the tutorials and advice out there to use RAIS 10 on Amazon EBS

So I used 4x4GB EBS in a RAID 10 with the following setup (that was suggested by mongodb back then)

sudo lvcreate -l 90%vg -n data vg0
sudo lvcreate -l 5%vg -n log vg0
sudo lvcreate -l 5%vg -n journal vg0

Since my Primary's version starts getting old (v3.2), I'm at the same time trying to upgrade to 3.4 so I just booted a secondary on 3.4 (in case this might be relevant to the problem)

Problem is, during the initial sync, MongoDB populates too many journal files in /journal, a total of 4x100MB journal files are allocated

ec2-user@secondary$ ll /journal/
total 369105
drwx------ 2 root   root       12288 Apr  3 14:47 lost+found
-rw-r--r-- 1 mongod mongod 104644096 Apr  3 19:00 WiredTigerLog.0000000001
-rw-r--r-- 1 mongod mongod 104685568 Apr  3 19:00 WiredTigerLog.0000000002
-rw-r--r-- 1 mongod mongod 104857600 Apr  3 19:00 WiredTigerLog.0000000003
-rw-r--r-- 1 mongod mongod 104857600 Apr  3 19:00 WiredTigerLog.0000000004

-rw-r--r-- 1 mongod mongod 0 Apr 3 19:00 WiredTigerTmplog.0000000005

which exceed the disk capacity allocated for journaling and causes a brutal crash during initial sync

2018-04-03T19:00:18.821+0000 E STORAGE  [thread2] WiredTiger error (28) [1522782018:821142][6176:0x7efc0cd3d700], log-server: /data/journal/WiredTigerTmplog.0000000005: handle-write: pwrite: failed to write 128 bytes at offset 0: No space left on device
2018-04-03T19:00:18.821+0000 E STORAGE  [thread2] WiredTiger error (28) [1522782018:821213][6176:0x7efc0cd3d700], log-server: journal/WiredTigerTmplog.0000000005: fatal log failure: No space left on device
2018-04-03T19:00:18.821+0000 E STORAGE  [thread2] WiredTiger error (-31804) [1522782018:821228][6176:0x7efc0cd3d700], log-server: the process must exit and restart: WT_PANIC: WiredTiger library panic
2018-04-03T19:00:18.821+0000 I -        [InitialSyncInserters-my_job_glasses_production.ahoy_events0] Fatal Assertion 28559 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 64
2018-04-03T19:00:18.821+0000 I -        [InitialSyncInserters-my_job_glasses_production.ahoy_events0]

***aborting after fassert() failure


2018-04-03T19:00:18.821+0000 I -        [thread2] Fatal Assertion 28558 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 365
2018-04-03T19:00:18.821+0000 I -        [thread2]

***aborting after fassert() failure

I'm not really sure WHY this happens, since on my primary, I only have 2 journal files of 100MB each so I was guessing everything should have been okay

ec2-user@primary$ ll /data/journal/ -h
total 205M
-rw-r--r-- 1 mongod mongod 4.1M Apr  3 18:49 WiredTigerLog.0000000059
-rw-r--r-- 1 mongod mongod 100M Apr  3 16:43 WiredTigerPreplog.0000000001
-rw-r--r-- 1 mongod mongod 100M Apr  3 16:43 WiredTigerPreplog.0000000002

Did I miss something or is something wrong ? Here is my mongod.conf

systemLog:
  destination: file
  logAppend: true
  path: /log/mongod.log
  logRotate: reopen

storage:
  dbPath: /data
  journal:
    enabled: true

processManagement:
  fork: true  # fork and run in background
  pidFilePath: /var/run/mongodb/mongod.pid

net:
  port: 27017
  #bindIp added accordingly

security:
  authorization: enabled
  keyFile: /xxx.key

replication:
  replSetName: XXX

EDIT: It would seem during the initial sync, MongoDB creates up to a dozen files each 100MB before going back to 4x100MB files. Where is this documented ?? is there a way to put a limit on this ??

What are your specific versions of MongoDB server (3.2.x on the source and 3.4.y on the member you are trying to initial sync)? It is expected that the MongoDB server will shutdown if there is no free space to write journal files. Your 5% allocation with 8GB of RAID10 only allows for ~400MB of journal data which is clearly insufficient to keep up with data written during initial sync. You could increase the size of your journal volume or keep the journal on the data volume (as per the default config). Are you using any different filesystem options in your LVM volumes? — Stennie, Apr 04 '18 at 20:23
Hmm I found little information regarding what the initial sync actually does, https://docs.mongodb.com/manual/core/replica-set-sync/ does not say a lot apart from temp storage in the local database... So apparently it's filling the database through the journal files first ? And there's no way to limit the journal size so the DB can fill up without blowing up the space allocated to journal files ? — Cyril Duchon-Doris, Apr 04 '18 at 22:25
All writes land in the journal before being periodically sync'd to the data files. The normal goal is to complete initial sync as quickly as possible so a new replica set member is ready to resume normal operation. In this instance your journal files are accumulating beyond the storage limit that you planned for. The storage limitation may be more apparent due to the throughput requirements for initial sync, but you could encounter the same issue with significant workload. Is there a reason you want to limit your journal directory size to 400MB? — Stennie, Apr 09 '18 at 02:43
As mentioned in an earlier comment, straightforward workarounds would be increasing the journal volume size or keeping the journal on the data volume. I'm not aware of a configuration option to limit total journal size, but you could influence how much data accumulates by temporarily decreasing the [`syncDelay`](https://docs.mongodb.com/manual/reference/parameters/#param.syncdelay) (which defaults to 60 seconds) during your initial sync. — Stennie, Apr 09 '18 at 02:44

MongoDB secondary crashes on initial sync because of too many journal files on RAID 10

0 Answers0