1

We run a small mongodb replica set on three bare metal servers (no virtualization, no docker/kubernetes) with Debian 11 and mongodb 5.0.6:

machineA: 128GB RAM, 1TB disk, PRIMARY machineB: 128GB RAM, 1TB disk, SECONDARY machineC: 8GB RAM, 20GB disk, ARBITER

All of a sudden we experience outages with error in our application log like "NotWritablePrimary"/"MongoNotPrimaryException" - we were assuming that our connection string would make sure that no outage occurs:

mongodb://machineA:27017,machineB:27017/?replicaSet=MyRepl&waitQueueMultiple=10&readPreference=primaryPreferred

It turned out that the PRIMARY mongodb instance was killed by the linux kernel, as it was consuming to much RAM. The replica set was now running for 3 months without a problem at any time. But all of a sudden I see RAM consumption like this:

enter image description here

All of a sudden there was a massive RAM usage by mongodb: enter image description here

Right after the kernel killed the mongod process, it was restarted by SystemD as it runs as a service. But right after the restart it again consumes maximum amount of RAM until it dies again.

All of a sudden this behaviour stopped this morning. We did not change anything on our application so the question now is: what eats so much RAM in the mongodb process?

As far as I know the WireTiger engine is using ~50% of the available RAM, but that wouldn't explain the maximum usage of the total RAM of the machine. I also have some metrics from Percona mongodb_exporter, which shows that RAM is used by mongodb and no other process on the system:

enter image description here

Interestingly the memory usage of the SECONDARY wasn't moving at all at that time: enter image description here

Anyone any idea or hint what is going on here?

mr.simonski
  • 225
  • 3
  • 12

1 Answers1

0

We found out that one of our app services was running wild in certain circumstances and it was a bit hard to see for us.

When constantly hammering against MongoDb it seems that the memory usage is getting higher and higher, instead of more CPU resources being used as I would expect it. At a certain point the mongod process was killed by the linux kernel.

After we fixed the issue in our application, the situation was gone.

mr.simonski
  • 225
  • 3
  • 12