This can be a long and involved process, but let me first say this as a starting point. I (and many others I have worked with) have managed to get far closer to maximum resident memory usage. Exactly what that maximum is will vary from system to system and has a lot of variables that come into play but I would generally shoot for 60-80%, anything higher is a bonus.
The next thing to do is some reading. There has been plenty written about this topic, often from the other perspective (better memory efficiency, fitting more into RAM when it is full etc.). For example:
With all that out of the way, you hopefully have a decent idea about how to tune your system to get the most out of the available memory (usually, but not always, knocking readahead down and making sure NUMA is disabled successfully), and are able to see where else memory pressure may be coming from. The next piece to understand is a little trickier, and involves how the MongoDB journal works, and how that in turn interacts with how the kernel tracks the memory usage of individual processes.
This is covered in detail as part of a lengthy MongoDB Jira issue - SERVER-9415. What we discovered when investigating that issue, is that they behavior of the journal when doing a mix of reads and writes could (not always, but it was reproducible) drastically reduce the reported resident memory for the MongoDB process. The mechanics of this have been described in detail by Kristina Chodorow here and there are more details in the Jira issue also.
So, what does all that mean?
It means that the reporting and interpretation of resident memory statistics is complex, particularly on a system that is also doing writes, and especially if that system has memory pressure outside of the mongod
process. In general, I recommend the following methodology:
- Read in (touch or manual pre-heating with a large query/explain) a large, known, amount of data that should fit into memory
- Run some queries, aggregations etc. on that data set and verify that page faulting is minimal
- If page faults are low, then the data is fitting into memory, you have a reporting problem. You can repeat the tests with larger data sets until you find your actual limit.
- If page faults are high, then the data has been evicted, was not fully loaded in etc. and you have something to investigate (readahead, memory pressure, make sure NUMA is disabled etc.)
I generally recommend running MMS Monitoring (free) while testing as that lets you track memory stats as well as non-mapped memory over time, page faults and more, as well as mongostat
(for sub one minute resolution) to get a decent picture of what is going on.