We want to run MapReduces on our live Mongo database, mainly so that we can extract metrics. However, we've had some bad outages caused by these MRs bogging down the Mongo server (in particular 100% disk IO). We think it's due to missing indexes.

Is it possible to perform batch processes like these with low priority such that it doesn't make the database inaccessible for our live app?

  • 205
  • 1
  • 3
  • 7

2 Answers2


There is no way to "nice" the MR jobs you are running - they will yield etc. but in the end (especially if you have poorly chosen indexes) you are going to impact the primary by evicting its working set from RAM, causing disk IO contention, etc. Hence I would definitely recommend optimizing your indexes to avoid that as much as possible.

In terms of easing the burden, you can run in-memory MR jobs on secondaries instead of your primary. Anything that does not require outputting to a database can be run in this way (no way to write out on a secondary).

If that is not an option, then other approach I have seen include making the data available to an "analytics" cluster specifically for running the MR jobs while leaving the production DBs untouched. There are multiple approaches to take for keeping the second cluster up to date, from filesystem snapshots and other batch type techniques to using mongooplog or a custom application and tailable cursors to replicate.

The other approach you could take would be to shard to increase your capacity on your primaries. If you are going to go down that road, make sure you are running at least 2.2 (2.2.2 as of writing this), the support for sharded MR has been improved greatly with the 2.2 release

Adam C
  • 5,132
  • 2
  • 28
  • 49

As Adam stated, there is no way of running jobs with lower priority on mongodb. We had the same issue with expensive jobs causing other queries to become extremely slow. We solved this issue by copying the data that needed to be processed by MR-jobs to a dedicated crunching database on a different host.

Besides the mongooplog and tailable cursor methods for copying data between databases mentioned by Adam, you can also do this directly via javascript for copying only the data you need (possibly in an incremental way). See this blog post for more info: Quality of Service in MongoDB. You may also want to use mongodump and mongorestore mongodump.