2

I have a single-instance mongodb with a collection with 3 fields: _id, block_id, payload.

  • Payloads are always 4096 byte binaries
  • _id is an ever-incremented unique integer

There is a secondary index on the collection:

{ "v" : 1, "key" : { "block_id" : 1, "_id" : -1 }, 
  "ns" : "testdb.testdev", "name" : "_block_id_id" }

I'm doing many queries like:

query: { query: { block_id: 868413 }, orderby: { _id: -1 } } ntoreturn:1 nscanned:1 nreturned:1 reslen:4166 163ms

There's no other query during these. When I read sequentially by block_id, it's 10 times faster than when I query with random block_id. I have low cpu usage, low storage utilization. The collection is 2-3 times bigger than the memory size.

What might be the bottleneck here?

Adam C
  • 5,132
  • 2
  • 28
  • 49
user131411
  • 121
  • 2

1 Answers1

1

A couple of things to clarify here:

  1. You will only see slow queries logged by default (>100ms), you could have millions of queries execute under that threshold that will never be logged
  2. The way to figure out what the cause of the slow operations is to look at stats when the slow ops are being logged
  3. You should re-run the queries with .explain() to make sure they are using the index you think they are

In terms of the stats, there are two basic ways to get them. First, and quickest, is mongostat and mongotop. These two utilities ship with MongoDB and can be used to figure out what your database is up to.

The other option is MMS (the MongoDB Monitoring Service) - it's free and it lets you graph all of the relevant stats over time, so you can determine what is spiking/dipping when you see slowness. I recommend installing munin-node if you go down this route (see MMS docs) because it will give you a view of IO stats as well as MongoDB stats.

You will usually be looking for one of the following:

  1. Page Faults - if this is spiking, your queries are causing paging to disk - this is an order of magnitude slower than in-memory operations and needs to be minimized.
  2. Resident Memory - closely related to page faults, this represents your working set in memory. You mention that your data set is 2-3x the size of RAM, but did you include indexes in that estimate (see the db.collection.stats() command)

There are plenty of other things to look at, but that is a good start given your description. Remember, if you have memory contention, the newer something is the more likely it is to already be in memory. Since you are using a sequential ID I would expect the older IDs (unless recently updated or touched) to show up in the slow query log more often than new IDs (that is how the OS generally manages memory - see LRU for more).

In terms of tackling this type of performance issue, besides adding more RAM, you should look at:

  1. Removing un-needed indexes that might take up space
  2. Look at using covered index queries if possible (no need to page in the data, just the index)
  3. Check your readahead settings - a long and complex topic - see here and here for more information (and more info in general)
Adam C
  • 5,132
  • 2
  • 28
  • 49