2

In our setup, we currently have 3 shard set sharded cluster, each shard being a replica set of 3. Our writes are about to go up significantly to implement a new feature, and we know the extra data will be necessary. The nature of our writes are basically all upserts(which will likely be updates) and updates where we increment a particular field by 1.

Our updates are always being incremented by 1 and the way our data is distributed, not all documents are treated equally, some get their fields incremented a lot more. An alternative solution that I thought could be effective is to have some type of middle man, like a few Redis databases (or some smaller mongods) where we do the updates to them first and after about 5 minutes (or use some queueing system), we have a bunch of workers consume the data and update the actual live cluster with the documents. This would save our main cluster a ton of writes as it would allow certain update heavy documents to accumulate their updates and could save us a ton of writes (exact numbers I will post shortly in an edit).

So bottom line, when is adding another shard not the right solution?

gWaldo
  • 11,887
  • 8
  • 41
  • 68
tonyl7126
  • 213
  • 2
  • 6
  • Definitely an interesting question, but I think it might get a lot more love at [dba.se]. – MDMarra May 17 '13 at 17:15
  • Honestly, this is a capacity-planning question. Please see http://serverfault.com/questions/384686/can-you-help-me-with-my-capacity-planning and http://serverfault.com/questions/350458/how-do-you-do-load-testing-and-capacity-planning-for-databases – gWaldo May 17 '13 at 17:51
  • @gWaldo I don't think it's capacity planning. It's "when is sharding a better choice than _____" – MDMarra May 17 '13 at 18:01
  • @HDMarra, I will also post the question on Database Administrators, thanks. – tonyl7126 May 17 '13 at 19:53

1 Answers1

1

Redis can certainly be used as a cache / write-back method to MongoDB.

However Adding Shards is the main way of adding to the write-capacity of your application, if you have already exhausted the options of adding memory, using faster disks, etc.

Also, be aware of the write-lock tendencies of MongoDB. Mongo allows the kernel to manage what is stored in RAM, so a best-practice when performing an upsert is to first read the object (so that it's stored in RAM), then write to it. Failing to do so causes the write lock to last much longer in the case of objects not in the working set, because it places a write lock, then reads the document from disk, writes to the document (now in ram), then releases the lock. All of this is less intrusive if the read (bringing object into RAM) has been done before the write lock takes place.

gWaldo
  • 11,887
  • 8
  • 41
  • 68
  • thanks for the tip on doing the read first, but I think most of our data we update is being read from anyway in other parts of our application because we usually care most about the most recent data (which is also the only data ever being updated/upserted) – tonyl7126 May 17 '13 at 19:42