We have a nodeJS instance inside a AWS VPC that makes a request to a mongodb instance running in another availability zone.

The node instance is hit with a particular request a lot. That request is used to pull a lot of information from the mongodb instance. After the first query that entry is cached for a period of time.

Yesterday something happened with the amount of data that it was getting back. With a code like:

 console.log('before retrieve');
 Model.find({}).exec(function() {
   console.log('after retrieve');

If would hit 'before retrieve' 10 times, and then just stall, DoS-ing itself. I removed some data that it was pulling as a temporary fix.

On the mongoDB side I would sometimes see:

 SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR]

How can I avoid this happening?

  • 375
  • 2
  • 5
  • 12

1 Answers1


The problem you are describing has much less to do with mongoose, mongodb and node and is more a manifestation of a problem commonly referred to as a "cache stampede" or "dog pile".

As the name implies, a cache stampede happens when a whole bunch of things try to refresh the cache at once. In your case, this happen when the cache expires or during the initial load of data into the cache. Suddenly, lots of requests come in and apply lots of read load onto your database which 1) causes the cache to be slow to be refreshed and 2) causes even more requests to stack up waiting for the cache. This basically leads to the behavior you saw, where things just crash

This Wikipedia page describes the problem fairly clearly and how one could solve it using a separate process or locks. Since node doesn't have locks or threads, thats probably not a solution. Also, while separate process would work, it has a lot more complex.

One technique that I have used in the past is to use two expiring cache keys, one key is used only indicate when the cache should be refreshed, while the other holds the actual data.

To illustrate, let's assume I have an object foo that I want to cache and have expire every hour. I can create another key foo_refresh that I expire 1 minute before the foo key.

When the foo_refresh key expires, one worker/request immediately replaces the foo_refresh key and ignores the cache and instead pulls the data from the database, refreshing the foo key when finished (also resetting the expiry time). Using a mechanism like this, we obtain a sort of "lock" on refreshing the cache, meaning no more than one worker will ever be doing the expensive read.

Assuming that the cache refresh takes less than 1 minute, the foo object never expires, instead, it gets refreshed via the expiration of the foo_refresh key.

Hopefully that helps!

  • 308
  • 2
  • 7
  • I get it. However it happened again the second day due to latency. I need mongoose/mongodb way around it. – MB. Apr 13 '14 at 00:51
  • Assuming pulling the information from the database is the bottleneck, there really isn't a way for mongoose or mongodb to take care of it. They have no mechanism for controlling access to the database outside its context. One thing that might help is to use `lean()` on the mongoose query to have mongoose return a flat object instead of the rich objects that take a while to generate – addisonj Apr 13 '14 at 02:11