1

I have a small MongoDB replica set (2.4.6) and it seems about once a week, the primary gets bogged and the load average spikes up. This app doesn't see a huge amount of traffic. It resides in EC2 and is a M1.medium (1 cpu, 3.7 GB RAM).

  • We have MMS installed and opcounters never goes higher than 25-30.
  • Page faults is insanely small (.002-.003).
  • There is zero queueing.
  • Load average typically sits under 1 but when these spikes happen, it is averaging 2-4.
  • Queries are showing from 500ms-1700ms to occur.
  • There is no replication lag.

When I tail the log, I see this:

Mon Mar 17 17:16:48.342 [conn62561] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: db_name.jobs top: { opid: 2507609, active: true, secs_running: 0, op: "query", ns: "db_name", query: { findandmodify: "jobs", query: { status: "queued", queue: "upload_hostname" }, sort: { enqueued: 1 }, new: 1, remove: 0, upsert: 0, update: { $set: { status: "dequeued", dequeued: new Date(1395076608340) } } }, client: "10.50.101.10:38766", desc: "conn62561", threadId: "0x7ff2d3772700", connectionId: 62561, locks: { ^: "w", ^db_name: "W" }, waitingForLock: false, numYields: 0, lockStats: { timeLockedMicros: {}, timeAcquiringMicros: { r: 0, w: 4 } } } 

I came across this https://groups.google.com/forum/#!topic/mongodb-user/s62QnfT8Vbc however I'm new to Mongo.

This issue only started occurring recently and we only have a certain # of users that would be connecting. I'm looking for guidance on finding the cause. I also ran db.collection.find().explain() and this was returned:

"cursor" : "BasicCursor",
"isMultiKey" : false,
"n" : 505,
"nscannedObjects" : 505,
"nscanned" : 505,
"nscannedObjectsAllPlans" : 505,
"nscannedAllPlans" : 505,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {

},
"server" : "hostname:27017"

We may just need to bump up the server size, but I'd like to try and figure out what is causing the spike once a week.

nocode
  • 168
  • 10
  • Hello! I work for MongoHQ, so I help customers all the time with this type of question. Since it is happening intermittently, I am betting it is a backup process you are running or a CRON job that kicks of slow queries. Use this operation to quickly examine some log files: > grep '[0-9]\+ms' | less – Chris Winslett Mar 17 '14 at 17:38
  • I installed the MMS backup agent, but this was after the first occurrence. I have a suspicion that it may be from the app as we updated the monq node package but I need to conjure some evidence pointing to it. Which logs would I be looking at specifically? – nocode Mar 17 '14 at 17:59

0 Answers0