0

This is the output from db.currentOp():

> db.currentOp()
{
    "inprog" : [
        {
            "opid" : 2153,
            "active" : false,
            "op" : "update",
            "ns" : "",
            "query" : {
                "name" : "Run_KPIS",
                "profile" : "totals"
            },
            "client" : ":34140",
            "desc" : "conn127",
            "threadId" : "0x7f1d0f03c700",
            "connectionId" : 127,
            "locks" : {
                "^cached_data" : "W"
            },
            "waitingForLock" : true,
            "numYields" : 0,
            "lockStats" : {
                "timeLockedMicros" : {

                },
                "timeAcquiringMicros" : {

                }
            }
        },
        {
            "opid" : 2154,
            "active" : false,
            "op" : "getmore",
            "ns" : "",
            "query" : {

            },
            "client" : ":34129",
            "desc" : "conn118",
            "threadId" : "0x7f1e32785700",
            "connectionId" : 118,
            "locks" : {
                "^cached_data" : "R"
            },
            "waitingForLock" : true,
            "numYields" : 0,
            "lockStats" : {
                "timeLockedMicros" : {

                },
                "timeAcquiringMicros" : {

                }
            }
        },
        {
            "opid" : 1751,
            "active" : true,
            "secs_running" : 98,
            "op" : "query",
            "ns" : "cached_data.webtraffic",
            "query" : {
                "mapreduce" : "webtraffic",
                "map" : function () {
        if (this.Pages)
            for (var i in this.Pages)
                if (i.match(/(\/blogs\/|\/news\/)/))
                    emit({
                        'page':i,
                        'profile':this.Profile
                    },this.Pages[i]);
    },
                "reduce" : function (k,vals) {
        for(var i=0,sum=0;i<vals.length;sum+=vals[i++]);
        return sum;
    },
                "out" : {
                    "inline" : 1
                },
                "query" : {
                    "$or" : [
                        {
                            "Profile" : "MEMBER"
                        },
                        {
                            "Profile" : "WEB"
                        }
                    ]
                }
            },
            "client" : ":34111",
            "desc" : "conn112",
            "threadId" : "0x7f1d1768d700",
            "connectionId" : 112,
            "locks" : {
                "^" : "r",
                "^cached_data" : "R"
            },
            "waitingForLock" : false,
            "msg" : "m/r: (1/3) emit phase M/R: (1/3) Emit Progress: 801/830 96%",
            "progress" : {
                "done" : 801,
                "total" : 830
            },
            "numYields" : 148,
            "lockStats" : {
                "timeLockedMicros" : {
                    "r" : NumberLong(183690739),
                    "w" : NumberLong(0)
                },
                "timeAcquiringMicros" : {
                    "r" : NumberLong(92296403),
                    "w" : NumberLong(0)
                }
            }
        }
    ]
}

I have indexes on all the relevant collections, yet there is still a huge delay in reading from our MongoDB when the above operations are running.

It can take ~5 minutes before the database is readable again.

Would the above map reduce function be causing this read lock? And if so how can I run a non-locking map reduce on the collection?

What is strange is that MongoDB still accepts connections, it just won't allow us to query whilst the above operations are running.

Edited to say this is Mongo version 2.4.1.

StuR
  • 167
  • 2
  • 10

1 Answers1

2

First of all, there is the query being run here. It uses the $or operator over two values of the same field. If that is typical, change that to the $in operator (as recommended here). That should help significantly - when you use $or like that you are running two queries in parallel and merging the results, when you use in you do a single query.

Next, since this is an inline Map Reduce job, I would recommend running this on a secondary (if you are not already) and have any applications with a more real-time requirement run elsewhere. You can do this in a variety of ways, but the most flexible is tag based read preferences.

In terms of interpreting the currentOp() output, the capital letters represent global locks and that is the what is likely holding things up (though it will try to yield), you can also see that it spent a lot of time trying to acquire the lock in the first place. I assume that this represents a large table scan of the data in question and that the data does not all fit in RAM and is being paged in from disk. Hence the number of yields for that query (MongoDB will try to yield whenever it sees a fault to disk).

Take a look at the page faults metrics in MMS or mongostat to see the trend there, in fact MMS would be a good place to get a picture of what is going on over time on this instance overall.

The changes in terms of $in above should help a bit on this, but may only kick the can down the road. If you are going to be doing aggregations across large amounts of data, it either needs to be in RAM where this kind of thing is fast, or you need to get it onto a secondary so that the slow disk access does not drag everything down with it.

Adam C
  • 5,132
  • 2
  • 28
  • 49