Get a list of records from a collection sorted by count and uniqueness of a field in MongoDB

0

So I have a bunch of documents in a MongoDB collection and it seems that the collection is growing a little faster than we thought.

Is there a way to get a list from a collection that will count the number of documents that have X as a value in a field.

I’ll just make some data up for example’s sake:

There are 4 values possible for the field reference:

  1. /content/public
  2. /content/private
  3. /resource/something
  4. /much/wow

Is there a way to get a list from mongo that says:

  • 1231 Records have /content/public as the value for reference.
  • 21312312 have /content/private.
  • 34 have /resource/something.
  • 34242 have /much/wow.

Sempus

Posted 2014-12-01T15:51:30.007

Reputation: 119

Answers

0

There are a couple of different ways to do this, the brute force approach would be to do a query for each distinct value you are aware of, and just call .count() like so:

db.collection.find({field : '/content/public'}).count()
db.collection.find({field : '/content/private'}).count()
etc.

But, that means you need to iterate through a list of known fields and you have to run a bunch of different queries. Still, it might be nice for a spot check.

The more general option is the aggregation framework. Something like this would work, where field, as above is the one you are interested in counting values for:

db.collection.aggregate( { $group : { _id : "$field", count : { $sum : 1 } } })

Here's a generic example with made up data. First the data itself:

> db.foo.find()
{ "_id" : ObjectId("547ccf922bd17c324ae52b6e"), "countField" : "/content/public" }
{ "_id" : ObjectId("547ccf932bd17c324ae52b6f"), "countField" : "/content/public" }
{ "_id" : ObjectId("547ccf942bd17c324ae52b70"), "countField" : "/content/public" }
{ "_id" : ObjectId("547ccf942bd17c324ae52b71"), "countField" : "/content/public" }
{ "_id" : ObjectId("547ccf952bd17c324ae52b72"), "countField" : "/content/public" }
{ "_id" : ObjectId("547ccf962bd17c324ae52b73"), "countField" : "/content/public" }
{ "_id" : ObjectId("547ccf9b2bd17c324ae52b74"), "countField" : "/content/private" }
{ "_id" : ObjectId("547ccf9d2bd17c324ae52b75"), "countField" : "/content/private" }
{ "_id" : ObjectId("547ccfa42bd17c324ae52b76"), "countField" : "randomvalue" }
{ "_id" : ObjectId("547ccfa52bd17c324ae52b77"), "countField" : "randomvalue" }
{ "_id" : ObjectId("547ccfa62bd17c324ae52b78"), "countField" : "randomvalue" }
{ "_id" : ObjectId("547ccfa72bd17c324ae52b79"), "countField" : "randomvalue" }
{ "_id" : ObjectId("547ccfa72bd17c324ae52b7a"), "countField" : "randomvalue" }
{ "_id" : ObjectId("547ccfa82bd17c324ae52b7b"), "countField" : "randomvalue" }
{ "_id" : ObjectId("547ccfb52bd17c324ae52b7c"), "countField" : "/much/wow" }

Now the sample aggregation:

> db.foo.aggregate({ $group : { _id : "$countField", count : { $sum : 1 } } })
{ "_id" : "/much/wow", "count" : 1 }
{ "_id" : "randomvalue", "count" : 6 }
{ "_id" : "/content/private", "count" : 2 }
{ "_id" : "/content/public", "count" : 6 }

Adam C

Posted 2014-12-01T15:51:30.007

Reputation: 2 475