13

Suppose you're on dell.com right now and you're buying a server to run your MongoDB database for your small startup. You will have to handle literally tens of thousands of writes and reads per minute (but small objects). Would you go for 2 processors ? Invest more on RAM ?

I've heard (correct me if I'm wrong) MongoDB handles the most it can on the RAM and then flushes everything to the disk, in that case I should invest on a CPU with a large L2 cache, probably >40GB of RAM and a solid state drive.. right ?

Would I be better off with a high end (~$11,309, 2 expensive processors, 96GB of RAM) server or 2x(~$6,419, 2 expensive processors, 12GB of RAM) servers ?

Is Dell ok or do you have better sugestions ? (I'm outside the US, on Portugal)

masegaloeh
  • 17,978
  • 9
  • 56
  • 104
  • 3
    why are you purchasing hardware instead of going with something like EC2 for your startup? At least initially until you know what your requirements will be. –  Feb 16 '11 at 21:25
  • Agree with Tom. Why not take some instances on the cloud? –  Feb 17 '11 at 13:34
  • 1
    @mixdev, you're wrong: "Linux, NUMA and MongoDB tend not to work well together." source: http://www.mongodb.org/display/DOCS/NUMA – Shadok Jun 06 '12 at 09:52

8 Answers8

19

Initially, you'll want to beef up on the RAM. The RAM you'll need is dependent on the amount of data you're storing, number of collections, indexes on those collections, data access patterns, etc. Lots of factors.

The most important thing is to have enough RAM to keep your indexes in RAM. Otherwise your performance will suffer dramatically as your server(s) will page constantly while Mongo moves memory mapped files in and out of RAM. In spite of all of this, we haven't seen write speed affected but everything else is. Processing writes off the queue, flushing, dumps, etc all take a dramatic hit once your indexes no longer fit in RAM.

So there is no real short answer. Basically, be smart about your indexes. Only use what you need. Keep collections small if you can (ie break out into multiple where you can.) Capped collections are also interesting to look into.

9

It is very important to use a 64 bit machine not 32 bit. http://blog.mongodb.org/post/137788967/32-bit-limitations

fullstacklife
  • 201
  • 2
  • 3
6

With MongoDB what you want is RAM. And then some more RAM. Buying RAM can't hurt.

chx
  • 1,665
  • 1
  • 16
  • 25
3

If you're at the stage of buying production hardware then the application you're running must already be written, right? So run the app on hardware you have and take metrics. Gradually change some components and take more metrics. When you're done, you'll know which points of focus are most important for your application and scenario.

Sam
  • 720
  • 2
  • 8
  • 18
3

First - buy as much RAM as you can. Second limiting factor is disk speed. RAID helps. SSD helps. More shards help. Measure throughput comparing to disk efficiency and required response times, then decide what to do within the budget you have.

1

I would wonder if a Linux clustered solution would be a better, cheaper alternative.

MongoDB lets you distribute data over many servers. That will be impossible with one, honking server.

I thought MongoDB was one of the next steps taken after finding out that deploying a relational database on a honking server didn't scale well enough.

duffymo
  • 231
  • 1
  • 3
1

Tens of thousands write per minute is nothing. You can get 50.000 or more writes per second on decent hardware. Hardware specs really depends on what you are trying to do. In general enough RAM for large databases and a fast IO systems are important beside a decent CPU...

0

It is important to establish a solid baseline prior designing your hardware. Generally expect these kind of questions to be asked by the experienced mongoDB folks before anyone can even consider answering your question.

Current Application Stats (if any)

  • Total Records to date?
  • Starting storage estimate?
  • Expected % growth/month?
  • Average Document Size?

Data Ingestion Work Load

  • New Insertions/day, peak & average per second?
  • Updates/day, peak & average per second?
  • Reads/day, peak & average/second?
  • Average number of documents returned per query: 70
  • Deletes/day, peak & average/second: None
  • Will there be bulk loads/bulk updates? If so, how large and how often?
  • How many different types of documents will there be?
  • How many of each?
  • What do you expect your documents to look like (sample doc)?

Query Patterns & Performance Expectations

  • Read Response SLA?
  • Write Response SLA?
  • Are reads range-based or random?

Anticipated Access Patterns

  • Number of secondary indexes required?
  • Number of Attributes?
  • Sort conditions?
  • Single or Compound?
Ostati
  • 103
  • 2