1

I don't know an awful lot about servers, but I do know that there are many ways you can improve the performance of running a web application, in terms of hardware.

For example, starting with a single server, you can:

  • Move the database onto another server.
  • Move static files onto a CDN
  • Use multiple web servers behind a load balancer.

But how can you scale a database server? Can you have multiple servers running the same database? If so, how does it normally work?

I get how a load balancer / multiple web servers work, but can the same approach be applied to database servers?

Alex Coplan
  • 575
  • 1
  • 10
  • 18

3 Answers3

4

This is a very broad question.

The general answer is that there is a number of ways this can be done; two of the main ones are these:

  • Replication
  • Clustering

Replication is where the database is copied from one machine to another. Using this method, you should in theory at least have two databases that are identical. With replicated databases, you can query any of them for your data; you only have to be careful with writes so that they don't conflict between servers.

Clustering uses a shared data store and a database server that recognizes the data store is shared and handles it properly. It also suggests a clustered file system such as GFS.

Mei
  • 4,560
  • 8
  • 44
  • 53
3

You can do some load balancing; but it quickly gets tricky. I've found the most useful thing to know is what kind of operations your database does normally. If you've got a website; usually its reads. For tracking apps (as an example) you'll have lots of writes.

If you have lots of reads you can make lots of read-only mirrors, chuck a load balancer in front and you're done. Each request is balanced so each request gets the maximum amount of resourcing.

For write heavy systems it's a bit harder because as you add a new server, they need to be aware of what the other is doing - this is where locking comes in. If you have servers A and B; A gets an update or insert request, before it can modify the table it has to say "hey B, make sure you don't change this table, I'm about to change it" (locking) and then commits the change across. This all happens over the network (usually) which means it can be slow (ms not usec) which can hurt performance. As you add servers the performance degrades. With A,B and C; A must talk to both B and C before committing.

Hope this serves as an introduction to db scaling. I'm currently looking into Citrix Netscalers DB load balancing offerings that can help with these questions; I'd recommend looking at least at the docs they have on the topic :)

Preflightsiren
  • 457
  • 2
  • 8
2

It really depends on the database technology, but generally this is accomplished by sharding, master-master replication, or master-slave replication with all writes going to the master(s) and all reads coming from the slave(s).

There are plenty of other ways, but these are the most common.

MDMarra
  • 100,183
  • 32
  • 195
  • 326