-1

How do Twitter tools like Buffer scale to millions of users where they have to write thousands of tweets every second?

I have a similar service where I am running a cron script however just going through 50 users takes about 10 seconds - so I'm wondering how this sort of scale is possible.

user9517
  • 114,104
  • 20
  • 206
  • 289
Chamilyan
  • 109
  • 3

1 Answers1

5

Tasks like what Buffer does are easily paralleled - you don't have to wait for one person's posts to be processed to post another user's. Because of this, it's trivial to scale horizontally - multiple threads and/or servers processing multiple users at a time.

Some basic tricks they might use are:

  • Multiple database servers - the post content doesn't have to be available to all servers, just the ones that need to process it. So you could have one main database that holds account and login info, and a bunch of separate ones that queued posts are added to, so the load is balanced out across them. There is no need to setup a database cluster or syncing between the servers, since the data doesn't need to be available across all servers.

  • Multiple posting servers - each server would look through one of the databases for things ready to post and deal with them. Probably with multiple processes/threads, and some way to control which thread is handling which database record, so duplicates don't get posted.

  • Optimization. If you are handling thousands of posts a second, shaving 1/100th of a second off the posting time is a big increase in speed. So much work is likely done to improve the performance of the parts of their code that get called often. Careful profiling helps determine which parts of the code need to be optimized, and which don't get called frequently enough to bother.

Basically, instead of processing one post from one user at a time like your script likely does, process as many of them in parallel as possible. Using these techniques you can just add more servers as you grow. You can also automate cloud server scaling - when you get too much of a backlog, a new server automatically gets started up to handle the extra load. When things die down, the extra servers get shutdown too.

Grant
  • 17,671
  • 14
  • 69
  • 101