1

I asked this question on stackoverflow and it was suggested that I try it here:

I'm building a website where users can upload photos and I'd also convert uploaded photos into thumbnails.

Planning ahead, if the website gets popular, how do I scale it out so that the images (both original and thumbnails) will be stored in and served from multiple servers? Maybe a cluster? Is there any open source software that would help me in this?

Thanks.

Continuation
  • 3,050
  • 5
  • 29
  • 38

1 Answers1

3

The easiest thing you can do to have good options later is to write your software in such a way that images can never keep the same filename when modified -- a change to an image must always change the filename. This means that you can set very long cache lifetimes, either through your own caches or through a content delivery network, which will greatly reduce the number of disk reads that you need. (You may need to have some mechanism to immediately flush a specific file from the cache or CDN, if there might be circumstances where an image has to be deleted completely.)

To allow horizontal scaling, break the images up into a large number of groups (100 or more), and prefix the path to each image with the group it's in. You may serve all the groups off the same server now, but at a later date it will be fairly easy to use a load balancer to direct traffic to different servers based on the image group. I wouldn't use a clustered filesystem for images, because it adds an extra layer of complexity, and it's probably easier just to use multiple servers and some load balancer rules to spread the load.

Mike Scott
  • 7,903
  • 29
  • 26