3

We have a growing set of data files (.wav files, image files etc) which are data, i.e not part of the application code - uploaded and modified by users. The number of files is in the 1000s and the total size reaches GBs.

We have several server clusters in different locations around the world (US, EU, ME). In each cluster it is important that the data is served locally and not from S3 (the data files are not served directly to clients, but are processed by the servers). We want to designate a file server in each location which will serve the files via NFS to the other nodes in the same cluster.

So the bottom line is:

  • Files uploaded via the application should end up on S3.
  • Each file server node should replicate those files.

We see several options:

  • Using an origin file server that replicates to S3 for backup/versioning and to the nodes via rsync (or similar).
  • Same as above but slaves replicate from S3 using something like S3 tool or similar.
  • Not using an origin - app code uploads directly to S3, and slaves replicate as above.

We were wondering which is the recommended solution, and what tools are available for the replication part (i.e in the filesystem-to-filesystem category, and in the filesystem/S3 category).

Amir Abiri
  • 227
  • 1
  • 3
  • 7

0 Answers0