Can I host a static website on Amazon EC2 coupled with EFS?


My website is currently hosted on EC2 with EBS volume. But very often I run a java program on the same instance (Which outputs to EBS) to generate/update files for the website. CPU & memory peaks to 100% at the run time, which is making my website either very slow or to not open at all, until the process ends. Moreover, I'm paying for a large instance for the whole month, just to run this java program few times a month.

Whyn't run the java program in local & update output files to EBS? Internet speeds in my area are ridiculous.

Whyn't output to S3? My files are small in size & large in volume. Unacceptable transfer speed between EC2 to S3. I even tried s4cmd, but still the transfer speed is unacceptable.

Whyn't mount same EBS volume to 2 EC2 instances? I think it's a hack.

So I want to switch to EFS & connect two EC2 instances. One dedicated for running website. And another on-demand instance to run java process only when needed.

My questions about EFS: 1) Can my java program able to output large files to EFS with the same speed as with EBS? 2) Amazon is not charging extra for EFS outbound bandwidth. Is it limited?


Posted 2016-12-31T05:58:28.823

Reputation: 41

The throughput available from EFS varies based on how much data you have stored, using a credit-based baseline+burst algorithm. To give you a meaningful answer, please mention approximately how much data is to be stored, in GiB. – Michael - sqlbot – 2016-12-31T14:31:33.980

I want to store only about 2gb - 5gb. But – Avinash – 2017-01-01T07:51:26.810

Thank-you, I just checked that and if I understood correctly EFS can provide only 0.5 MBps if I store below 10GB of data. I want to store only about 2gb-5gb of data, but require very high outbound throughput. I think EFS is not suitable for my requirement. It's strange amazon is linking size of data to traffic. I might store only little data, but that doesn't mean my outbound traffic too will be very low. – Avinash – 2017-01-01T07:59:44.897

Your interpretation is a little bit incomplete. I'll try to clarify this with an answer, shortly. – Michael - sqlbot – 2017-01-01T17:49:28.647

Can you tell us more about your use case? In your situation I would reduce the instance size and use something else for batch processing - a spot instance, AWS Batch (new), Lambda, etc. Making the files on your instance available using NFS seems quite practical. I've read that EFS has a lot of latency when a read starts so it's not idea for small files - it's better with large files. – Tim – 2017-01-06T19:04:04.033



Important note: As of 2018-07-12, EFS allows you to purchase provisioned throughput. The answer below reflects the behavior of the service before this feature was available. Previously, small EFS filesystems were easy to overwhelm with traffic, because performance scaled up linearly with the size of the data stored... so with only a few GB stored, the effective limit was too small for some use cases that did not take this into account.

Whyn't mount same EBS volume to 2 EC2 instances? I think it's a hack.

You can't mount the same EBS volume to multiple instances. You can, however, create an NFS export from the machine with the EBS volume and mount that across the network. NFS is established technology, not a hack. In fact, this would be almost identical to using EFS from your perspective, since EFS does in fact use the same protocol -- NFS.

Amazon is not charging extra for EFS outbound bandwidth. Is it limited?

"Outbound bandwith" is not exactly a valid concept with EFS, because the traffic is strictly between EFS and the instance that is accessing it. If you mount it correctly, using the availaibility zone-specific endpoint, the traffic between EFS and the EC2 instances never leaves the availability zone.

If a web browser downloads a file that is on your EFS filesystem, it must necessarily be downloading it through one of your instances. So the outbound bandwidth is actually EC2 outbound, not EFS outbound.

The available throughput ("bandwidth") between EFS and EC2 scales up with the total size of the files stored in an EFS filesystem.

Amazon EFS uses a credit system to determine when file systems can burst. Each file system earns credits over time at a baseline rate that is determined by the size of the file system, and uses credits whenever it reads or writes data. The baseline rate is 50 MiB/s per TiB of storage (equivalently, 50 KiB/s per GiB of storage).

Accumulated burst credits give the file system permission to drive throughput above its baseline rate. A file system can drive throughput continuously at its baseline rate, and whenever it's inactive or driving throughput below its baseline rate, the file system accumulates burst credits.

But there is a 100MiB/s burst capability, however small the filesystem might be. For a 10GiB filesystem, you can burst to 100MiB/s for 7.2 minutes per day, or 25MiB/s for 28.8 minutes per day, etc.

The thing you're overlooking with your conclusion that this is insufficient is the OS cache. On your web server, files read from EFS may remain in the OS cache on that machine, which means that once a file has been served to a browser, the web server may not need to read the file from EFS on the next download, but may instead only check that it has not changed and then serve it to the browser from memory. This behavior should be automatic unless you disable it.

It's strange amazon is linking size of data to traffic. I might store only little data, but that doesn't mean my outbound traffic too will be very low.

Not really strange, since the size of the stored data is the only dimension that impacts pricing. EBS volumes are generally similar -- the larger the volume, the more throughput in MiB/s and/or IOPS is available from that volume.

Here, again, don't confuse your application's outbound traffic with the backing store's throughput. The two values are not tightly correlated.

For smaller instances, the instance's characteristics are actually more likely to be the limiting factor. For example, a t2.small instance only has 31.25 MiB/s (250 mbps, 0.25 gigabit/sec) of bandwidth available at all, so the upper performance limit won't be the filesystem.

Try your application with EFS and observe the CloudWatch metrics for the filesystem. Every workload is different and that's really the only way to know if it will work as expected.

Michael - sqlbot

Posted 2016-12-31T05:58:28.823

Reputation: 1 103

Thank-you. It completely makes sense now. I now realized even if there's 1000 simultaneous requests for a file on EFS, the instance makes only a single request to EFS server. Moreover, I also understood throughput depends on average I/O size rather file size from this sentence on aws page: "Overall throughput generally increases as the average I/O size increases, because the overhead is amortized over a larger amount of data". But mounting in same availability zone (or even region) is not possible to me as almost of my traffic is from Asia pacific region, while EFS is not available here. – Avinash – 2017-01-02T03:55:48.403

Officially, EFS is only supported when the EFS filesystem and the instances mounting it are in the same region. I developed a workaround for this, which works great, but limitations of the speed of light make this poorly-suited for many applications because of the increase in latency and the data transfer charges involved when sending traffic across regions.

– Michael - sqlbot – 2017-01-02T04:40:37.010