CloudFront download speed in Europe: expected vs. reportable

0

As a user in Europe, what is the download speed I should expect from a service powered by AWS/CloudFront? At what point should I report a slowness and to whom?

For WeTransfer, from an example link I get to an example download URL (found in network console, F12). I then use iftop to see what host is serving the file to me and mtr to see if any obvious problem stands out (though the traceroute from their host to my machine can be different from the other way).

Yesterday, the file was served from CloudFront's Madrid edge, something like server-54-192-61-242.mad50.r.cloudfront.net, and my download speed didn't go beyond 300 KiB/s, staying at 150-200 KiB/s most of the time. That's terribly slow.* I did not save the traceroute but there was no obvious packet loss or latency; IIRC packets went through Telia.

Today, the file is served from server-54-240-166-250.lhr5.r.cloudfront.net (London) and I get 1.1 MiB/s at home, 13 MiB/s average (and 25 MiB/s peak) on a Northern Europe server. This is what I expect.

Given Amazon/AWS changed the host from yesterday and now things work, it seems even more likely that the problem was with them. However, the AWS client on The download speed is slow says they won't do anything. CloudFront docs and AWS forums have no information on how to report networking/routing/peering issues. What to do in such cases then? I guess only the AWS client is in the position to get something done, but only if the person who receives the report is able to understand networking.

My traceroute to CloudFront Madrid is something like this:

10.|-- 62-101-124-129.fastres.net                   0.0%    50    4.6  13.8   3.5 101.1  20.3
11.|-- 89.96.200.21                                 0.0%    50   17.6  16.6   2.6  92.9  22.0
12.|-- mno-b2-link.telia.net                        4.0%    50   52.6  26.3  13.1  69.2  13.7
13.|-- mei-b1-link.telia.net                        0.0%    50   23.7  30.3  20.4  87.7  11.3
14.|-- bcn-b2-link.telia.net                        0.0%    50   47.5  53.7  30.2  92.9  16.4
15.|-- mad-b2-link.telia.net                        0.0%    50   62.7  57.7  36.1 102.2  14.4
16.|-- mad-b1-link.telia.net                        0.0%    50   37.7  42.1  34.3  59.8   5.6
17.|-- a100-ic-314004-mad-b1.c.telia.net            0.0%    50   70.2  58.5  39.7  87.2  12.5
18.|-- ???                                         100.0    50    0.0   0.0   0.0   0.0   0.0
19.|-- ???                                         100.0    50    0.0   0.0   0.0   0.0   0.0
20.|-- ???                                         100.0    50    0.0   0.0   0.0   0.0   0.0
21.|-- server-54-192-61-242.mad50.r.cloudfront.net  2.0%    50   71.1  83.5  56.4 156.2  19.5

The traceroute is now something like this:

10.|-- 62-101-124-94.fastres.net                    0.0%    50   68.6  79.5  36.1 108.8  15.4
11.|-- 89.96.200.110                                0.0%    50   75.9  94.8  46.0 141.8  17.6
12.|-- ???                                         100.0    50    0.0   0.0   0.0   0.0   0.0
13.|-- 54.239.4.248                                 2.0%    50  107.2 112.9  71.6 146.7  18.2
14.|-- 54.239.41.135                                0.0%    50  112.8 108.7  72.8 147.6  15.0
15.|-- 178.236.3.22                                 0.0%    50  115.8 102.3  58.4 127.9  16.9
16.|-- 176.32.106.11                                4.0%    50   95.8 103.2  73.7 130.7  14.2
17.|-- 176.32.106.11                               40.0%    50  110.6 108.6  80.4 136.1  14.7
18.|-- ???                                         100.0    50    0.0   0.0   0.0   0.0   0.0
19.|-- ???                                         100.0    50    0.0   0.0   0.0   0.0   0.0
20.|-- server-54-240-166-250.lhr5.r.cloudfront.net 60.0%    50   88.7 100.0  57.6 131.9  18.0

As the first answer notes, it matters a lot whether the file was already cached on the CloudFront edge or not. Here is an example of a cache miss (which right now manages to saturate my bandwidth):

$ LANG='en' wget -S 'https://download.wetransfer.com/wetransfer-eu1/f7a2031249f56fdeeda9040adda5a26f20160224143804/wetransfer-f7a203.zip?expiration=1456605646&escaped=false&signature=3d916716d49e415f637b4f824c7709f7483b67a8f02588caece30d6c2a3ed0ea&filename=wetransfer-f7a203.zip'
--2016-02-27 21:34:39--  https://download.wetransfer.com/wetransfer-eu1/f7a2031249f56fdeeda9040adda5a26f20160224143804/wetransfer-f7a203.zip?expiration=1456605646&escaped=false&signature=3d916716d49e415f637b4f824c7709f7483b67a8f02588caece30d6c2a3ed0ea&filename=wetransfer-f7a203.zip
Resolving download.wetransfer.com (download.wetransfer.com)... 54.192.61.62, 54.192.61.196, 54.192.61.80, ...
Connecting to download.wetransfer.com (download.wetransfer.com)|54.192.61.62|:443... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Length: 1449534395
Connection: keep-alive
Server: nginx
Date: Sat, 27 Feb 2016 20:34:39 GMT
Content-Transfer-Encoding: binary
Content-Encoding: none
Cache-Control: private, no-transform, no-store
Allow: GET, HEAD
Accept-Ranges: bytes
Content-Disposition: attachment; filename="wetransfer-f7a203.zip"
X-Transfer-Id: f7a2031249f56fdeeda9040adda5a26f20160224143804
X-Cache: Miss from cloudfront
Via: 1.1 943ab292a0096b706fe263560805857e.cloudfront.net (CloudFront)
X-Amz-Cf-Id: 4hEZcZL56GWMBn8z1T2txF-O3TTdrAC6OxCtqVDZUoJREUd9_EBo6A==
Length: 1449534395 (1.3G) [application/octet-stream]

On further testing, I always gewt X-Cache: Miss from cloudfront, even at the 6th time I request the same resource, so it seems that WeTransfer isn't caching anything at CloudFront (or not files of this size). Interestingly, the X-Transfer-Id: f7a2031249f56fdeeda9040adda5a26f20160224143804 header is always the same although the actual download URL I get from clicking the download button varies; the Via and X-Amz-Cf-Id headers also vary. As of this update, the first time I request a given download URL is very fast, the second very slow, the third 404s. I tried and I can have two simultaneous downloads, one at the second attempt and one at the first attempt: the first will be very slow and the latter very fast, although the networking conditions are clearly the same.

See https://paste.debian.net/408552/ for a test from my Northern Europe server: download A* are one URL, B* another; A-2 is after A-1 and B-2 is after B-1, but B* started while A-2 was running. Yet A-1 and B-1 were very fast, A-2 and B-2 very slow.

This is increasingly looking like an issue with quality of service/QoS aka throttling. Can CloudFront throttle me with cache misses, or should we only blame their client?

(*) Note: I have a 10/10 Mb/s FTTH connection with Fastweb. The available bandwidth never goes under this guaranteed speed. The ISP is not known to apply QoS throttling, but does sometimes have some routing issues outside Italy. When I observed the problem, I didn't have any problem saturating my bandwidth with other services.

Nemo

Posted 2016-02-27T10:46:18.343

Reputation: 1 050

How can you download at 25MiB/s if you have a 10Mb/s line? – MariusMatutiae – 2016-02-27T16:33:02.553

@MariusMatutiae that was the server – Nemo – 2016-02-27T20:26:26.677

Answers

1

If you are not a representative of the AWS account that owns the CloudFront distribution -- and the phrasing used in the question makes it seem as if you are not -- then your appropriate point of contact is the web site in where you are experiencing poor performance.

It would then be their responsibility to open a support incident with AWS support, if they deem it appropriate, since they are CloudFront's customer.

CloudFront is designed to route your request to the most optimal CloudFront edge location (where "optimal" often but not always means geographically proximate) within the pricing tier selected by the distribution's owner (owners may choose not to pay for more expensive edges, where Amazon's costs are higher, in which case requests for that site will avoid those edges, or will at least avoid premium pricing even though CloudFront may, at their option, use a higher cost location but bill at the lower rate).

The optimal edge for a particular downloader's location will shift over time, due to a multitude of factors, including latency, congestion, hops and AS path sizes, link bandwidth, and any number of other factors... which are not public information, but are taken into account by the CloudFront routing algorithms that determine what DNS response you receive when you connect to a site powered by CloudFront. The DNS response varies by the requesting client IP address.

From a single source IP address in Southern Ohio (US) I see my CloudFront test site route through an edge location that changes between South Bend (IN, US), Chicago (IL, US), and Ashburn (VA, US) on a fairly regular basis -- without the actual IP address I am requesting the page from even changing. From a similar setup less than 5 miles away, but with a different static source IP address using a different ISP, I get similarly varying but often different responses.

This can most easily be explained by CloudFront's algorithms trying to select the most appropriate edge, based on factors that are not obvious from the outside looking in.

For all you may know, your slow behavior on the connections to CloudFront yesterday may have been detected, and triggered the selection algorithm to choose a different strategy, thereby causing the performance issue to "fix itself." It's also possible that the Madrid edge was being used as a suboptimal choice due to availability issues at a better location choice.

There could also have been an issue between CloudFront and the origin server. The response headers from CloudFront would have given you a little more information... If X-Cache: Hit from Cloudfront is present, you're being served from the edge cache, and the Age: header will tell you how long the object has been cached at the edge. If X-Cache: Miss from Cloudfront, then your download was not cached at the edge, and the file you're currently receiving is being fetched from the origin server and simultaneously being staged for the cache and being streamed back to you. Allowing that download to fully finish, then downloading again with an identical request should get you the cached copy, assuming your next request hits the same edge, and the speed difference, if any, is approximately indicative of the connection speed back to the origin server. CloudFront is a pull-through CDN; objects are not replicated to the edges, they're only stored in places where they have been requested, after the initial request.

As a CloudFront client, I have never had a need to report slow downloads. That doesn't mean it's perfect, but the service does appear to be very solidly engineered for performance and resiliency.

Michael - sqlbot

Posted 2016-02-27T10:46:18.343

Reputation: 1 103

TL;DR: CloudFront should saturate my bandwidth, unless I'm the first person requesting that resource from that edge, in which case it will depend on the bandwidth between the website and CloudFront. – Nemo – 2016-02-27T20:30:24.803

Your https://cloudfront.sqlbot.net/ is very nice, I bookmarked it. Right now it says "mad50" as my current optimum, and that's indeed what I get if I download from WeTransfer. I see a similar traceroute as yesterday but the download speed is good, I saturate my home bandwidth.

– Nemo – 2016-02-27T20:45:06.180

I added some information. This doesn't seem to related to cache misses. They are throttling the requests at some level, I'm not sure which. – Nemo – 2016-02-27T21:01:11.320

If you are not the CloudFront customer, I don't think it matters at this point -- your issue is with the service whose customer you are, not with whatever infrastructure provider they are using. As an end-user, you don't have the standing to report an apparent issue to an intermediate provider -- that is a rule of standard business practice. It is up to the service you have a relationship with to validate your trouble report and relay it upstream if they consider it to be an issue. You lack the visibility to determine upon whose side the limitation is being applied, if there is one. – Michael - sqlbot – 2016-02-28T01:02:41.443

Yes, the problem is the service's stonewalling. Their support says AWS/CloudFront is perfect hence any problem can only be with my ISP. At this point I can't prove, for instance, that Telia isn't doing some throttling on me. (And go figure how I could explain a help desk that Telia would still be their problem, not mine.) – Nemo – 2016-02-28T08:05:02.917