23

This is a pretty long question, so bear with me.

I wanted to stress my Akamai Server logged in from an AWS instance. So, I started running ab benchmark. However, they seemed ridiculously fast to download ~3 MB video files. Naturally I wanted to see what's going on. This is what I did to get the file

curl -v -o /dev/null

The above completed in ~5 seconds.

Next, I ran the same command again. This time, it completed in ~200ms! Naturally, my intuition says the file is being cached somewhere.

My questions:

  1. Does curl cache files? If so, is there a way to ignore it?
  2. If curl doesn't, does the ubuntu abstracts a cache beneath curl? If so, is there a way to ignore it?
  3. Given the requirements, do you think there could be a benchmarking tool apart from ab that could serve the purpose?

Thank you, Akshay

Akshaya Shanbhogue
  • 332
  • 1
  • 2
  • 6

5 Answers5

21

The curl client isn't caching files, but the remote server network might well be. Try adding an arbitrary query string variable to the URL to see if you can reproduce it.

Josip Rodin
  • 1,575
  • 11
  • 17
  • Thank you for your answer. I couldn't add arbitrary query string as the Akamai server that I use doesn't accept any query params! (forcing error as it relies on salted token digest of timestamp and URL). However I was able to generate multiple tokens for the same path (essentially multiple URLs) and you are absolutely right. curl wasn't caching any file - the remote server was. Go CDN! :) – Akshaya Shanbhogue Jun 11 '15 at 19:04
7

Belatedly, try:

curl -v -H "Cache-Control: no-cache"

That will tell the web server to not cache. Doesn't stop layers below caching unless it's coded to obey the headers.

user171959
  • 179
  • 1
  • 2
2

You can use add a random query string using the $RANDOM environment variable:

curl --location --silent "https://git.io/lsf-e2e?$RANDOM"

This worked for me on github raw files.

Édouard Lopez
  • 425
  • 1
  • 3
  • 13
  • `$RANDOM` is a good idea, but `?$RANDOM` might result in a bad request. For a valid query string, `?foo=$RANDOM` would work. – Dario Seidl Nov 15 '21 at 12:51
1

I've used this curl command with a cache buster parameter.

curl http://example.com/static/changing_file?_=$(date +%s)

date +%s prints the seconds since the epoch, if you call the url more than once a second use date +%s.%N to add in nanoseconds.

Martlark
  • 141
  • 7
  • 1
    Or you could use $RANDOM instead of appending the nanoseconds. Sure, it's not the prettiest (or most concise) thing, and it kind of merges two solutions, but it does not require nanosecond precision. – Gustavo6046 Jun 05 '20 at 22:27
-2

Maybe your dns is caching the resolution of the name and this is the reason of the diference in time of response.

It's only a theory.

Falcon Momot
  • 24,975
  • 13
  • 61
  • 92