54

I have seen increased 'HEAD' requests in my webserver access.log. What are these requests for? Should I disable this method in my webserver configs?

Bruno Rohée
  • 5,221
  • 28
  • 39
hnn
  • 997
  • 2
  • 8
  • 12

5 Answers5

73

No.

Relevant quote from the link:

HEAD

Asks for the response identical to the one that would correspond to a GET request, but without the response body. This is useful for retrieving meta-information written in response headers, without having to transport the entire content.

If you disabled it, you'd just increase your throughput cost. A person can get the same information with a GET, so if they were trying to do something malicious, they could just use a GET. Except, this way, they're being nice and not forcing you to send the request body.

EDIT: I don't know what the requests would be from, although I can certainly think of uses. Anyone else who knows or wants to chip in, please do so. I'm kinda curious, myself. Hence, community wiki.

Parthian Shot
  • 861
  • 2
  • 10
  • 18
  • reqests are for random directory names and come from "DirBuster-0.12 (http://www.owasp.org/index.php/Category:OWASP_DirBuster_Project)" So I guess some reconnaissance is going on – hnn Jul 09 '14 at 19:28
  • Hmm... You could blacklist their IP. Or just heavily log and watch traffic from that IP without letting them know you're doing it; maybe with a transparent proxy. Watch for any change (because if there is any change... Or you could just contact their ISP. If they have a domain (which you can use dig -x to discover, or just normal dig with .in-addr.arpa and a reversed address), I'd ask them directly first, then contact their ISP. – Parthian Shot Jul 09 '14 at 19:48
  • 10
    Good answer! HEAD requests are commonly used by proxies or CDN's to efficiently determine whether a page has changed without downloading the entire body. (esp large media files) – Barett Jul 10 '14 at 01:00
  • 1
    Someone is probably running `wget -m http://yoursite` as a cron job. If you disable `HEAD`, you'll have no fun. – Damon Jul 10 '14 at 10:40
  • 2
    HEAD is also useful to tell if a webpage exists before fetching it - e.g. if you've got some custom web-scraping code and need to see if the page exists then branch your logic from there. – Monica Apologists Get Out May 16 '18 at 20:15
  • `HEAD` is used to get the headers for a given URI without downloading the body. Reason you might do this: (1) checking to see if a resource has changed (e.g. comparing HEAD response vs saved etag) before downloading. (2) checking server capabilities before downloading (e.g. does the server have Accept-Range header indicating we can download/resume in chunks). (3) checking authentication status/requirement. In one example usecase I periodically download very large feeds (in some case 3GB of uncompressed XML). I prefer to make sure it is changed via HEAD before redownload. – mattpr Sep 12 '19 at 14:43
  • For routine automated external link testing for a site to uncover dead links. Unfortunately, a lot of the free image downloading sites block head requests with a 403 code, which is strange because the pages are freely publicly viewable, and their images are not being scraped with a HEAD request. – Patanjali Aug 17 '22 at 01:16
  • I get a significant amount of referrer spam done with HEAD requests. Spammers send a "HEAD / HTTP/1.1" with a referrer on non-secure http, not followed by any requests for css & images in the index.html file. Its a light weight way to carpet bomb urls into logs, in hope some of them would turn up in sites' webalizer referrer section. As noted above, HEAD is easier on the server than GET, so why encourage them to use GET? – Uri Raz Sep 12 '22 at 13:29
19

Everything Parthian said was spot on. HEAD requests are a like a 'short' GET request that avoids the network extra traffic and potentially the rendering overhead of a GET request.

There are a variety of reasons you, your browser, or your search engine may want to do a HEAD request. Some websites may just be pulling meta information off you, and your smaller response is to your benefit. More likely your browser or search engines are probably using HEAD requests to see if their cached versions of your pages are still up to date.

The Response header's "date" and "expires" field should be used by clients when your page is cached to determine when the next time they should visit your site for an update. Also the response headers may sometimes include a modified date that could also be used to indicate when your page needs to be updated.

trevorgrayson
  • 291
  • 1
  • 5
0

NMAP Uses HEAD Requests in Script: 'http-security-headers'

A common technique in penetration testing is to confirm HSTS secure connections over HTTPS by using the NMAP command below, which uses HEAD vs. GET requests. Blocking HEAD requests can cause this test to falsely report that HSTS is not configured properly.

$ nmap -p 443 --script http-security-headers your.domain.com
Starting Nmap 7.80 ( https://nmap.org ) at 2022-05-16 22:33 UTC
Nmap scan report for your.domain.com (xxx.xxx.xxx.xxx)
Host is up (0.026s latency).
rDNS record for XXX

PORT    STATE SERVICE
443/tcp open  https
| http-security-headers:
|   Strict_Transport_Security:
|     HSTS not configured in HTTPS Server
|   Cookie:
|     Cookies are secured with Secure Flag in HTTPS Connection
|   Cache_Control:
|_    Header: Cache-Control: private
schroeder
  • 123,438
  • 55
  • 284
  • 319
-3

Yes.

In information security circles, the HEAD method, while admittedly useful in some situations, allows requests to bypass security constraints.

It should be disabled

Nessus comments on the security issues with HEAD.

OWASP reports how it can be used to create new users on a system remotely.

schroeder
  • 123,438
  • 55
  • 284
  • 319
Rick
  • 138
  • 4
  • 2
    I might suggest that it's a vulnerability if you are vulnerable to that type of tampering. The articles suggest to disable it if you cannot confirm a secure state when allowing it or if you cannot fix it. Your advice is missing context. – schroeder Oct 19 '20 at 09:15
  • I might suggest that every exploit is successful against targets that are vulnerable to it’s specific type of tampering. – Rick Oct 26 '20 at 11:56
  • Which makes your broad answer of "yes" missing the required context ... none of your links suggest a blanket answer of "yes, in all contexts it should be disabled". – schroeder Oct 26 '20 at 12:35
-4

You block the head request and you watch for increases in GET or HEAD request from the scummy scrapers.. THEN YOU BLOCK THEIR IPS. Funny thing is that their BOTS are so STUPID that they don't take the hint and keep coming back for more attempts. At which point you send them off on a 301 redirection to some where else ( ie someone elses web site and bandwidth. Take for example that 98% of all INTERNET search traffic comes from just Google (86% market share ) and BING and YAHOO taking up the remaining 12% ) - there is ABSOLUTELY NO REASON TO ALLOW ALL THE SCUMMY head tester to have any access. You don't need them ! They are expendable. And there is no reason to encourage their traffic.

My site handles close to 1 million visitors per day... of the 1 million -500,000 are from pointless content scrapers and barely used search engines. By blocking those clowns I free up 50% of my bandwidth and server cycles to handle the LEGIT traffic. Its too bad that the clever campers behind Apache and Linus never gave us a NULL that we could send the useless traffic off to.

  • 1
    I take it you've never heard of [DaaS](https://devnull-as-a-service.com/). – forest May 18 '18 at 01:27
  • 4
    You still do not address if HEAD is ever useful. You are outlining a process for blocking some kinds of requests to your site but never if HEAD is useful. Are you aware that some legitimate client-side caching programs use HEAD? – schroeder May 18 '18 at 07:27
  • 1
    This reads like Holden Caulfield got a job in web development.. – Alkanshel Apr 10 '19 at 00:26