Most other methods mentioned here are easily defeated and/or significantly reduce performance.
That is why web application firewalls (WAF) don't rely on them.
2 reliable methods are:
- Analyze the packet sequence in the IP header: Most TCP/IP stacks
number packets sequentially, so if the number is random, then the
client is probably behind NAT (a shared Internet connection with
other clients). There is some RAM overhead to remember the previous packet ID, but it's comparable to a session cookie. This is harder to spoof since the attacker needs to rewrite the packet at a low level, but not impossible, so it should be combined with rate limiting: allow NATted public IPs to have more hits/second.
- Session cookies with cryptographic authentication:
Normal session cookies are not enough. They aren't encrypted with HTTP, and even with HTTPS, they may not be part of an HTTPS-encrypted payload, and therefore not authenticated either. http://blog.teamtreehouse.com/how-to-create-totally-secure-cookies Do it properly. Make sure it's not just encrypted, but authenticated/"signed". http://spring.io/blog/2014/01/20/exploiting-encrypted-cookies-for-fun-and-profit If you only have access to the web app code, this is the better option. Obviously if cookies are disabled, then run a client-side JavaScript that distinguishes between browsers and bots (call for screen resolution, etc. that a bot would not support, and return bot status if the script fails). Combine this with rate limiting by authenticated session cookie or, if cookies are disabled, IP-based rate limiting ("throttling") or blocking. This accepts/rejects the request early in the IP/HTTP parser's routine, conserving future CPU cycles, since a brute force attacker may retry thousands of times. Due to DHCP/PPPoE/new session cookies, bans should expire after a few minutes. Keep in mind you should white list non-spoofed IPs of search engine crawlers like Google and Baidu. Otherwise your search rankings may suffer.
If you combine your solution with rate limiting, the correct rate limit should be 1-2 times the number of URLs per page. For example, if login.php includes 2 CSS files, 4 JavaScript files, and 10 images plus the 1 page itself, you must allow at least a total of 17 hits per client session cookie, per second. Otherwise you may be blocking/limiting normal requests.
For persistent attacks, you should ask your ISP to block/blackhole the route upstream.
Why not use the other solutions?
'User-Agent:' is very trivial to spoof:
wget -O index.html --user-agent="My Fake Browser"
Session cookies, 'X-Forwarded-For:' HTTP header, and other headers are also trivial to steal/spoof. Google 'Firesheep' or 'Cookies Manager+' or 'Modify Headers plugin' or 'LiveHeaders plugin' etc. for proof.
Rate limiting is not enough alone, either, because a stealth attack will randomize or increase wait time between requests:
wget --limit-rate=10 http://example.com/index.php
Brute force is usually not your only problem. https://www.owasp.org/index.php/Top_10_2013-Top_10 Coding and testing effective protection takes time, too. To save you time and/or CPU cycles on your web servers -- it's multiplied waste if you have a server farm -- your web host should offer a front-end WAF with this configured for you. That is the best solution -- don't do it server-side. Do it upstream.