1

I used to believe that captcha-like solutions are the websites' primary way to combat spambots. I used to believe that any site that doesn't implement such solution is likely to be flooded by shady weight loss pills ads.

But this is not the case. There is a counterexample. The Pokemon Showdown website allows unregistered people, who haven't attempted to pass a captcha, to freely talk in their Help chatroom as long as they choose a nick. From my experience this chatroom is not being flooded by shady weight loss pills ads.

According to people knowledgeable in this site:

Most ways for bots to connect to PS are blocked automatically

(source)

Ah. So bots do try to spam, but are blocked if I understand correctly.

OK then. What are other, non-captcha ways to block bots?

Note I bring the PS example only to back up my point that it is possible; but I'm asking how can this be done, not how PS does this; so please don't close this question as off-topic and only answerable by PS staff.

gaazkam
  • 5,607
  • 11
  • 24
  • 37
  • Search for "captcha alternatives" and you'll find many different suggestions for combating bots. You'll also find that there is no perfect defense if your site is of high-enough value (like a ticket sales site) because scalpers will use services like Mechanical Turk to hire lots of real people (aka "smurfs") to pass whatever tests you throw at them. These smurfs quickly corner the market on high priced, limited availability, high demand items, making the scalpers a tidy profit even after paying the smurfs. – John Deters May 17 '18 at 22:18

1 Answers1

0

The primary method of understanding a "bot" is behavior based. Fail2ban is an example of a behavior-based software that blacklists an IP for "behavior" - in this case, brute-forcing an SSH. Other behaviors are excessive comments, posting obvious XSS scripts in comments and having many directory/URL requests within a short period (e.g. the use of dirBuster). Usually behaviors are identified by using some sort of algorithm or software that detects when an IP/user/role is doing something far more quickly than a human could (whether posting comments or scanning IP's.

Moreover, many firewalls, bot filters and spam filters have massive lists of domains and IP's that are already known for bad behavior. These RBLs et al. are updated constantly and more well known sites like spamhaus work to keep compromised IP's at bay.

Also noteworthy is that the predominant majority of bots come from certain geo-based IP blocks. I can expect with high regularity daily scans on my work network from East-European, African and Asian IP's. This is where greylisting comes in handy, because a lot of reputable people live in those areas and want to consume whatever content we're serving up, but that area also produces a lot of bots.

SomeGuy
  • 730
  • 3
  • 18