2

I know some websites say in their terms and services that the website can only be accessed by humans and not bots or scripts. How do they tell who is accessing? Especially with tools such as AutoScript that just record the mouse movements and keyboard input, wouldn't it appear fairly natural? This is assuming there are no additional security features like captchers.

I'm guessing if the log files show that the user had been repeating the exact same actions with the exact same timing, this would be a red flag. But wouldn't this be hard to check for?

Celeritas
  • 10,039
  • 22
  • 77
  • 144
  • Usually user profile is done based on the kind of interactions that they have with a website. The interactions can be the kind of links clicked, the time spent on the site etc. That could be one way to find out. But if you develop a system that simply mimics your action, then it is difficult – Limit Jan 09 '17 at 00:20
  • 5
    Stating such a restriction in the ToS doesn't necessary mean that the site employs extensive technical measures to ban bot activity, though there may be simple mechanisms in place such as rate limiting. The benefit of a blanket ban is that it gives the site administrators explicit leeway to take action against any bot activity that negatively impacts the site. – tlng05 Jan 09 '17 at 01:23

3 Answers3

2

The most common method is probably a hidden field, bots tend to fill these but humans don't.

Also common is to use a captcha of course.

Then some people use JavaScript to detect page events that typically only happen in a real browser. Such as a click event in a form field.

It might also be possible to get clever with timing things in the browser, again using JavaScript. Humans will type at a much slower pace than bots for example. They will also take much longer to fill in and submit than a bot.

Julian Knight
  • 7,092
  • 17
  • 23
  • Chrome browser auto-fill will also fill in hidden filed. – mootmoot Jan 09 '17 at 11:21
  • @mootmoot - ??? - only if you have set it. That's the point, a human won't fill it in, a bot will (typically). So autofill wouldn't kick in. – Julian Knight Jan 12 '17 at 14:54
  • One can still write custom behavior with tools like Selenium . – mootmoot Jan 12 '17 at 15:31
  • 1
    I didn't say the method is perfect did I? But it is effective and cheap. Anyone could get round it if they wanted to but the effort isn't worth it Bots work on scanning millions of forms automatically not by someone hand-crafting Selenium jobs. – Julian Knight Jan 13 '17 at 15:09
1
  • To detect, sites owner can deploy a lot of mechanisms like: Timing of requests, Content of the requests (the values of the field), User Agent string...etc.
  • To protect, CAPTCHA (stands for: Completely Automated Public Turing test to tell Computers and Humans Apart) is relatively a good measurement, request rate control...etc.
Opaida
  • 323
  • 1
  • 3
0

This is rather vague terms. This can mean anything :

  1. There is no bots.txt for Web portal search engine to crawl
  2. May blocked multiple web request from an IP that appear to be web scrapper.
  3. The web page landing page javascript may track user activities to determine mouse movement.

There is no guarantee technology to determine automation script, since there is always way to mitigate various bot detection mechanism.

The ToC is there just in case :

  1. Website is attack by DDoS, and they can blocked particular clients request.
  2. They can blocked possible contents scrapping activities, e.g. contents mills, price compare scrapping bots.
mootmoot
  • 2,387
  • 10
  • 16