42

I noticed that Google's "I am not a robot" reCAPTCHA forces me to check correct images on my computer. I installed a virtual machine and tried there. Same thing. Used proxy. Same thing too. Then I used another computer in the same network (same public IP), but this time the reCAPTCHA doesn't force me to solve it. It just checks itself when I click it.

Very curious behaviour. I repeated the process a couple times with a few days in between, and some computers never need to solve reCAPTCHA, while others (including brand new virtual machines) behind a proxy need to. I even tried a new browser in a fresh new VM. I'm on a home network, not an enterprise network. I am confused about what triggered reCAPTCHA into thinking it needs to double check me even when using new virtual machine behind a proxy?

On computers where it isn't suspicious, I can delete all the cookies, history and caches, visit a website and reCAPTCHA just lets me go without any concerns. So it can't be solely based on my past activity. On the other hand, if I indeed solve the reCAPTCHA and register for an account on a website, the website is missing all the functionality for registered users.


Also, when I'm presented with a CAPTCHA, even on brand new VMs, the functionality of registered users is limited. Which leads to thinking that reCAPTCHA sends information of what it thinks about a specific user to the website owner. Is this a documented behaviour?

D.W.
  • 98,420
  • 30
  • 267
  • 572
sanjihan
  • 639
  • 2
  • 7
  • 11
  • 14
    I'm pretty sure what triggers the image solving is "secret", probably including analyzing the headers and public IP address server-side and possibly some client-side detection (in the minified JavaScript). Things like operating system, browser, and being in an authenticated Chrome/Google session seem to play a factor. – Alexander O'Mara May 29 '16 at 18:31
  • @sanjihan Logged into a google account. Possibly stuff like how long the account has existed for, how frequently it's used, and whether it has payment methods associated with it matters. – SomeoneSomewhereSupportsMonica May 29 '16 at 19:21
  • 3
    I noticed the same. Two different Macbooks, both Chrome with session logged in, same network. In one of them I need to choose the relevant pictures but in the other I just need to click on "I'm not a robot" – The Illusive Man May 29 '16 at 19:46

2 Answers2

47

Google tries to figure out if you are a bot or not. If it's in doubt, it serves you a CAPTCHA to check. Exactly how this is done is part of Google's secret sauce, and I don't think they will tell you. But here are some ingredients I guess that they mix together:

  • Your IP: Has it been identified as a bot already? Is it a Tor exit node?
  • The resources you load: A simple bot does not load styles or images, since it does not need them. That is a tell tale sign that someone is not human (or, as JDługosz points out in comments, blind).
  • Sign in: Are you signed in to a Google account? Does that account appear to belong to a real person?
  • Your behaviour: A human scrolls down the page, moves the mouse around, takes some time between pushing down the mouse button and releasing it. A human does not click the dead center of the check box every time. All this could be mimmicked by a good bot, but it is not easy.
  • Your history: Google knows a lot of your browsing history. Bots usually don't have a browsing history.

Figuring out exactly why you need to solve the CAPTCHA sometimes, but not others, is not easy. I could imagine that a fresh virtual machine has a browser fingerprint - installed fonts, plugins, etc - that is very common and therefore fishy enough for Google to flag your for a CAPTCHA. If you are behind a proxy, perhaps others have used it as well for non legit activities.

That you don't get a CAPTCHA when you clean your cookies is surprising. I don't understand why - then Google knows very little about you and should assume you need a CAPTCHA to be on the safe side. Perhaps they do some advanced browser fingerprinting so they still know who you are?

Do note that all of this is speculation. If you want more speculation, have a look at How does new Google reCAPTCHA work?.

Anders
  • 64,406
  • 24
  • 178
  • 215
  • 4
    I also suspect that Google, at least sometimes, uses the content of your searches to make a decision. For example, multiple searches for Google dorks, even at a relatively low speed, might trigger a CAPTCHA. – A. Darwin May 29 '16 at 19:25
  • It probably also checks the plugins or the fonts you have installed. – Ángel May 29 '16 at 22:10
  • "Not loading images" is really a telltale sign of the use of a nonstandard browser. Given the domain of "nonstandard browser", quite a high population of that is undesirable clients. – Riking May 30 '16 at 04:35
  • "... moves the mouse around, takes some time between pushing down the mouse button and releasing it ..." - touch screens? I guess they could use touchstart/touchend, but what if the browser doesn't provide these / the OS eats these? – John Dvorak May 30 '16 at 05:32
  • About fingerprinting fonts and plugins. I have never come across the need to install a plugin for my browser beyond the default ones. Same goes for the fonts. – sanjihan May 30 '16 at 07:31
  • 1
    "That you don't get a CAPTCHA when you clean your cookies is surprising." This indeed, when I implemented reCaptcha not too long ago it would *always* trigger in incognito mode the first time after reopening the window (effectively deleting cookies and stuff). – David Mulder May 30 '16 at 10:49
  • *"That you don't get a CAPTCHA when you clean your cookies is surprising."* I believe the recent activity history of your public IP address is the most critical component. If you clean your cookies and then immediately hit google, they probably have a reasonably good idea who you are. Unless you're behind a proxy server. On the other hand if you've been using the address for less-than-legitimite purposes, they might be a little more suspicious and trigger a captcha in those circumstances. – Jules May 30 '16 at 13:42
  • 3
    The interesting part is that this bot detection is most likely implemented by a machine learning algorithm. Google would know that inputs that algorithm computes over, but it's very likely that Google themselves don't know exactly how the algorithm decides it. – Alexander May 30 '16 at 16:22
  • Google could also use the local data storage for its detection, or magic cached pages with certain etags – Ferrybig Jun 02 '16 at 05:56
13

I read somewhere that reCAPTCHAs use the movement of the mouse (only in their area) to determine if you are a bot or not. Try this

  • use mouse keys on your computer (if it is windows use Left-Alt + Left-Shift + NumLock) to move the mouse straight up.

This should trigger the image selection test.

Rory Alsop
  • 61,367
  • 12
  • 115
  • 320
Guest A
  • 131
  • 2