10

I noticed recently that ReCAPTCHA is using house numbers and street numbers as images for humans to decode.

315

586

How ethical is it for Google to do this to this? Does it hamper privacy of an individual?

TechCrunch has reported it way back in 2012, and Google had issued a statement,

We’re currently running an experiment in which characters from Street View images are appearing in CAPTCHAs. We often extract data such as street names and traffic signs from Street View imagery to improve Google Maps with useful information like business addresses and locations. Based on the data and results of these reCaptcha tests, we’ll determine if using imagery might also be an effective way to further refine our tools for fighting machine and bot-related abuse online.

kinokijuf
  • 234
  • 2
  • 12
Vineet Menon
  • 393
  • 3
  • 10
  • 6
    Do you have any thoughts on how this would hamper someone's privacy? A number in isolation is just a number. – paj28 Jun 12 '14 at 09:59
  • 1
    The number is in isolation only for us as a user, Google will have everything in place. IDK, it would be like Google is playing NSA. – Vineet Menon Jun 12 '14 at 10:14
  • 1
    Are your concerns specific to recaptcha, or are you asking "is it ethical to decode house addresses"? – paj28 Jun 12 '14 at 10:20
  • Okay, so let me rephrase my concerns. Google know the exact location (coordinates) of the home. Google can get the detail of the name-plates and house number from reCAPTCHA. Either of them in isolation is not so useful, but Google will know both. – Vineet Menon Jun 12 '14 at 12:20
  • Google could get all with without reCAPTCHA by having their own employees (or software) decode the numbers. Which is it that you're asking about: Google knowing coordinates + house number, or Google using reCAPTCHA to decode the number? – paj28 Jun 12 '14 at 12:49
  • To me, this CAPTCHA seems self defeating. They use a program to decode streetview images and make CAPTCHAs, but if their program can decode them, so can another program? – Cruncher Jun 12 '14 at 13:29
  • 3
    @Cruncher Their program does not decode the numbers. ReCaptcha works by showing the user two images, only one of which is known to the system. The user is not aware of which image is the known control and which is the genuinely unknown, so has to solve both. If the control image is solved correctly, it is assumed that the submitted solution for the unknown one is correct too, so the system updates its database with this value. This is just a very basic description of the ReCaptcha system, for more information look here: http://www.google.com/recaptcha/intro/index.html – zovits Jun 12 '14 at 14:10
  • @zovits that's interesting! Thanks for that. I'm sure it has some false captchas too though. Do they potentially only consider it known after several people have given it the same value? – Cruncher Jun 12 '14 at 14:12
  • @Cruncher AFAIK it doesn't decide solely based on the submitted text. Instead it analyzes a broad range of possible clues, including but not limited to typing speed, mouse movements, cookies, known history of the submitting IP address, reported OS and browser, etc. I'd say it is quite a sure bet that Google stores possible captcha solutions as probability-based values that increase with successful solvings. They might employ further logic too, like checking the solutions for the geographically adjacent images. – zovits Jun 12 '14 at 14:26
  • If it's ethical to decode it with your brain (i.e. read it yourself) then it logically can't be unethical to do it with a computer. The only difference is the tool you're using. – flarn2006 Jun 15 '18 at 18:05

5 Answers5

15

For this to be unethical there would need to be the potential for the information displayed to leak personal information. The numbers shown, while they have a small amount of background, do not show enough context for anyone to glean any extra information.

The privacy issue of the source of these images, ie street view, is much more relevant as it allows anyone in the world to look in your window. Of course, that's public anyway as anyone who walks up your street would be able to see that, although the legalities of it in some places are still disputed.

So no, it does not appear to violate anyone's privacy, and therefore is ethical.

GdD
  • 17,291
  • 2
  • 41
  • 63
  • I'd feel shocked if I went to enter a reCAPTCHA and saw a small picture of my house. – Cruncher Jun 12 '14 at 13:26
  • More shocked than seeing your car on your drive in street view @Cruncher? – GdD Jun 12 '14 at 13:47
  • In that case I go looking for it. This just randomly appears during regular browsing – Cruncher Jun 12 '14 at 13:49
  • 3
    I think the chances of that happening are pretty low, if it does happen take a screenshot and be a Facebook hit. – GdD Jun 12 '14 at 13:57
  • Well, I don't think they want to use 2-digit CAPTCHAs, so I doubt I'd see my house. – Cruncher Jun 12 '14 at 14:04
  • -1 because I _strongly_ disagree with your last sentence. Just because it does not violate anyone's privacy does not mean it is ethical, especially given the fact that Google is selling the machine learning results to the US military to assist in automated drone strikes in third world countries. – forest Jul 14 '19 at 10:22
4

I live at 103. Good luck finding me. There are only millions of streets to look through. Even if you knew I lived in a particular region, there would be thousands of possibilities to check. And that's assuming you know that it is me at the particular house pictured.

All that the images really give away is that somewhere a house with a particular number has a particular color and maybe a small portion of the style. You know nothing about who lives there or where it actually is. There is no privacy or security concern as there is absolutely no way at all to make use of the information, and even if there was, it is no information that driving past would not reveal.

AJ Henderson
  • 41,816
  • 5
  • 63
  • 110
1

Regarding privacy, the purpose of fixing a number to your house is to inform people of the number. Granted, not necessarily to inform Google in particular. I don't think there's anything inherently wrong with working out the number of a house from a picture of that house.

I think the ethical issue of Google using humans to identify these numbers, or for that matter OCR-ing them, is subsumed by the ethical issue of whether it was OK in the first place for Google to compile all these images that can be obtained by passers-by in public streets.

Now, I'm not going to rule on whether it's ethically OK for Google to aggregate millions of photos of people's houses. Some folks aren't happy about it. If Google agrees for whatever reason to remove your house from StreetView, then I'd hope that they also don't use it in StreetView-based CAPTCHAs.

There's maybe an entirely separate ethical issue, what effort it's OK to make CAPTCHA-completers do on your behalf, and to what productive purpose. One could imagine a Mechanical-Turk based CAPTCHA system that just assigns any old kind of HIT to people -- one that has been done before to verify they're human, and one that hasn't for your personal profit. This might well be regarded as unethical, or at least chiselling. So I suppose one could argue on that basis that it's not OK for Google to use CAPTCHAs in order to locate houses on StreetView for the benefit of their mapping products. reCAPTCHA also helps scan books, so the same ethical issue applies there. I think most people just figure that if you want to log into somebody's website, or use their CAPTCHA product, then they can ask you to do some small amount of work for them in return.

Steve Jessop
  • 2,008
  • 10
  • 14
0

I don't see how it would violate someone's privacy unless the address of the image can isolate and identify an actual resedential address. It's like picking a random four or five digit number and framing it with a background color. That's all it is to me. I look at it this way, if google did not tell us these were actual street addresses would we even bring up the issue that these could potentially be privacy concern? Probably not , although they are real, still there's probably thousands of possibilities isolating that address, even if we did, it's still public information located on google maps.

eof0100
  • 424
  • 1
  • 5
  • 10
0

It would be a concern if they were publishing identifiable information about a person that lived in the house. But it is no more information that would be available than if I were to simply drive past the house. I'm not sure how this is an ethical issue.

Anthony Shaw
  • 101
  • 2