Why is breach-detection site "Have I Been Pwned" considered safe?

Question

Whether it be due to technology the site is using, or any manual behind-the-scenes work with the data, why does this breach detection site seem to be unquestioningly safe?

Wouldn't the data of you, as a user(breached/pwned or not), utilizing this tool be used against you if not secure properly(see examples below)? What is this website/project doing or using to prevent this?

If you go to this site, enter your information, you are at least providing the potential Boolean checkbox of: "Visitor [YourUsernameProvided] cares to check."

Is this not valuable data? If black hats out there have something like a list of 2,000,000, and they take/intercept the data provided by this site, couldn't they get a smaller list of 12,000? A curated list of targets that "care"? "Targets that care" can mean "Targets that have value". It can mean "Targets that are active". Perhaps even "Targets who are actually humans, not bots".

On another note, if you use the site to check multiples of your usernames, wouldn't you potentially be crafting a list of "All these usernames have been accessed from this location" and therefore be tying all your online personas together?

This all sounds like free work given to black hats. So, what technologies or methods are in place to prevent such a thing?

The tin-foil-hattery is strong with this one. Good question! — Mike Ounsworth, Jan 18 '19 at 01:56

score 8 · Accepted Answer · answered Jan 18 '19 at 10:09

This article by the service's creator may answer some of your questions.

https://www.troyhunt.com/here-are-all-the-reasons-i-dont-make-passwords-available-via-have-i-been-pwned/

Specific details that might be of interest:

Passwords are not stored along with user details because there is no such thing as "secure enough" storage for this kind of thing
Have I Been Pwned? won't tell people their own passwords anyway, even if the account ownership could be verified
Some more sensitive breaches - Ashley Madison being the first such breach - are kept more discrete by only disclosing that an email is in the breach corpus after confirming you control the address

Here's an additional article covering the Pwned Passwords feature:

https://www.troyhunt.com/ive-just-launched-pwned-passwords-version-2/#cloudflareprivacyandkanonymity

Of note, Pwned Passwords as the downloadable list provides only Hashed Passwords. There is some question as to whether this constitutes a password dictionary that can be exploited, but given it doesn't associate the passwords at all with who used them or where - reversing them to use them just wouldn't be that valuable. And while some may not consider this a satisfactory answer... these passwords are already out there.

The most recent "Collection #1" breach, with over 12,000 sources is evidence enough that Have I Been Pwned is not the only one aggregating this type of information. And the competition does not have your best interest at heart.

Pwned Passwords as a lookup service uses k-anonymity to provide some safety. It works basically like this:

You hash the password with the same algorithm Pwned Passwords uses (SHA1)
Submit just the first 5 characters for the hash, which given the sample size of the database will return many results for any given 5-character combo
You search the returned list to see if any of the results match your hash from the first step

I can't see the future, so I don't know if this collection of information will ever become exploitable in any meaningful way... but as far as I can tell, Have I Been Pwned is provided as a useful service provided for virtually no gain, in the interest of public safety.

score 3 · Answer 2 · answered Jan 18 '19 at 10:10

3

I'm not sure where you are getting the "unquestioningly safe" claim. They ask this question of themselves and provide clear explanations of what they claim they are trying to do to limit risks to the people involved. Believe them or not, but the question is actively raised.

Second, could users be profiled as being "those who care"? Sure. How is that a risk? How is that a higher risk than any check anyone does on the Internet for Internet safety? I'm not sure where you are getting "care == value" logic. It does not follow. There are so many other methods to get this information with so much more enrichment than just seeing what usernames are accessed.

Note that companies check on their own email addresses using automated methods, so I'm not sure that there would be clear value in gathering this usage data.

Third, remember that the data they are processing is already public. So blackhats do not get an easier tool.

answered Jan 18 '19 at 10:10

schroeder

123,438
55
284
319

1) Definitely an observation, sorry! I've just been noticing that, for example, in Discord chats(programming-related ones primarily), the suggestions to go to this site and check for your safety is not just an instant reaction from seemingly all members, but is often posted in the official server's announcement area. Feels unquestioningly to me! Perhaps I didn't look hard enough, but I didn't see a clear link/area to this self-questioning activity or discussion. – Nohbdy Ahtall Feb 01 '19 at 18:39
2) I still entrench my stance, [assert(care == value);]! I do think the idea of a boolean-variable, which is attached to a hypotehtical email/user profile, could exist and be utilized for adding efficiency to black(/any?) hats. A [bool User::CheckedBreachStatus = TRUE;] update per intercepted site-visit [&& (email || username)] checking would set this variable. There may be a [vector targets] which could be a massive list, and to narrowing that list down to "Those Who Checked" sounds like a huge efficiency boost. They may statistically determine those people tend to hold greater value. – Nohbdy Ahtall Feb 01 '19 at 18:52
3) Indeed, but the data I speak of is ongoing. The data-flow can stop by simply "not checking"(and/or not visiting the site, maybe not even Google searching it). A preferred method would be to scramble this data to some degree, but that would first require the consideration that this data matters. If the website creator/manager agrees with your stance - that it's not unsafe information to worry about - then I can safely assume there isn't protection in place! – Nohbdy Ahtall Feb 01 '19 at 18:55

Why is breach-detection site "Have I Been Pwned" considered safe?

2 Answers2