41

Some services (for instance ProtonMail) claim to store hashes of phone numbers, instead of phone numbers themselves (while they don't say how they hash it). Now, given that the number of potentially valid phone numbers is very small (about 26 bits worth of information in an 8-digit phone numbers), it should be quite easy to recover a phone number from its hash.

So what's the point?

BlenderBender
  • 539
  • 1
  • 4
  • 7
  • 20
    did you take into account the likelihood of salt and pepper? – TheHidden Mar 15 '18 at 14:44
  • 7
    I think the hashing should be seen more as obfuscation in this context. Not irreversible, but still better than nothing. – Anders Mar 15 '18 at 15:06
  • 27
    what gives you the idea its 8 digit? phone numbers can have between 3 and 15 digits [source E.164](https://www.itu.int/ITU-T/recommendations/rec.aspx?rec=E.164) and thats even without the Country code and international prefix ('+'). having 18 digits would yield a lot more than 26 bits of information – LvB Mar 15 '18 at 15:44
  • 19
    @TheHidden All the salts and peppers in the world won't change the fact that a 26-bit key space is easily brute forced (in about 67mln hashes). Using a slow hash function helps against brute force; salts and peppers do not (and are not designed to). – marcelm Mar 15 '18 at 19:12
  • 1
    Other than preventing employees from taking a quick look at phone numbers, the purpose here is rather *marketing* than *security*. But you can’t really blame them, there is no easy solution to this problem. Given what they want to do, this is still (a little bit) better than storing in plaintext. – caw Mar 15 '18 at 23:08
  • The real question here is do you trust ProtonMail your phone number? There is no way to verify if they do as they say, and if your (free) life depends on it, you should just use a burner phone with a non-registered SIM. (And use a computer, Live CD, internet uplink and physical location that cannot be linked to you in any way.) – user258572 Mar 17 '18 at 09:35
  • 1
    Salts and peppers are just to prevent using pre-computed hashes ("rainbow tables"). They don't stop brute forcing *at all* and that's clearly what OP is referring to in asking "what's the point". – Kat Mar 28 '18 at 17:27

4 Answers4

61

ProtonMail may request your phone number to perform a human check:

  • ProtonMail detects that you're attempting to create several accounts.
  • It requests you a phone number, to send you a token via SMS.
  • You must send that token to ProtonMail to prove you're the phone number owner.

Then, ProtonMail doesn't need your phone number anymore, but it still need to use it to prevent spammers to create multiple accounts.

Hashing the phone number allows it to not store the original number and to prevent someone to use the same number twice.

From their FAQ:

However, using the same phone number will result in obtaining the same cryptographic hash, so by comparing hashes, we can detect re-use of phone number or email addresses for human verification.

Thus ProtonMail doesn't seem to use unique salts.

We also know thanks to a tweet from Bart Butler (ProtonMail CTO) that:

  • ProtonMail regularly flushes stored hashes.
  • Stored hashes aren't linked to any account.

Bart Butler also tweeted:

We use a slow password hash (With a salt) and flush the list and rotate the salt at irregular intervals.

In conclusion: brute-forcing them is possible, but it's neither practical nor useful.

Benoit Esnard
  • 13,942
  • 7
  • 65
  • 65
  • This quote doesn't imply ProtonMail is *not* using any salt. However, your conclusion doesn't change if the salts were to be stored with the hashes. – Yuriko Mar 15 '18 at 15:06
  • 2
    @Yuriko: indeed, changed "salts" to "unique salts". – Benoit Esnard Mar 15 '18 at 15:12
  • 9
    I believe my comment still works with unique salts. (Unique) Salting would obviously deter bruteforcing attempts, but the search space is small enough to consider bruteforcing each entry. – Yuriko Mar 15 '18 at 15:18
  • I still don't see how this answers the question. What security property does their current system have, that it wouldn't have if they stored the unhashed phone numbers? I can't see a single one. – D.W. Mar 15 '18 at 21:20
  • 11
    I think you should remove your claim that brute-forcing them isn't practical, or provide a technical justification for that claim. Based on the description here, it seems likely to me that it *is* practical. If each hash takes 100ms to compute (which might be an overestimate), you can still enumerate a space of 2^{26} possibilities in 77 CPU-days, which seems practical to me, given that it is trivially parallelizable. That doesn't even take into account possible speedups from GPUs. – D.W. Mar 15 '18 at 21:22
  • 4
    @D.W. : Who is your adversary? A nation-state or a random ProtonMail employee who happens to have read access to the phone (hash) field? – Eric Towers Mar 15 '18 at 22:58
  • 2
    So in essence, the main point is that hashes are only stored temporarily. The secondary point is that they increase the cost of brute forcing by using a slow hash and salts, which hides phone numbers behind a sufficient amount of computational complexity so that a random ProtonMail admin can't simply see your phone number. – BlenderBender Mar 15 '18 at 23:32
  • 9
    @EricTowers, you pick! I can't think of any reasonable threat model where hashing adds much strength. Even a random employee can easily do 77 CPU-days of computation; that's probably a few hundred bucks on Amazon EC2 (or possibly free on their home machine if they have a nice GPU, maybe). Can you identify a non-trivial adversary (a reasonable threat model) and a security property that is achieved when ProtonMail hashes, but isn't achieved when they don't hash? Looks to me like hashing provides security only against very weak adversaries with little computation power. – D.W. Mar 15 '18 at 23:35
  • 7
    Seems to me that's exactly the point raised in the question, and I don't see how this answer is responsive to that point. Seems to me like any security that is gained is mostly attributable to periodically flushing data and not linking the data to accounts, rather than by hashing; hashing doesn't seem to add all that much. – D.W. Mar 15 '18 at 23:36
  • 12
    @D.W. the hash is still useful for social reasons. Putting up any barriers to viewing the real phone number will help __keep honest people honest__. Sure this hashing setup may not help much against a determined attacker, but it still removes immediate temptation for a random employee to abuse the system. Hashing (even a weak hash) draws a clear line in the sand for an employee for what is acceptable to view. – ryanyuyu Mar 16 '18 at 13:13
  • 1
    _I'm going to restate to see if I understand correctly:_ PM uses a single "salt" (more like pepper) for all the phone hashes. They do this to prevent a single phone number from making infinity accounts. They delete all the hashes and change the pepper on random intervals. Upon wipe, new accounts can be made with existing phone numbers, but PM is intentionally vague about when/how-often the wipe happens, so you can't sign up new accounts on a set schedule knowing that the list is gone. _How'd I do?_ – Michael Mar 16 '18 at 14:31
  • @ryanyuyu If your comment was an answer, I'd upvote it: the threat model here is "a tempted employee who stumbles on the data", whereas the OP assumes "a determined employee, with access to the pepper as well as the hashes, willing to invest time and energy reversing the hashes". – IMSoP Mar 16 '18 at 15:30
  • @ryanyuyu, that makes sense to me -- but it's a far cry from this answer, which claims that "brute-forcing is neither practical nor useful". Want to write an answer saying what you said? – D.W. Mar 16 '18 at 15:35
  • @forest, sorry, I don't follow you. (Did you notice that the answer claims they use a slow hash? It doesn't say how slow, so I'm making a wild-eyed guess that maybe it's 100ms per hash computation. That could be totally wrong; we'd need information from ProtonMail to know what they're actually doing. However, given that the hashes need to be computed on the client side, and some clients might be mobile clients, I would be surprised if the hash takes much longer than that to compute on a modern desktop.) – D.W. Mar 16 '18 at 15:38
  • @D.W. good idea. I've added such an answer now that I've had the time to actually write one out. – ryanyuyu Mar 16 '18 at 18:07
  • @D.W. I missed that it was a slow hash! My bad. – forest Mar 17 '18 at 00:27
  • @D.W. I don't see how the hashing would happen on client side. For one the provider *needs* the real number to send a SMS and even if it didn't, allowing the client to generate the hash would rather defeat the security idea. – Voo Mar 17 '18 at 18:42
  • @Voo, OK, maybe I misunderstood then. In any case, I guess we'd need information from ProtonMail to know how slow their slow hash is; my 100ms is pure speculation. – D.W. Mar 17 '18 at 19:50
  • If they simply store the hash to prevent spam, being able to crack one entry seems pointless. Arguably this hash (thanks to the salt) prevents anybody to get the _complete_ list of numbers, which would be bad for the brand. – TonioElGringo Mar 20 '18 at 12:53
  • Sorry for the downvote, I don't know how it happened, I must've accidentally pressed it on mobile or something; but it's locked now :/ (thanks weird voting mechanics..) – SWdV Mar 22 '18 at 17:36
  • @SWdV: well, I've just fixed a minor typo. It should be unlocked now. And no problem about the downvote. :) – Benoit Esnard Mar 22 '18 at 17:45
10

The hash is useful as an indirect map, even if it's not as secure as a typical hashing setup. One of the biggest benefits is purely social. Hashing (even a weak hash) draws a clear line in the sand for an employee about what is acceptable to view. Putting up any barriers to viewing the real phone number will help keep honest people honest.

it should be quite easy to recover a phone number from its hash

Easy is a relative term. True, this hashing setup may not help much against a determined attacker who is willing to perform hash cracking. But you also have to think of the 99% of other employees with access to the data who don't even know what a hash really is, let alone how to crack them.

ryanyuyu
  • 211
  • 1
  • 5
  • 2
    That's a good point: even if it only adds a minor technical barrier for someone knowledgeable, it is enough of a social deterrent to be useful. Kind of like a (regular) fence. – BlenderBender Mar 17 '18 at 00:09
  • 3
    @BlenderBender: Indeed! And that's a really important point. We hear here over and over again how security by obfuscation is useless to the point of being better omitted, yet everybody I know has a thin slice of wood around their garden. It's trivial to see over it, and provides no real privacy, and certainly no real security. But it does send a message: "this is _my_ space and if you enter it you must have willfully violated my terms". Hashing a phone number, while maybe cryptographically useless does the same thing in the context of employees glancing at the DB. That is not entirely useless. – Lightness Races in Orbit Mar 18 '18 at 04:25
6

The point is to not store them in plaintext.

That is probably pretty much it. As D.W. pointed out in his comments, that Benoit's answer, tells you their reason why they store phone numbers and that they hash them. ProtonMail does not tell you why they hash them. We all can only speculate about this, until an employee of ProtonMail tells us the exact reason.

The most probable reason is (in my opinion) is the following:

ProtonMail is a company whose whole business model is founded on secure products and protecting a customer's privacy. If they told you, that they saved phone numbers in plaintext, that would be pretty weird. Hashing them makes much more sense in that regard, don't you think?
On the other hand, ProtonMail doesn't link phone number hashes to user profiles, they flush the hashes regularly and as you stated yourself, there's not much to gain from a phone number.

Hashing phone numbers if they have to store them is better than not hashing them. That's why they do it.
Does it strengthen security much? No.
Is it better than storing them in plaintext? Yes.

Tom K.
  • 7,913
  • 3
  • 30
  • 53
1

There are two reasons for storing hashed phone numbers, one is useful the other one is not:

1) Allow to verify the user. Here a salted slow hash is useful. While brute-forcing a phone number is faster than a password, it still provides added security.

2) Pretend to provide a more safe lookup (i.e. in several of whatsapp competitors). Here you cannot salt the hash, because you would not be able to search for the hash when only knowing the phone number. This means a rainbow table is easy to create as the search space of unsalted hashes is really small.

Note that 1) still provides an easy proof of existence when you have the database. Hash your phone number with all the salts used in the database and look it up. If it is stored in there, you will find it.

allo
  • 3,173
  • 11
  • 24