0

Let's say I have an API call which is supposed to check if email address exists in the DB, without transmitting the actual email address. For that purpose, I've decided to store SHA256 hash of the email address in the DB, and updated client to send SHA256 hash of the email in the API call (instead of plaintext email address).

How can I make this more secure using salt when calculating the hash?

As per my understanding, having a fixed salt is pretty much the same as not having salt at all. Is there any way I can have unique salt per email address, but in such a way that both server and client can calculate them?

Would something like this make sense? If not, can you please explain why?

salt = sha256(email)
email_hash = salt + "|" + sha256(email + salt)

Consider the following quote:

The salt value is not secret and may be generated at random and stored with the password hash. A large salt value prevents precomputation attacks, including rainbow tables, by ensuring that each user's password is hashed uniquely. This means that two users with the same password will have different password hashes (assuming different salts are used). In order to succeed, an attacker needs to precompute tables for each possible salt value. The salt must be large enough, otherwise an attacker can make a table for each salt value.

I'm not sure if I'm missing anything or not, but does this mean that even if salt isn't a secret nor random, and everyone knows how to calculate salt for an email address, it would still prevent the attacker from recovering email addresses from hashes using rainbow tables. Because each email would be hashed uniquely. Is that correct?

xx77aBs
  • 103
  • 4
  • 1
    Well, if you really want to have a secure email, treat it as a password. see [how to securely hash passwords](https://security.stackexchange.com/questions/211/how-to-securely-hash-passwords) to understand more about how and why we salt – Kepotx Jun 29 '18 at 10:01
  • Why would who do such thing? Your client knows the email, so why would they hash it? Salt must be random to be a proper salt, so `sha256(email)` is not a real salt. – Xenos Jun 29 '18 at 10:02
  • However, just out of curiosity, why so much protection for an email? – Kepotx Jun 29 '18 at 10:02
  • I agree with @Kepotx, you can read the post that he has pointed. And you can choose one of the globally accepted password hashing standards and follow it rather than trying to generate your own. – Pilfility Jun 29 '18 at 13:04
  • What does the server do with the email hashes? Why does it need to store them, i.e. know that an email address matching the hash is "stored"? Why does the client need to be able to check if the hash is on the server? What's your threat scenario? – ilkkachu Jun 29 '18 at 15:21
  • Hey guys, the example isn't really the best, but let's still try to work with it. Let's say I need to fetch an attribute associated with email, without sending the actual email to the server. I would like to use salt to make it harder to recover all of the emails from the hashes (in case someone gets the DB). My understanding was that using a unique salt (even if it's known) would greatly increase the difficulty of recovering emails. I've updated my question to add one more question. – xx77aBs Jul 03 '18 at 11:25

2 Answers2

3

Would something like this make sense? If not, can you please explain why?

The process would probably require a hashmac instead of plain sha256. Plus, I would head up to crypto-hashing functions of your language instead of simple hashing functions (ie: password_hash in PHP instead of md5).

Last, you didn't salt your hash in your example: your salt = sha256(email) is the unsalt hash that attacker would like to know.

Does this mean that even if salt isn't a secret nor random

By definition, salt is random, so question is contradictory

it would still prevent the attacker from recovering email addresses from hashes using rainbow tables

No, because the salt must be random (and unique for each field). If salt is not random and predictable from your original data, or if the salt is not unique, then rainbow table can be used (maybe attacker will need to forge that table first, so that's harder than no salt, but that's still doable).

Because each email would be hashed uniquely.

But two calls of the hashing method with same data yields the same result, and therefore is a threat. If your "salt" is not random, you'll still have unique hashes (otherwise, write down your email because you've found a collision)but a rainbow table is meant to match one hash to one corresponding plain raw value, so you'll still be vulnerable (even if attacker will probably have to forge that rainbow table first instead of reusing one from the web).

So to fall back on your case, what you want is:

An API that can tell if an email "exists" (is on a server's whitelist) without sending the email itself

If the servers knows the plain email, then you can let client generate a random salt, do a password_hash of that email+salt, and send this hash and the salt to the server. Server will then take each email it has, do the hash process, and if a match is found, then server returns "email exists" to the client.

If the server does not know the plain email, but only a salted hashed version of it, then you can treat this email like a password, which is sent in clear over secures connection.

If you still don't want to send the email in clear over secured transport, then you cannot send a random salted hashed value (it's the goal of random salted hashed values: have a unique result for each call, and not being able to get any information from that value, so server won't be able to get the information "this email is on whitelist"). Then, you will need an id for every hash, that will act as a unique public identifier of that hash (the same way login is a public identifier of server's hashed password). Server would then return the hash for this id and client will be able to password_verify it using the email it knows (so email is not transferred, but the hash is).

Xenos
  • 1,331
  • 8
  • 16
  • Thanks, your answer has cleared up a lot of my confusions. I'm baffled that I haven't noticed that my proposed "salt" is the hash of the email address and would be saved in the DB in clear text, rendering the other calculated value (sha256(email + "salt")) completely unnecessary and useless. – xx77aBs Jul 04 '18 at 11:19
0

The premise is wrong

You are asking how to generate a common salt, but not why you want to salt the email hashes. You salt passwords so that two users that chose "Password1" get different hashes on the db. However, here you are seemingly hashing unique values. In which case I see no need to save a salted hash. At least not when the client must be able to recreate that same 'salt'.

Ángel
  • 17,578
  • 3
  • 25
  • 60