3

I originally asked this on stackoverflow, but due to lack of traction and a recommendation by a user there I have asked it here too.

Imagine a scenario where a client application is sending a password to a backend server so that the server can validate that the user entered the correct password when being compared to a stored variation of the password.

The transport mechanism is HTTPS with the server providing HSTS & HPKP to the user agent and strong cryptographic ciphers being preferred by the server scoring A+ on SSL labs test. None the less, we may wish to avoid sending the original user provided password to the server from the user agent. Instead perhaps we'd send a hash after a number of rounds of SHA-256 on the client.

On the server-side, for the storage of passwords we are using bcrypt with a large number of rounds.

From a cryptographic point of view, is there any disadvantage to performing bcrypt on the already sha-256 hashed value as opposed to directly on the plain text password? Does the fixed length nature of the input text when using hashes somehow undermine the strengths of the algorithm.

I'm not asking about performance such as the memory, CPU, storage requirements or wall clock time required to calculate, store, sent or compare values. I'm purely interested in whether applying a hash prior to applying bcrypt could weaken the strength of bcrypt in the case of a disclosure of the full list of stored values.

I've read posts this (which I find interesting and useful) but I'm not specifically asking whether it's a good idea to hash on the client side - I'm more interested in whether doing so could somehow weaken the password storage system with bcrypt given that an attacker armed with this knowledge would know that all values stored are a derivative of a fixed set length of inputs consisting of a much smaller range of possible characters (SHA-256)

David
  • 167
  • 6

2 Answers2

4

Well, in theory explicitly using SHA256 outputs as input limits the amount of possible inputs, but the huge(!) number of SHA256 outputs makes this a negligible factor. Knowing this provides no real advantage (as far as we know today), as the hash output is pretty much pseudo-random. There shouldn't be much difference between all possible SHA256 outputs and 2256 random strings being fed into the function. You can't really derive a pattern from the outputs (if you could, the hash function is bad). I can't see any major implications from a cryptographic perspective.

A comment on your scenario, or why client-side hashing imho is not an improvement over sending plain text:

Using client-side hashing prevents an attacker from getting a clear-text password, IF (huge if, as you described a lot of high security standards) they can comprimise the connection SOMEWHERE. However, hashing the password before sending it, effectivly makes the hash the new password, as your server doesn't care whether or not the provided password is a hash or "topSecretK3y".

If an attacker gets this hash, its value is just a high as the original password, as he can now impersonate the victim.

P.S.: The only "drawback" is that the attacker cannot use the username/password for other websites, so you at least prevent a "password reuse attack" on other websites.

GxTruth
  • 963
  • 6
  • 9
  • Agreed, the main purpose was not to compromise the original password so as to hopefully protect a user this way. – David Jun 11 '18 at 09:24
  • 1
    I disagree on client side hashing. Either an attacker can see plaintext network traffic or they can't. If they see a hashed password sent to the server they can send that hash to impersonate the user. If they see the password itself sent to the server then they can send that password to impersonate the user. If they see the value of a session cookie then they can send that cookie value to impersonate the user. Use HTTPS if you don't want them to see any of these values. – Future Security Jun 11 '18 at 19:29
  • @FutureSecurity I agree with all of your statements, but how is this contrary to my answer? There is no improved protection by client-side hashing, as the hash can be used to impersonate the user, just like the password itself. The only tiny advantage is, that an attacker, who sniffs "victim@website.com" and password "a489b491ff37ebac" can steal this specific account, but certainly NOT use these credentials on other websites to steal other accounts (password reuse is the problem). Certainly TLS is the way to go here and is considered safe, as of current knowledge. – GxTruth Jun 14 '18 at 12:47
  • 1
    @GxTruth In the context of bcrypt and iterated SHA-256 we are talking about password stretching. The threat password stretching defends against is offline cracking, not traffic being intercepted. TLS does not replace password stretching and password stretching does not replace TLS. Client side hashing works fine for stretching pu'trpose if the implementation is efficient. People don't read unqualified statements like "client side hashing is ineffective" as "it is not an improvement over x". It's like saying "seat belts aren't effective" ... "because they don't decrease braking distance." – Future Security Jun 14 '18 at 17:38
  • @FutureSecurity Fair point. I rephrased the answer accordingly, because there is in fact a huge difference I did not consider at the time of writing. Thanks for pointing that out. – GxTruth Jun 15 '18 at 07:28
4

As long as first hash isn't horribly broken (and I mean broken so badly that it's significantly worse than MD5) this doesn't weaken the bcrypt hash in any notable way. In fact, performing a round of SHA256 before bcrypt is actually recommended sometimes because bcrypt silently truncates at 72 characters. While 72 characters should be more than enough for a good passphrase, silent truncation is hardly ideal.

However, there is a significant implementation detail that you must be aware of if you want to do this. bcrypt was designed to hash strings, not data, and as such, it uses null-terminated strings. What you actually want to do is bcrypt(base64(sha256(password))) rather than bcrypt(sha256(password)), otherwise a 0 byte near the front of the sha256 hash would be catastrophic, as bcrypt would only hash the first few characters.

If your goal is to prevent a passive MitM from obtaining the original password then doing sha256 client-side is helpful, however it doesn't protect you as much as you think if your worry is a database leak. Adding a sha256 before bcrypt doesn't really change anything in terms of time it will take to crack passwords, it just means that the attacker has to run their guesses through sha256 as well, which doesn't slow them down at all compared to using a high cost for bcrypt.

AndrolGenhald
  • 15,436
  • 5
  • 45
  • 50
  • +1 for mentioning that detail. Didn't know that until now, despite the catastrophic impact it can have. Very interesting :) – GxTruth Jun 14 '18 at 12:51