56

I'm working on improving a CMS where the current implementation of storing password is just sha1(password). I explained to my boss that doing it that way is incredibly insecure, and told him that we should switch to bcrypt, and he agreed.

My plan was to just run all the existing hashes through bcrypt and store those in the password field, and then use the following psudo-code to check the password: correctPassword = bcrypt_verify(password, storedHash) or bcrypt_verify(sha1(password), storedHash).

This way, new users, or users who change their passwords will get "real" bcrypt hashes, while existing users won't all have to change their passwords. Are there any disadvantages to doing this? While it would probably be ideal to ask all users to choose a new password, do we lose much in the way of security by doing this?

I was thinking that even if an attacker got access to both the database and the code, cracking won't be substantially faster even if the majority of the "input" to bcrypt was a 40 character hex string, since the slow part (bcrypt_verify()) still has to be invoked for each password attempt on each user.

Alex
  • 709
  • 5
  • 7
  • 38
    As a sidetought: the system could automatically change the hash the next time the user logs in, so if the bcrypt_verify(sha1(password), storedHash) matches then you store the new (not sha1'ed) password as the new hash. This will over time change all users to the new hash, so should some weakness be found in the usage of bcrypt and sha1 together (unlikely but always possible) then there is only a relative short window of attack. – Selenog Oct 22 '15 at 09:40
  • 4
    @Selenog That's a brilliant technique for doing transparent migration! I'm definitely going to do that. Thanks for the idea! – Alex Oct 22 '15 at 12:08
  • 32
    From what little I know of him, I like your boss. ;) – jpmc26 Oct 22 '15 at 13:21
  • 3
    FYI, you should probably add a salt while you're at it. – Ajedi32 Oct 22 '15 at 20:46
  • As a side note, I asked [this question on SO](http://stackoverflow.com/q/23987734/404623) about 16 months ago regarding the implementation of a solution for this issue. – rink.attendant.6 Oct 23 '15 at 01:12
  • 2
    @Ajedi32, many bcrypt implementations supply a salt by default, and I think the algorithm actually requires a salt. Of course, it is worthwhile verifying that the implementation doesn't use a _static_ salt. – Soron Oct 23 '15 at 04:40
  • 1
    bcrypt is based on blowfish. The original blowfish algorithm is over 20 years old, so it should be no surprise that it is known to have weaknesses. As long as 7 years ago, various experts were saying it should no longer be used. Personally, I would recommend something like what @Selenog recommended above, but using the newly adopted sha3 algorithm. That approach would be compliant with NIST guidance and probably be viable for a long time. – JaimeCastells Oct 26 '15 at 21:34

5 Answers5

78

Actually this is a good way to protect the otherwise unsecurely stored passwords. There is one weak point in this scheme though, which can be overcome easily in marking old hashes, so I would prefer this solution:

if (checkIfDoubleHash(storedHash))
  correctPassword = bcrypt_verify(sha1(password), storedHash)
else
  correctPassword = bcrypt_verify(password, storedHash)

Imagine an attacker getting hold of an old backup. He would see the SHA hashes, and could use them directly as passwords if you test with bcrypt_verify(...) or bcrypt_verify(sha1(...)).

Most bcrypt libraries add a mark of the used algorithm themselves, so it is not a problem if you add your own "double hash mark", but of course you can also use a separate database field for this:

$2y$10$nOUIs5kJ7naTuTFkBy1veuK0kSxUFXfuaOKdOKf9xYT0KKIGSJwFa
 |
 hash-algorithm = 2y = BCrypt
Michael Mior
  • 401
  • 1
  • 3
  • 11
martinstoeckli
  • 5,149
  • 2
  • 27
  • 32
  • 11
    That's a good point. I hadn't thought about it being possible to use the SHA1 hashes from backups as passwords. Thanks. – Alex Oct 22 '15 at 12:13
  • 1
    Nice catch. Somebody remind me to come back and award a bounty to this answer, in a couple of days. – kasperd Oct 24 '15 at 09:33
  • 1
    @kasperd Here's your reminder. :) – Alex Oct 28 '15 at 12:50
  • Couldn't the attacker just decrypt the SHA1 hashes from the backups? – mbomb007 Apr 11 '18 at 16:08
  • @mbomb007 - Hashes cannot be decrypted, but of course one can try to brute-force the SHA1 hashes. It depends mainly on how strong the password was, whether the hash can be cracked, strong passwords are well protected even with SHA1. In any case, just using the hash to login is much faster than cracking the hashes. – martinstoeckli Apr 11 '18 at 17:46
  • @martinstoeckli Of course, but the point this question is trying to prove is that re-hashing these weak hashes makes them stronger. This is all moot if the old, weak hashes are still obtainable. – mbomb007 Apr 11 '18 at 18:46
  • @mbomb007 - Not really, double hashing is protecting the password-hashes from _future_ theft, it is not possible to prevent a damage which is already done. My answer just points out a side problem which can be circumvented with very little effort. The old hashes are not obtainable anymore in the database, but just in case they leaked earlier or another way, one can protect at least the accounts with strong passwords. – martinstoeckli Apr 11 '18 at 19:12
  • @martinstoeckli So would having a database table with (hashed) password history be a vulnerability, then? Especially if it includes weaker-hashed previous passwords? – mbomb007 Apr 11 '18 at 19:45
  • 1
    @mbomb007 - If you maintain a history of previous passwords, you should protect them equally to the active password. Knowing the old passwords can be a huge advantage to guess the active password and passwords are often reused on other sites. – martinstoeckli Apr 12 '18 at 06:36
8

Why not simply use bcrypt(sha1(password)) for all passwords both old and new? This avoids the problem of people using your old hashes as passwords and is also simpler than your proposal.

Peter Green
  • 4,918
  • 1
  • 21
  • 26
  • 6
    sha1 and bcrypt might potentially have a vulnerability when used in conjunction that's found in the future. It's better to *only* use the more secure algorithm if possible. – d0nut Oct 22 '15 at 22:15
  • 6
    @iismathwizard: I find this extremely hard to believe. – Joshua Oct 23 '15 at 17:28
  • 1
    @Joshua how unfortunate http://stackoverflow.com/questions/120131/combination-of-more-than-one-crypto-algorithm http://security.stackexchange.com/questions/58781/does-using-the-same-encryption-algorithm-multiple-times-make-a-difference http://blog.cryptographyengineering.com/2012/02/multiple-encryption.html – d0nut Oct 23 '15 at 17:34
  • Additionally, this one is good in reference to hashing algorithms: http://stackoverflow.com/a/17396367/1974671 – d0nut Oct 23 '15 at 17:43
  • 4
    @iismathwizard: If you had read your own paper, you would understand the stack is at least as strong as the weaker of the two. – Joshua Oct 23 '15 at 17:44
  • 3
    @joshua From the paper: *"In this case, the resulting ciphertext ought to be at least as vulnerable as a single-encrypted ciphertext. Hence double-encrypting gives you no additional security at all"*. – d0nut Oct 23 '15 at 17:50
  • 2
    @Joshua http://security.stackexchange.com/questions/19866/scrypt-bcrypt-cascade-hashing – d0nut Oct 23 '15 at 17:53
  • 1
    @iismathwizard If the security of a hash depends on what kind of content it's hashing, it's a terrible hash. – jnm2 Aug 10 '16 at 12:54
4

It's a good strategy, you won't loose any security unless a user decided to generate a truely random password longer than 160-bits as it will be truncated. So the difference is minimal. (in which case it would still take a significant amount of time to bruteforce the original text)

You might opt to implement some logic to migrate the passwords next time a user changes it, but I don't see any risk that would require immediate change of passwords, unless you believe the hashes have leaked.

Lucas Kauffman
  • 54,169
  • 17
  • 112
  • 196
2

I recently implemented a similar system for migrating passwords across to bcrypt. However instead of SHA1 we were using SHA256(password + salt) hashes originally.

This salt is regenerated when we switch the user to bcrypt as they login (optional) or upon changing their password. So the hash wouldn't be based on the original. We then primary use this nonce as an IV to encrypt the bcrypt hash in the database with a key stored outside the database.

Doing this only really prevents injection attacks and data pulled solely from the database providing any useful password information. But the overhead wasn't a concern for us and we can change this external key whenever we need to.

Using the SHA256 hash as input to bcrypt also keeps the length of the input password below the maximum for bcrypt (related articles)

The only concern that I saw any reference too with using SHA1 in this manner when I was looking around for advice and problems with anything I was doing was from Thomas Pornin in the second of those two links:

Using a secure hash function to preprocess the password is secure; it can be shown that if bcrypt(SHA-256(password)) is broken, then either the password was guessed, or some security characteristic of SHA-256 has been proven false. There is no need to fiddle with the salt at that level; just hash the password, then use bcrypt on the result (with the salt, as bcrypt mandates). SHA-256 is considered to be a secure hash function.

So it's possible that holding onto SHA1 may not be a good choice - why else migrate from using it solely in the first place. That being said it's improbable there would be an kind of attack that would provide any practical value in attacking a bcrypt(SHA1(password)) in the near future that wouldn't involve a compromise of some sort with bcrypt itself.

John Pettit
  • 161
  • 3
  • "why else migrate from using it solely in the first place" well the main reason for migrating from conventional hash functions to deliberately slow ones like bcrypt is to slow down brute force/dictionary attacks if someone compromises your password DB. – Peter Green Oct 23 '15 at 13:40
  • It's not just that though of course. Bcrypt uses cryptographic primatives for the purpose of storing password hashes, message digests or random oracles, are cryptographic primitives designed for very specific tasks. SHA1 has known indirect compression function weaknesses resulting in [this](https://www.schneier.com/blog/archives/2015/10/sha-1_freestart.html) Using it to transform input to bcrypt by itself is not really worth the CPU time if it's not to keep the length of the input within the bounds of what bcrypt will process and for that you might as well use SHA256. – John Pettit Oct 23 '15 at 13:59
0

Initial idea

bcrypt(sha1(password));

Problem #1 - Null Termination Problem

The first reason you don't want to do that is because SHA, or any hashing algorithm, puts out bytes. And many programming languages do not have proper String types; and instead simulate strings with a series of characters followed by a null (i.e. \0) terminator. If your hash digest contains a null, the bcrypt algorithm might see the \0 character, and assume that's the end of the string:

  • bcrypt(sha1("fsdf3hgfh2faff32f"))
  • bcrypt(96 87 0f 9e 71 ff 62 57 55 00 b6 5c 91 07 64 6f b5 81 13 a9)

And with C and PHP, if you blindly treated the digest as a "string", then your "string" would look like:

  • bcrypt("–‡žqÿbWU\0¶\‘doµ©")

causing some bcrypt implemetations to cut off at the \0 null terminator:

  • bcrypt("–‡žqÿbWU")

This is known as the null termination problem

Solution

Your implementation may be immune to this; or it may not. So lets not tempt fate. You can pre-hash the password, but be sure to base-64 encode the digest first:

  • bcrypt(base64(sha1("fsdf3hgfh2faff32f")))
  • bcrypt(base64(96 87 0f 9e 71 ff 62 57 55 00 b6 5c 91 07 64 6f b5 81 13 a9))
  • bcrypt("locPnnH/YldVALZckQdkb7WBE6k=")

Problem 2 - Hash Shucking

The next issue is has to do with dictionary attacks.

An attacker isn't going to bruteforce every possible password:

  • aaaaaaaa
  • aaaaaaab
  • aaaaaaac
  • ...

Instead they're going to use dictionaries, previous password breeches, and password that follow the rules that certain stupid corporations insist upon (e.g. password complexity policies).

  • hunter2
  • password
  • Tr0ub4dor&3
  • 12345
  • qazxsw
  • zxcvbn

The whole point of bcrypt is that it is still hard to brute-force all these dictionary words. But the fact remains that there are still these lists, and it can dramatically shorten the search space.

But imagine there was a password database breech, and fortunately the web-site used SHA-1 to store all their passwords, and one of the breeched SHA-1 hashes was:

  • 96 87 0f 9e 71 ff 62 57 55 00 b6 5c 91 07 64 6f b5 81 13 a9

They don't know what the original password is, but at least it's something they can add to their dictionary list. And if your web-site does pre-hash with SHA-1, then suddenly they can try:

  • bcrypt(base64(96 87 0f 9e 71 ff 62 57 55 00 b6 5c 91 07 64 6f b5 81 13 a9))

If it matches, it means that they have the SHA-1 hash of someone's password. And since SHA-1 is so easy to compute in hardware, they now have an SHA-1 hash they can try to bruteforce.

This problem is known as 'Hash Shucking'.

Solution

What you want to do is be sure to salt the password hash:

  • bcrypt(base64(sha1(password+salt)))

This way the "password hash" will never appear in any other global database.

Ian Boyd
  • 2,125
  • 1
  • 21
  • 13