1

I am not a security expert but I am trying to get my head around the current best practices for adopting a security policy for a website that will contain sensitive information.

In terms of someone trying to access user accounts through the front door/login screen I am happy that I am familiar with enough information to make this a futile exercise.

However I also want to also guard against the possibility that the database could be hacked and a third-party be in possession of all the encrypted passwords.

It is at this point that I can't see any real advantage with using any of the MD5, MD5+Salt, AES encryption method in cases where the database is hacked and all the keys needed to test a password against any stored hash is retained in the database.

Would I not be more secure if I took an encryption method such as MD5 and used that to encrypt the user's password, and then take the encrypted string, and through code, select say the first n characters of the hash and pass that through the MD5 encryption and store that hash in the database?

i.e. take the password "password"

The MD5 generated hash would be 5f4dcc3b5aa765d61d8327deb882cf99

I take for example the first 14 characters 5f4dcc3b5aa765 and run that through the MD5 encryption to get a new Hash 316e881b8519579949391d6f8424decb which I store in the database.

Only the code would know how many characters were being used to re-encrypt the first hash code. Any hacker who is not in possession of both the database data and the source code would not be able to compare hash codes using a dictionary rules set even though they may have correctly identified that MD5 encryption is being used.

All the encryption methods I have researched so far seem to store the encryption keys in the database which doesn't seem to give much protection from a database dump hack.

It is quite possible that I am being naive here and don't understand the first thing about security but any guidance from people would who do would be appreciated.

S.L. Barth
  • 5,486
  • 8
  • 38
  • 47
Martin C
  • 11
  • 2
  • 2
    See also: http://security.stackexchange.com/questions/211/how-to-securely-hash-passwords – ilkkachu Aug 10 '16 at 11:50
  • 1
    Also consider that even if the system you use isn't fully known, if someone even gets a hint at what you are doing, they can easily go through all the possible cutoff values (there are only couple of dozen). – ilkkachu Aug 10 '16 at 11:54
  • 1
    I really hope they don't store "encryption keys" in the database - they should be using a one-way hash, where the protection comes from it being hard to reverse, and slow to calculate. It doesn't matter if the salt is there, since it still takes a long time to try large numbers of passwords. – Matthew Aug 10 '16 at 12:43
  • 4
    You go back and forth between *encryption* and *hashing*. To be clear they are different things. Encryption assumes you can decrypt your string with a given key, whereas hashing (which is what you're talking about) is intended to never be decrypted. – Novocaine Aug 10 '16 at 16:18
  • You're right that plain MD5 is a poor choice for password hashing these days - but it wasn't designed for passwords in the first place! The recommended practice these days is to use bcrypt, or another algorithm which is specifically designed for passwords. – Soron Aug 10 '16 at 17:28
  • Also, if you don't know what you're doing, **do not** use AES to encrypt passwords - with passwords specifically, it's very easy to accidentally make something that's only marginally more secure than plaintext. There are ways to use AES securely with passwords, but they are **not** the most obvious way. I've seen this at a former workplace, and HR freaked out when I demonstrated that I could trivially decrypt all passwords. – Soron Aug 10 '16 at 17:33
  • If you are storing personal data be sure to store that data **encrypted** into the database... – Bakuriu Aug 10 '16 at 18:43
  • This comes up like every other day on security.SE. The answer is always the same: **Never create your own password hashing algorithm.** You are **not** smarter than the experts who created existing, proper algorithms for this, and you'll end up with **far worse** security if you think you are. Use PBKDF2, bcrypt, or scrypt. With a significant work factor / iteration count. And a salt. – marcelm Aug 10 '16 at 19:17
  • @EthanKaminski that's interesting regarding AES for passwords; do you happen to have a reference for further reading? – Blorgbeard Aug 10 '16 at 22:26
  • @Blorgbeard - this answer goes into more detail about why naively storing passwords with AES is a Bad Idea: http://security.stackexchange.com/a/10496/80588 The basic issue is: if an attacker gets your DB + code, then if you're using MD5, weak passwords become known (because hashes are cracked / broken); if you're using AES, *all* passwords become known, even very strong ones (because ciphertext is decrypted). In order to safely use AES with passwords, you'd need to use a wrapper algorithm similar to PBKDF2 or bcrypt (also look up "key derivation function"). – Soron Aug 10 '16 at 22:49
  • @EthanKaminski ah, I see. I know just encrypting passwords is a bad idea in general, I thought maybe there was a trick with short cleartext and certain AES modes or something. – Blorgbeard Aug 10 '16 at 22:56
  • @Blorgbeard - nah, it's nothing specific to AES (as opposed to other symmetric ciphers), nor is it specific to the length of the password. It's just that "reversible" is exactly what you don't want when storing passwords. – Soron Aug 10 '16 at 22:59

3 Answers3

15

Only the code would know how many characters were being used to re-encrypt the first hash code.

And this is where the problem is. Right at the start, only one with access to the code knows this. But how long can you keep this a secret?
You may inadvertently publish your code on GitHub, or talk about it with friends and be overheard... as soon as the secret is out, the extra value goes away. And you need to use a different method - which implies having to ask all users to once again enter their password.

It's called Kerckhoff's Principle:

A cryptosystem should be secure even if everything about the system, except the key, is public knowledge.

The only things that should need secrecy, are the things you can change quickly. So you can adequately respond when the secret leaks.

That being said, MD5 is outdated; consider using bcrypt for hashing instead.

S.L. Barth
  • 5,486
  • 8
  • 38
  • 47
  • 5
    +1 for pointing out that MD5 is outdated (i.e. collisions exist) – Verbal Kint Aug 10 '16 at 12:02
  • 8
    @VerbalKint Collisions exist with *all* hash algorithms, that's their nature. (But yes, MD5 is not good anymore) – deviantfan Aug 10 '16 at 14:55
  • 1
    That's on top of the fact that look-up tables readily exist for MD5 too, further reducing its effectiveness (even when rehashed like suggested). – Jeremy Kato Aug 10 '16 at 17:07
  • 4
    @VerbalKint MD5 is bad because it's *fast*, not because of collisions. Collisions don't matter for password hashing; they matter for integrity. (I can create multiple messages that have the same hash, allowing me to claim I sent one when I really sent the other.) It's pre-image vulnerabilities that matter for passwords (the ability to determine what the original message was). See [this post](http://security.stackexchange.com/a/31846/46979) (especially the "Collisions and MD5" section). – jpmc26 Aug 10 '16 at 17:56
  • 2
    @JeremyKato look-up tables can readily be created for _any_ hash, be it MD5, PBKDF2, or bcrypt. The existence of these tables has nothing to do with MD5; the way to make those tables useless is by using a salt. Which you should do whatever your hashing algorithm is. – marcelm Aug 10 '16 at 19:12
  • Very true @jpmc26! I actually read a few articles discussing the disadvantages/advantages for using hash algorithms and realized my mistake. Now, even though an algorithm is fast, I would think this could be remedied by more iterations. Obviously this is incorrect, as every source I have found points to MD5 being outdated, but what is the relationship between the number of iterations and the security of the hash? – Verbal Kint Aug 10 '16 at 19:17
  • 3
    @VerbalKint Actually, an iterated MD5-based hash, with salt, could potentially be quite secure. But still, using a hash that is severely broken for many uses is horrible security hygiene, especially when there are excellent alternatives you could use at no cost :) – marcelm Aug 10 '16 at 19:28
  • 1
    @marcelm That's a good point too, though at least bcrypt has salting built in; it'd almost certainly be pointless to bother tabling it. – Jeremy Kato Aug 10 '16 at 19:31
  • 1
    @VerbalKint Even barring the issue of "security hygiene," MD5 is still going to be faster on common hardware like GPUs than algorithms designed to resist them better from the get go. bcrypt has issues there as well (hence why people were interested in scrypt), but they're not as bad, from what I understand. – jpmc26 Aug 10 '16 at 21:20
6

Note that algorithms that are used for stored password validation against a typed password should be hashing algorithms, not encryption algorithms.

The former is one way.

i.e. foo -> acbd18db4cc2f85cedef654fccc4a4d8 cannot be reversed, because it is possible for acbd18db4cc2f85cedef654fccc4a4d8 to be the hash of other values too.

The latter is two way.

foo -password-> cC0rVBKPXwDeUuXjyrZjLQ== -password-> foo

There is only one plaintext, provided the same key is used.

Now back to hashing...

The point of a password hash is not to obscure the password. The point of a password hash is to make lookups one way, however this makes it vulnerable to password guessing. That is, an attacker could try the password foo offline through the same algorithm to find out if it hashes to acbd18db4cc2f85cedef654fccc4a4d8. If so, they can try an online login.

Beacuse an attacker can do this reasonably fast, you use bcrypt so that it takes say 10,000 times longer to test each password guess.

This is what stops an attacker, not security through obscurity by obfuscating the hash calculation. Chances are, a skilled attacker would be able to gain the password algorithm somehow - either by breaking into your code base, or finding out if it has leaked anywhere (e.g. in an open source system you have no chance). At worst, an attacker could register for multiple accounts on your system and observe how you hash their password. Sooner or later they will figure it out, and chances are this will be much sooner than they can run their password guesses through a secure algorithm such as bcrypt.

SilverlightFox
  • 33,408
  • 6
  • 67
  • 178
  • 1
    "Sooner or later they will figure it out, and chances are this will be much sooner than they can run their password guesses through a secure algorithm such as bcrypt." Should be bolded and italicized. =) – jpmc26 Aug 10 '16 at 18:52
3

Using the standard methods of say CrypticHash(password, salt, rounds/cost) is a much safer approach for numerous reasons even if your source code and database was leaked.

In your approach it would be quite easy to attack. All I would have to do, once your code/database is leaked, is brute-force over a hexadecimal character set (16 characters), with length 14 until i find "5f4dcc3b5aa765" as a match, which might take some time, but mind you, md5 would be quite fast to do making this possible. Then all I have to do is find ANY md5 hash that starts with "5f4dcc3b5aa765xxxxxxxxxxxxxxxxxx" and this would be even easier to find with rainbow tables.

James
  • 145
  • 5