31

This isn't really a practical question, but more a question of curiosity. I heard a CS professor recommend stepping up from md5ing passwords not to SHA1, but to AES encrypting the password using itself as the key. Does anybody have any insight as to whether this is actually more or less secure?

Some thoughts on the matter:

  • SHA1 is one-way, which should make it more secure than a function that preserves the data.
  • On the other hand, the hacker can't possibly take advantage of AES' decryptable nature because he doesn't yet know the password.
  • Unless he guesses it. But then he would be in no matter how I encrypted it.
  • Since you know the AES'd password is decrypted when the decryption key matches the output, it could be bruteforced on the hacker's computer.
  • But the hacker doesn't know how it's encrypted, so he wouldn't be trying that.
  • But the hacker could have got the encryption code as well, in which case it's simply a question of which data takes longer to brute force.
  • Most websites use md5 or sha1 (right?), so lookup tables for these hashes would be far more widespread than for the AES method. Thus making the AES method more secure.
  • But if we salt both methods, they would become equally immune to lookup tables.

Some common points in the answers, and counter-points:

AES encryption is not designed to be collision resistant, and thus will probably have more collisions for the same hash string length.

If we're using a longer salt, then the encrypted password will be longer than a SHA1 hash. This may be enough to bring the collisions down to a comparable level - or maybe not.

AES encryption tells you how long the password was, within 16 bytes.

We can add to the AES method: after encrypting it, we pad or trim to a reasonable length (which is longer than SHA1's so as to avoid collisions).

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
  • 3
    I don't think if the function is one-way it is more secure than a non-one-way function. They are two different things.. –  Jan 04 '12 at 08:38
  • Well, I figured it was only less secure if there was some way of guessing or brute forcing the decryption key, but bullet 5 eliminates that ... –  Jan 04 '12 at 08:41
  • 6
    I wouldn't consider things being 'unexpected' to carry any weight in a 'how secure is this' question. I also much prefer to leave crypto to the people who understand it - it's way too easy to make mistakes –  Jan 04 '12 at 08:47
  • Heh ... I hope there are at least a few people who don't think that way about crypto. Also, I think I was unclear in the last two bullets - question edited. –  Jan 04 '12 at 08:55
  • 1
    The hash function itself doesn't matter much as long as it's decent. It's much more important to use it correctly(a salt, and a slow scheme such as PBKDF2) – CodesInChaos Jan 08 '12 at 20:42
  • How will you pad it to be a min size? If with 0s then it's obvious to the hacker. How could you pad it in a way that it's not reversably apparent what is padding and what is encryption? – Alexander Bird Jan 12 '12 at 01:33
  • @Thr4wn, since you're not trying to reverse it, you could pad it with the password also. – Marcus Adams May 07 '12 at 18:32
  • "you know the AES'd password is decrypted when the decryption key matches the output" -- this seems to assume that, in such a setup, the true password is the only fixed-point of AES. Why shouldn't there be another key candidate x that, when used, causes AES to spit out x? Unlikely, probably, but impossible? – Raphael Jun 26 '17 at 09:23

7 Answers7

33

Short answer: bad idea, don't do it.


Longer answer: the point of the exercise is to store something which allows the server to verify a given password, but does not allow it to rebuild the password; the latter property is desirable, so that consequences of an illegitimate read access to the server database by an attacker remain limited.

So we want a one-way deterministic transform which converts a given password into the verification value. The transform shall be:

  • configurably slow, so as to thwart dictionary attacks;
  • distinct for every instance, to prevent parallel dictionary attacks, including precomputed tables (that's what salts are about).

A single invocation of MD5 or SHA-1 fails on both accounts, since these functions are very fast, and not salted. Slowness can be done through nesting many invocations (thousands, possibly millions), and the salt can be injected "somewhere", although there are good and bad ways to do it. PBKDF2 is a standard transform which does just that, and it is not bad at it (although bcrypt should be somewhat preferable).

However, MD5 and SHA-1 do at least one thing right: they were designed to be one-way. That's hard to do; it is not even theoretically proven that one-way functions can really exist at all. Cryptographers around the world are currently involved in a competition to design a new, better one-way hash function.

So what your professor seems to recommend is to replace a function designed for one-wayness, with a function which was not designed for one-wayness. It does not correct anything about slowness and salting, but it removes the one good thing about MD5 and SHA-1. Moreover, AES is known to be relatively weak with regards to related-key attacks -- that's not a problem as long as AES is used for what it was meant to, i.e. encryption, but it becomes an important issue when it is subverted into a building block for a one-way hash function. It seems possible to build a secure hash function by reusing parts of the AES design, but it requires substantial reworking (see for instance Whirlpool and ECHO).

So do not use a homemade AES-based password hashing scheme; for that matter, do not use anything which is "homemade": that's a recipe for disaster. Security cannot be tested; the only known way to make a secure algorithm is to have hundreds of trained cryptographers look at it closely for a few years. You cannot do that by yourself. Even a lone cryptographer will not risk that.

Instead, use bcrypt. PBKDF2 is not bad either.

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
  • 1
    Using AES to encrypt a password with the password as the key seems as much a one way function as using SHA1 to hash it. The only difference is that it was not designed for it like you said. – Gudradain Dec 14 '15 at 14:41
  • I agree with Gudradain. More so, bcrypt, for example, *was* designed to be one-way and is based on encrypting a known text with the password (in multiple rounds). – Volker Dec 14 '15 at 15:57
18

Most websites used md5 or sha1 (right?), so this is what a hacker would be expecting. Thus making the AES method more secure.

Even if most websites use MD5 or SHA1, switching to AES just for the reason that it's usually not used there wouldn't make it any more secure - that's just security by obscurity, which usually doesn't effectively contribute to increase security; it will be far more secure to use a good algorithm (which is well-tested for the given purpose) and document it publicly, than use an algorithm of which you can't be sure if it's good, and not say that you're using it.

At the moment I can think of the following possible disadvantages when using AES or any other encryption algorithm like this:

  1. Encryption algorithms are usually not designed for avoiding collisions: As pointed out in the comments below, it might not be a problem because of the expected pseudo-random behavior of block cipher. But still, you are using an encryption algorithm for something it wasn't designed for.
  2. It is imaginable that there are attacks on encryption algorithms which are far easier to do if one knows that clear text and password are actually one and the same (should the fact about the algorithm you're using somehow leak, which is not that unlikely given that it's e.g. discussed here).
  3. Encryption algorithms produce data which in many cases varies with the length of the cleartext. That means, the ciphertext holds information on how long the password is. In the case of AES this is a multiple of 16 bytes as far as I can tell; hash values, on the other hand, have a fixed size (e.g. 160 bit / 20 bytes for SHA1); meaning that a potential hacker has a chance of getting more information out of a encrypted text than out of a hash value.
  4. In cryptographics it's usually always less secure too cook up something by yourself (which is subsequently probably also only used by you) than using something which is well tested and out in the community for a long time already (meaning it had time to be thoroughly inspected by a number of cryptanalysts).
  5. You want to make sure your password hashing algorithm is slow, so that potential attackers are hindered to brute-force their way into your system. One way to achieve this is for example to iteratively perform the hashing in multiple rounds, as for example PKBDF2 does. For more details, see http://security.blogoverflow.com/2013/09/about-secure-password-hashing/
codeling
  • 297
  • 1
  • 6
  • You make very good points. However, this makes me wonder - if the average password is >20 characters then the AES method would, on average, produce longer encrypted passwords. Would this be enough to counteract your first point? (the average user password is certainly less than 20 characters, but let's assume we're salting it, or something else, for a moment) –  Jan 04 '12 at 09:10
  • As for your edit, see bullet 6. Even if the hacker knows exactly how the password was encoded, I'm pretty sure they would still have to brute force it. –  Jan 04 '12 at 09:15
  • @skier88: as for the length: might be a theoretical point since it's rather hard to brute-force passwords of lengths over 16 bytes anyway, but it's certainly an advantage to know that the password is between 16 and 32 bytes long than to know nothing about the password at all –  Jan 04 '12 at 09:21
  • Actually, I just realized this. You could easily pad or trim the resulting AES - they would know what you've done if they got your code, but they still wouldn't know how long the password was. But I'm guessing it would still be more prone to collisions. –  Jan 04 '12 at 09:47
  • Collisions are only a problem if the attacker can choose what to hash. For password, they should be a non-issue. However, all the other reasons still stand :) –  Jan 04 '12 at 09:52
  • @Kosta: true, an attacker can't chose what he wants to hash. But the possibility of more collisions than with a hash algorithm still exists - which could make a brute-force attack much easier, since the first password producing the desired "encrypted hash" is just fine for the attacker! –  Jan 04 '12 at 09:57
  • 4
    I think point 1. is not a problem here. If such collisions would appear (more often than the theoretical bound), it would mean the block cipher doesn't have the expected pseudo-random behavior, rendering it effectively broken by modern standards. So this shouldn't happen to any currently used block cipher, in particular AES. –  Jan 04 '12 at 10:34
  • -1 for recommending keccak for password hashing. You are also forgetting that password need to be hashed several rounds over and over again and you also fail to mention salts should be unique. – Lucas Kauffman Aug 17 '13 at 12:58
  • 1
    @LucasKauffman right, recommending a just-devised hashing method actually goes against my other advice of using well-tested algorithms - or are there specific other concerns against using keccak? – codeling Aug 17 '13 at 13:33
  • 1
    Speed, keccak is designed to be very fast, which you want to avoid for password hashing. – Lucas Kauffman Aug 17 '13 at 13:37
  • What @LucasKauffman is really saying is that neither your suggestions (SHA2 or SHA3 (Keccack)) are correct. Both are fast, and password hashing should be slow. See [Thomas Pornin's answer](http://security.stackexchange.com/a/10496/12) for the correct options. – Xander Dec 14 '15 at 14:07
7

One immediate problem I see with this approach is the fixed key length of AES. Using AES-128, one would be restricted to 16 bytes of the password. What about longer passwords, or, long passphrases?

To accommodate for any password length you would need a hash function, so back to the initial point.

Note that there are well-studied and secure* ways of turning a block-cipher into a hash function, so called PGV modes like Davies-Meyer, Matyas-Meyer-Oseas or Miyaguchi-Preneel.

(*) For an appropriate definition of "secure".

Kris
  • 264
  • 1
  • 2
  • I hadn't thought of that, but would it really make it less secure? And would you really need to hash it and not just truncate it? –  Jan 04 '12 at 10:02
  • 1
    Yes, if you truncate, you loose part of the strength of the password. E.g. brute force dictionary search attacker *knows* your password is at most 16 characters. –  Jan 04 '12 at 10:15
  • Right, but 16 hashed characters won't be any more secure ... I think. (and we're talking trunctating just for the key, not what it hashes, which is the original password + a salt) –  Jan 04 '12 at 10:19
  • Yes, the difference is only for passwords longer than 16 characters. Yes, hashing shorter passwords doesn't make it any more secure. –  Jan 04 '12 at 10:25
5

However, consider this:

I break into your site and get enough access to notice that you have AES keys configured for your system, only thing that looks "encrypted" is yours passwords... guess what I'm going to do.

I would keep them hashed, leaves a lot less room for easily scraping your entire password database.

On top of that, as a site owner, I DON'T want the ability to decrypt my user's passwords in the first place, otherwise the option to "recover passwords" becomes a possible feature that higher ups can ask for, and technically, we can do it because the underlying system is built using an encryption system instead of a hashing system (a lame reason but software developers deal with requests like these).

Mainly: When security is concerned, don't try to do something wacky and out of the box, you'll likely write yourself a huge security hole because you wont be able to put the amount of research a team of people working on an algorithm for a specific purpose would have to.

StrangeWill
  • 1,603
  • 8
  • 13
4

The correct answer is: It does not really matter at all.

In the case of passwords, the only practical attack is to try out a list of passwords until you find the right one by brute force. No one will try to attack SHA1 nor AES directly to do that. Even in the current point of research, attacking SHA1 would be ten thousand times easier than AES, both are still completely impossible (for reversing password hashes because collisions do not matter here). And as a researcher finds another way to attack the other algorithm, the odds might be reversed next year.

What might matter is how fast you derive your hash / encrypt your password. But if e.g. SHA1 is faster and you want it to be slower (to fend off brute-force attacks), you can just hash multiple times.

What a hacker "expects" is not that relevant, making something "unexpected" is just obscurity. It's like saying "I reverse all the password letters, now it's more secure". It won't be. If the hacker can access your password database, how can you be sure he doesn't have access to your password hash derivation function?

One flaw when "encrypting a password with itself": The length of the ciphertext might give the attacker a clue about the length of the password. That's awful.

In any case, you should salt the password, preferably with something unique to the user.

  • 1
    I can see you put a lot of effort into that answer because you haven't seen my edits yet ;) . But seriously, thanks, you make good points. See the comments on nyarlathotep's answer for a sort of counterpoint, though I think that the combination of collisions and common lengths will settle it. –  Jan 04 '12 at 09:28
  • Don't know why I typed "common lengths". Make that indicative lengths. –  Jan 04 '12 at 09:34
  • While I typed my answer, the internet went down and it took a couple of minutes to get up again. I should have re-read the question maybe :) –  Jan 04 '12 at 09:50
4

There are MACs (Message Authentication Codes) constructed from block ciphers; the best known is probably CBC-MAC. It has issues compared to a hash-mac, notably:

Given a secure block cipher, CBC-MAC is secure for fixed-length messages. However, by itself, it is not secure for variable-length messages. An attacker who knows the correct message-tag (i.e. CBC-MAC) pairs (m, t) and (m', t') can generate a third message m'' whose CBC-MAC will also be t'.

[...]

This problem cannot be solved by adding a message-size block (e.g., with Merkle-Damgård strengthening) and thus it is recommended to use a different mode of operation, for example, CMAC to protect integrity of variable-length messages.

The shorter answer: The security of a MAC built from a block cipher depends on how you construct it. A naive construction will almost certainly have significant flaws.

Nick Johnson
  • 201
  • 1
  • 4
  • Thank you for the reply, but I'm a little lost (as you can probably tell, I don't know much about cryptography). Why are MACs generated? And how do they affect the process of comparing an input password to stored cyphertext? –  Jan 05 '12 at 01:37
  • @skier88 A MAC is a Message Authentication Code, and it's a common way to use a hash. Hashing passwords is slightly different, but similar considerations apply. My main point was to illustrate that building your own cryptosystems, even when seemingly safe, can have surprising undesired side-effects - so you should stick to primitives and systems that have been shown to be secure. – Nick Johnson Jan 05 '12 at 02:50
  • 1
    Thanks, I do understand your main point, but I also want to learn a little terminology. So I understand now that MACs are an application of hashes, but, in the quotes text, what are the "tags" in the "message-tag pairs"? –  Jan 05 '12 at 03:07
  • @skier88 In this context, the 'tag' is the output of the CBC-MAC - see the Wikipedia article for more details. If it were a hashing function, it'd be the 'hash' instead of the 'tag'. – Nick Johnson Jan 05 '12 at 03:12
  • Ah, ok, thanks. So it seems CBC-MAC is not directly applicable to the question, but rather an example of why you have to be careful. Then again, that's why I posted this question - in the hopes that somebody could conclusively analyze this method. –  Jan 05 '12 at 03:28
  • @skier88 Well, CBC-MAC is one approach you might try to use a cipher as you would a hash function, and as you can see it has flaws. I would analyze your construction, but you haven't actually specified it completely: what's the mode of operation? what's the IV, if applicable? What portion of the output to you use as the hash? – Nick Johnson Jan 05 '12 at 03:44
  • Sorry, but I really don't know enough to answer that. I would try to figure it out, but I think you and others have already pretty much determined that this would be a bad idea, besides being much more complicated than simply replacing function calls. But thanks for the offer. –  Jan 05 '12 at 04:10
1

The fact that you have designed an unusual algorithm does not make it secure. "self made" algorithms are dangerous, because nobody has carried out cryptographic analysis of them.

As @Thomas Pornin wrote above, AES was not designed to be collision resistant. Besides, AES is relatively fast, which simplifies attacks.

SHA1 is designed be collision resistant, but SHA1 is also designed to be very fast, because its purpose is to generate hashes of big volumes of data or files in a short time. Its purpose is not to prevent attacks on hashed passwords.

If you want to generate a password hash that is resistant to different kinds of attacks, consider password stretching. Depending on your technology stack and available libraries consider bcrypt, scrypt, argon2.

mentallurg
  • 8,536
  • 4
  • 26
  • 41