Password Storage - Self Encryption vs Hashing?

Question

Self Encryption: Encrypting a password using the password itself (as a symmetric key). Basically, by doing this, I'll get random data as an output. Now, in order to retrieve the password from this encrypted data, I must know the key. That is, I must know the password itself. Doesn't this property make it kind of one-way function?

I know that Hashing is the recommended and preferred way for storing passwords in a database (since it is one way). I also know that we should NEVER come up with our own crypto. However, I'm just curious to know how effective (in terms of security) will this Self Encryption be (if used instead of hashing) for storing passwords in a database?

By the way, I was not able to find much detail regarding this on the internet. I came up with concept while brainstorming and while searching about it, I came across with this answer by Tom Leek (where he called this technique as Self Encryption).

The answer to your question is in Tom's answer (the one you linked). A secure password storage scheme must be **slow** and **salted**. Your encryption scheme is neither. Therefore, it's not secure. — Adi, Nov 27 '14 at 11:28
As linked to by [Nefrubyr](http://security.stackexchange.com/users/5401/nefrubyr), it sounds like you've just [basically reinvented crypt(3)](https://en.wikipedia.org/wiki/Crypt_%28C%29#Traditional_DES-based_scheme). — user, Nov 27 '14 at 13:42
@Adnan: So basically, if I just add a SALT and some ITERATIONS to my scheme, will it be secure? — Rahil Arora, Nov 27 '14 at 18:21
Try encrypting the password with a simple XOR algorithm and the same password as the key. It turns out to be a magnificent one-way function! You can even apply multiple iterations to make it extra safe. =] — Marcks Thomas, Nov 27 '14 at 22:54

score 14 · Answer 1 · answered Nov 27 '14 at 07:54

14

Very interesting thought.

But we have one problem here, regular hashes have always the same size, your hash will have different sizes depending on the input.

So it can be considered less secure that a regular hash.

Hash function definition:

A hash function is any function that can be used to map digital data of arbitrary size to digital data of fixed size (...)

answered Nov 27 '14 at 07:54

Lucas NN

1,336
8
21

6

The original Unix crypt implementation got around the problem of variable length by using the password as a key to encrypt a fixed block of data; see http://en.wikipedia.org/wiki/Crypt_(C) – Nefrubyr Nov 27 '14 at 09:52

score 14 · Answer 2 · answered Nov 27 '14 at 09:47

The properties people are looking for when storing passwords is to make it incredible tedious and slow to try and guess the original text, but it needs to be relatively fast when doing it once in software.

You also don't want two users using the same password to generate the same ciphertext. To cope with this you will need either a salt in case of hashing or an IV in case of self encryption.

As Herr K has already mentioned, if you have a password you need to derive a key from it to be able to encrypt the password. What you normally use then is PBKDF2. Which is basically also used for password hashing. So by doing the self encryption you basically build another layer on top of PBKDF2.

Does that add any security? Not really. There are better ways, like increasing rounds in the PBKDF2 function rather than building a self encrypting layer on top.

There are better functions which actually are built on top of symmetric crypto algorithms such as bcrypt. bcrypt is built on top of Blowfish. So maybe you should just go with bcrypt since it's been vetted over the years.

Note that the famous cryptographer Tom Leek once said:

Hashing is the proper framework for password storage (where, in fact, the password is not stored, only a password verification token). However, growing home-made hash functions is known to be hard; when cryptographers want to build a hash function, they take a lot of time because they need to be sure that the function is secure, and you cannot know that just by looking at it.

score 3 · Accepted Answer · edited Mar 17 '17 at 10:46

What you are doing here is in fact defining a hash function: given an encryption function Encrypt(K, M) where K is the key and M is the plaintext to encrypt, you define Hash(P) = Encrypt(P, P). So you are inventing a new cryptographic function.

First, for password hashing, you don't need any old hash function, because it needs to be resistant to brute force cracking attempt, where the attacker guesses what the password might be, calculates Hash(guess) and compares the result with the stored hash. Whatever hash algorithm you do, you need to beef it up to include salt (so that attackers have to crack each hash individually rather than going for all accounts at once and making the weaker passwords fall) and make the hash function slow (because that hurts the attacker, who needs to make a lot of wrong guesses, more than the defender, who'll be mostly verifying correct attempts). See How to securely hash passwords? for a more detailed explanation.

You can build a slow, salted hash function from an ordinary hash function, with a construction like SSH(P, S) = Hash(Hash(…Hash(P + S)…)) (where P + S is concatenation). This is not necessarily the best way to do it — for example, a good password hashing function for typical uses should require a lot of memory, because servers have a lot more memory than specialized password cracking hardware. But it's a good start.

The problem remains whether Hash(P) = Encrypt(P, P) is a good hash function. This is not automatic; beware that some analyses of the security of cryptographic algorithms rely on the key and the plaintext being independent.

A major difficulty is that the key size in most encryption algorithm is heavily constrained, often constant. For example, the standard encryption algorithm AES only accepts three key sizes (8, 12 or 16 bytes). If the password (plus salt) is too short, you can pad it with an invalid character, but what can you do if it's too long? The usual way to use a key based on material that's longer than the key size is… to apply a hash function (accepting an arbitrary input length) to the material.

If you want to derive a hash function from an encryption algorithm, there's actually a simpler way: instead of Hash(P) = Encrypt(P, P), define Hash(P) = Encrypt(P, 0), i.e. encrypt an all-zeros plaintext block (or some other well-known plaintext block). That's what the original Unix password hashing function did. The advantage of this approach over Encrypt(P, P) is that you can benefit from existing analyses of the algorithm: encryption algorithms are designed to be resistant to known-plaintext attacks. The limitation with the password and salt size remains.

I'm accepting this answer as it has very well summarized the problems with *Self Encryption*. — Rahil Arora, Dec 04 '14 at 00:03

score 2 · Answer 4 · answered Nov 27 '14 at 21:32

The idea of using the password as key in an encryption is actually quite similar to how passwords are stored.

One of the early approaches to password storage was based on using DES with the password being used as key. The salt served as plaintext and the ciphertext served as hash value.

For multiple reasons that approach is considered obsolete and insecure these days, but the good parts of that approach lives on in newer password hashing designs.

Problems with the DES based approach:

Only 12 bits of salt. I have no idea why, since there doesn't seem to be any technical reason why it couldn't have used the full 64 bit block size as salt.
Only 8 characters of password. The way DES is structured the key consists of 8 words of 7 bits each. Using an 8 byte password as DES key is simply going to ignore the most significant bit of each byte. This means 56 bits of key, and any characters beyond the first 8 were silently ignored.

If we look at how MD5, SHA1, and SHA2 works, it is quite similar to a block cipher. Each block of data to be hashed is expanded in ways very similar to how an encryption key is expanded into a key schedule for a block cipher. What happens then is as follows:

A constant (called the IV) is encrypted with the first block of data used as key, then the cipher text and plain text are added together. The result of the addition is passed to the next round. The next block of data is then used as key to encrypt the output of the previous step, and again plain text and cipher text are added. This proceeds all the way until the end of data+padding.

As such both approaches will use the password as key in a block cipher, but the input to that block cipher is not the password.

All in all there is nothing fundamentally broken about encrypting a password using itself as key in order to obtain a password hash. But the devil is in the detail, and until the details are clearly specified, no-one can say if there is a weakness in it. In particular having a salt with sufficient entropy and supporting arbitrary length passwords are two important aspects that must be supported in order to have any chance of matching the security of modern password hashes.

score 0 · Answer 5 · answered Nov 27 '14 at 07:38

0

When you are encrypting, you would need to generate a KEY from a passphrase, which would be your password in this case. You would be running your password through a key derivation function like PBKDF2 which would effectively come close to a hash.

So basically IMHO it's interesting but I don't think necessary in terms of security.

answered Nov 27 '14 at 07:38

Herr

249
2
14

What the OP is saying is using the password both as the thing you're encrypting AND the KEY you use for encryption. – Nzall Nov 27 '14 at 11:06

Password Storage - Self Encryption vs Hashing?

5 Answers5

Linked