How to check that the user entered the correct password when decrypting a file?

Question

I'm trying to create a program for encrypting/decrypting different files. I have not problems with the code, I just don't know how to save the password for every file, which was encrypted through my program.

A user can encrypt his file using a password, and when he tries to decrypt it, there must be a message which informs him about the incorrect password, because, when he uses an incorrect password, the program will produce a decrypted file, which will still be unreadable.

Hi Dimas, did you try to google "store passwords" or in the forum here? I found many posts that will help you — camp0, Feb 08 '19 at 08:13
Welcome, Dimas! I changed the title to reflect your goal as I understand it. You mentioned "how to *save* passwords" but what you really want is to *check* passwords. The *saving* would be just one way to solve the problem. You don't need to save passwords to solve the problem, see my answer :) — Luc, Feb 08 '19 at 09:53
Please do not roll your own crypto if you do not know about padding and how to detect it. — Tobi Nary, Feb 08 '19 at 11:55

Luc · Answer 1 · 2019-02-08T14:45:48.367

You are doing it all wrong! But that's okay, amateur cryptography is where we all started and learned. Just be aware that your encryption software will not be secure unless you had it audited by an independent third party, and even then they make no guarantees. Do not use your own hobby crypto for anything important! And do not let others use it without making sure that they know it's just a toy!

Imagine you have a file that says "attack at 10:00". Encrypted, it is something like aWJiaWtzIGliIDEzOjAw. Your enemy might know that you are sending the attack message, but does not know the password. What they can do, though, is change a byte. If the message is changed to aWJiaWtzIGliIDazOjAw (the E is replaced by a), it suddenly says: "attack at 03:00".

As you see, modifying the ciphertext, even blindly, can be an advantage to the attacker (the enemy). You want to have authenticated encryption, where a modification is detected. This also solves your password problem.

You do not need to store the password in the file, because if they enter the wrong password, the authentication check will fail. The software will notice that the decrypted text changed from the original text, and you will know it's wrong.

There are also some more details on why you should use authenticated encryption:

There are a lot of resources online on how to do authenticated encryption. Some good methods to look for are AES+HMAC, or AES in an authenticated mode such as GCM or OCB. This Wikipedia article is a good introduction and contains pointers to algorithms that you might be able to use: https://en.wikipedia.org/wiki/Authenticated_encryption

As an example, let's consider the previous ciphertext: aWJiaWtzIGliIDEzOjAw. Now we HMAC it with SHA-256, which gives us 3be635.... I used a random online site to compute the HMAC and used the same password as I used for the encryption. (As a challenge to the reader, try to break it :). It's just one character.) We can store the two together: aWJiaWtzIGliIDEzOjAw|3be635. When a user tries to decrypt the contents, they enter the password, you can compute the HMAC again, and if it does not match, you know that either the message was tampered with, or their password was incorrect.

The code for this is basically:

# Encrypt
password = user_input()
plaintext = "attack at 10:00"
ciphertext = encrypt(method = "AES-256-CTR", key = password, data = plaintext)
auth_code = hmac(method = "SHA-256", key = password, data = ciphertext
write_file(name = "encrypted", data = base64(ciphertext) + "|" + auth_code)

# Decrypt
password = user_input()
ciphertext, auth_code = read_file(name = "encrypted").split("|")
computed_auth_code = hmac(method = "SHA-256", key = password, data = ciphertext)
if computed_auth_code == auth_code:
    plaintext = decrypt(method = "AES-256-CTR", key = password, data = ciphertext)
    print(base64_decode(plaintext))

So far so good, but...

This method allows an attacker to guess a password very fast. The HMAC operation takes a few microseconds to compute, so an attacker can do many of guesses per second on a standard computer. The code is very simple:

ciphertext, auth_code = read_file(name = "encrypted").split("|")
for password in load_file("password_database.txt"):
    if auth_code == hmac(method = "SHA-256", key = password, data = ciphertext):
         print("The password is " + password)

Because the only operation in the for loop is hmac(...) and a quick comparison, it is very fast.

Before using the user's password for anything, you should apply a slow key derivation function (KDF). How to do this best is answered in the question: How to securely hash passwords?.

Your code should now look something like this:

password = user_input()
password = argon2(iterations = 1_000_000, memory = 100_000_000, data = password)
# Below here, the normal encryption/decryption code

Now, an attacker's code has to look like this:

ciphertext, auth_code = read_file(name = "encrypted").split("|")
for password in load_file("password_database.txt"):
    password = argon2(iterations = 1_000_000, memory = 100_000_000, data = password)
    if auth_code == hmac(method = "SHA-256", key = password, data = ciphertext):
         print("The password is " + password)

Now the attacker has to do this argon2 thing, which is super slow. If it takes 1 second on your computer, then it will also take roughly 1 second on their computer (maybe a little faster, maybe a little slower). This means an attacker can only do one password guess per second. That's much safer! Of course, an attacker can use 100 computers, but that's a serious investment, and they would still have only 100 guesses per second instead of millions or even billions per second.

Euphrasius von der Hummelwiese · Answer 2 · 2019-02-10T08:09:36.140

0

First of all, you should consider this disclaimer.

Go ahead with your implementation if:

your tool is only for educational purposes
your tool is not required to be used for really sensitive data
you really know, what you are doing

That said, you could consider storing a password hash with the file. If you do so, please consider this question as well. Your tool could then check if the password hashes correctly before decrypting and therefore provide valid feedback to the user.

Keep in mind that password hashes can (possibly) be cracked. Storing the password or a hash of the password in the file, that you encrypted therefore weakens the "security" of your tool.

TLDR;

Your best bet is to use algorithms (and implementations) that solve your problem and refrain from reinventing the wheel. Have a look at authenticated encryption, as Luc proposed.

edited Feb 10 '19 at 08:09

answered Feb 08 '19 at 08:14

Euphrasius von der Hummelwiese

1,018
4
20

"password hashes can be cracked" needs an explanation since cryptographic hashes are non-reversible. Here is an Argon2i hash with 8 iterations of a 13 character password that might be considered insecure, crack the password—salt: 673603d22cc85e10676715e6ceae4abd hash: c6bf335023c5c19f8596187e4899dd37 – zaph Feb 09 '19 at 17:06
Thanks, @zaph, I added a link to a question, that explanains hashing and cracking. Of course, cryptographic hashes are non-reversible, That is why I wrote cracking and not reversing. However, you would probably agree, that storing a cryptographic secure hash next to the file you want to protect would still leave more attack surface, than using authenticated encryption, as Luc proposed? That was at least the reason, why I voted in favor of his answer. – Euphrasius von der Hummelwiese Feb 09 '19 at 22:02
The question has completely changed such that authenticated encryption is now the correct answer. – zaph Feb 10 '19 at 13:54

How to check that the user entered the correct password when decrypting a file?

2 Answers2

Linked