0

I'm working on a simple file encryption tool.

Basically, here's how my program works:

password = get_user_typed_password()
salt = uuid4()
key = bcrypt(password + salt)
cipher = AES(key)

for block in input_file:
    output_file.write(AES.encrypt(block))

My question is about "how could I make a possibly short and weak password a long and strong one?".

I thought I could use a concatenation of sha512 hashed, like that:

for i from 1 to 1000:
    password += sha512(password).digest()

At the end, I would have a a 64000 long password with 'high ANSII' characters like these ones: ☼¢♣┌¥↕▼║ÅÕ¼■♫.

Do you think it's a good idea to stretch the given password like this? Remember that it will not be stored in a db.

I know that inventing your own mechanism is bad, but here, it's only for making the password longer/stronger...the final key will be key = bcrypt(long_password + salt) anyway.

With that question, I wanted to ask one that I didn't found here: "Hash concatenation is bad for generating a key...but what about password stretching?"

Thanks in advance for your answers, explainations and critics (I'm a newbie in crypto)!

SuperPython
  • 399
  • 2
  • 8
  • What is the finality of stretching the password? I ask you this because, if I choose the password "a" and you want to stretch it, it might be possible, but at the end I will be able to login using "a" again so, does the stretching give you any benefit? – kiBytes Jan 20 '14 at 10:19
  • Here, the finality is to ensure that the password used to build the key will be strong, whatever the user choose for his password (including 'a'). It is also to avoid the use of rainbow tables for trying to find the password. It is not question to "login" here. No password will be stored anywhere. – SuperPython Jan 20 '14 at 10:26

2 Answers2

2

"Long" does not make "strong". What matters is the entropy, a tricky word which means "that which the password could have been". A weak password is weak because it is part of a small set of possible passwords. The attacker is in your head; when you select a raw common English word he knows that he just has to try all common English words; there are about 10000 of these.

Now suppose that you have "stretched" your password. How many distinct stretched passwords can you end up with ? Still 10000. The attacker still has only 10000 passwords to try in order to break into your system. The stretching has make the passwords longer but not stronger because entropy has not changed.

At best, you can try to make each attacker's try as expensive as possible, and that's the role of bcrypt.

Tom Leek
  • 168,808
  • 28
  • 337
  • 475
1

No, I don't think it's a good idea. At the end of the day, the number of possible states that your output could have is limited to the number of possible states that your input password can be in. So, if your password is only 5 characters long, alphanumeric, then the number of possible outputs will always be 36^5, or about 60 million. You can't create information entropy by hashing it - you're just making it look more random, which gives people a false sense of security.

Moreover, you're inventing your own key-stretching which has not been peer reviewed. There are existing key-stretching mechanisms that have been properly reviewed, and that have been shown to be at least reasonably strong. PBKDF2 is one such algorithm.

There's also a fault with your encryption code:

for block in input_file:
    output_file.write(AES.encrypt(block))

This is an ECB-mode encryption (each block transformed independently) which is known to be insecure, even if you're using AES or any other strong block cipher. You should look into CBC or another block mode that uses an IV.

Finally, you don't seem to be providing any form of authenticity record in your ciphertext. This means that an attacker can re-shuffle blocks around and modify plaintext without ever knowing the key. If you were to use CBC mode this problem doesn't go away - there's a malleability issue that allows an attacker to modify one plaintext block by trashing an adjacent one.

My advice? Don't use this in production. You need to study cryptography for a while before trying to write a real implementation.

Polynomial
  • 132,208
  • 43
  • 298
  • 379
  • I totally agree @Polynomial – kiBytes Jan 20 '14 at 10:34
  • Thank you for your answer! I wanted to simplify my algo because the question was only about the password stretching. In my program, I use AES in CBC mode, with an IV. To be honest, I expected that kind of response! I totally agree and understand that using a hombrewed solution is dangerous, but in my case, I don't understand, since, my 'homebrew' is only to extend the password, AND THEN, I use a quite secure and widely approved solution wich is bcrypt(password+salt). The extended password is seomething that the user could have typed on his keyboard... Do you know what I mean? – SuperPython Jan 20 '14 at 10:45
  • I think I understood! My error was that I didn't assumed that the sourcecode was known! I forgotten that the security of a system should not be based on it's secrecy! If my source code is known, extending the password will add absolutely NO security! Am I right? – SuperPython Jan 20 '14 at 10:58
  • It's a bit of both. Assuming that your code is secret is a bad move (crypto 101 stuff!) and the actual key derivation you're using is weird and may have flaws. Stick to PBKDF2 and other known-good mechanisms. – Polynomial Jan 20 '14 at 11:07
  • Thanks Polynomial! I replaced my weird hashing loop by a use of pbkdf2 with a lot of iterations, and a long output (100 chars) that will be the input of my final bcrypt opetation. – SuperPython Jan 20 '14 at 11:20