16

I'm currently writing my own little password manager that stores the key in a SHA256 hash, with salt. I create the hash by doing the following:

def sha256_rounds(raw, rounds=100001):
    obj = hashlib.sha256()
    for _ in xrange(rounds):
        obj.update(raw)
        raw = obj.digest()
    return obj.digest()

After it is created it is stored with the following:

    key = base64.urlsafe_b64encode(provided_key)
    length = len(key)
    with open(key_file, "a+") as key_:
        front_salt, back_salt = os.urandom(16), os.urandom(16)
        key_.write("{}{}{}:{}".format(front_salt, key, back_salt, length))

The questions/concerns I have are:

  • Is this an acceptable way to store a hashed key?
  • Should I use more iterations?
  • Is my salting technique sufficient? Or should I use a different salting technique?

If this is not an acceptable take on storing a hashed password/key, what other steps can I take to make this more secure?


UPDATE:

I took a lot of your guy's advice, and if you would like to see the outcome, so far, of my little password manager you can find it here. Everyone's advice is much appreciated and I will continue to strive to make this as excellent as possible. (if that link doesn't work use this one https://github.com/Ekultek/letmein)

CertifcateJunky
  • 481
  • 1
  • 4
  • 13
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackexchange.com/rooms/79597/discussion-on-question-by-certifcatejunky-how-many-rounds-of-hashing-is-enough-f). – Rory Alsop Jun 30 '18 at 09:33

5 Answers5

81

I'm currently writing my own little password manager

That's your first mistake. Something this complex has many subtle pitfalls that even experts sometimes fall into, without plenty of experience in this area you don't have a chance making something even close to secure.

stores the key in an SHA256 hash

Uh oh...

This doesn't necessarily indicate you're doing something wrong, but I have strong doubts that you're going to do it right. I assume you're talking about a master password being hashed here? The master password should be turned into a key using a KDF like PBKDF2, bcrypt, or Argon2, then this key is used to encrypt the stored passwords.

If you want to have a way to verify that the password is correct, storing a hash of the key should be fine, but you MUST NOT store the key itself...if you store the key anyone who gets access to your storage has everything they need to decrypt all the passwords!

If you aren't talking about hashing a master password and you do mean an actual randomly generated key then I have no idea what you're trying to accomplish here, but you shouldn't be using a slow KDF with a large number of iterations.

Alternatively you could be hashing the master password twice, once to store as a hash to later verify that the password the user enters is correct, and again to use as a key for encryption. Depending on how this is done it could range from a design flaw to completely giving away the key.

Edit: After seeing the full code it seems to be a fourth option: you store a hash of the password to later check if the entered password is correct, then you hash this hash to use as the key, which is nearly as bad as just storing the key itself.

I create the hash by doing the following:

def sha256_rounds(raw, rounds=100001):
    obj = hashlib.sha256()
    for _ in xrange(rounds):
        obj.update(raw)
        raw = obj.digest()
    return obj.digest()

It's not really clear what raw is here, but I'm assuming it's the password. What you're doing is an unsalted hash using SHA256. Don't try to create your own KDF!

After it is created it is stored with the following:

key = base64.urlsafe_b64encode(provided_key)
length = len(key)
with open(key_file, "a+") as key_:
    front_salt, back_salt = os.urandom(16), os.urandom(16)
    key_.write("{}{}{}:{}".format(front_salt, key, back_salt, length))

So, you're creating the key by hashing the password, then adding a random salt to the front and back? Not only is concatenating 2 different salts to the front and back non-standard, it's not accomplishing anything here because it's done after the KDF has already finished! You're just adding in some random values for the sake of having them there.


To show just how bad this is (as of commit 609fdb5ce976c7e5aa1832670505da60012b73bc), all it takes to dump all stored passwords without requiring any master password is this:

from encryption.aes_encryption import AESCipher
from lib.settings import store_key, MAIN_DIR, DATABASE_FILE, display_formatted_list_output
from sql.sql import create_connection, select_all_data

conn, cursor = create_connection(DATABASE_FILE)
display_formatted_list_output(select_all_data(cursor, "encrypted_data"), store_key(MAIN_DIR))

While it may be a good learning experience to try creating a password manager, please please don't ever use it for anything remotely important. As @Xenos suggests, it doesn't seem like you have enough experience that creating your own password manager would really be beneficial anyway, it would likely be a better learning opportunity to take a look at an existing open source password manager.

AndrolGenhald
  • 15,436
  • 5
  • 45
  • 50
  • 23
    I'm not even sure "creating your own password manager" is a good experience. I think that diving into an existing one and looking at how they work and how they are coded is a far more interesting and efficient experience, since you won't do mistakes (by trying to do your own PM, you will do mistakes and unless someone is rereading your code, you'll keep doing and believing in these mistakes) – Xenos Jun 28 '18 at 14:54
  • 12
    It's not going to be used for anything important, I'm just creating it in order to learn, I am only storing the master passwords hash value, not the password itself – CertifcateJunky Jun 28 '18 at 15:29
  • 11
    Compared to closed-source 3rd party password managers, creating your own password manager, as long as you use other peoples' proper password encryption functions for the stored password and hashing function for the master password, is, imo, a much more secure and paranoid option than blindly trusting that someone else's will be secure, not have government backends, and not sell the passwords on the black market. Obviously, this logic does not hold for open-source 3rd party ones though. – kloddant Jun 28 '18 at 18:06
  • @kloddant so security by obscurity? First, offline password managers hardly have "government backends" or sell your passwords on the black market. If you get infected with any kind of malware, it may be able to distinguish between a known, secure password manager (and throw its hands up) and a DIY version, which can be more easily attacked. Plus, OP's password manager is now open source (and deleting it now doesn't help) and probably less secure than most open source password managers. – A. Darwin Jun 28 '18 at 18:13
  • @kloddant While I'm sympathetic to the idea of using FOSS software, being FOSS doesn't by itself make it any more secure. I wouldn't recommend creating your own unless you are a security expert (and perhaps not even then). There is a benefit to having a professional audit; not just more eyes on the code, but more importantly more _experienced_ eyes on the code. – AndrolGenhald Jun 28 '18 at 18:23
  • 5
    @CertifcateJunky: For what it's worth, getting the crypto wrong is only a problem if you're defining your own file format. It's kinda hard to mess up the crypto (KDF etc.) if you're working on existing password database formats (like KeePass's KDBX)... it doesn't take a rocket scientist (cryptographer?) to compare the original XML and your new XML and check that your code doesn't add/remove extraneous information. So if you're just trying to improve the UI and learn how it works, it might be worth sticking with an existing file format. As a bonus you get interoperability too. – user541686 Jun 28 '18 at 18:50
  • 2
    @Mehrdad Not 100% true, as you could still use a bad PRNG instead of a good CSPRNG for salt or iv generation, allow secrets in memory to be swapped, etc which would preserve compatibility while still being insecure. A good point though nonetheless. – AndrolGenhald Jun 28 '18 at 19:07
  • @AndrolGenhald: I suppose... but if you know enough to use a secure hash function it's kind of hard to not know enough to use a secure RNG. Though I would agree that things like preventing side-channel attacks and hardening against various attacks from other programs (including JS) is *way* harder, so I wouldn't recommend disseminating the password manager to others unless you really know what you're doing (in case someone finds a hole)... but yeah for personal use I'd give it a pass. (Yeah I guess this is security by obscurity. Sorry...) – user541686 Jun 28 '18 at 19:28
  • 1
    I was not arguing for security by obscurity; I was saying that writing your own password management software, not the cryptographic functions themselves, just the software that uses them, is better than trusting someone else's software. – kloddant Jun 28 '18 at 19:30
  • 1
    @kloddant The problem is that you can still do many things wrong while using good implementations of cryptographic functions (as demonstrated by the question). – AndrolGenhald Jun 28 '18 at 19:40
  • 2
    @AndrolGenhald I don't think kloddant is denying that. He's saying that particular risk needs to be balanced by the risk you take in trusting a third party's closed source software (and let's face it, the Snowden leaks have shown that that risk is very genuine). – Jon Bentley Jun 28 '18 at 20:28
  • @AndrolGenhald If you want to see it and how it works, _so_far_, I added a link to the git repo at the end of my question. Thank you for your advice! – CertifcateJunky Jun 29 '18 at 14:01
  • Come on guys, _it's a private password manager_. It generates passwords that are replacements for other people's `tr0ub4dor`s. For what its purpose is, you might as well use a couple of rounds of RANDU as the “hash function” and the current Dow Jones as the “salt”... — Of course, if somebody has access to the database, you're screwed then, but I daresay if you consider that risk significant then you have your priorities wrong. Avoiding any closed-source software on your machine is a _much_ more sensible move. (Which is not to say that you shouldn't instead use a proven open-source manager.) – leftaroundabout Jun 29 '18 at 14:33
  • @leftaroundabout To me problem isn't that it's insecure if someone gets access to it; if you want to store your passwords in an unencrypted spreadsheet that could be acceptable depending on what you consider a threat. The problem is that it _tries_ to encrypt passwords so keep them secure, and it fails. – AndrolGenhald Jun 29 '18 at 14:49
  • True. And as you said, the OP's implementation indeed makes very little sense. I was just objecting to your up-front statement that writing one's own password manager is a mistake. It may not _be useful_ in itself to do this, but it can't really harm either to dabble with it for oneself (unless, indeed, by actually using it for important stuff and dropping other security layers, which would be a horrible idea). And it _can_ be quite instructive. – leftaroundabout Jun 29 '18 at 15:16
  • @leftaroundabout I guess it depends how you interpret "plenty of experience". Relative to the OP's experience I intended for "plenty of experience" to mean that you can at least get the encryption right and not have any other obvious vulnerabilities or mistakes. That way you _should_ be mostly ok if your encrypted database leaks, but you probably don't want to recommend anyone else use it. The threats that _you_ consider important aren't necessarily the same as the threats _everyone else_ considers important, and a general password manager should protect against all the threats it can. – AndrolGenhald Jun 29 '18 at 15:48
20

First let me disclose that I work for 1Password, a password manager, and although it may seem self-serving, I have to add my voice to those telling you that writing a secure password manager is harder than it may first appear. On the other hand, it is good that you are trying and asking about it in public. This is a good way to learn that it is harder than it may first appear.

Don't authenticate. Encrypt instead

I've taken a quick glance at the source you link to in your update, and am delighted to see that you have taken on some advice that has been offered so far. But unless I'm misreading your code, you still have a fundamental design error making it massively insecure.

You appear to be treating the master password entry as an authentication question instead of as encryption key derivation. That is, you are checking the entered password (after hashing) with something that is stored, and if that check passes you are using that stored key to decrypt the data.

What you should be doing is storing an encrypted encryption key. Let's call it the master key. This should be generated randomly when you first set up a new instance. This master key is what you use to encrypt and decrypt the actual data.

The master key is never stored unencrypted. It should be encrypted with what is sometimes called a Key Encryption Key (KEK). This KEK is derived via your Key Derivation Function (KDF) from the user's master password and a salt.

So the output of your use of PBKDF2 (thank you for using that instead of just repeated hashing) will be your KEK, and you use the KEK to decrypt the master key, and then you use decrypted master key to decrypt your data.

Now perhaps that is what you are doing and I've misread your code. But if you are just comparing what you derive through your KDF to something stored and then making a decision whether to decrypt, then your system is massively and horrendously insecure.

Please make the boldest biggest first line of the project README scream the fact that it is massively insecure. And add that to every source file as well.

Some other points

Here are just a few other things that I noticed when glancing at the source

  • Use an authenticated encryption mode such as GCM instead of CBC.

  • Your approach of just encrypting the whole thing will work for small amounts to data, but it won't scale once you have larger data sets.

    When it comes time to encrypting (parts of) records separately, keep in mind that you will need a unique nonce (for GCM) or unique initializate vectors (if you unwisely stick with CBC) for each record.

  • Your data destruction after 3 failures is dangerous and adds no security.

    A sophisticated attacker will just copy your data file and write their own script for trying to crack the password. They will also make their own copy of the captured data. Your data destruction only makes it easy for someone to accidentally or maliciously destroy your data.

How many rounds of PBKDF2

So to your original question. The answer is that it depends. On one side, it depends on how strong the master password is and what sorts of resources an attacker will throw at the problem. And it also depends on what resources the defender can throw at it. For example, will you be running your KDF on a mobile device with limited battery capacity?

But as it happens we at 1Password are trying to gauge the cost of cracking by offering prizes to people or groups who crack a 42.5 bit password hashed with 100,000 rounds of PBKDF2-HMAC-SHA256.

Diminishing gains from slow hashing

But what's more important than figuring out how to weigh all of those is to understand that slow-hashing, while absolutely essential for a password manager KDF, yields diminishing marginal gains once it is tuned high enough.

Say you are using 100K rounds and you have some master password, P. If you picked a random digit, 0–9 and appended it to P you would get a ten-fold increase in cracking resistence. To get the same increase via increasing the rounds of PBKDF2, you would need to go up to 1 million rounds. The point is that once you have a decent amount of slow hashing, you get far more strength for defender effort by increasing the strength of the password than you do my increasing the number of rounds.

I wrote about this some years back in Bcrypt is great, but is password cracking unfeasible?

Jeffrey Goldberg
  • 5,839
  • 13
  • 18
  • It's defiantly not easy. In the README it tells you to not use it, I can add something saying that it is insecure and a work in progress at the top though. Also, you have not misread the code, what I'm doing is hashing the key, and comparing against that stored hash. I see now that, that probably isn't the best solution. Thank you for your answer. – CertifcateJunky Jun 29 '18 at 21:10
  • @CertifcateJunky: Always consider what someone could do by changing your source code and recompiling. This makes it obvious why you can't just check the entered password and then allow use of a key that was already stored on disk: Change the code to bypass that check and you still have a working decryption key, if they key bits don't actually *depend on* the unlock password. This is the difference between just authenticating the user vs. requiring them to actually supply a secret that turns into the key. If your code works like that, it goes way beyond "probably isn't the best solution". – Peter Cordes Jul 01 '18 at 03:29
  • KEK... that made me laugh. – Dev Jul 01 '18 at 05:32
  • @PeterCordes I've been working on it more, and have came up with an idea (mostly because of everyones advice) of keeping the master password, but only using it as verification (for now) and generating a "key" from that password with random characters and what not. The key is only decrypted and used if the master password hash matches the stored hash, also the key is never stored decrypted (unless you count memory which is a whole other ball game). That "key" is used to encrypt the passwords instead of the master password. – CertifcateJunky Jul 02 '18 at 16:13
  • 1
    @CertifcateJunky: Yes, turning the master password into a decryption key for the main key is the right kind of idea. This answer suggests some *good* ways to turn passwords into keys. – Peter Cordes Jul 02 '18 at 18:46
  • @PeterCordes I'm also looking into watching file changes as another precaution. – CertifcateJunky Jul 02 '18 at 19:13
6

Do not use raw SHA256 - it is able to be accelerated using GPUs and therefore prone to brute force. It isn't broken, by any stretch, it just isn't best practice any longer. PBKDF2 works to counteract that to some extent, but you should use bcrypt or scrypt preferentially.

Reinventing the wheel (which is pretty much what you're doing if you aren't using PBKDF2-SHA256, bcrypt or scrypt) is never a good idea in crypto.

crovers
  • 6,311
  • 1
  • 19
  • 29
  • 2
    @Azxdreuwa The ASICs for scrypt (used for litecoin) are possible because weak parameters were chosen. The real problem with scrypt is that it scales memory and cpu usage together, and it was designed for a disk encryption context where parameter can be higher. In a web context where you need a lower cost, it's easy to end up less secure that bcrypt, even if still hopefully resistant to ASICs. Argon2 fortunately has separate parameters for cpu and memory cost. – AndrolGenhald Jun 28 '18 at 18:10
  • 3
    In case it wasn't obvious from my previous comment, those ASICs couldn't be used for scrypt paramaters other than those chosen for litecoin (which are particularly bad). – AndrolGenhald Jun 28 '18 at 18:31
3

Looking at the code in the link you posted (reproduced below, in parts), I'm a bit confused.

If I read it correctly, main() calls store_key() to load a key from a file on the disk, then uses compare() to compare a user-given key against that one. Both run through sha256_rounds() which runs PBKDF2 on them.

Then you use the same stored_key, the one loaded from the file, to do encryption?

That's completely and utterly backwards. Anyone who gets access to the key file could just load the stored key, and use it to decrypt the stored data, or store new data encrypted under the same key, without having to go through the script to "verify" the password.

At best, I think you're confusing using the user-given password/passphrase to produce a key for encryption, and authenticating the user against a stored password hash. You can't do both with the same hash, the encryption is completely irrelevant if you've stored the encryption key along with the encrypted data.

Sure, passwords are processed with some sort of a hash/KDF both when used for authentication and when used to produce an encryption key. But that's more like a step of pre-processing necessary just because humans can't memorize 256-bit keys, and we have to rely on low-entropy passwords instead. What you do with the hash/key you got does differ between the two use-cases.


Then there's the thing that you have a hardcoded salt in sha256_rounds(), which pretty much defeats the purpose of a salt in general. I could comment on the code more generally, too, but this isn't codereview.SE.

Please tell me I read the code wrong.


The parts of code I looked at:

def main():
    stored_key = store_key(MAIN_DIR)    
    if not compare(stored_key):
        ...
    else:
            ... # later
            password = prompt("enter the password to store: ", hide=True)
            encrypted_password = AESCipher(stored_key).encrypt(password)



def store_key(path):
    key_file = "{}/.key".format(path)
    if not os.path.exists(key_file):
        ...
    else:
        retval = open(key_file).read()
        amount = retval.split(":")[-1]
        edited = retval[16:]
        edited = edited[:int(amount)]
        return edited
ilkkachu
  • 2,086
  • 1
  • 11
  • 15
  • nope that about sums it up. I didn't say it was secure, just that I was making one :P, I see your point though. – CertifcateJunky Jun 29 '18 at 14:44
  • 1
    @CertifcateJunky, well, if you're happy with not saying it was secure, you could just drop the AES/SHA calls for good and concentrate on storing and loading the data. If you do consider the notion of playing around with the crypto primitives, at least start with a high-level description of the algorithms and data the system uses, what parts are secret and what affects what. And go read some descriptions of how existing encryption utilities (etc) work, and consider why they do what they do. – ilkkachu Jun 29 '18 at 15:17
  • I think you're misunderstanding what I said and not taking it the way I meant it. I completely understand what you're saying and am actually trying to figure out a way to edit it. It was a joke, not a serious statement. – CertifcateJunky Jun 29 '18 at 15:45
2

Use a maintained tool that will decide for you and will increase the default count or even algorithm over time. I'd suggest libsodium, but there are others. Libsodium has defaults for "interactive" password hashing and for symmetric encryption. Ensure your software can some day "upgrade" the settings as needed. A tool like libsodium will change their defaults from time to time so you can follow them.

I do development for an open source password manager and use libsodium. Also I found Mitro's security document worth reading.

Bufke
  • 163
  • 5
  • 3
    The point of the tool is to learn how hashing works and how password managers work. I'm not suggesting anyone use this tool, because well, _I WROTE IT_ (lol). But I think it's a good practice in order to determine how things work. If you're interested, you can see it [here](https://github.com/Ekultek/letmein) – CertifcateJunky Jun 29 '18 at 14:07
  • 2
    For sure - I totally get that. Go write it, make mistakes, and learn! I would still suggest looking at other software like libsodium to learn what they use. Typically a password manager is dealing with the concern of storing passwords and presenting a UI. Almost all of them will use other software libraries to do the actual encryption. Separation of concerns. Thanks for the link! I see you are using python - you might be interested in this https://www.crypto101.io/ which uses Python in examples. – Bufke Jun 29 '18 at 14:11
  • I'll look at it. I am curious how they keep their encryption up to standard. Thanks for the link! – CertifcateJunky Jun 29 '18 at 14:24