We should all know the XKCD comic on password strength, suggesting (appropriately) that a password based on multiple common words is more secure and memorable than a password such as Aw3s0m3s4u(3
or something.
I have an application (multi-platform) that I want to generate somewhat secure passwords for, and my password requirements are much less demanding: if the password has no spaces I expect the 'multiple symbols, numbers, mixed alpha and 6+ characters', but if the password has more than one nonconsecutive space I'm relaxing the symbol/number/mixed case constraint, and instead require at least two words that are no less than 4 characters individually, with a minimum password length of 15 characters.
The question isn't about that aspect, but about generating: assuming I want to generate an easy-to-remember and hard-to-guess password for the user, is it cryptographically safe to generate a password based on 5 or so dictionary words from a 10k word list? (Literally 10k words sit in my database, scraped from various sources, emails, etc.) They're all pretty common words, no less than 3 characters in length.
Now I don't want to make these one-time passwords, but I'm suspecting I should at least require the user to change it to something else upon logging in after using this generated password, which is fine and I can, but I also want users to have the option (on changing a password) to generate a 'secure' password that fits my requirements.
From a cracking standpoint, how easy/difficult would it be to attack a password generated using this scheme? There's no fixed length, words in this database table range in length from 3 characters to 11 characters (environment
is a word in the database, for example)? The programme generating the passwords will not pick two words with 4 or fewer characters (so the shortest password could be one three-character word, 4 five-character words, and 4 spaces, for a total of 27 characters), and it will not use the same term twice in a password.
Based on samples I've run against it, the average password length generated by the programme is ~34 characters, which seems acceptable to me. Even if we assume that each of the 27 minimum non-space characters (so 23 characters in the end) can be 26 possible states (a-z
), that's 23^26
or 2.54e+35
possibilities.
There are 994 words in the database with 3 to 4 characters in length.
We can also assume that the attacker has the dictionary, and the generation parameters/algorithm. Is this still secure, can I get away with taking one word away from the generated password (that's still 21 characters, for 18^26
possibilities (4.33e+32
) based on entropy alone), the only problem I see is that this isn't based on character entropy, but on word entropy, which would mean the 5-word password is 10000*9006*9005*9004*9003
possibilities, or 6.5e+19
possibilities, and the 4-word password is 10000*9006*9005*9004
possibilities, or 7.30e+15
. Compared to a normal 6-character password ((26+26+10+33)^6
or 7.35e+11
possibilities: 26
lower alpha, 26
upper alpha, 10
numbers, 33
symbols) it's significantly stronger.
Another assumption I made: users will write this down, they always do. I suspect that five random words on a piece of paper (hopefully not in direct sight, but alas that's the most likely scenario) are less-likely to be picked up as a potential password than a, well, complex term that looks like a traditional password.
Lastly, before I get to my actual questions, the passwords are all salted before stored in the database, then hashed with the SHA-512
algorithm 100 times, with the salt being appended between each hash. If the user logs in successfully then the salt is changed and a new password hash is created. (I assume this doesn't help much in a brute-force offline attack, but it should help against active online attacks I would think.)
DatabasePassword = SHA512(...SHA512(SHA512(SHA512(password + salt) + salt) + salt) + salt)...)
So, finally, my actual questions:
- Is my math correct? (You don't necessarily have to answer this, I'm sure it's close enough in principle to demonstrate my concerns.)
- Is this generation secure or should I stick to the 'traditional' password generation? Do note that an attacker doesn't have any idea on whether the users' password was generated with this algorithm or selected by the user, the attacker can make an assumption if they know the length, but that may or may not be a safe one.
- Lastly, did I make any assumptions that would significantly alter (increase or decrease) the security of this 'idea'? (By assuming the per-character entropy of a 6-character password is 95, for example.)
Apologies for the length, I'm used to over-explaining myself to hopefully alleviate confusion.
It was pointed out that my question is extremely similar to this one, I want to point out the differences in my generation method (though, honestly, it's still similar enough that it could be considered a duplicate, I leave that up to the community to decide):
- Each word is separated by a space, this means that all but the first and last three characters have an additional potential state.
- The password is not selected by a human, it's (mostly) uniform-random generation. No words are preferred over others except to only allow one ultra-short (3 or 4 character) word, once the random generator selects a word of that length no more of those may be selected. (Though the position that word will be in the list of words is random still, and there may not be an ultra-short word selected.)
- This is mixed in with a separate password restriction, which means the attacker has two vectors to attempt to crack. The user could have selected a password meeting the 'traditional' requirements or a password meeting the 'XKCD' requirements.