74

I am designing a service that would, among other things, store sensitive information. To ensure no unauthorized access of this information, it would be encrypted with a key derived from their password (PBKDF2). The password will be stored in a BCrypt hashed + salted format in the database. It is never stored in plain text.

The nature of the saved information is such that a strong password is necessary. Some websites force their users to make up passwords with large entropy by enforcing strict character guidelines. These complex passwords can lead to password reuse on different websites [1]. This is A Bad Thing™ [2].

I would much rather have my users select a password that is both strong and not very likely to be reused or already in use on another, less secure web service. As such, I was thinking of using an XKCD-like [3] password scheme for my users in their native language.

The user would be presented with 4-5 different words from a large word list with more than 6 characters (no words with special characters included, just ASCII). The password input dialog would be formatted with 4-5 fields instead of the normal single field, to reenforce the passphrase paradigm. Upon registration the password can be regenerated at will, to give the user the ability to select a passphrase consisting of words that they can easily remember. The user can not enter their own words.

I know from personal experience that CreeperHost [4] already uses this method, albeit with a single password field with four concatenated English words.

My questions are as follows:

  • Would this method be more secure/effective than allowing users to pick their own?
  • Does anyone have any experience with implementing a similar scheme? Was it effective?
  • Is dividing the password field into more than one distinct field benificial or does it expose too much information?

I am looking specifically for answers related to real-world application of this specific method, if there are any. I am familiar with the theoretical strengths and weaknesses of this method of password generation.


  1. The Tangled Web of Password Reuse - http://www.jbonneau.com/doc/DBCBW14-NDSS-tangled_web.pdf
  2. Password reuse - http://xkcd.com/792/
  3. Password Strength - http://xkcd.com/936/
  4. Creeperhost - http://www.creeperhost.net/
mcgyver5
  • 6,807
  • 2
  • 24
  • 45
Stephan Heijl
  • 813
  • 7
  • 11
  • 19
    Does it make sense to _require_ your users to use a password of this form? It is likely to annoy users who use password managers. (If you know, somehow, that none of your users will ever use a password manager in connection with this site, then I suppose it's okay) – David Z Jan 02 '15 at 05:36
  • You bring up a fair point, I am not familiar with the way most password managers handle this situation. A rudimentary search shows LastPass supports this, but I'm not sure about others. – Stephan Heijl Jan 02 '15 at 06:45
  • This question is so full of win, I finally joined after a year+ of lurking. – Krista K Jan 02 '15 at 11:20
  • 5
    Just as a personal issue, if I had a choice between using a password I made up or one made up for me, I would choose the former every time. Especially since I find the XKCD claims to be false, and the four words are not easy to remember, and I definitely didn't have them memorized as quickly as claimed. I also know there was a TED talk referencing this where the person indicated that testing showed they weren't easy to remember. (If I don't have a choice, I'll probably put the password inside a password locker that does allow me to set my password. So I'd better be able to copy/paste quickly.) – trlkly Jan 02 '15 at 12:07
  • trlkly: The user has some say in this scheme, as passphrases can be influenced to some extent with a series of words that are easier to remember by "spinning the wheel" until a suitable phrase comes up. @cmdqueue referenced a study that shows they are approximately as easy to remember. Looking at this study shows that the more secure passphrases (pp-large) are perceived as less difficult and less annoying when comparing against an equivalent high entropy password. (https://cups.cs.cmu.edu/soups/2012/proceedings/a7_Shay.pdf) – Stephan Heijl Jan 02 '15 at 12:19
  • 3
    I always hate when sites impose me to use a specific password format, in the end it's their data, it's up to them how they want to protect it. –  Jan 02 '15 at 12:19
  • @OndrejSvejdar Kindly reread the question. The purpose of the passphrase in this instance is not primarily as authentication. The passphrase is used as the key for a Key Derivation Function used to encrypt personal data. Other systems that I will not discuss in this question are concerned with user authentication. The most important factor in this question is that the user's data is encrypted and the key should not be present in plaintext on my end of the service. – Stephan Heijl Jan 02 '15 at 12:25
  • 1
    @StephanHeijl: I hope you're actually just using it to encrypt the real key? – SamB Jan 03 '15 at 06:06
  • [**Obligatory XKCD post.**](http://xkcd.com/936/) – Qix - MONICA WAS MISTREATED Jan 03 '15 at 15:05
  • 2
    If this was an "optional" service (e.g. not one imposed on me by my bank, for example) then I would regard any attempt to force me to use a password selected by the provider to probably be an indication of cluelessness on their part, and it would almost certainly motivate me to buy whatever service I was considering your site for from another provider. – Rob Moir Jan 04 '15 at 10:04
  • You should make sure you encourage your users to pick very strong passwords such as "correct horse battery staple" (yes, I just referenced xkcd), which is difficult for cracking programs to find out, instead of the traditional "throw in a number and an uppercase". Yes, this will improve the security, and you should inform users it actually works – Sarah O'Connor Jan 02 '15 at 02:52
  • Typically, yes. There are a ton of script kiddies and other random hackers that may target you or your client for fun. It's never good when a company who owns the website needs to put out a pr explaining that they got hacked... and you're part of the reason. – Henry F Jan 05 '15 at 04:35
  • One hidden advantage--you can be sure they aren't reusing a password. – Loren Pechtel Jan 05 '15 at 06:11
  • Your scheme would prevent me from using a stronger password than what you sypply (e.g. a 20 char generated Lastpass password). You would receive a strongly worded email if I was one of your users. –  Jan 06 '15 at 09:36
  • @StephanHeijl : From my own experience, forcing staff to choose strong passwords is the best way to get several post-it inside the first drawer of their desk! Or *(when it doesn’t happen)* mail the sysadmin for password reset everyday. – user2284570 Jan 06 '15 at 16:42

8 Answers8

34

Allow any passwords. Just highlight consequences.

Your scheme is weird and alien to users and I believe many people would rather stop using your service than comply. Instead of letting them choose their own password, you're forcing them to remember one you chose for them. This is unacceptable to most people. You think that you're giving them a choice by "try again", but the difference in user experience is about same as ATM vs one-armed bandit.

Keep in mind that the biggest issue is human factor. What you should avoid in the first place is passwords being written on a whiteboard. These guys had a very secure password. And then it aired on TV.

If you want to ensure no one at your side would be able to crack it, then what you need are internal procedures, not monkey-training users. Maybe a system where access to salted passwords is protected by the same retry limits as the user login panel. Besides, it's not important to make it impossible for workers to gain access. Clerks in stores have unlimited access to cash registers. What prevents theft is awareness of unavoidable consequences. Use admin-session recording. Use "minimum 2 persons to access" procedures. This is how banks do it, because strong user passwords can't protect everything.

Deer Hunter
  • 5,297
  • 5
  • 33
  • 50
Agent_L
  • 1,921
  • 14
  • 13
  • 3
    Extra thought: **you** think the information is important. But you never know what users put in there. I have accounts in 6 banks, but 5 of them never have more than few dozen bucks. Yet those banks still force me to guard pennies and thousands alike. – Agent_L Jan 02 '15 at 14:07
  • If banks had password policies by deposit amount then one day you could withdraw all the money from your thousands account and deposit in your penny account. It would be a shame if hackers unaffiliated with you broke through the penny security that same day. And it would be a big coincidence if you received thousands in mysterious wire transfers that same day - that would in no way relieve the penny bank from making good on your thousands in deposits. Password protection is for the bank's protection, not yours. – emory Jan 02 '15 at 14:39
  • 2
    @emory I wasn't writing about bank analyzing my balance (when you set up your first pass the balance of your brand new account is obviously 0), I was writing that I am the only one capable of deciding the security level. Bank already has good protection, as it's not responsible if I displayed neglect managing my password. But yeah, they're protecting their image, "Bank X lost lost my money" headline never looks good even if "money" means "$0.12". – Agent_L Jan 02 '15 at 14:47
  • This. As much as I love security, adding annoyance on top of security does not equal better security – Raestloz Jan 05 '15 at 08:22
  • The "these guys" link is broken (it says "this site does not allow hot-linking). Can you find another source for it, or link to the whole web page (so we can click their ads and pay for the hosting!). – Darren Cook Jan 05 '15 at 21:00
  • 2
    [Image with password not hidden](http://www.frostbox.com/wp-content/uploads/2013/04/tumblr_m99m0evw311rsyfz8o1_1280.jpg) – Cole Tobin Jan 06 '15 at 06:34
  • Agreed. I’d probably have to write it down, which I do not normally do with my passwords. Additionally, some of the words may not make sense or be strange to non-native English speakers… – mirabilos Jan 06 '15 at 13:58
  • Sorry for dead link. Full article (In Polish.): http://niebezpiecznik.pl/post/wczoraj-w-wiadomosciach-tvp-ujawniono-hasla-do-systemow-ratowniczych/ . Credentials allowed access to RescueTrack, international system for managing and dispatching medical helicopters. People who logged in reported they were able to send text messages to air crews. It remained untested if a valid dispatch order could be issued this way. TV show was recorded at LPR (Lotnicze Pogotowie Ratunkowe, polish air ambulance service). – Agent_L Jan 07 '15 at 14:34
21

The specific questions:

Would this method be more secure/effective than allowing users to pick their own?

Yes. You will reduce password re-use and remove the very common passwords immediately.

Does anyone have any experience with implementing a similar scheme? Was it effective?

I've seen a lot of 'alternative' password schemes in the past. Most just added support problems and complications without substantially improving security.

Is dividing the password field into more than one distinct field beneficial or does it expose too much information?

I think you've already defeated the basic automated no-knowledge attacks simply by being different. That leaves a targeted attack. However if there's public sign-up available then the attacker can discover that you need 5 words or whatever. So I don't think you are providing the attacker with more information by splitting it into fields.

Obviously don't provide advice about which word(s) are wrong though if authentication fails - that will undermine the security!

More advice

There's lots of other good advice in other answers about usability.

If security and usability are important perhaps look at something better than passwords. SMS perhaps?

It's possible to faff around with passwords for ages and still not deal with the risks properly. Trojans on the users computer, authentication reset via e-mail accounts that get hacked, phishing ... there are things that no amount of messing with fancy password schemes and controls will address*.

If the data is important I'd do something better than a password.

(Sometimes you can be better off just generating a random password for the user and asking them to store it securely in a locked drawer. It depends on what the threats and attacks are you are worried about, and how the system will be used.)

*There's quite a lot to be said for the risk based two-factor approach google offer, where the two-factor authentication is triggered only occasionally. It controls cost and gives a great security improvement. But I'd outsource this unless you have a lot of time on your hands :)

JCx
  • 480
  • 2
  • 6
  • This is more like the answer I am looking for. To clarify, authentication would be indeed be performed with an extra factor, like SMS. This question was mainly related to the data encryption-password link and the extra user experience benefits or hurdles associated with exposing the nature of the password generation method. – Stephan Heijl Jan 02 '15 at 14:55
  • Stephan - do you mean that you've already got a second factor, e.g. SMS, on all logins anyway? If so you can rather reduce the security on the password without affecting the risk much. Unless you are in a very high-threat environment. – JCx Jan 02 '15 at 15:19
  • All logins are performed with a second factor. However, due to the sensitive nature of the data stored, I fear that lowering the security of the password (from which the encryption key is derived) exposes too much of a risk should the database containing the encrypted files be seized, be it by malicious hackers or government intervention. – Stephan Heijl Jan 02 '15 at 15:35
  • Ah - got you. The crypto key makes the password entropy rather more important. – JCx Jan 02 '15 at 16:12
  • Okay - I'm pondering an update to the answer. Are you worried about a local attack on the users where their home is searched, or just electronic attack? Are you doing the decryption on the client side or the server side? – JCx Jan 02 '15 at 16:17
  • 1
    Decryption is performed client side, all the data is transferred over HTTPS. Since this is an enterprise application my intent is to secure against break ins in an office environment or server seizures on my end. – Stephan Heijl Jan 02 '15 at 16:22
  • Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/19927/discussion-between-jcx-and-stephan-heijl). – JCx Jan 02 '15 at 16:46
  • 2
    First claim is unjustified, and skits the question entirely. Many users have secure schema, like "correct horse battery staple", that doesn't meet *your* stupid requirements, so they have to revert to using "correctHorse4" instead. Or just "horse4abc" since you also prohibited the same character twice in a row (looking at you, USPS). – djechlin Jan 02 '15 at 22:06
14

You should read about Diceware. You can get 44 bits of entropy with four words from a much smaller list. If your list is 2050 or so words and you select without replacement, each word will be about 11 bits of entropy because log2(2048) = 11. With 5,000 words you can get 12 bits per word. (log2(4096) = 12)

I like the idea of a "try again" link that the user can click until presented with a combination they think they can remember.

One thing about your question bothers me. If the data are encrypted with a particular user's password, then only that user will be able to decrypt the data. If your data is stored on a per-user basis, that may be OK. Do note that if a user forgets his or her password, then the data are toast if the password is the crypto key. You really might want to consider an asymmetric key arrangement in light of this.

For access, do not separate the input into four fields. That gives an attacker information the attacker should not have. Just decide whether the words shall be separated by spaces or not and tell your users.

Bob Brown
  • 5,283
  • 1
  • 19
  • 28
  • 1
    While it's important to realize that using password as key means only someone who knows the password can decrypt it (i.e. no recovery of data is possible if the user forgets their password), that's quite possibly a _good_ thing, especially as a criteria is "no one on our side can decrypt it." If it's possible to recover your data when you forget your password (and any recovery key provided to you while you still knew your password), it means someone on their end can decrypt your data, which can be bad. – cpast Jan 01 '15 at 22:51
  • I agree completely, except for the last paragraph. It doesn't really give the attacker any important information that she wouldn't have otherwise - not only is the number of words in the passphrase not important (the length is not part of the entropy), but it is known and revealed to any potential user, which the attacker can easily pretend to be. On the other hand, the upside is very strong user education - which is sorely needed if you want to push them to get used to a four-word passphrase... – AviD Jan 01 '15 at 23:02
  • 1
    This is intentional, the data should not be recoverable if the key is lost. This is why I'm looking into passphrase generation methods that yield easily memorized results. I like the idea of Diceware, but the native language of my users is mainly Dutch, so I'm not sure if the wordlist they propose is suitable. As for separating the fields; shouldn't we always assume that the method of password generation is known to the attacker? It seems like a trivial amount of information to expose, while significantly enhancing user experience. – Stephan Heijl Jan 01 '15 at 23:03
  • I upvoted your answer, but I was looking for answers specifically related to real-world effectiveness of application of this paradigm, not just theoretical password strength. I will clarify this in my original question. – Stephan Heijl Jan 01 '15 at 23:06
  • @StephanHeijl: You can use Dicewords' mechanism without using English words. What are the most common 5,000 or so Dutch words of five or more letters? – Bob Brown Jan 01 '15 at 23:11
  • OK... I've been convinced that separate password fields is "mostly harmless." I've given up mainly because of the problem of making sure the user understands what to use for a separator character. – Bob Brown Jan 01 '15 at 23:13
  • The mechanism itself is quite irrelevant, picking the words will be performed by a sufficiently secure RNG. Of course, knowing that 5k words with 4-6 characters is sufficiently entropic is useful. – Stephan Heijl Jan 01 '15 at 23:15
  • 5
    A small study https://cups.cs.cmu.edu/soups/2012/proceedings/a7_Shay.pdf covers usability aspects of passphrases, including data on sentiment, recall, speed, dictionary choice, length, effectiveness and user tendencies if the system generates the passphrases. Based on what I've learned, security is more effective if it's more usable, remembered and not written down. People will reuse credentials anyway. I don't have enough experience to know for BCrypt, but I've seen demos that known plaintext (especially spaces, which reveal text) and its approximate position should be avoided. – ǝɲǝɲbρɯͽ Jan 02 '15 at 03:59
  • 1
    That's a great find, cmdqueue2. I read the study and though there are some limits to its scope (not being able to receive a new passphrase, 30bit entropy on the passphrase.) it does offer more insight into the issue. If you'd like you can reformulate this into an answer, as this is an important resource in this question. – Stephan Heijl Jan 02 '15 at 07:09
  • Personally, I'd think separate fields could be painful if the user was using some sort of password manager ... of course, maybe that's okay. – SamB Jan 03 '15 at 06:11
  • `I like the idea of a "try again"` it should be mentioned that this robs of at least couple of bits of entropy though. – Cthulhu Jan 03 '15 at 08:00
8

The problem I see with this approach is simply the non-conventional nature of it, which will reduce the likelihood of people even using it.

Personally, I would have to have a very compelling reason to use the service in order to put up with this, as opposed to traditional password entry. I've commented in the past that I think application developers have some responsibility to try to ensure that users use strong passwords. However; ultimately this is the user's responsibility.

Also consider that you're forcing the user to employ one particular method of ensuring password entropy. The XKCD approach is fine, but it isn't the only one and there isn't absolute, empirical evidence that it is the best, nor is there any assurance that if it is the best today, it will remain the best six months from now.

Do consider the use of password managers, which can generate very large, completely random passwords and enter those automatically for the user. You would likely break those applications with this approach.

Also, consider things like the grc.com password haystacks article, which claims that this:

D0g.....................

...is a stronger password than this:

PrXyc.N(n4k77#L!eVdAfp9

...simply because there is one extra character, despite the overall entropy (randomness) being fairly low, and it would take a computer 95 times longer to brute-force guess the password that is easier for you to remember.

The problem is that brute-force attacks are only really used up to a certain length of password (6 to 8 characters) because of the ridiculous computational expense as the passwords get longer. But even very complex short passwords are highly vulnerable to brute force. Newer attacks are hybrid pattern-based attacks that use compromised passwords as patterns within larger passwords. Powerful GPU's make this approach workable, and apparently it is highly effective. Simply padding a weak password with .'s doesn't appear to be a good solution.

Also see this article, Password complexity rules more annoying, less effective than lengthy ones.

So, at least some recent research indicates that the most relevant factor in password strength is password length, at least in part because passwords with a lot of complexity are just too much for most users, even if they're short. The XKCD approach helps ensure password length by making it easy to remember very long passwords, without those passwords being simple well-known patterns or phrases. That is (probably) it's ultimate strength--just the length of the resulting passwords. Having said that, I'm starting to wonder how long "correct horse battery staple" will hold up to newer hybrid pattern-based attacks.

Since there are other approaches to achieving the same result, forcing users into that one may actually be making your service more difficult to use without a commensurate payoff in security. Also, if some weakness is found in the XKCD approach six months from now, you are hard-wired into it and must expend substantial effort to change your service.

Craig Tullis
  • 1,483
  • 10
  • 13
  • I agree that the approach is somewhat unconventional. I can only think of two services that use a enforced, non-user password/phrase. I should add that the software this solution is designed for is quite specialized. People looking for this product would do so with security in mind, but wouldn't necessarily be security experts. While I agree that the password haystacks method works for personal password selection, it would break down when generated for every user. Padding a simple word for every user reduces the search to the simple phrase + the character * padding length. – Stephan Heijl Jan 02 '15 at 12:45
  • With regards to security weaknesses being found in a particular password generation scheme: I think a the same could be made for any scheme. Purely mathematically the approach is sound. – Stephan Heijl Jan 02 '15 at 12:47
5

It sounds like a great way of preventing password reuse, although as you may be nudging users out of their comfort zone it might cause them to save the password into a text file so they can remember it. You should stress that the user must remember their password, or use a password manager in order to store it securely (if that is something acceptable to you and your system). I would also make sure the multiple field solution works with common password managers, although I cannot see any other issues with splitting the words this way.

As password manager functionality with several fields designated as the password may be difficult to implement, I would also give users the option of using their own passwords too, however you should state that user must only use securely generated random sequences, for example like those used in password generators included in Keepass or LastPass. To prevent a user from just entering the same password they always use, you could enforce a silly restriction just in the case that it is user entered to ensure it is actually unique. For example. 50 characters minimum, using at least 20 numbers and it must include lower/uppercase and at least one symbol. This will make expert users happy, while preventing weak passwords from others (although you may get some users padding their weak passwords with 45 "2"s and one "!"). You could try and detect cases such as these and then reinforce the message that this option is for use with password generators only, however there would be diminishing returns in putting so much effort into edge cases. The better choice would be to try and to explain to non expert users that the generated different words option would be more beneficial.

I am designing a service that would, among other things, store sensitive information. To ensure that no one on our side of the is able to retrieve this information, it would be encrypted with a key derived from their password (PBKDF2). The password will be stored in a BCrypt hashed + salted format in the database. It is never stored in plain text.

In order for a Password Based Encryption to be secure against offline attacks (e.g. the people on your side, or if access was gained by an attacker another way), you would need an entropy of 128 bits or more, which means you would need ten random words for this rather than four or five. Although, as you are using PBKDF2 to strengthen the key, this may give you 16 bits or so of additional entropy (depending on iterations), therefore only nine words are needed. My point is to make sure that you do the maths to ensure your encryption is adequate.

This would make it even more difficult for your average user to remember the passphrase, and if they used a password manager you may as well go for the 50 character option and let them log in that way. As others have pointed out, it seems this could potentially generate a lot of support requests and it is a lot of effort to develop only one part of a secure authentication and session management system.

SilverlightFox
  • 33,408
  • 6
  • 67
  • 178
  • Accommodating edge case users who somehow mentally keep track of 40+ bit entropy passwords seems like a waste of time compared to the security that is sacrificed by allowing custom passwords, even when requiring absurd standards of potential entropy. I will look into the password manager suggestions, as these are becoming more popular and have the potential of providing more secure keys. – Stephan Heijl Jan 02 '15 at 12:49
  • @StephanHeijl: Agreed. I added that option in there as implementing the five fields option to make it compatible with multiple browser based password managers may be difficult. It also reinforces the "use a password manager message" as I believe that many users will simply save the five words in a text file given only the first option. – SilverlightFox Jan 02 '15 at 13:04
  • @StephanHeijl The only password entropy factor that is truly relevant is the length of the password. Require a 25-character password, don't worry about other "complexity" factors because they just don't mean that much, and you're well on your way to a solid solution. – Craig Tullis Jan 02 '15 at 18:18
  • 2
    @StephanHeijl many users will just write the 5 words on a post-it note and stick it to their monitor or under their keyboard. Trust me--that's what happens. ;-) – Craig Tullis Jan 02 '15 at 18:19
  • Agreed, users who don't already have a password manager will be faced with a choice between writing the password down insecurely (many will take this), trying to remember it then forgetting it and using password reset (many will take this accidentally at least once), or installing some software they don't understand to remember it for them (almost none will do this). So decide whether that scenario is better or worse than them using and re-using their own weak passwords and away you go. Just don't imagine they'll follow simple instructions like "memorise this phrase". – Steve Jessop Jan 03 '15 at 02:09
2

Your method is interesting, but I think I don't like it.

  1. You allow the users to "respin the wheel" until they find a password they like. If you did this with four-digit numbers, I'm afraid some people might spin again and again until they end up with 1234 or 0000. This would make the attacker happy.

  2. Even though there are a lot of combinations, an attacker can restrict to trying passwords only of your prescribed structure (and in fact they can use your very service to determine what word list you use).

Hagen von Eitzen
  • 1,098
  • 8
  • 19
1

Before trying to implement any password policy you really should watch this great presentation by Rick Redman at AppSecUSA 2014 titled "Your Password Complexity Requirements are Worthless":

https://www.youtube.com/watch?v=zUM7i8fsf0g

He discusses cracking passwords using the similar patterns that all humans use to create passwords.

Rory Alsop
  • 61,367
  • 12
  • 115
  • 320
0

The user would be presented with 4-5 different words from a large word list with more than 6 characters (no words with special characters included, just ASCII). The password input dialog would be formatted with 4-5 fields instead of the normal single field, to reenforce the passphrase paradigm. Upon registration the password can be regenerated at will, to give the user the ability to select a passphrase consisting of words that they can easily remember. The user can not enter their own words.

This is going to be an issue.

This means that you have to have a list of the words that are being used in the password phrase, and that there is no deviation from that list.

If a hacker is able to get the list somehow, whether it be from brute force attempts or social hacking or whatever other way, then they have the list of words and it is that much easier for them to figure out the password.

If you instead make it so that a user can enter their own words and allow numbers and/or special characters, the password phrase will be much more secure and you will still keep them from using the same password as {everywhere else} because you are still forcing 4-5 words for the password phrase.

Couple of things (some of which were already mentioned)

  1. Make sure the words don't repeat more than once
  2. Don't allow them to repeat a word more than once (pass, word, pass, word)
  3. Don't use the password as the key for encrypting the data
    • This is a very bad idea, especially if the user forgets their password and needs a password reset.
  4. Don't use a password scheme
    • This leaves clues for a hacker to eliminate words and phrases, narrowing the field.
Malachi
  • 207
  • 1
  • 12