186

An NHS doctor I know recently had to do their online mandatory training questionnaire, which asks a bunch of questions about clinical practice, safety and security. This same questionnaire will have been sent to all the doctors in this NHS trust.

The questionnaire included the following question:

Which of the following would make the most secure password? Select one:

a. 6 letters including lower and upper case.
b. 10 letters a mixture of upper and lower case.
c. 7 characters that include a mixture of numbers, letters and special characters.
d. 10 letters all upper case.
e. 5 letters all in lower case.

They answered "b", and they lost a mark, as the "correct answer" was apparently "c".

It is my understanding that as a rule, extending password length adds more entropy than expanding the alphabet. I suppose the NHS might argue that people normally form long passwords out of very predictable words, making them easy to guess. But if you force people to introduce "special characters" they also tend to use them in very predictable ways that password guessing algorithms have no trouble with.

Although full disclosure, I'm not a password expert - I mostly got this impression from Randall Munroe (click for discussion):

password strength

Am I wrong?

Robin Winslow
  • 1,738
  • 2
  • 11
  • 10
  • 204
    I like that they test people on the concepts of passwords, but that is a horrible set of possible answers. – schroeder Oct 06 '16 at 20:45
  • 30
    Ultimately it is a very poorly worded question with no clear answer. You are correct that generally length is more important than the character set when it comes to preventing a brute for cracking of a password but without defining what characters are included in "special characters" it is impossible to tell which is better. Option c also doesn't specify upper and lower case letters so it's entirely possible that it is a smaller character set and shorter password. 26 lower case letters +10 numbers +half a dozen common special characters is less characters than just upper and lower case letters – Evan Steinbrenner Oct 06 '16 at 20:58
  • 57
    Did you have training, or was this quiz out of the blue? The right answer will be the one that the training provided, not the "true" answer. This is a sad state of affairs, but I expect it isn't limited to infosec and you hit similar issues with medical stuff. Props on being an NHS doctor; I'm British and you do amazing work. – paj28 Oct 06 '16 at 22:11
  • 26
    Considering how many /banks/ are wrong about password security I can't say I'm surprised… – StarWeaver Oct 06 '16 at 23:22
  • 3
    I suspect the authors are looking at this from the practical standpoint rather than theoretical. Fact is, most cracking attempts start with dictionary based searches, possibly with a few trivial substitutions. So while more letters may be less guessable than fewer characters including punctuation, crackers are fairly likely to try all letters first before widening the character set much. So _practical_ time-to-break may be very different from theoretical. – keshlam Oct 07 '16 at 01:21
  • 6
    This is one of those Internet questions that gets me all riled up and wanting to start and angry letter writing campaign to someone. – Michael Oct 07 '16 at 01:46
  • 22
    @Michael we need a "i found out on the internet that someone is wrong IN REAL LIFE" xkcd for these occasions. – StarWeaver Oct 07 '16 at 02:06
  • 1
    The "c" answer is also bad because it mentions a 7-char password, which is *way* too short for all standards. – dr_ Oct 07 '16 at 06:43
  • 3
    @keshlam This is kinda my question. I'm actually more interested in the practical implications than the absolute total number of random possibilities. The problem with that logic about special characters though is that anyone *trying* to break into somewhere with the special characters requirement will know about the requirement. I am guessing that people tend to use special characters in very predictable ways, and so the benefit of increased length (ahem) is still way superior. But I'd welcome anyone who knows of research on this. – Robin Winslow Oct 07 '16 at 06:44
  • 1
    The only problem I have with that comic is that I won't be able to use that password almost anywhere because everyone has requirement for number, case (and sometimes even special char). – Zikato Oct 07 '16 at 11:25
  • And secure from what threat? If the server stores the password in plain text then they are all equally secure from that. You might argue the shortest on is most secure then, as you are least likely to write it down and leave it on your desk. – JohnB Oct 07 '16 at 13:59
  • 1
    This the same NHS which still funds homeopathy, remember! Don't get me wrong, I think the NHS is one of the best things we have in Britain - but it has chronic management problems, and its track record with IT is best summarised by the Despair.com quote: "Sometimes your purpose in life is to serve as a warning to others". So surprised at the fact they can't get some basic IT training right? Not very. – Graham Oct 07 '16 at 14:48
  • There is the point that the C password will (if appropriately constrained) contain some "special" characters, preventing the use of simple names or common words/phrases. This doesn't increase the statistical strength of the password but does make it less susceptible to a dictionary cracking scheme. – Hot Licks Oct 07 '16 at 21:07
  • 1
    No, it doesn't. The point the OP is asking/making about common substitutions is dead on the money, and current password cracking algorithms use heuristic analysis, not simple brute force dictionary attacks. No matter what characters are in a 7 character password, it's going to be broken in a relatively shorter period of time. There just aren't enough permutations and the computer can try every single possibility regardless of complexity in a short timespan. 8 chars is an order of magnitude better but still too short. 9 is vastly better than 8, and 10 vastly better than 9, by *huge* margins. – Craig Tullis Oct 08 '16 at 02:46
  • What is NHS?.... – Celeritas Oct 08 '16 at 03:20
  • 1
    10^26 >> 26^10. That is, twenty six digits has much greater combinations than ten letters. Length is nearly always more important than width. – Chloe Oct 08 '16 at 04:56
  • Entropy depends upon the alphabet and number of symbols used. In case of English words (which number around a million) : a pass word of four such randomly chosen words gives an entropy of 79 ie (log2(10^[6*4]). Clearly when we are choosing English words-- our alphabet is not one , of some 52 odd elements...as some seem to confuse here. – ARi Oct 08 '16 at 08:19
  • Of course even a 26 symbol set can describe a string of words ...In this regards one need only ensure adequate length so that the password entropy as 'seen' with 26 symbol alphabet view is comparable; for an entropy of more than 72 here we require a total of around 17 characters . i.e.72/Log2(26) – ARi Oct 08 '16 at 08:50
  • 6
    @Celeritas [National Health Service](https://en.wikipedia.org/wiki/National_Health_Service) in the United Kingdom. – gerrit Oct 08 '16 at 13:16
  • 1
    Reminds me of the IT security compliance quiz administered by a healthcare provider in Texas, where one of the quiz questions tried to see if people would project their knowledge of biological viruses onto computer malware. Unfortunately for the supposed experts writing the quiz, computer viruses are highly likely to cause an increase in system temperature and sluggishness, as a result of the CPU load caused by the malware trying to spread. – Ben Voigt Oct 10 '16 at 16:57
  • 1
    Note that a **human factor** may be involved in their decision. Pushing them away from letter-only passwords (even if that means they don't use quite as long a one because it's hard to remember) could, on average, create more secure passwords simply because you get fewer people are using things like `secretpass` (which is 10 chars...yet would probably be broken quite quicktly) or other extremely simplistic word-style passwords. ...still a crummy question though. – Jimbo Jonny Oct 10 '16 at 18:03
  • 4
    Possible duplicate of [XKCD #936: Short complex password, or long dictionary passphrase?](http://security.stackexchange.com/questions/6095/xkcd-936-short-complex-password-or-long-dictionary-passphrase) – Auzias Oct 11 '16 at 06:48
  • I think most of the answers here are missing the point. The question is not asking a bunch of network admins "what password _requirements_ are best to enforce?", it's asking a bunch of users "which password (assuming all of them meet the minimum requirements) is better to pick?". In other words, it's trying to get the user to think about how they choose a password, to think about the very factors which many answers identify as key (randomness is not a valid real-world assumption unless we enforce the use of password managers, most of which use a non-random password to gain access). – Adam Oct 11 '16 at 15:32
  • Not for any specific radomly generated password, but for a policy, or, as Adam mentions, requirements, they are correct that the special characters make it more likely that more difficult to guess/reason passwords will be generated, since users generally don't create completely random ones. – PoloHoleSet Oct 12 '16 at 19:18
  • 1
    this question asks "why" http://security.stackexchange.com/questions/139594/why-do-the-large-majority-of-big-organizations-have-known-bad-password-policie – Trevor Boyd Smith Oct 12 '16 at 20:03
  • 1
    @StarWeaver there is an XKCD for that one: https://xkcd.com/386/ – alroc Oct 13 '16 at 12:53
  • @ Evan Steinbrenner, Although you are correct in principle, "Numbers and special characters" have a simple intuitive set (I hesitate to call it "definition"): it's the symbols on the 21 non-alpahbetic buttons of the left-keyboard. They are all of the following: `1234567890-=~!@#$%^&*()_+[];'\,./{}:"|<>? – mr23ceec Oct 13 '16 at 15:46
  • (contd.) it's easy to see that 52^10 ~=1e17 and 94^7<=1e10^14. What is harder to prove is that "10 characters in upper and lower cases" is almost always 1 or 2 words (I'd say ~20 bits of entropy, since all the long words will be single) , while "7 character in upper and lower cases" is one word, but with weird shit tacked on (28 bits according to the chart, but let's agree on >25, shall we?) – mr23ceec Oct 13 '16 at 15:52
  • You know what? I'm fine with saying 2 words are <=24 bits, as per chart. – mr23ceec Oct 13 '16 at 16:09
  • 1
    It is interesting that no-one has compared this against the various publications that NHS organizations put out or point to on the subject of password security. Also, [here is a similar, but different, quiz by the Royal Wolverhampton Hospitals NHS Trust](http://www.royalwolverhamptonhospitals.nhs.uk/jdoi/downloads/Intro_to_Information_Governance_Booklet_070411.pdf#page=33). And enjoy [this guidance on how to write passwords down from another part of the NHS](http://www.northamptonshire.nhs.uk/resources/uploads/files/IM&T_14.pdf#page=16). – JdeBP Oct 13 '16 at 18:26
  • 1
    Too bad password [173467321476C32789777643T732V73117888732476789764376Lock](https://www.youtube.com/watch?v=oNrWgjh9tnU) is publicly known – chux - Reinstate Monica Oct 15 '16 at 14:17
  • Doggone doctors. They missed the day in their first-year medical school class in computer science where the topic was information entropy. Oh, wait, they take physiology and genetics in medical school, not computer science. – O. Jones Oct 16 '16 at 17:26

9 Answers9

229

By any measure, they're wrong:

Seven random printable ASCII: 957 = 69 833 729 609 375 possible passwords.

Ten random alphabetics: 5210 = 144 555 105 949 057 024 possible passwords, or over 2000 times as many.

Length counts. If you're generating your passwords randomly, it counts for far more than any other method of making them hard to guess.

Pablo A
  • 123
  • 5
Mark
  • 34,390
  • 9
  • 85
  • 134
  • 26
    They didn't say ASCII, and many (most?) systems allow Unicode passwords. There are 128,172 Unicode characters, so "c" would be correct on that basis. – paj28 Oct 06 '16 at 22:06
  • 2
    In fact for "b" they didn't say they were upper and lower case LATIN letters. I expect Unicode has a job load of non-latin upper and lower case. – paj28 Oct 06 '16 at 22:19
  • 8
    @paj28, the concept of capitalization is really only a feature of the Latin and Greek alphabets, and some of the alphabets derived from them. – Mark Oct 06 '16 at 22:27
  • 17
    With a [quickscript](http://dpaste.com/3HJGDJK) I count 1984 lower and 1631 upper case characters. 3615 ** 10 = 381138671891365331133862350087890625 (for b) which is still less than 128172 ** 7 = 568266595760666481405057656211161088 (for c). Maybe we could trim some unprintable characters out of (c) to lower the number - but it's pretty clear the people who wrote the question did none of this maths! – paj28 Oct 06 '16 at 22:31
  • 89
    @paj28 Broadly speaking this is a good point, but in practice I think it's not realistic to expect that the average doctor's system in an English-as-primary-language country will have anything but ASCII key inputs readily available (if they have to figure out how to configure multiple keyboard layouts on possibly multiple machines that might not be "theirs" to configure, and work input method toggle hotkeys into their password-entry flow, and possibly get locked out when they have to login into a workstation that isn't already thus configured, it might as well not exist). – mtraceur Oct 07 '16 at 01:06
  • @mtraceur If you're taking security seriously enough to do questionnaires on them, you're probably beyond the point of typing them in manually, they're bound to have physical access keys (like an RFID chip or a thumb drive with the key). – Kevin Oct 07 '16 at 07:17
  • 8
    @Kevin if only... With plenty of mutually incompatible systems the NHS is the type of institution that runs on written down passwords. aboutthe best you can hope for is that it's not on a post-it under the keyboard. Some *more sophsticated* hospital staff I've come across have had a plaintext document on a password-protected phone. Others a book. – Chris H Oct 07 '16 at 07:24
  • 1
    @ChrisH Seriously? I'm an app developer who makes apps for Dutch healthcare institutions, the NHS is going to be our first international customer next year. Almost all Dutch healthcare organisations use RFID chips to authenticate, the healthcare inspection basically forces them to do so. If NHS is really that insecure I'm not looking forward to next year lol :P – Kevin Oct 07 '16 at 07:53
  • 1
    @Kevin I only have my own eyes to go on, and don't spend a lot of time in healthcare places so things may have moved on. But there are still a lot of legacy systems in use, and the classic problems of mutliple passwords with different rules, all of which much be changed at different frequencies. The NHS were one of the biggest bodies to pay for extended XP support after end-of-life. I believe they've stopped now but that doesn't mean everything has moved on – Chris H Oct 07 '16 at 08:29
  • Any american keyboard has diacritics and can be configured in us-intl to print è, é, ò, ç, ë, õ.. 5 diactritics, usually on the 5 vowels, that's 50 more characters, or `145^7 = 1.3 10^15` or just 100 times less. – njzk2 Oct 07 '16 at 15:17
  • 10
    [This answer](http://security.stackexchange.com/a/137322/112339) to a different question lays out the case against using non-ASCII characters in passwords. It's got nothing to do with the theory (larger character sets are better) and all to do with the practice (you cannot trust third party system implementers to process non-ASCII text reliably). – Luis Casillas Oct 07 '16 at 18:03
  • And moreover, in the case of c, if the system requires that all passwords contain some mixture of upper/lower symbol and numbers, then the number of possible passwords decreases a good bit, depending on the rules, of course. – JimmyJames Oct 07 '16 at 20:26
  • @JimmyJames, just as increasing the alphabet size doesn't help things much, decreasing it doesn't harm things much -- in theory. In practice, everyone puts the uppercase letter at the start of the password, and the digit and symbol at the end. – Mark Oct 07 '16 at 20:39
  • @Mark Please help me understand this. Let's say the rules require at least one uppercase, one numeric and one symbol. The number of possible passwords is then 23*10*26*95^4 = 487,074,737,500 or 143 times less than the 95^7 figure. 143 times less is a lot in my book e.g. 143 months to brute force all hashes is a lot more than 1 month. Either my math is wrong or we have different ideas about what 'much' means. – JimmyJames Oct 07 '16 at 21:00
  • 1
    @JimmyJames, Two errors in your math. First, there are 33 symbols, not 23, and second, the restricted characters can appear anywhere in the password. The correct formula is (33*10*26*95^4*7!)/(3!*4!) = 24,459,622,687,500, or about a third as large as 95^7. – Mark Oct 07 '16 at 21:41
  • 1
    @Mark Yeah, I realized the second error as I was enjoying my dinner beer. Clearly I was B12 deficient. So OK not as big a difference as I had thought but still, 1/3 is not an insignificant difference. – JimmyJames Oct 08 '16 at 02:11
  • 1
    @njzk2: it is just silly to assert that typical users are going to put diacritics in their passwords. It just isn't going to happen, end of story. The length of the password is paramount. And the variety of the input **in practice** is smaller than the full ASCII character set. – Craig Tullis Oct 08 '16 at 02:50
  • @Craig right, because typical users are of course Americans. How silly of me to consider the rest of the world. – njzk2 Oct 08 '16 at 22:01
  • @njzk2: nice ignition. They use a lot of umlauts in Britain, do they? English is a language that isn't used much for, say, business in the rest of the world? I'm just speaking to practicality. Bottom line, the actual size, in practice, of the alphabet from which passwords are chosen is much smaller than the full Unicode character set. Theory is nice, but reality is what it is. Besides, **you're** the one who was talking about American keyboards and diacritics. – Craig Tullis Oct 08 '16 at 22:13
  • @Craig yes, because that's the keyboard I happened to be writing that comment on. But that should only emphasis the fact that even people who don't usually use diacritics can still put some in their password. (but like you said, that's theory. In practice, we all know that the password is always "123456") (I should point out, though, that my initial point was that *even* by considering various common diacritics, we would be shifting the order of magnitude, but not the outcome of the comparison) – njzk2 Oct 09 '16 at 00:31
  • 1
    The problem with this answer is that people don't choose passwords randomly *at all*. So it's better to look at a large database of actual passwords and compare for each of these constraints how easy the average password meeting the constraint is to crack. – reinierpost Oct 11 '16 at 10:24
  • Flipside is that if you have only letters the chances are that the password contains a dictionary word.... Probably only a dictionary word/s – undefined Oct 11 '16 at 22:25
  • @Kevin The NHS does indeed have a smartcard-based authentication solution. However, as with any SSO solution, integration is a pain, and many vendors don't do it, and of course it's not integrated into solutions that predate its introduction. – James_pic Oct 12 '16 at 09:13
  • Disagree with the "by any measure" - who creates a password that is completely random? Almost no one, because it is difficult to remember. So, since we're already imposing some kind of pattern or order on the selection process, which of the options is more likely to re-introduce a level of randomness that would be difficult to guess or reason for someone trying to crack the password? Regardless of the number of characters, if I pick something like MyNameIsAndy, that might be easily guessed, where requiring that I include an odd character, somewhere, will make it more random - MyName#IsAndy – PoloHoleSet Oct 12 '16 at 19:16
  • @AndrewMattson Any decent password cracking tool would try likely passwords and permute through numbers and symbol additions/substitutions. Adding a symbol like that doesn't make an easily guessable password much stronger. – JimmyJames Oct 14 '16 at 13:58
  • @JimmyJames - and having a slightly longer string of letters only, upper and lowercase would not offer ANY protection against an "easily guessable" hack. I'm pointing out that the premise of the longer string being stronger is only applicable if the user is randomly generating a password combination, which pretty much never happens. There are a number of ways to use non-letter characters, so it can make it stronger, because you'd have to test multiple versions of the same guessable password. MyN@meIsAndy, MyName!sAndy, MyNameIs@ndy, MyN@me!s@ndy, etc. – PoloHoleSet Oct 14 '16 at 14:03
  • 2
    @AndrewMattson Yes but in the case the hashes are exposed, trying these combinations takes a very small amount of time. – JimmyJames Oct 14 '16 at 14:44
  • Downvoted because it is entirely incorrect that "by any measure" they are wrong. There are easily available rainbow tables of upper-lower out to >10 characters. These should be considered *non-passwords* in the first place. Just because in theory you might use all of the ascii character set doesn't mean hackers are dumb enough to assume you did. –  Oct 14 '16 at 17:17
  • @njzk2 Regarding international users bothering to use diacritics - most of them will not, even if they're used to doing so for regular typing. For example, all the Chinese people I know use the Latin alphabet plus numbers for their passwords; I haven't had as much exposure to users whose primary language uses a lot of accented Latin characters, but what I have suggests that they also will simply skip the accents in password fields. – Logan Pickup Oct 14 '16 at 17:55
  • 1
    @LoganPickup As someone who has a couple of diacritics in my native language, my experience tells me that barely anyone uses them, especially with the advent of mobile devices where typing them is often annoying. It also makes your account inaccessible from all computers without Polish keyboard installed, and then there is the factor of not knowing how the target site will handle those characters. I once lost access to my account because change password form treated them differently than the login form. – Maurycy Oct 16 '16 at 17:16
  • 1
    @WilliamKappler: rainbow tables are only really useful with unsalted hashes, which unfortunately still describes too many credentials databases. But pbkdf2, scrypt and bcrypt are safe from rainbow tables, as is a straight SHA1 hash with salt. The problem with straight salted SHA1 is that it's too fast, enabling the attacker to try many more guesses. More modern attacks use heuristics and dictionary guesses. Even stringing three unrelated dictionary words together into a long password dramatically increases the strength of the password. The length of the password is critical. – Craig Tullis Oct 23 '16 at 06:18
90

The theoretical perspective

Let's do the math here. There are 26 letters, 10 digits and let's say about 10 special characters. To begin with, we assume that the password is completely random (and that a character in one group is not more likely to be used than a character in another group).

The number of possible passwords can then be written as C = s^n where s is the size of the alphabet, and n the number of characters. The entropy of the password is defined as:

log2(C) = log2(s^n) = log2(s)*n

Lets plug the numbers from the question into this:

     s    n   Entropy (bits)
A   52    6   34.2
B   52   10   57.0
C   72    7   43.2
D   26   10   47.0
E   26    5   23.5

So in this scenario, C is only the third-best option, after B and D.

The practical perspective

But this is all under the assumption of randomness. That is not a reasonable assumption for how people generate passwords. Humans just don't do it that way. So we would have to pick some other assumptions for how the passwords are generated, and what order the attacker tries them in her dictionary.

A not unreasonable guess would be that many dictionaries begin with words, and only later move on to making substitutions and adding special characters. In that case, a single special character in a short password would be better than a really long and common word. But on the other hand if the attacker knows that a special character is always used, she will try those passwords first. And on the third hand maybe the dictionary is centered around completely different principles (like occurrences in leaked databases).

I could go on speculating about this forever.

Why it is the question, not the answers, that is wrong

The problem is that there are many principles for how the password is generated to choose from, and I could arbitrarily pick one to make almost any answer the correct one. So the whole question is pointless, and only serves to obscure an important point that no password policy in the world can enforce: It is not what characters a password contains that makes it strong - it is how it is generated.

For instance, Password1! contains upper case, lower case, a number, and a special character. But it is not very random. ewdvjjbok on the other hand only contain lower case but is much better since it is randomly generated.

What they should have done

If you just stop relying on the very fallible and limited human memory the character set and the length stops being limiting factors that you have to weight against each other. You can have both in abundance.

One way to do this is to use a password manager. As Dan Lowe pointed out in comments, that might not be a workable option on a hospital. A second alternative is to use some kind of two-factor authentication (e.g. a hardware token or keycard) that makes the security of the first factor (the password) less important.

This is the responsibility of the system managers, and not the end users, to implement. They must provide the tools that allow the end users to perform their work in a practical and safe way. No amount of user education can change that.

Anders
  • 64,406
  • 24
  • 178
  • 215
  • 3
    Looking at an English-language keyboard near me (one that actually has engravings!) I found just over thirty avalailable non-alphanumerics (using only Shift as modifier - not Compose/Super/etc); same on a medical device touchscreen. So you can probably increase the score for C a bit (it's still never going to win!). – Toby Speight Oct 07 '16 at 08:41
  • 3
    @TobySpeight Thanks for the input - did not expect there to be so many! I quess it is a question about wheater it is "available special chars" or "special chars people on average actually use". And then we are back to the fact that the original NHS question is to vague to be answered. – Anders Oct 07 '16 at 08:52
  • 11
    Password manager is not realistic in a medical setting. Doctors and other staff are logging into shared terminals in office spaces, hallways, patient rooms, etc. Not just on their own computer or device. – Dan Lowe Oct 07 '16 at 14:19
  • Your "practical perspective" bit assumes the dictionary attack would exhaust all letter-only variations first. If the attacker knows a number of special characters and numbers is required, then they would likely be trying variations of words as they go. Even if not, users are still likely to use words and replace letters with visually similar characters (like E to 3 or T to 7). This would add a limited set of numbers or special characters against allowing for longer words or phrases, which I'm not convinced would push it into "more secure" territory. – David Starkey Oct 07 '16 at 14:48
  • 1
    @DavidStarkey The point you make in the comment is sort of the same I make in the answer. I do not say you can take for granted that the attacker will user letters only first - in fact, I suggest the opposit. – Anders Oct 07 '16 at 14:50
  • 1
    Also password manager may be illegal (I don't know) as I don't think LastPass was created with whatever HIPA storage requirements there are in UK. – Maciej Piechotka Oct 07 '16 at 16:50
  • @MaciejPiechotka Good point. I will rewrite the last past of the answer to be more nuanced later, but no time right now. Thanks for the input. – Anders Oct 07 '16 at 17:16
  • 17
    Excellent answer, but I subtly disagree with the conclusion in your last section ("What they should have asked"). Not because "Generate your passwords randomly with a password manager" is bad advice, but rather because it's good advice for **end users**, and less so for **institutions** like NHS, who, if they're really serious about password security they ought to be taking a hard look at two-factor authentication instead. – Luis Casillas Oct 07 '16 at 18:12
  • @DanLowe That would be an argument for 2 factor auth. But then we're talking about mitigating the risk of a weak password by using an additional mechanism instead of just "password strength," which again makes the question bad. +1 to this excellent answer. – jpmc26 Oct 07 '16 at 22:27
  • 2
    @MaciejPiechotka - as far as I can tell, the UK does not have any specific legislation similar to HIPA. That would mean that there is only a general duty under the Data Protection Act that controllers of confidential information should follow best practices in order to ensure that the information is not incorrectly disclosed, but which does not have any specific legislative requirements on actual technologies used. – Periata Breatta Oct 08 '16 at 02:11
  • @TobySpeight Are you sure all keyboards are equal? That all kind of desktop/gamer/notepad/OSD/&c keyboards have the very same specials? – Koshinae Oct 08 '16 at 09:33
  • Edited in a higher amount of special characters, of course this only strengthens the conclusion. -- For those interested, you need more than 282 unique characters in a random password of length 7 to beat a random password of length 10 which is made up of 52 unique characters. – Dennis Jaheruddin Oct 08 '16 at 13:41
  • Sorry: I anticipated wrongly what you wanted to write :). [return] I don't like password *managers* because what you trust in the password model is the memory of the physical owner of the password (authentification based on what **you** know). By introducing a software within this trust path, you add many risks. [return] I made the error to think that most security specialists shared this risk analysis. – dan Oct 09 '16 at 17:01
  • Passwords are far more secure than password managers or they require passwords themselves. A doctor may use more than 1 computer and giving unauthorized personal who find a stick or such full access is really really bad. – HopefullyHelpful Oct 12 '16 at 17:18
  • I have to say, it's delightful the number of systems that I've logged into where `*light|` is not a secure password, but `Password1!` is. – Wayne Werner Oct 16 '16 at 02:21
20

I realize there are already a number of good answers, but I want to clarify one point.

The question is unanswerable as it does not specify a character set, nor the password selection method.

First of to address the second point, we shall pretend the passwords are generated truly randomly within the permitted domain, otherwise we cannot even start reasoning on the matter.

For our other point, to give extreme examples, let us say b implies letters only in the English alphabet, so lets say 52 possible symbols. This gives about 5.7 bits of entropy per character and thus about 57 bits of entropy overall.

On the other hand let us say (perhaps somewhat unreasonably) that answer c implies any completely random Unicode code point which is considered to be a character (as opposed to a BOM etc). There are currently roughly 109,000 of these as of Unicode 6. This means about 16.7 bits of entropy per character and a total of 117 bits of entropy.

On the other hand if the answer c was limited to only ASCII or perhaps ISO 8859-15 or some subset of these, the opposite conclusion could easily be drawn.

This is of course completely unreasonable but highlights the brokenness of the question and how one can reasonably justify either answer. To be a sensible test question it would have to be worded much more rigorously which would make it much harder for users with limited technical or mathematical knowledge to work out.

In the end I would suggest that this test is probably fairly pointless as an organisation would ideally not require users to memorize password requirements but would instead enforce them technologically (the only requirement I can think that learning by heart is useful is not reusing the same password in multiple places).

Vality
  • 399
  • 2
  • 7
  • 3
    From a practical standpoint the keyspace has to be restricted to what the user can reasonably type. Since this is the NHS the only language they can be sure of having available is English. Foreign speakers might have other things configured on their systems but that doesn't mean every computer they may need to use will thus be configured. – Loren Pechtel Oct 07 '16 at 05:53
  • @LorenPechtel I understand that, and as I tried to imply I was not saying that the example is truly realistic, but merely that ambiguities like that are enough to make the question impossible to answer with any certainty. – Vality Oct 07 '16 at 05:55
  • 3
    I object to this "unanswerable" standpoint. Yes there are certain unknowns, meaning it's not possible to calculate absolute definitely correct numbers about possibilities. This does *not* mean it's unanswerable, as all we're looking for here are generalities. We know rough sizes of alphabets people tend to use. And also, is it not true that the number of possibilities added by extending length from 7 to 10 far outweighs a difference of, say, 40% or so in your alphabet size? Regardless of unknowns, there are definitely better and worse rules of thumb. – Robin Winslow Oct 07 '16 at 06:50
  • I agree with everything but the last paragraph. I think it is useful to teach people about the *reasoning* behind password complexity requirements in general (though not, I'd concur, the details of any one particular set of rules), because if they understand that reasoning many of them (in theory, hopefully) might be more inclined & able to choose stronger passwords on their own volition. Versus just doing the bare minimum to satisfy the requirements of whatever technical enforcement mechanism is in place. – mostlyinformed Oct 16 '16 at 20:46
13

Is the NHS wrong about which passwords are most secure in the ideal case? Yes, absolutely -- and the other answers have covered that ground pretty thoroughly.

Is the NHS wrong about which passwords are most secure in an NHS environment? Maybe not.

How could a long password be worse tha--?

There are legacy systems that artificially limit the length of a password -- for instance, the old Windows LANMAN/NTLMv1 password hash limits the length to 14 symbols, and the old DES-based UNIX password hash limits it to 8. Worse, the password entry on such a system will often let you enter a password as long as you like, and ignore everything after the first n symbols.

In fact, it seems likely that NTLMv1 is the particular legacy scheme they're running. As @MarchHo points out, NTLMv1 splits your password into two halves of up to 7 characters each, and each half can be cracked separately. So if you're using NTLM with a 10-character alphanumeric password, what you really have is a 7-character alphanumeric password and a 3-character alphanumeric password. The former is clearly worse than 7 characters from the full symbol set, and the latter can be broken in milliseconds on a 10-year-old PC.

Why would something so old still be in common use?

Basically, because it works and it would be expensive to upgrade.

Now, this is me speculating, but: I propose that healthcare environments in particular are likely to be running legacy systems, because of the sensitive nature of healthcare. New systems are likely to need very thorough scrutiny before being accepted as a solution, which means healthcare systems upgrades tend to happen slowly and at great expense.

So if you know there are systems in common use that behave this way, and you can't fix them, then the best you can do is to tell your users to choose a length-n password using the largest possible symbol pool.

In general: are you sure your passwords aren't truncated?

Unfortunately, this has implications for the general case too, especially for us who like our passwords long. How sure are we we can't log into our account on https://example.com with just the first word or two of our passphrase? As bad as using the well-known "correcthorsebatterystaple" is, accidentally using "correct" would be even worse. To be secure in your passwords it's not enough to make sure you generate enough entropy. You also have to be sure that the system on the other end isn't throwing most of it away.

Jander
  • 981
  • 8
  • 12
  • 1
    NTLM also splits the 14-character password into two 7-character hashes, so the effective password length is 7. – March Ho Oct 08 '16 at 13:00
  • @MarchHo: I think you might have just solved the puzzle! Updated my answer. – Jander Oct 08 '16 at 19:22
  • The NTLM hashing process does **not** split the password into two 7-character segments before hashing. You're thinking of the LM hashing process, which does indeed do this. LM also converts all alphabetic characters to uppercase before hashing. – PwdRsch Oct 09 '16 at 19:38
  • NTLM version 1 does use the process I'm talking about, which it inherited from LANMAN. See e.g. [Wikipedia](https://en.wikipedia.org/wiki/NT_LAN_Manager) and [MSDN](https://msdn.microsoft.com/en-us/library/cc236699.aspx). NTLMv2 is the one that fixes this issue by switching to an MD4-based schema. Good point on the conversion to uppercase. – Jander Oct 10 '16 at 17:05
  • @Jander I think we're mixing discussions of the authentication protocols and the hashing methods. – PwdRsch Oct 10 '16 at 18:01
  • That's very interesting. But one crappy Microsoft security protocol that most people aren't using is not enough to justify the 7 characters as a general rule. – Robin Winslow Oct 10 '16 at 20:02
  • Edited again. I think I've made the answer relate better to the core question. – Jander Oct 11 '16 at 02:00
  • 1
    It's certainly true that healthcare environments can be *very* slow to upgrade. In 2014 I was still seeing IE6 in web server logs for a healthcare-related web app. – Peter Taylor Oct 14 '16 at 10:39
  • Isolated sightings of IE6, rare though it may be, are hardly surprising in any context. – Robin Winslow Oct 17 '16 at 08:25
8

There are some problems with that question. One of them is that it doesn't state how the passwords are chosen but I think the most logical approach is to assume the passwords are chosen randomly but satisfying the respective conditions so I'll use that convention for my answer. Note that Randall's comic clearly doesn't share this assumption but the question didn't specify which way a password is chosen so I reckon we can go for the best which is possible and that's choosing a password randomly. Furthermore, the test probably isn't based on Randall's comic.

The key pace of option b is quite easy to calculate if we assume the English alphabet is used. Yeah, more assumptions, I know. But since the test appears to be in English and not very tricky, I think we can make that assumption.

There are 26 lower-case letters in the English alphabet and just as much upper-case letters, making 52 in total. So there are 52^10 ≈ 1.45*10^17 elements in the key space of option b.

Option c is way less specific than option b. However, since we assumed that the English alphabet is used – which is in favor of option c – we may also assume that only ascii is used for the special characters – which is in favor of option b. Really, if we assumed more special characters than ascii has, we got to assume more letters than are in ascii since ä arguably is a letter in German. That makes the key space of option b even bigger compared to the one of option c.*

The best we can do for option c if we restrict ourselves to the ascii alphabet is to use every printable character (excluding the blank) in our alphabet (note: different, more general use of the word "alphabet"). That's 94 characters, giving option c a key space of 94^7 ≈ 6.48*10^13 elements.

Since one of our assumptions to tackle the question is that the password is chosen randomly witch the respective restrictions and that rule is equal to choosing a password randomly from the respective key space, a password chosen using option b is arguably harder to guess since there are several orders of magnitude more options to try when cracking the password.

In fact, if we assume the costs of cracking a password via brute force to be approximately linear to the size of the key space, cracking a password chosen via option b is 52^10/(94^7) ≈ 2'229 times as hard as cracking one chosen via option c, clearly showing that the allegedly correct answer to this question is wrong.


 * This is quite easy to prove mathematically but this StackExchange lacks LaTeX support and you probably will understand it better through a textual description anyways.

The only advantage option c as over option b is its bigger alphabet (again, more general use of the word "alphabet"). Option b, however, makes more than up for this by having choosing a longer password. If we add more and more characters (like ü, à, Ø, Æ, etc.) to it, we're making the alphabets more equal in size, causing the advantage of c over b to diminish, whereas the advantage of b over c is unaffected.

UTF-8
  • 2,300
  • 1
  • 9
  • 24
  • 6
    In order for "c" to be the correct answer, you need at least 221 special characters to choose from, in addition to the alphanumerics. Good luck finding a keyboard that will let you type your password! – Mark Oct 06 '16 at 21:36
  • 4
    These are training questionnaires for doctors, not a Comp Sec exam. While you can mathematically prove entropy in favor of another answer, their goal is to get people to think 'Pa$$w0rd' instead of 'password' – Shane Andrie Oct 06 '16 at 21:44
  • 1
    @Mark Yeah. And you better hope that keyboard doesn't offer more letters. Oh, wait. Basically any keyboard offers characters like `à` already, further increasing that number dramatically. – UTF-8 Oct 06 '16 at 21:46
  • @ShaneAndrie It said "letters", not "words". – UTF-8 Oct 06 '16 at 21:46
  • @UTF-8, I've got three keyboards sitting on my desk, and not one of them offers "à" as a typeable character. I know how to type it on the Linux system (`compose`-`backtick`-`a`), and I can make a good guess at the Mac (`option+backtick` as a dead key to create the accent mark, then `a` to combine with it), but there's no way I'll remember the `alt+keypad` code to type it on Windows. – Mark Oct 06 '16 at 21:50
  • You can't do do the backtick thing on Windows? – UTF-8 Oct 06 '16 at 21:53
  • @UTF-8 I get that, but being grossly specific works for the math's of this, however, OP's friend got it wrong because the goal of the test wasn't be pedantic, but to push a policy requirement. That policy wasn't around the theory of entropy password strength, it was get people to not choose '123456' and think that's good. – Shane Andrie Oct 06 '16 at 22:00
  • 3
    @ShaneAndrie The question was "Which of the following would make the most secure password?", not "Which password policy will make the users choose more secure passwords on average?". Furthermore, you can easily accidentally create passwords in accordance to option `c` which aren't part of the most popular million passwords. Users are far less prone to accidentally creating a password which isn't part part of the most popular million passwords if they only have to comply with option `b`. It should always be checked whether a password is one of the million most popular ones by the service. – UTF-8 Oct 06 '16 at 22:04
  • @UTF-8 And I agree. And if we are going to read it specifically, we can't answer that questions, because we don't if A) the Character Set is limited in scope to the person creating the password, or too the system, and B) People creating password. I'm just arguing that HR isn't thinking that way, and neither did the person who wrote the question. – Shane Andrie Oct 06 '16 at 22:15
  • @UTF-8 I can't make that character at all on my U.S. keyboard except for the Windows Alt+... for CP439 characters and Linux Ctrl+Shift+U+... for Unicode codepoints. – LegionMammal978 Oct 06 '16 at 22:47
  • 11
    @ShaneAndrie Their goal is wrong, then. They shouldn't be getting people to think "Pa$$w0rd" instead of "password"; they should be getting people to think "this is a passphrase and you can't guess it" instead of "password", which would work towards their _real_ goal much more effectively. – mtraceur Oct 07 '16 at 01:12
  • Do you actually think when people attempt to crack a password, they run every character in ASCII? That's really what this is concerned about. Anything other than <4 numbers or an easily guessed word is already more secure than your bank account in terms of *user input passwords*. In terms of cracking, diverging from alpha-num is a huge security benefit. –  Oct 14 '16 at 17:31
7

I love entropy questions:

The Short Answer:

Yes, You are "technically" correct about having more entropy (best kind of correct).

The Long Answer

Entropy is factored largely by two things. Number of symbols a password can use, and length. In the NHS's scenario, it would be logical that "special characters" are available symbols to use in the 10 Character answer and therefore, the longer a password is, the higher the entropy, and theoretically more secure.

HOWEVER, we have have to deal with people and we are lazy. The question is trying to get people to include special characters in their password because it forces entropy to happen.

Without it, Randall's comic is mathematically correct, while being cheeky, but any SysAdmin that thinks correcthorsebatterystapler is a good password because it long needs to be slapped in the face, cause that's been in my rainbow tables for a while.

To be fair, I think taking four dictionary words an stringing them together is a good concept (which is what we call a passphrase), however people as I said are lazy and will likely fall for common patterns.

Shane Andrie
  • 3,780
  • 1
  • 13
  • 16
  • 4
    Thanks for this. Is there any evidence that people being forced to use special characters actually leads to more unpredictable passwords? Because I would posit that someone who's forced to come up with a 10 character password, with no other rules, might well produce one that is harder to guess than someone who's forced to come up with a 7 character one which includes special characters. This is the real question here. – Robin Winslow Oct 06 '16 at 20:58
  • 1
    And for the record, in my team we use "correct horse battery staple" in a couple of places ;). Places where we don't think there needs to be a password in the first place... – Robin Winslow Oct 06 '16 at 21:00
  • 2
    Quite the opposite, allowing people to select passwords is horrible. We tend to use mnemonics, and a not a very large list of patterns. This makes us easy to profile and therefore easy to guess. Password Physiology is a thing, and it one of the reason certificates are heavily pushed. – Shane Andrie Oct 06 '16 at 22:07
  • 2
    @RobinWinslow i would rather think that 10 letter english words would make up a large proportion of chosen passwords as users strive to make them memorable rather than unpredicatble – JamesRyan Oct 07 '16 at 10:15
6

Both the quoted test and your counterarguments are wrong, fundamentally because entropy is a measure of randomness of a password—not length, not alphabet size. The XCKD comic scheme that you cite is secure to the claimed 44 bit security level if and only if the 44 little gray boxes below "correct horse battery staple" represent the outcomes of coin flips (or similar uniform, indpendent random events) that were used to select the passwords. If a human picked the words all bets are off.

Since neither the NHS nor you talk about this critical factor, it's impossible to say anything concrete about the security of the passwords, other than if they're not chosen uniformly at random they're likely to be weak.

It is my understanding that as a rule, extending password length adds more entropy than expanding the alphabet.

If d is the alphabet size and n is the password length, then a password chosen uniformly at random has log2(d) * n bits of entropy. Doubling the size of the alphabet therefore adds n bits of entropy; adding an extra symbol to the password adds log2(d) bits. So it all comes down to the concrete values of d and n; there really is very little point in having a rule of thumb like you're proposing there since we can just calculate the increases straightforwardly.

Luis Casillas
  • 10,181
  • 2
  • 27
  • 42
  • 3
    This seems unnecessarily pedantic. I am aware that entropy is about randomness, but I'm asking which is a better policy. Are you really saying there is no point having any password policy at all apart from "use a random generator"? – Robin Winslow Oct 06 '16 at 20:44
  • Also, if any equation includes `x` *to the power of* `y`, it is pretty clear that increases in `y` will have a greater influence over the magnitude. – Robin Winslow Oct 06 '16 at 20:46
  • 2
    @RobinWinslow: My point is that unless we introduce randomness into the equation we don't really have any reason to believe either of these policies will actually be secure. Debating between those alternatives when humans choose the passwords is missing the forest for the trees. – Luis Casillas Oct 06 '16 at 20:53
  • 1
    @RobinWinslow: if we take 2^2 as the starting point, it is trivial to observe that 3^2 > 2^3. And my larger point is that there is no need to generalize when you have concrete examples that you can just calculate from. – Luis Casillas Oct 06 '16 at 20:55
  • I knew you'd come back at me with an extremely cherry-picked case where power made less difference. Thanks for making your pedantic point. It's not useful, and doesn't answer the question. – Robin Winslow Oct 06 '16 at 21:06
  • 2
    There is nothing pedantic with this answer, and it does answer your question. If there is something else you want to know, I think you need to be more clear about it in your question. – Anders Oct 06 '16 at 21:08
  • 1
    @RobinWinslow: But the NHS examples that you quoted show an apparent *linear* growth of password lengths (5, 6, 7, 10) vs. an *exponential* increase of alphabet size (26, 52, 95). So the rule of thumb that `x^y` grows faster on `y` than on `x` risks leading you astray on this one example. And again, just doing the math is simple enough that we don't need to resort to rules of thumb. – Luis Casillas Oct 06 '16 at 22:38
  • This is just so obviously wrong! To take even the worst case of your provided examples - `95^7 = 6.983373e+13`, whereas `26^10 = 1.411671e+14`. Even with a *vastly* larger character set (which is unlikely to actually be the case in practical terms), the greater length wins. – Robin Winslow Oct 07 '16 at 07:02
  • @RobinWinslow A six letter mixed case password is "better" than an seven letter lower case password. Just an example. – Anders Oct 07 '16 at 08:25
  • @RobinWinslow It is pedantic, but really, it should be. The point the comic you embedded was making isn't just that length is king (which it usually is), but also that intuition can be misleading. Passwords that look good might not be and passwords that look easy to guess may in fact be quite secure. All rules of thumb break down at some point. (In fact, in the xkcd, "correct horse battery staple" is actually a length-4 password from a very large vocabulary.) The only foolproof way to determine password quality (or more precisely, password generation method quality) is to *do the math*. – Ray Oct 10 '16 at 16:19
  • @Ray: Your observation that "correct horse battery staple" is a length-4 password from a large symbol set is spot on. But it undermines your earlier statement that the comic's point is (partially) that "length is king" (or usually so). – Luis Casillas Oct 10 '16 at 18:30
  • @LuisCasillas Hence the "or usually so" bit. A 2048 element vocabulary is (barely) enough to make the vocabulary size relevant. But merely going from a vocabulary of size 52 (in option B) to one of ~80 (in option C) isn't even close to the amount needed before the length ceases to be the dominant factor. My key point is "Always do the math when possible." In support of that, I pointed out the fact that the even the (overall quite good) xkcd advice has non-intuitive aspects. – Ray Oct 10 '16 at 18:42
2

Here's the thing, like it or not this question is not about laboratory, or mathematically more secure passwords. It's about getting people to "think" about their passwords when choosing them.

a. Is incorrect because it only has letters.
b. is wrong because it only has letters
c. is correct because it is long enough and includes "special characters"
d. is wrong because it has only letters.

Or in other words, passwords using only letters are bad.

Now, it's true that you can create a more secure password by using only letters if it's long enough, or random enough. Surly "asefhesesnh" is better then "p4ssw0rd!", but to be honest that is an understanding beyond most people in the target audience of this test.

Instead it's "better" to get users to understand to pick a password that is "longer" and has letters, numbers, and special characters.

In other words C is correct when your talking about a wide range of users with different levels of technical skills, creating their own passwords. Sure the math might be off, but it doesn't matter. No provider, is going to sit there and figure out password entropy, but they can count the number of $ in a password.

coteyr
  • 1,506
  • 8
  • 12
  • "Through 20 years of effort, we've successfully trained everyone to use passwords that are hard for humans to remember, but easy for computers to guess" – Robin Winslow Oct 10 '16 at 19:15
  • "C is correct when your (Sic.) talking about a wide range of users with different levels of technical skills, creating their own passwords" - most experts here seem to disagree. Do you have any evidence? – Robin Winslow Oct 10 '16 at 19:17
  • @RobinWinslow I've yet to see one expert disagree. The argument proposed is entropy, which is not something a normal user is even aware of or can even define. You can't make a password rule like "Your password must be at least 40 bits of Entropy" and expect most end users to "get it". You can set some basic guidelines though, and that what this test is about complying with guidelines. – coteyr Oct 10 '16 at 19:21
  • 1
    practically everyone else has said C is a bad answer (as well as saying the question in general is bad, and randomness is the best solution etc.). I believe that even when dealing with real people in the real world, if you tell them to come up with a 10 character password with only letters they will produce one that is harder for a machine to guess than if you tell them to come up with a 7 character one including special characters. "lazypsycho" is better than "Daphne!". – Robin Winslow Oct 10 '16 at 19:24
  • As I said, if you have any evidence to contradict (or, indeed, confirm) my belief here, I'm extremely interested to see it. – Robin Winslow Oct 10 '16 at 19:26
  • My point is that it's not about a single strong password. It's about getting a group of dis-interested people to choose better passwords. How would you explain to a group of 20 non-technical people how to pick a password? Now how would you do it in less then 15 seconds? – coteyr Oct 10 '16 at 19:31
  • There is no misunderstanding here - we have the exact same goal, we just disagree. My point is that each one of your group of 20 non-technical people would come up with a stronger 10-character letters-only password than a 7-character password including special characters. Mostly because they'd all put exactly 1 special character, at the end, which is probably an exclamation point, a question mark or a full stop. As I said, if you have evidence to refute this assumption, I'm all ears. – Robin Winslow Oct 10 '16 at 19:35
  • lazypsycho! is better then lazypsycho (notice the 3 wrong answers don't include any punctuation) even if not by very much. – coteyr Oct 10 '16 at 19:37
  • https://www.betterbuys.com/estimating-password-cracking-times/ try both options. I think their estimator is way wrong, because of the fact that it ignores dictionaries and patterns but... Again, disinterested people, – coteyr Oct 10 '16 at 19:39
  • That's a good resource. Thanks. And sure enough, "Daphne!" = 1 month 3 weeks, whereas "lazypsycho" = 4 months, 3 weeks. Obviously "lazypsycho!" is better than "lazypsycho", that's a no-brainer. – Robin Winslow Oct 10 '16 at 19:42
  • 5
    In the real world, requirement "c" will produce a password consisting of, in order, an uppercase letter, four lowercase letters, a digit, and a punctuation symbol. Further, the digit will usually be a "1" and the punctuation will be a period, exclamation point, or question mark. The overall result will be a complexity on the order of 26^5, or about twelve million possible passwords. – Mark Oct 10 '16 at 21:05
  • 1
    @Mark Clearly the solution to this problem is that we must make a rule that the number must not be a 1, and must not be at the beginning or the end of the password.That will solve the problem, you betcha. – barbecue Oct 10 '16 at 22:06
0

Option b gives you 52 possibilities per character.

For c to be better, each of the 7 characters must have more than 5210/7 = at least 283 possibilities.

This means ASCII or western ANSI character sets won't suffice. They'd have to allow the Unicode character set (or some very arcane Asian ANSI codepages) in order for option c to be better.

It's obviously an ill-phrased question. There are 62 numbers and letters (upper + lower case) so the correct answer would be:

c if 'special characters' means I can use Unicode characters or any other character set that contains at least 221 non-alphanumeric (i.e. 'special') characters, otherwise b.

RocketNuts
  • 223
  • 1
  • 6
  • Except that anyone trying to crack a password will run it through a fairly extended alpha-num before bothering with special characters. –  Oct 14 '16 at 17:34