The actual "dictionary creation"
i.e. figuring out if your base dictionaries plus common rules results in the password they want
First and foremost, realize that when we're talking about password cracking, "dictionary" means a list of base password candidates, which can be used both by themselves, or - much more fruitfully - as part of a rules based attack.
The Hashcat forum has a list of wordlist locations. To prevent link rot, a few of them are:
http://www.skullsecurity.org/wiki/index.php/Passwords
Start here
the phpbb list is a really excellent, very small list of 184,389 words
the rockyou list is a good next step, at 14,344,391 words
- there are also filtered rockyou variants in various places with different sizes and other criteria; look around if you want.
http://hashcrack.blogspot.de/p/wordlist-downloads_29.html
- includes links to the larger Purehate and Xploitz lists plus many sites
http://packetstormsecurity.org/Crackers/wordlists/
- lots of the old standby lists; toss them all together if you like, these have been around a long, long time.
http://www.md5this.com/tools/wordlists.html
- wordlists in several languages, plus a huge english wordlist in pieces.
For some guidance on what a rules back dictionary attack can do, take a look at Hashcat for the types of rules it supports, and note that when you have the plaintext user password, it's easy and very, very fast to apply many rules in "reverse"; i.e. given an actual password, could it have easily come from a base cracking dictionary word plus some number of rules?
For instance:
- You can handle all the uppercase/lowercase rules with a simple UPPER() equivalent and an all-uppercase dictionary - if you find it, it's weak. (JacQueLinE)
- Appending/prepending numbers purely to meet length minimums is a simple pattern match - if the last/first N characters are numbers, and the remaining length isn't enough, it's weak. (Riddick123)
- Remove N numbers from the beginning/end, uppercase it, and check the dictionary for the remainder (JacQueLine12)
- The above, but N-1 numbers and/or symbols (#1JacQueLine)
- The above, but date formats. (JacQueLine02121995)
- If the last/first N-1 characters are numbers and the last/first is a symbol, and the remaining length isn't enough, it's weak. (!JacQueLine1)
- Take out one character at a time, see if it matches the dictionary. (jacqu$eline)
- Combine some of these.
- Reverse all of these.
Do a pattern-match for very common dictionary words as subsets, pull that substring out, then search again; figure out your limit for keyspace on combinatorial attacks
correcthorsebatterystaple
- correct: 1813th most common English word, row 16828 on phpbb, row
9871 on Ubuntu american english small.
- horse: 1291st most common English word, row 14820 on phpbb (horses is
at row 1723!), row 21607 on Ubuntu american english small.
- battery: 3226th most common English word, row 7775 on phpbb, row 3644
on Ubuntu american english small.
staple: 6 characters, all lower case, not in the top 5000 most common
words. row 40524 on phpbb (staples is at row 3852!), row 42634 on
Ubuntu american english small.
Therefore, correcthorsebattery would get all three words pulled out by pulling the top 4000 English words out, and thus fits into a keyspace of 4000^3. correcthorsebatterystaple passes that test, or even top 4000^4 (solely because of staple).
Many crackers start with brute force for tiny passwords, then small wordlists and large rulesets, then large wordlists - the largest I'm aware of is over 30GB, and includes almost every password found to have been cracked by anyone on a given popular forum, plus many, many other large wordlists.
Find yourself a happy medium - fast enough to be performant, large enough and with enough "rules" to cut out the first few fast passes of cracking software - if you are using PBKDF2 with enough (hundreds of thousands) of iterations, or BCrypt or SCrypt with a high enough work factor, then only small dictionaries + large rulesets and large dictionaries + small rulesets will be practical attacks for a few years.
Reducing the user revolt
You are going to anger the userbase to some degree, just the same as you did when you said they need to meet your complexity rules.
When you say "password", "Password", "P@$$w0rd", "P@$$w0rd1", "P@$$w0rd123", and even "P@$$w0rd123!" are bad passwords, you're going to annoy them. When you say "Jennifer2007" is a bad password, they're going to be frustrated (and perhaps Jennifer will be upset, too!). Manage their frustration as best you can, and simply accept some. Personally, I would recommend actually being explicit - tell them their password is a word in known cracking dictionaries plus two numbers, which is a normal cracking rule!
Note that one major purpose of adding this is to educate users on what a weak password is, so they have some understanding to mitigate their frustration.
As part of educating, perhaps show them some alternatives you generate that pass your own tests, if you flunk their password, so definitely let the users know which particular rule they hit, whether it was "password is a word in password guessing wordlists used by attackers" or "password is a word in password guessing wordlists used by attackers plus a toggle case rule plus a number on the end".
1) Fully random passwords
2) Fully random passwords translated into bubblebabble or another pronounceable subset
3) correcthorsebatterystaple type passwords, but with longer and uncommon words. For instance, take the Ubuntu american english insane dictionary, subtract out all the words in the american english small dictionary, and select N words of at least 7 characters in length. This leaves you without any really short words, and without the most common words.
4) a mix of 1, 2, and/or 3.
Then your users can, if they choose, simply pick something you showed them (over HTTPS with the best cipher suites you can get away with, of course).
Personally, I would also strongly suggest raising your length limit; about 14 is what I would recommend, but for most userbases that's just too long. Try a minimum of 12 or even 10, enough so a fully random password might have a slight amount of value at the minimum length and character set.
Much of this was taken from my answer to Should I reject obviously poor passwords?, though that answer has signficantly more math and explanations of why involved.