There are three primary types of attack that can be done against hashes: brute-force attacks, dictionary attacks, and pre-computation attacks.
Brute-force attacks
A brute-force attack involves selecting a range of characters (e.g. lowercase and numbers) and computing the hash for every single possible permutation of those characters, for a range of password lengths. Each hash is compared against your target hash, and if it matches the password has been found. For example, we might choose a-z A-Z 0-9
as our alphabet for passwords between 5 and 8 characters. Defending against such attacks is reliant on the computational cost of each hash operation, the alphabet needed to successfully attack the password, and the length of the password. Since modern technology allows for GPU-based acceleration of hashing, it is important to use a slow key-derivation function (e.g. PBKDF2 or bcrypt) instead of a single hash.
Dictionary attacks
Dictionary attacks involve running through a large list of pre-chosen words that are likely to be used as passwords. It is important to note that most dictionaries don't just include real dictionary words - they also include various pseudo-words and other values that are found in various database leaks and common password lists. These attacks are more efficient than brute-force attacks in general, because they focus on the kinds of passwords that humans choose rather than completely random values. Defending against such attacks almost entirely relies on not picking a common password or dictionary word.
Pre-computation attacks
Instead of computing hashes repeatedly and comparing them to the target hash, pre-computation attacks involve computing hashes for a set of chosen values (like a dictionary attack) and storing them in a file or database. Hash databases and rainbow tables are two common methods of doing this. This provides a very fast lookup of plaintext for any known hash, since it's just a case of looking up the hash in the index and returning the associated plaintext. This can be defended against by using a salt, i.e. a random value appended to the password before hashing. This makes computing rainbow tables for each possible salt value completely infeasible.
So, why are complicated passwords important? It depends, really. If you're doing password hashing properly, using PBKDF2 or bcrypt with a reasonable cost factor, complexity beyond not using common passwords isn't actually that important. It's more important to avoid dictionary words and common passwords, and complex passwords do usually offer that kind of protection. However, choosing a long and unusual non-dictionary password that is memorable (e.g. PolynomialLovesBacon
) works just as well. If you do password hashing incorrectly (e.g. salted SHA1) you need a much stronger password to remain safe, because GPUs can compute tens of billions of hashes per second.
Of course, you're going to have to deal with the human aspects. I think one of the best things you can do is warn users if they use a common password, by storing a list of the ~2000 most common ones (you can get lists of these online) and checking against them. As long as you're properly hashing passwords, most users should be reasonably safe even in the case of a database leak.
Most of these attacks are based on the model of your site being hacked and your passwords stolen, e.g. via SQL injection, so it's important to adhere to secure coding practices and be aware of common vulnerabilities.
Further reading: