27

I've read that every good web application should hash passwords. I found many articles about hashing. So I started implementing hashing on my website and then I asked myself why should I do it? When a user sends his or her login (name+password) to the server, the server loads the password of the given user name from database and then compares passwords. There is no way how the user could get password from the database.

Most likely I'm misunderstanding the concept, so can anyone please tell me why anyone should be using hashing?

  • 7
    Before doing any password hashing, please PLEASE go read [this answer](http://security.stackexchange.com/a/31846/9377) first. – lynks Jun 03 '13 at 09:11
  • [Mu](http://en.wikipedia.org/wiki/Mu_(negative)). [Because of bruteforce attacks, general hash functions like MD5 and SHA are broken for password storages](http://youtu.be/3rmCGsCYJF8?t=26m28s), use KDFs like [Bcrypt](http://de.wikipedia.org/wiki/Bcrypt) or [Scrypt](http://en.wikipedia.org/wiki/Scrypt). – Bengt Jun 10 '13 at 12:34

8 Answers8

29

For any reason, your database may be compromised and its data may be obtained by someone else. If the passwords are in what we call plain text, you will have leaked a piece of sensitive information that your users have trusted you with: their password (which is very likely to be a password shared in multiple services). This is a very serious issue.

Instead of plain text, passwords are typically hashed with an one-way hashing function. If the cryptographic function used for the hashing is strong, then the passwords are much safer, because even if someone gets their hands on your database, it's computationally infeasible to calculate the passwords (given only the hashes).

On the other hand, the hashed information remains useful to you: Because the same hash function will always yield the same hash for the same input, you can still hash any attempted password and compare the result to your saved hash to verify a user's authentication attempt (without knowing the correct password beforehand).

Of course, just hashing is not enough nowadays. There are lookup attacks that often make decryption (of hashes created from predictable passwords) very feasible. To counter these attacks, each password is hashed together with a unique randomly generated piece of input (called salt). The salt is stored in plain text in the database and it doesn't have to be secret, because its main purpose is to render precomputed hash dictionaries useless.

Almost all serious platforms use hashing and salting to store their users' passwords (unless they make use of something different but comparably secure that I'm not aware of). Incidentally, this is exactly why you can reset your password in various services but almost never recover it: The hash can be overwritten by the system, but it can't be decrypted.

You should by all means do the sane thing for the sake of your users' security and salt-and-hash your passwords. There are plenty of resources online that explain the process and the pitfalls in detail. One of them is "Secure Salted Password Hashing: How to do it Properly".

T. C.
  • 402
  • 4
  • 8
7

Your database could be compromised for any number of reasons, it happens all the time watch the news (even to some of the biggest web sites on the web). It's just a safety precaution to your users to hash their personal information in case your DB ever gets compromised for any reason.

Kevin DiTraglia
  • 771
  • 1
  • 6
  • 8
6

The most basic reason why you hash a password is so that if the stored version of it is stolen it cannot be openly read.

Note that I said the stored version - if the password is never stored in any form (which is unlikely) then there is no point hashing it ever. But it is unlikely (and impractical) that you will never store it, so it should be hashed. You may think that your database is secure but there are some very smart people out there who are way smarter than you.

But you should also note that hashing it alone is largely ineffective - you should also add a salt to it before hashing. This means you add a known (preferably not too predictable) value to the password (anywhere in the password) before hashing it - this helps to slow down the breaking of the password by the use of rainbow tables.

Transmission of the password should be across a connection secured by TLS - ie it should be across HTTPS. This way you can send the entered password in clear text (you should not try to encrypt or hash it at the web page), the server side code then salts and hashes the received password before checking the resulting hash against the salted hash stored in the database.

slugster
  • 171
  • 1
  • 3
6

Hashing passwords provides defense against your passwords being compromised when a database has been compromised. It does this in two ways, 1. it hides the users passwords by making it computationally impossible to get the password from the hash, and 2. It can slow down the generation of rainbow tables for lookup against known passwords.

Hash functions are one way functions, meaning that if you know the original data you can produce the hash but having the hash won't give you the original string (in theory anyway). The easiest way think of this is to think of the mod function in programming (%), If I were to ask you what values give you n % 6 = 5 there are lots of answers, 5%6 = 5, 11%6 = 5, 16%6 = 5 and so on... but you can't figure out which n I'm using just by knowing that n%6 = 5. Cryptographic hash functions provide this sort of data hiding, except that it's extremely difficult to find collisions (two values that produce the same hash). For example SHA2 provides 128 bits of entropy, meaning that it should take 2^128 tries to find a collision (although SHA2 is a 256 bit algorithm it like all hash functions is susceptible to the birthday paradox).

Because cryptographic hash functions have this property of data hiding, when someone looks at the passwords in your database they typically wouldn't be able to tell what those passwords are just by looking at them

Rainbow Tables

Rainbow tables are tables of pre-hashed passwords. Because most people use easy to remember passwords with low entropy a bad guy could just have a table of known passwords and check your hashed value against what is in the table. To mitigate against that you use a salt, which is just a random string that you concatenate to the password. It's suggested to use a different random salt for each users password. The salt should be generated by passing a random value (for C# you can use RNGCryptoServiceProvider ) into a pseudo random number generator (such as a Hash function) any time a password is created or changes. Lots of programmers think that salt has to be a string of characters, it does not, you can use bytes that are Base64 encoded in your db and just decode them before you do your check. This would change the original string enough to make it so the hash doesn't match what would be in a rainbow table and make it useless to do a simple lookup. Good Cryptographic hash functions have a property called the "Avalanche effect" this basically means that a 1 bit change causes at least 50% of the bits to change after it. So hashing with salt increases your overall security by making it harder to figure out what those passwords are by a simple lookup.

Fast Computation

These days computers are extremely fast, making it feasible to compute a rainbow table on the fly. Because the salt is stored in plaintext usually in another column on the same table as the hashed password, a bad guy who has access to your database can compute a rainbow table on the fly with a known dictionary and the salt you have in your database.

For this case we want a way to make it harder for them to do it. And while many of us programmers toil over making our application faster this is where we would actually want things to go a little slower. Hash functions such as the SHA family are made to hash large amounts of data very fast, so they are not ideal for password hashing. Hash functions such as BCrypt introduce a work factor, a higher work factor causes the algorithm to perform slower so it's extremely hard to generate a rainbow table. As computers get faster you simply increase your work factor and rehash the hash you have. A nice side effect with BCrypt is that it will generate the salt for you, so you don't have to mess with RNGCryptoServiceProvider etc, and if you use BCrypt.NET you won't need an additional column to store the salt as it is part of the resulting string you get back. The idea is to have a work factor that's tuned so a user doesn't notice the delay when logging in, but a bad guy is really slowed down when they are trying to brute force a password or trying to generate a rainbow table on the fly.

nerdybeardo
  • 273
  • 2
  • 7
  • 1
    This is a good presentation, except for the first paragraph. Hashing passwords only defends against one thing: brute force attacks after the database is compromised. It's not two lines of defense, just the one. (Or you could call salting and slowness two lines of defense, but the important point is that the only use of hashing is to mitigate a database compromise, it does nothing about online brute-force attacks.) – Gilles 'SO- stop being evil' Jun 03 '13 at 11:46
  • @Gilles Thanks for the clarification, I see your point and agree I'll make an edit. – nerdybeardo Jun 03 '13 at 13:16
4

There is no way how the user could get password from the database.

And indeed as long as this assumption is true there is no reason to hash the password — it's literally a waste of time.

The fly in the ointment is that this assumption does not hold in practice. There are two main ways a password database could be compromised:

  • Attackers manage to gain unauthorized access to the live server. A typical example is SQL injection in some buggy web application.
  • A backup of the database is accidentally exposed to the internet or physically lost.

If the password database is leaked in such a manner, and the passwords are in plain text, then anyone who can access the leaked database gets to read the passwords. You might think then of encrypting the password — but that doesn't really help, because the application needs to be able to encrypt the key, and most attacks that leak the database can also leak the key.

So instead of storing the password directly, you should store a hash of it. The general idea of a hash is that you can compute the hash from the password but not the other way round. Cryptographers have come up with algorithms to do this. An added twist is that not just any cryptographic hash will do, because the attacker can try to compute hashes by brute force (“Is it password? No. swordfish? No. Tr0b4dor&3? Yes!”). So a good password hash has two additional qualities:

  • It must be slow, so that it'll take a long time for the attacker to go through all likely possibilities.
  • It must incorporate a unique element (called a salt), so that the attacker can't do a lot of precomputation that make it efficient to crack lots of passwords at once.

For all the details of password hashing, read How to securely hash passwords?. In practice, call scrypt, bcrypt or PBKDF2 in your programming language or framework's library, with a random salt.

Gilles 'SO- stop being evil'
  • 50,912
  • 13
  • 120
  • 179
2

(copy and pasting my answer from another thread, with minor edits)

I've read that every good web application should hash passwords. I found many articles about hashing. So I started implementing hashing on my website and then I asked myself why should I do it?

Good question, and I'm glad you asked it. I want people to find this thread when they Google it so they -- hopefully -- won't make the same mistakes that many other companies make.

You shouldn't just hash passwords, you should salt them, and add a SlowEquals.

Why salt?

Let's imagine you just hash your passwords without a salt. You would end up producing a static output pretty much every single time.

For example, "myDarnPassword" would end up being converted to "aca6716b8b6e7f0afa47e283053e08d9" in md5. At this point, you could create a dictionary attack by yourself. You could automatically generate a database that converts as many random characters, plus dictionary attacks, into a usable database. You'd create a table looks like this:

+-------------------+----------------------------------+
| PASSWORD          |           UNSALTED_HASH          |
+-------------------+----------------------------------+
| myDarnPassword    | aca6716b8b6e7f0afa47e283053e08d9 |
+-------------------+----------------------------------+
| pleaseDontSueMe11 | 0dd395d0ec612905bed27020fb29f8d3 |
+-------------------+----------------------------------+

Then you would select from the database like this:

SELECT [PASSWORD] FROM [TABLE] WHERE [UNSALTED_HASH] = 'aca6716b8b6e7f0afa47e283053e08d9'

And it would return myDarnPassword.

With enough processing power and time, you could create trillions of combinations, and quite easily crack a large number of passwords. At that point, all you really have to do is look it up. And if you've stolen other people's passwords in the past from a database, you can add those, and convert them to md5 hashes.

Salting the hash defeats this attack.

When a user sends his or her login (name+password) to the server, the server loads the password of the given user name from database and then compares passwords. There is no way how the user could get password from the database.

Right. You compare the password to the stored hash, and if it matches the salted hash in the database, then it's considered a valid password. You may then allow the user to log in.

Below is something people can do with unhashed and unsalted plaintext passwords. It may not necessarily be used to target you directly, but let's say Hacker wants to target Person A. Let's deduce how you can target Person A.

  1. You are Hacker. Your job is to hack websites and develop a database to aggregate this information.
  2. Person A is a person of interest. Person A shows up in one of your hacked sites database. You now know their email address, and the password they're using for that website.
  3. Now you can try to log in to their email address with password you've stolen from that website. Sweet, it works!
  4. Now that you have access to their email, you download all of their emails through IMAP, or through their web-mail. At this point, you find lots of interesting things. They're communicating with Person B.
  5. You can actually google some people's usernames and email addresses, and it could show websites they post on. This will bring up other websites that the user uses. Maybe you can try to hack those websites, or maybe you can just deduce what they're into. Now you can pretend to be like them, or find additional information. Information/activities could include:
    • Usernames. Person A posts online as Mark Hulkalo. That's a relatively unique name, a combination of Mark Ruffalo the actor, and the monster he portrays, The Incredible Hulk. You can then google, Mark Hulkalo, and look for websites that he posts on. Maybe he reveals more of his personality on other sites?
    • Passwords. Maybe Mark Hulkalo has the same password on that website. Maybe you can log in to that website and view his private communications with others?
    • Personal Information. Because you know the identity of Mark Hulkalo, what if he shares personal information on a certain website? What if he posts on craigslist searching for male or female escorts, and he's left his phone number there? You already found his phone information, so you can find a way to set him up and blackmail him for money/information/power. This doesn't have much to do with salting the passwords unless you don't include the phone number, but they find their phone number on another website thanks to your leak. It's one of the many very powerful ways that information can be collected and used against you. This is, after all, an Information Security forum, so I want to use this example.
    • Family Information. Now it's getting creepy. We've got Mark Hulkalo's personal information. Let's look into his social networking. Oh, he has a Facebook account. Can we access this with the same password? If Mark Hulkalo is using the same password/email combination, then probably. And you can probably deduce this from his email that you accessed earlier, where you found a lot of interesting things. We can now log in and read his Facebook messages. Now we know who his family members are. We can then coordinate the blackmail attack more easily.
    • Other Login Information. Since we got access to his email earlier, we see he also has a Skype account. One of them is secret. We log in, and see he's flirting with his escorts on Skype. We now have more blackmail material.
    • Impersonation. You can now log in and impersonate Mark Hulkalo on a variety of websites. Maybe he's actually a straight-shooter and never went after any escorts, or anything of the sort? Well, now you can turn him into an escort-seeking reprobate by using his credentials to impersonate him online. Imagine the damage that could cause to a politician who was wrongly accused and forced to resign.
    • Things that make it easier to hack other people. You can then send emails to Person B with infected attachments, and pretend you know him. You've read enough emails, so you're able to imitate Mark Hulkalo to the point where you seem just like him. You craft the email in a way that leaves Person B unsuspecting of what's really going on, and now you can do the same thing to Person B, or worse.

And those are just a few scenarios. There are a lot of different uses for someone else's credentials. Salt and hash your passwords, and prevent SQL injection attacks. Please don't turn me into an escort-seeking reprobate! Save Mark Hulkalo!

(I'm aware some websites can block your attempt to access their services when using a different IP, but there are many ways around this, and not all websites do this).

By the way, congratulations on your class action lawsuit if you get hacked.

Mark Buffalo
  • 22,498
  • 8
  • 74
  • 91
2

If someone crack your database he will get all the passwords but hashed. Since hashing function is not reversible he won't make anything with it.

the more secure way is to compute hash = password + salt

2

It is for added security. You have to assume the worst case which is that your database will be hacked. If you save your users passwords in plain text along with other data, the hacker will know that this user uses that password. Since most people reuse passwords, they could potentially use it to access other sites with data stolen from you.

That's why it is considered good custom to hash your passwords, and preferably to salt them as well, making it harder for a potential hacker to reverse the hash back to its plain text meaning.

PureW
  • 121
  • 3