13

Goal: have token/cookie-based authentication that doesn't require keeping sessions on the server

TL;DR: What, if any, is the accepted mechanism to work around the 72-character limitation of BCrypt?

Long version:

After reading this answer I attempted to implement authentication on a REST-based server using BCrypt as a hashing algorithm and a rotating server-side secret. The basic idea was that upon successful authentication with a username and password, the server would set a cookie containing a hash based on concatenating the following:

  • The server-side secret
  • The user ID and groups to which the user belongs
  • The time at which point the token was given out

This would then be hashed with a salt, and the result would go in the cookie, to be read and verified by the server (possible because the user ID, groups and date are transmitted in the plain by the client, and the server-side secret is in memory, with the secret serving as a guarantee that the client didn't make all the other values up themselves).

This all worked well until I realized that BCrypt has a 72-character limit on the plaintext being hashed (anything after those 72 characters does not influence the output hash).

This is a problem because the above data amounted to more than 72 characters, and allowing either the time, user ID, groups or secret key to be 'last' in the input and not influence the output opens holes in the system's security (either old tokens are infinitely valid, or people can modify their user ID / role arbitrarily).

Ironically, when I finally figured out why my system was producing identical hashes based on differing input and googled around, it seems this kind of situation was foreseen by some.

Anyhow, the question: What, if any, is the accepted mechanism to work around the 72-character limitation of BCrypt?

Some pre-emptive thoughts: Making up your own variations on standard crypto algorithms is something I've been warned about for as long as I can remember, so I am not 100% sure what would be the best course of action.

Obvious options include:

  • chunking the input and applying BCrypt to each in turn and/or in cascaded fashion;
  • using another (non-cryptographic?) hash on the input to reduce its size
  • using plain compression (but that might be ineffective with such a small bit of plaintext)
  • switching to another algorithm (SHA-3 was recently finalized, for instance).

(there may be other things wrong with the above idea, in which case I would of course be happy to be corrected!)

Gijs
  • 233
  • 2
  • 6

3 Answers3

18

The accepted mechanism is "don't do it".

What is bcrypt good at ? It is good at being slow. Why would you want a cryptographic function, or just any function, to be slow ? This makes sense only when the input to the function is a low-entropy secret, which means "some value which the adversary could conceivably, and realistically, explore exhaustively". Passwords are low-entropy secrets: it is well-known then when human beings are allowed to choose secret strings in the privacy of their brain, most will select "witty" passwords which are utterly weak against dictionary attacks. The only remaining defense is to enforce the use of an inherently slow hashing process, which is equivalent to sending all involved computers (and, in particular, the attacker's computers) back to the early 1980s, when 1 MHz was a very high CPU frequency.

In your case, the "secret value" is the server's secret; it should not be of low entropy (the server is a machine, not a human; it can remember long keys). If the server's secret is a low-entropy value, then you are doing it wrong.

Therefore, no need for bcrypt or any other deliberately slowed function here. What you actually want is a Message Authentication Code: you want to store some values on the client side (as a cookie), but you also want to check that you are not fed back forged values; this is a matter of integrity, and a MAC is the right tool for that. Use HMAC/SHA-256, with the "server's secret" as the MAC key (your homemade hashing of several values, including the server secret, is just a homemade MAC in disguise, so you could just do it right by reusing the standard, robust, expert-vetted solution, and that's HMAC).


Of course, once we begin to use client-side storage, it becomes tempting to store values which the client himself should not learn. To the integrity requirement, we add confidentiality. For that, we need encryption. Doing encryption properly is difficult, and combining encryption and a MAC even more. In that case, do yourself a favour: use an authenticated encryption mode, which will do it properly with minimum hassle (I recommend EAX).


By suggesting Keccak (SHA-3), you demonstrate that you are afflicted with a very common disease, which can be called the "iPhone syndrome": you crave for the newest gadget. Stop it. When dealing with security in general and cryptography in particular, this will lead only to massive suffering. In cryptography, "new" is another name for "bad".

The best kind of cryptographic algorithms is old algorithms which are still alive. There is no better criterion for security. In that sense, SHA-2 is better than SHA-3, and will remain so in the future, until a problem is found with these functions. The NIST people themselves explicitly recommend SHA-2 and deny any suggestion that SHA-3 would be better than SHA-2.

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
  • Thanks for weighing in with the clear pointers. Some questions: I opted for BCrypt not so much for 'slow' as much as for 'can be made slow enough for bruteforce not to be viable', even as technology progresses. If I understand, you're saying: as the secret is controlled, increase the entropy of the secret rather than the 'slowness' of the algorithm, in order to achieve the same resistance, correct? Regarding the confidentiality: I don't really understand the link to the question; what am I missing? Do you think some of the plaintext used for the MAC should be confidential? – Gijs Oct 12 '12 at 19:23
  • As for "iPhone syndrome": ouch. I disagree, but I see your point. SHA-3 was an example; surely NIST isn't themselves saying that it's worse than SHA-2, either? The reason I didn't go with HMAC+SHA-2 to begin with (I'm aware rolling your own is a bad idea) was that in my limited searching around, there were concerns about it after the SHA-1 issues that surfaced, hence looking further. Searching some more it seems those concerns have faded somewhat now that quite some time has passed and nothing horrible has reared its head. Would you say the latter is an accurate summary? – Gijs Oct 12 '12 at 19:32
  • @Gijs Brute force only works if you have a small enough number of possibilities. Even a "small" key like 128 bits is [unlikely to be broken by brute force](https://www.wolframalpha.com/input/?i=2^128+nanoseconds). The reason we need bcrypt is that most people have passwords which are **extremely** short. Your server doesn't have that limitation. – Brendan Long Oct 12 '12 at 20:19
  • 3
    Security of HMAC relies on some properties of the underlying hash function. These properties may suffer from issues similar to those which allow for building collisions; although it seems that the problem is not as dire (e.g. HMAC/MD4 is broken, but with effort 2^58, which is high, whereas MD4 collisions are "instantaneous": generating a MD4 collision costs less than compute MD4 over two messages). Therefore, it is _possible_ that the known theoretical weaknesses on SHA-1 _might_ impact HMAC/SHA-1. However, as far as we know, HMAC/SHA-256 is fine (and HMAC/Keccak is equally fine). – Thomas Pornin Oct 12 '12 at 20:35
2

You could use another key derivation function such as scrypt or pbkdf2+sha2 (or sha3).

Although I think this approach to authentication is very wasteful of bandwidth, memory and CPU and to top it off its less secure. In order for key stretching to be effective, it must be heavy, and you'll need to make this calculation for every request. I think you are better off issuing a cryptographic nonce, and back the server state with a fast non-relational database like memcachd. Sure its not "RESTful", but its a more efficent and a more secure design.

The best cryptographers use cryptography when there is no other choice. Plan on failure.

rook
  • 46,916
  • 10
  • 92
  • 181
0

It sounds like you are trying to maintain session state without the usual approach of storing the session data on the server end.

Here's a better cookbook:

  1. Authenticate the username and password (Here you can use bcrypt for password validation)
  2. Put all the session information you want to store in a serialized blob
  3. Encrypt the blob with a strong, secret server key, using authenticated encryption
  4. Send the blob in a cookie/hidden form field/... (base64 encoded)

Upon the next call:

  1. Decrypt/verify the blob
  2. Unserialize the session data
  3. Profit!

All these steps have been battle-tested.

Jan Hertsens
  • 237
  • 1
  • 3
  • Your first sentence makes it sound like you want to suggest storing the session info server-side, which violates the assumption in sentence 1 of the question I asked. Then your "cookbook" basically says "use encryption [instead of hmac as suggested in the accepted answer]" to be able to verify session info, without explaining why. The answer would be a lot more appropriate if it provided more reasoning for either suggestion. – Gijs Sep 22 '15 at 10:40
  • >you want to suggest storing the session info server-side, which violates the assumption Please re-read. It says "_without_ the usual approach of storing the session data on the server". In other words, "usually we store session information on the server, but you it looks like you don't want to do that". >basically says "use encryption [instead of hmac as suggested in the accepted answer]" Is says "using _authenticated_ encryption": https://en.wikipedia.org/wiki/Authenticated_encryption This gives you the benefits of the HMAC (integrity) plus encryption (confidentiality). – Jan Hertsens Sep 23 '15 at 16:21
  • Regarding "without" etc. - you then say "here's a better cookbook" which implies that you will go do "something else". As for confidentiality and AE, well, that should be part of the answer, then... ideally in more detail than just a link to a wiki page. – Gijs Sep 23 '15 at 17:05
  • Yes, _I_ and most web applications that you see will store session information on the server, but that is not what you asked. My "cookbook" was intended to show clear implementation steps. Its "better" because it is actionable. My answer securely and reliably answers your stated "goal" instead of your "question". It is unclear to me whether you are asking for the best-practices to _implement_ a solution to a problem, or wanting to _debate_ the best-practices. – Jan Hertsens Sep 23 '15 at 17:29
  • I was (this question is pretty old now) asking for implementation steps, and in that sense there is nothing wrong with the answer - but you depart very significantly from the question and give no reasons for doing so. Giving those reasons (as, for instance, the accepted answer did) would help make it a more useful answer, even without my wanting to "debate" the best practices. – Gijs Sep 25 '15 at 07:11