3

I'm looking for a reliable solution to compare two strings without leaking their content through time differences. The length of the strings is not secret.

The background is this: I'm implementing password-based authentication for a web application. The passwords are hashed with bcrypt and stored in a database. Rather than giving the application direct access to the hashes so that it check the password, I'd like to delegate this to the database system itself. The application only hashes the password and then passes the hash to a database procedure. This procedure has special privileges to access the stored hashes and compare them with the provided hash. The goal is to protect the hashes against SQL injection attacks. A similar idea is described in this presentation.

Obviously, this scheme is only effective if the procedure does not leak any information about the stored hashes. So the string comparison must be timing-safe.

I'm aware of the following method (pseudo-code):

string_comp(str_1, str_2):
    if str_1.length != str_2.length:
        return false
    else:
        result := 0
        for i := 0 to str_1.length - 1:
            result := result | (str_1[i] ^ str_2[i])

        return result == 0

However, this is rather cumbersome, and I'm not sure if it works as expected in high-level languages like SQL.

Another common suggestion is to hash the strings and then compare the hashes. This would be much simpler than the code above.

Which method is preferable? If I were to use the hash solution, which algorithm would I choose to not degrade the strength of bcrypt (not even theoretically)? SHA-256? SHA-384? SHA-512?

Fleche
  • 4,024
  • 1
  • 17
  • 20

2 Answers2

2

Note that you can't compute the bcrypt hash in the application and pass it to the database unless you store the salts separately from the hashes. If you do that, you'll incur the overhead of performing a SELECT first, then the bcrypt, and then the password verify step. Doing as the presentation suggests (having the DB calculate the hash) might make the most sense. Postgres 9's crypt module does support bcrypt as a built-in hash type.

If you want to do the hashing in the application anyway, comparing with a hash is probably the simplest (least likely to have implementation issues). Since bcrypt is based on blowfish, it only considers up to 448 bits of input, so you can safely hash the output with SHA-512 and mathematically will not lose any strength. In reality, SHA-256 is fine as well, as users passwords essentially never contain more than 256 bits of entropy. Even 256 bits of perfectly random printable characters would take 56 character long passwords. Comparing using a hash is likely to have the most straightforward implementation, and since you already have the overhead of bcrypt, the cost of 2 additional hashes is trivial.

(This part of the answer is based on a misinterpretation of the question as being concerned about timing attacks against the application, not the database, but I'm leaving it here for reference.)

If you're using bcrypt, you've already hashed the password, so you don't need to be concerned about the timing results. Even if someone could deduce the hash of the password, they'd still need to find a plaintext to provide to the application, which should be computationally infeasible (or bcrypt is broken).

Timing attacks are only useful if the attacker can iterate over the values being compared to find a useful comparison. (i.e., A, B, C...) With a hashed value, this is not possible, as they cannot iterate over the output of the hash (for cryptographically strong hashes). Small changes in the input to bcrypt result in large changes to the output. A user providing the input to bcrypt has no way to try out prefixes and then extend the prefix. An attacker would not know anything about either side of the comparison, and thus timing attacks do not leak any information about the value of the hash.

David
  • 15,814
  • 3
  • 48
  • 73
  • bcrypt does not make the passwords immune against brute-force attacks. The strength of a particular hash still depends on the strength of the original password, so it's probably a good idea to expect the worst and do everything we can to not leak the hashes in the first place. I'm not sure what you mean by the second paragraph. The stored procedure takes an arbitrary string and compares it with the expected hash. If I used naive string comparison, then it would be possible to try out different prefixes and see if they match, then extend the prefix etc. – Fleche Jun 11 '14 at 04:34
  • I just expanded on the 2nd paragraph, but basically, since the attacker doesn't know either side of the comparison, the timing information is not useful to them. The paper you linked recommends using a hash, bcrypt is just such a hash, hashing again is unnecessary. – David Jun 11 '14 at 04:43
  • I'm talking about timing attacks against the stored procedure, not the overall password check. The stored procedure is a simple string comparison with no relation to bcrypt (it just happens to compare bcrypt hashes in my case). Without additional measures, this is vulnerable to timing attacks like any other string comparison. – Fleche Jun 11 '14 at 04:52
  • To be clear, you want to protect the hashes against an attacker who can run arbitrary SQL? – David Jun 11 '14 at 04:56
  • Yes. In the classical model, it's possible to fetch all hashes with something like `SELECT password_hash FROM users`. I want to disallow direct access to the hashes and only let the application ask whether a particular string matches the (unknown) stored hash. That's really all it has to do, there's never any need to load all hashes into the application. The presentation linked in the question explains the idea in detail. However, it takes a slightly different approach, becauses it even delegates the bcrypt calculation to the database system. – Fleche Jun 11 '14 at 05:15
  • Updated to reflect how I now understand your question. – David Jun 11 '14 at 05:36
  • Thanks for the update. Letting the database system hash the password is something I'd like to avoid, because this is pretty much limited to PostgreSQL and creates new problems. For example, I'd have to be very careful that the plaintext password doesn't show up in any query log. Thanks also for the comment on hash lengths. I wasn't sure which bit length to match: The 192 bit output or the 448 bit input. – Fleche Jun 11 '14 at 12:53
1

Rather than giving the application direct access to the hashes so that it check the password, I'd like to delegate this to the database system itself.

Why do you want to delegate it? It's simpler to just fetch the hash for the username given, and do a constant side string comparison (using pseudo-code above) with the fetched hash.

Every successful login will give the web application running on your server both the entered password as well as the computed hash of the password.

Granted what you want is fairly straightforward to do within for a given database using its procedural langauge (PL). For example, in postgres you could write:

CREATE OR REPLACE FUNCTION user_with_correct_hash(username TEXT, app_hash TEXT) RETURNS BOOLEAN AS
$$
DECLARE
    user_hash TEXT;
BEGIN
    user_hash := SELECT hash FROM users WHERE "name" = username;
    IF (user_hash IS NULL) THEN
        RETURN false;
    END IF;
    RETURN constant_time_string_compare(user_hash, app_hash);
END
$$
LANGUAGE plpgsql;


CREATE OR REPLACE FUNCTION constant_time_string_compare(user_hash TEXT, app_hash TEXT) RETURNS BOOLEAN AS
$$
DECLARE
    result INTEGER;
    index INTEGER;
    hash_length INTEGER;
BEGIN 
    hash_length := CHAR_LENGTH(app_hash);
    IF CHAR_LENGTH(user_hash) != hash_length THEN
        RETURN false;
    END IF;
    index := 1;
    result := 0;
    WHILE index <= hash_length LOOP
        result := result | (ascii(substr(app_hash, index, 1)) # ascii(substr(user_hash, index, 1)));
        index := index + 1;
    END LOOP;
    RETURN (result = 0);
END
$$
LANGUAGE plpgsql;

Granted be very careful to give your constant time string comparison several test cases to make sure it is behaving properly.

dr jimbob
  • 38,768
  • 8
  • 92
  • 161
  • This is actually what I currently have. Any reason why I can't replace the bit operations with a simple `digest(user_hash, 'sha256') = digest(app_hash, 'sha256')`? That's one line of code instead of 15. – Fleche Jun 11 '14 at 04:53
  • @Fleche - With a suitably long salt unknown to the attacker (e.g., bcrypt's 128-bit salt), AFAIK you do not have to worry about timing attacks. That said, constant-time string comparison strikes me as the right thing to do for defense-in-depth and a good habit. I don't like take a hash of the your hash and do non-constant time comparison. All you've done is effectively change the hash function from bcrypt to sha256 of bcrypt and any timing attack that could have been done on bcrypt can now be done on sha256 of bcrypt. – dr jimbob Jun 12 '14 at 17:16
  • @Fleche - As for using constant-time functions for defense-in-depth, attacks aren't always intuitively obvious; e.g., the [recent paper](http://www.tau.ac.il/~tromer/papers/acoustic-20131218.pdf) (note Shamir, the S of RSA is a co-author) where RSA in GnuPG (which used non-constant time modular exponentiation) was broken by using a mobile telephone's microphone to listen to a laptop decrypting things with RSA from across a room. – dr jimbob Jun 12 '14 at 17:21