What are best practices for finding an account in a SQL database during authentication? Is using `LIMIT 1` vulnerable to timing attacks?

Question

I have a application where users can log in by providing a username or email address (both case insensitive) and a password. In the users table in the database, the relevant account information is stored in three columns lowercase_username, lowercase_email, and password_hash_and_salt. There are indices on both lowercase_username and lowercase_email. The authentication logic uses an ORM to perform a SQL query of the form

SELECT … FROM `Users` AS `User` WHERE (`User`.`name` = … OR `User`.`email` = …) LIMIT 1;

and then uses a library-provided constant-time function to hash the submitted password with the record's salt and compare the result to the record's password hash (the computation instead runs on a fake password_hash_and_salt if no record was returned). (I am also using a pepper, but I don't believe that that is relevant to the question.) Then an authentication decision is made based on whether there were both a record found and a password hash match.

My concern is that I might potentially be leaking information about which emails are in the database through variations in the query time. If the query returns faster for email addresses in the database, then an attacker could attempt to log in with the target email address and observe the application's response time. Is that the case, and if so, would the threat be mitigated by the DB index and/or by removing the LIMIT 1 clause from the query?

I admit that in this example I maybe shouldn't be so worried because an attacker could also find out whether an email is in the database by trying to create an account with that email address—despite the registration page's generic error message, there are not that many other reasons that account creation could fail—and the only downside of approach is that the application will send an email to the target if the account creation actually succeeds. But please ignore this alternative attack, since I would still like to know what information my DB query might be leaking through timing and if there are best practices to avoid such leaks.

Does this answer your question? [How to preventing timing attacks against values used for authentication, like emails, stored in Databases?](https://security.stackexchange.com/questions/224981/how-to-preventing-timing-attacks-against-values-used-for-authentication-like-em) — kenlukas, May 24 '22 at 18:45
@kenlukas The discussion there sounds like it's about user enumeration, the question being whether an attacker can exploit variations in the string comparisons to learn about the email addresses in the database. The threat I'm imagining is different (and maybe not a worthwhile threat to consider the more I think about it): if the attacker already has an email address and wants to know if that user uses the site, can they answer that yes/no question with a timing attack on the login page? — limitone, May 24 '22 at 19:05
With a properly optimized index the lookup time does not differ measurable between item found and no item found. The majority of items will take the same or almost the same time to find compared to no item found. Exact details depend on the type of index and how well it is optimized - i.e. in the worst case of a completely unbalanced tree the time might differ significantly. — Steffen Ullrich, May 24 '22 at 20:34
@SteffenUllrich Could you make that an answer so that I can accept it? — limitone, May 24 '22 at 21:12

score 1 · Answer 1 · answered May 24 '22 at 21:35

The timing part of the authentication should not be done in SQL, so you need not worry about it, I think. Just ensure that the function as a whole runs in constant time.

I usually read a microsecond timer before the authentication starts, and take the time when the authentication returns. There is a mandated time (usually 500 milliseconds) that the authentication call will take every time, so basically the logic is,

int authenticate(const char *username, const char *password) {
    int ret;
    // us since server start
    micro_t elapsed = getMicroseconds();

    ret = real_authenticate(username, password);
    // real authentication always returns in less than 78 ms
    // So, not only does the authentication function
    // always require *about* half a second, but there is an
    // uncertainty on the elapsed time that is of the same order
    // of the real authenticator.

    // Realistically, microsleep(500000 - elapsed) would be enough.

    microsleep(450000 - elapsed + my_random(100000));
    return ret;
}

I don't think there is a realistic way of checking the existence of a user with any kind of timing analysis. Maybe some sophisticated power analysis that could tell between "doing something" and sleeping, but that would have to be run on top of the timing analysis. And would need to not run afoul of the denial-of-service mitigations that are usually in place.

Thanks, I would upvote if I had the reputation for it. – limitone May 24 '22 at 22:22 — limitone, May 24 '22 at 22:22

score 1 · Accepted Answer · answered May 24 '22 at 21:39

Given that lookup patterns are not known in advance the data structures and algorithms used in database indices are usually optimized for getting a consistent performance, no matter if a lookup has a positive (item found) or negative (item not found) outcome. This means that there will likely be no significant time difference between a positive and a negative lookup, which is large enough to let an attacker reliably differentiate between a positive and negative lookup.

But, database indices are usually not explicitly designed for constant lookup time, only for best lookup time on average and consistent performance. Depending on the actual type and implementation of the index and its current content, some positive lookups might be done in a measurable shorter time than the majority of negative lookups (and the majority of positive lookups) so there might be some information leaks. This is especially true if an index gets significantly unbalanced. Just imagine a binary tree which in the worst unbalanced case is essentially a sequential list where negative lookup always need to traverse the whole list while positive lookups end much earlier.

What are best practices for finding an account in a SQL database during authentication? Is using `LIMIT 1` vulnerable to timing attacks?

2 Answers2