3

The application takes several inputs (notably, private personal informations) to produce a fixed, but non-guessable hash.

Simplified example:

$data = $websiteDomain . $myChildSchool . $myPetName . $etc;

for ($i = 0; $i < 1000000; ++$i) {
    $data = hash('sha512', $data, true);
}

$data = str_replace(['/', '+', '='], '', base64_encode($data));

$result = substr($data, 0, $passwordLength);

The goal is to create a publicly published code, that generates hashes that are then used as passwords. And the important part: the passwords can be recovered, using the public code and the private personal information.

Because of this, I can't use password_hash() because the result is automatically salted so it produces different hashes each time. I can't manually provide a salt as it is deprecated (produces warning, and likely to be removed in future PHP versions).

My main concern is that some of the passwords could be used on crappy websites, that have small maxlength restrictions, and store the user passwords as plaintext. Thus, I'm assuming attackers could steal the hashed data from the website, and use brute force to deduce back the private data.

In the above example I'm using SHA-512 with key stretching. Though, SHA isn't really designed for this, GPUs are incredibly fast on it, and there may exist, now or later, techniques to circumvent all or part of the iterations.

So, how acceptable is the current technique? Could it be improved, and if so, how? Ideally, the solution shouldn't be PHP-specific, and shouldn't require external libraries.

Gras Double
  • 153
  • 7
  • How are your users supposed to log in? Do they have to use their personal informations instead of a password? – simon Nov 15 '16 at 13:42
  • It would be for personal use. The purpose is to be able to recover passwords, even twenty years from now, naked and from the other side of the planet, using only my brain and an internet connexion. I can't remember "l33tpa$sw0rd" more than one hour, though I can surely remember my parents' names and so on. The real application would use *a lot* of personal informations. – Gras Double Nov 15 '16 at 13:48
  • Follow-up question, about including the input in each iteration, as PBKDF2 does: [Include input data in key stretching?](http://security.stackexchange.com/questions/146441/include-input-data-in-key-stretching) – Gras Double Dec 28 '16 at 13:25

1 Answers1

3

... attackers could steal the hashed data ... and use brute force to deduce back the private data.

There may be some practical issues, regarding how much entropy there is in the input data

$websiteDomain . $myChildSchool . $myPetName . $etc

If some of this info is known to the attacker then those parts have no entropy. The remaining unknown portions could be brute-forced by the attacker if the remaining bits of entropy becomes too low. You should consider:

  • Bits of entropy of the input data after the attacker has acquired the 'low-hanging fruit'.
  • How sensitive is the remaining input data, and what are the potential damages of this being guessed?

The purpose is to be able to recover passwords ... using only my brain and an internet connetcion. I can't remember l33tpa$sw0rd more than one hour, though I can surely remember my parents' names and so on.

Note that every 'secret question' must be answered precisely character-for-character in order for the resulting hash to be consistent. Also add up the bits of entropy to make sure they are sufficient.

Please read this description of the pros and cons of each popular Slow Password Hash with implementation notes. The best options are BCrypt, SCrypt and Argon2, with the former being the best vetted and the latter being the newest one on the market. Personally I would use BCrypt for this project.

I can't use password_hash() because the result is automatically salted so it produces different hashes each time. I can't manually provide a salt as it is deprecated (produces warning, and likely to be removed in future PHP versions).

This function uses BCrypt by default, which is one of the good options. You would of course need to specify a hard-coded (unchanging) Salt (which in this situation is considered to be Pepper), and it is best if you can set a custom Work Factor. StackOverflow would be able to help you with those two specific requirements to be implemented in PHP.

700 Software
  • 13,807
  • 3
  • 52
  • 82
  • 1
    You greatly understood the case, particularly the remaining entropy after the low-hanging fruits have been figured out. Although not recommended, I'll consider bcrypt with a fixed salt, which would be public (and not forgetting to prehash the input because of the 72 bytes truncation). Probably still better than simply looping SHA. There is a lot of entropy in the input, but it is critical this input is not guessed. – Gras Double Nov 15 '16 at 16:22