51

I'm currently working on a "helper function" for PHP's core to make password hashing more secure and easier for the majority of developers. Basically, the goal is to make it so easy, that it's harder to invent your own implementation than to use the core secure one. It's also designed to be updated and extended in the future as hashing methods improve.

I've written an RFC for the addition, and have an implementation of the functions as well.

The basic premise is that crypt is too difficult to directly use correctly for the majority of developers. The weird error returns, the base64-like (but using a different alphabet) salts, etc. So these functions are designed to take that guesswork out, and provide a dirt simple API.

string password_hash(string $password, string $algo = PASSWORD_DEFAULT, array $options = array())
bool password_verify(string $password, string $hash)
string password_make_salt(int $length, bool $raw_output = false)

Password_Hash takes in a password, an optional algorithm specifier (right now, only the improved CRYPT_BCRYPT implementation is supported, but would like to add scrypt as an option later), and a options array. The options array can specify the cost parameter to bcrypt, as well as a predefined salt value.

Password_Verify accepts a password, and an existing hash. It then re-hashes the password (identical to $tmp = crypt($password, $hash)). Then, it uses a constant-time comparison function to determine if the two hashes are indeed equal.

Password_Make_Salt exists to generate a random string of the given length. If raw_output is set to false (default), the outputted "salt" will be base64 encoded in a way that's directly compatible with crypt(). If raw_output is true, it will return the same length string using random full-bytes (0-255).

Example:

$hash = password_hash("foo");
if (password_verify("foo", $hash)) {
    // always should be true
} else {
}

A note on reading PHP's source code: PHP functions (those exposed to php code) are surrounded by the PHP_FUNCTION() macro. Additionally, php variables (zval's) are used in a few places. The macros that access parts of them are

  • Z_TYPE_P() (finding the type of a pointer to a zval)
  • Z_STRVAL_P() (get the pointer to the string value)
  • Z_STRLEN_P() (get the int length of the string type)
  • Z_LVAL_P() (get the long value of the integer type)

Additionally, zval_ptr_dtor() is a refcount mechanism to decrease the refcount on a zval, and clean it up if it hits 0.

I'm looking for a review of the implementation by at least a few security experts prior to officially proposing the change. It's reasonably short (only about 300 lines of code)...

Update

The API has been approved, so I've set up a Pull Request for the functionality. Please review it if you have time. Thanks!: https://github.com/php/php-src/pull/191

ircmaxell
  • 1,416
  • 12
  • 16
  • 11
    Your work is very much appreciated! If PHP has built-in support for bcrypt, the discussions about "_how much security is necessary for storing passwords_" will hopefully come to an end. – martinstoeckli Jun 27 '12 at 11:22
  • 3
    @martinstoeckli: Thanks! PHP already supports bcrypt. It did from 5.0 (if complied against a recent version of crypt), and from 5.3 (includes its own implementation). So it's already available, just not really easy to work with... – ircmaxell Jun 27 '12 at 11:26
  • 2
    Yes i know it did, i have published example code myself, to explain the problem and to show that there is no reason against using it. But there was still the problem, that a developer had to make an effort to do it right. – martinstoeckli Jun 27 '12 at 11:37
  • Though I'm not an expert and can't help, you should probably contact the author of phpass - I'm pretty sure there is a lot of potential for synergies, and he seems to be very proficient in this area. I also noticed you didn't reference him in the RFC (or maybe I just missed it) - if you didn't know phpass, you should definitely have a look at his implementation: http://www.openwall.com/phpass/ **Edit:** Sorry - you did reference PHPASS - just not in the links. – Arne Jul 24 '12 at 13:49
  • 2
    Sir, allow me to thank you and I'd really love to shake your hand. Very excellent work. Your work will be appreciated by the whole PHP community. – Adi Apr 29 '13 at 16:49
  • 1
    As every single developer will thank you for your work, my case would not differ. But two questions might come here, why are we using an encryption algorithm over hashing one like SHA-512, even though I read the references, but the only lack there is the rounding time which you referred to as cost which should be easily handeled. Second, the possibility of knowing the user password would also violate data integrity even from administrator side, or at least I think so. – mamdouh alramadan Nov 29 '13 at 10:01

3 Answers3

6

In C, I usually recommend that you should be consistently using size_t to store all length values, not int or long. Using int and long exposes you to integer signed/unsigned bugs. Reference: INT01-C from the CERT C Secure Coding Standard.

However, there is a complication in this case, due to the way the PHP native code interface works:

  • When you read an integer using zend_parse_parameters(), PHP expects you to store it in a long. Therefore, you should pass zend_parse_parameters() a pointer to a long, then convert the long to a size_t carefully: validate that the number is non-negative and falls within the range for size_t (l >= 0 && l < SIZE_MAX), then cast it to a size_t and use a size_t thereafter.

  • When you read a string using zend_parse_parameters(), PHP expects you to pass a buffer to store the string and a pointer to an int to store the string length. Therefore, you should pass zend_parse_parameters() what it is expecting. Then, carefully convert the int to a size_t: validate that the length is non-negative and falls within the range for size_t (len >= 0 && len < SIZE_MAX), then cast it to a size_t and use a size_t thereafter.

    Similarly, Z_STRLEN_PP() returns an int (I think) so you'll probably want to do the same for it.

(Beware of passing a long as a length to memcpy(); I don't know for sure, but that code smells pretty dubious to me and of high risk for an integer casting/truncation bug.)

buffer = php_base64_encode((unsigned char*) str, str_len, NULL);
for (pos = 0; pos < out_len; pos++) {
    if (buffer[pos] == '+') {
         ...
    } else {
         ret[pos] = buffer[pos];

That looks like it could read past the end of buffer to me. I think you need an additional check to make sure that pos is smaller than the length of buffer.

D.W.
  • 98,420
  • 30
  • 267
  • 572
  • It's not an off-by-one error. The malloc call that makes it (`safe_emalloc`) already appends 1 to the allocated length to ensure that the null byte won't overflow the length. As far as the sizes, that's a very good point, I'll add those changes to the list... Thanks! – ircmaxell Aug 21 '12 at 16:21
  • Thanks, @ircmaxell! I've deleted the erroneous statement about off-by-one error. – D.W. Aug 21 '12 at 18:15
  • I also edited it to provide a bit more information about how to use `size_t` safely (based upon my understanding of PHP internals). – D.W. Aug 21 '12 at 18:36
1

Very nice initiative!


Quick comments
(I will read the RFC in more detail later.)

'salts only need to be unique in a system' is a faulty assumption; please also read: this answer on salting

I do not see what the need is to manually provide a salt for the BCrypt algorithm. As far as I can see, it only opens up an option for people to make unnecessary mistakes.

password_make_salt seems to be more related to the crypt_ family of functions; maybe rename it crypt_make_salt instead, as to avoid confusion.

the first Basic usage example promotes the use of a constant value 'usesomesillystringfor', many developers will make the mistake of using some sort of constant value instead of a proper (random) salt.

The password_make_salt should use the best possible random source, cryptographically safe if at all possible (I'm not sure about the php_win32_get_random_bytes() quality, the /dev/urandom is good)

the user_needs_rehash.php example is not very readable; could use a cleanup.

the specify_salt.php example suggests that you can supply random bytes as a valid salt value for the BCrypt algorithm. This contradicts the requirement of the custom base64 alphabet for the BCrypt.

Jacco
  • 7,402
  • 4
  • 32
  • 53
  • Thanks for the feedback! At that point, why not using a UUID or GUID for a salt (generated at hash time)...? As far as the option to add a salt, it was requested (and can be useful for other algorithms). As far as _make_salt, it's going to be changed to something like random_string() with a mode for CRYPT formatting. `php_win32_get_random_bytes()` is crypto safe (wraps the windows API). And you can supply a full random string for salt, it will automatically serialize it correctly... – ircmaxell Aug 21 '12 at 16:17
  • *"I'm not sure about the php_win32_get_random_bytes() quality"* - `php_win32_get_random_bytes()` looks fine. It calls `CryptGenRandom()`, which is a high-quality source of crypto-strength pseudorandom numbers. See [the code](http://lxr.sweon.net/php/http/source/win32/winutil.c#L52). – D.W. Aug 21 '12 at 18:39
  • 1) I agree the salt-value argument could be usefull for other alogithms, but for BCrypt, I think it is a bad idea. 2) I'm not familiar enough with UUID/GUID to be able to comment on the implications, but from the wikipedia info it looks like it could be usefull. 3) if the full random string is automatically serialized correctly, then this should be noted explicitly in the argument description. (maybe it is there, but I didn't see it) – Jacco Aug 21 '12 at 20:32
  • @ircmaxell Most GUIDS and UUID's do not have very much entropy in them, a lot of the source bit-fields are fixed based on the machine that generated them and sequential a time stamps. See section 4.1 of the [RFC for UUID](http://www.ietf.org/rfc/rfc4122.txt) – Scott Chamberlain Sep 15 '12 at 07:10
0

From an hashing perspective, using Bcrypt is one of the most recommended techniques these days, so it’s good.

Letting the user choose to use another algorithm (such as Scrypt, and maybe their own implementation later?) would also be a great option: By default (and for non-expert PHP developers), they’ll just have to use the default BCrypt, and if they want to, they’ll be able to adapt for their needs.

TRiG
  • 609
  • 5
  • 14
Cyril N.
  • 2,649
  • 2
  • 18
  • 28
  • 2
    -1. Have you read RFC and code before submitting answer? 1. scrypt is planned to be added in the future. 2. `password_verify` aims to be resistant to timing attacks. It's not just `password_hash("foo") === $hash` – Andrei Botalov Jun 27 '12 at 08:54
  • What does scrypt being planned in the future have any link to my answer? Regarding your #2, you're right, I updated my answer. – Cyril N. Jun 27 '12 at 09:05
  • I don't see a reason in letting users choose another algorithm now. bcrypt is quiet secure. Adding less secure algorithm isn't good. scrypt is quiet new so it's not recommended e.g. Thomas Pornin – Andrei Botalov Jun 27 '12 at 09:12
  • 1
    The idea behind allowing some other algorithm is the "future" case. Maybe BCrypt will be broken tomorrow, and a new algorithm will be defined as better/stronger. In that case, user will still be able to implement it without waiting for an update from PHP to fix this security breach. (quiet means silence, did you meant "quite"?) – Cyril N. Jun 27 '12 at 09:19
  • 4
    By the way, I appreciate you commented your -1. I hate it when people just hit -1 without any explanation :) – Cyril N. Jun 27 '12 at 09:19