Shortest way to generate UUID's version 3,4 and 5 in php

-3

I have this function to generate UUID's:

function uuid($v=4,$d=null,$s=false)//$v-> version|$data-> data for version 3 and 5|$s-> add salt and pepper
{
    switch($v.($x=''))
    {
        case 3:
            $x=md5($d.($s?md5(microtime(true).uniqid($d,true)):''));break;
        case 4:default:
            $v=4;for($i=0;$i<=30;++$i)$x.=substr('1234567890abcdef',mt_rand(0,15),1);break;
        case 5:
            $x=sha1($d.($s?sha1(microtime(true).uniqid($d,true)):''));break;
    }
    return preg_replace('@^(.{8})(.{4})(.{3})(.{3})(.{12}).*@','$1-$2-'.$v.'$3-'.substr('89ab',rand(0,3),1).'$4-$5',$x);
}

This is far from being short!

The idea is to reduce this at maximum!

Criteria to meet:

  • It MUST have the format xxxxxxxx-xxxx-vxxx-yxxx-xxxxxxxxxxxx, being x a hexadecimal number, y MUST be 89AB and v has to be the version! (required)

  • Only vcan be generated randomly for all versions (non-standard, optional)

  • Version 3 and 5 have to generate ALWAYS the same UUID (except for the rule above, required)

  • You must provide a method of making the UUID somewhat random (required, except for version 4)

  • Version 3 uses md5 to "pack" the data, while version 5 uses sha1 (leaving a few chars behind, required)

  • Function name MUST be uuid (required)

Scoring:

  • Lower number of chars wins
  • The score is calculated using (number chars)*0.75
  • Readable code is calculated using (number chars)*0.50
  • If one of the required criteria from above isn't met, the multiplier is increased by 0.5 for each criteria, except for the last which is 1.25 (maximum will be (number chars)*4.00, which means that 1 char is counting as 4)
  • Comments don't count but anything else between function uuid(...){ and } counts!

For example:

My function would have a crappy result:

It has 451 chars on linux.

Since it is somewhat hard to read, it is *0.75.

Since I fulfilled all the criteria, it stays *0.75.

Result: 451*0.75 = 338,25!

Ismael Miguel

Posted 2014-02-08T08:31:56.307

Reputation: 6 797

Your code doesn't seem to meet your own specification; for $v of 3 and 5, the y half-byte is chosen at random. Also, the criteria are somewhat hard to understand as written. For example, do you mean that versions 3 and 5 must always return the same UUID, given the same data, or do you mean that they actually always return the same value (fairly useless)? – primo – 2014-02-08T08:49:09.033

If I run uuid(3,'this') 4 times, depending on the implementation you choose, you must have the same UUID, except the only char that can be random. – Ismael Miguel – 2014-02-08T08:51:38.107

1So this is a specification that you've invented? According the OSF UUID specification, a version 3 UUID must always return the same value for any given data. – primo – 2014-02-08T08:57:26.453

I know, but I'm letting that one be a little "loose". And it is well identified as being optional and NOT standard. All i want is a UUID generator for all those 3 versions. That one was my example. It works, but it's quite a chunk of frankencode. The idea is to keep it standard. All non-standard "features" are optional, except the "random" part, that can be a simple salt given by the function. – Ismael Miguel – 2014-02-08T09:00:50.623

Answers

3

PHP - 189 × 0.75 = 141.75

function uuid($v,$d,$s=''){
  $u=hash($v^3?sha1:md5,$v^4?$d.$s:gmp_strval(gmp_random(4)));
  $u[12]=$v;$u[16]=dechex(+"0x$u[16]"&3|8);
  return substr(preg_replace('/^.{8}|.{4}/','\0-',$u,4),0,36);
}

This implementation should be fully compliant with RFC 4122. If $s is provided, it is expected to be the byte string represention of the UUID for the applicable namespace. Otherwise, the default ("NULL") namespace is used.

gmp_random(4) is used to generate the 128 bits of entropy, which is just about the best PHP has. If the gmp module isn't available, you could also use this:

openssl_random_pseudo_bytes(16)) (requires openssl module to be enabled)

or, as a last resort:

for(;$i++<4)$r.=mt_rand();

Sample usage:

echo uuid(3,'MyCoolNewApp');

Sample output:

c478211b-224d-30b1-9116-c06048999ce2

primo

Posted 2014-02-08T08:31:56.307

Reputation: 30 891

I only see one problem with your function: The version 4 MUST be random. Other than that, it's a really nice implementation. I'm actually pleased with the result. Let's just wait a little longer to see more answers. – Ismael Miguel – 2014-02-08T22:45:14.507

But that is not a completely random string. microtime() and getmypid() aren't good ways of making random values. There is a risk that running that function twice at the exact time with the same pid will generate the same UUID, and that is not so random. But still, it's a good one. I wouldn't do better. – Ismael Miguel – 2014-02-09T02:24:19.797

I don't want true entropy (and i didn't even mention that anywhere). I'm just saying it is not a good way. That is just my opinion. – Ismael Miguel – 2014-02-09T02:29:29.843

@IsmaelMiguel fixed. – primo – 2014-02-09T03:28:15.643

Your code won't work everywhere. It relies on the GMP extension. And using openssl_random_pseudo_bytes(16) on windows is a bad idea... It simply times-out your script. a small idea would be use md5(mt_rand(0,1e9)). It's always random, between 0 and 1,000,000,000. – Ismael Miguel – 2014-02-09T04:58:31.330

mt_rand(0,1e9) only provides ~30 bits of entropy. But, as mentioned, the concatenation of 4 or 5 mt_rand values could be used, if no other method is available. If openssl hangs on your machine, there's something wrong with the installation. – primo – 2014-02-09T05:18:39.197

openssl_random_pseudo_bytes(16) doesn't work on windows. not on mine, not on any windows! It's a known issue. – Ismael Miguel – 2014-02-09T06:06:56.540

@IsmaelMiguel Is that so? http://i.stack.imgur.com/h9BDl.png

– primo – 2014-02-09T06:15:21.273

Well, I change my comment from "any" to "some of". – Ismael Miguel – 2014-02-09T06:16:51.813

@IsmaelMiguel from PHP 5 Changelog, update 5.3.4 dated Dec. 9th, 2010: "Fixed possible blocking behavior in openssl_random_pseudo_bytes on Windows."

– primo – 2014-02-09T06:27:54.653

Not everyone has updated to php 3.3.4. Most likely, most of IIS servers are running php 5.3 or 5.4. – Ismael Miguel – 2014-02-09T06:31:18.857

Firstly, the code has a syntax error. The last closing parenthesis is missing off the end. Second, this is impressive, but does not meet the RFC 4122 spec for either scenarios (grouping 3 and 5 as essentially the same), because for 3 & 5, there is no call (or need, imo) for a salt, as the expectation is to create the same UUID for a given name and namespace, and there is no namespace being used, which is a requirement as the hash is of the packed binary string of the namespace UID concatenated with the name string. For version 4 this doesn't quite work because it should be more random. – Anthony – 2014-06-27T16:52:13.573

Also +1 for using bitwise operation to properly flip the 8th octet – Anthony – 2014-06-27T17:42:17.340

@Anthony thanks for the feedback. Addressing your points in order: 1) Added the missing parenthesis. Not sure how that got dropped. 2) For 3 & 5, it is stated: "If [the salt] is provided, it is expected to be the byte string represention of the UUID for the applicable namespace." A fairly large caveat, to be sure, but given the proper input, the output is entirely compliant with RFC 4122. 3) gmp_random(4) produces a 128-bit random value. Only 122 bits are necessary to produce a v4 UUID. – primo – 2014-06-28T04:26:01.397