Convert Salesforce 15-character ID to 18-character

20

3

In Salesforce CRM, every object has 15-character alphanumeric ID, which is case-sensitive. If anyone's curious, actually it's base-62 number. However, tools used for data migration and integration may or may not support case sensitivity. To overcome that, IDs can be safely converted to 18-character case-insensitive alphanumeric IDs. In that process 3-character alphanumeric checksum is appended to the ID. The conversion algorithm is:

Example:

a0RE000000IJmcN
  1. Split ID into three 5-character chunks.

    a0RE0  00000  IJmcN
    
  2. Reverse each chunk.

    0ER0a  00000  NcmJI
    
  3. Replace each character in every chunk by 1 if it's uppercase or by 0 if otherwise.

    01100  00000  10011
    
  4. For each 5-digit binary number i, get character at position i in concatenation of uppercase alphabet and digits 0-5 (ABCDEFGHIJKLMNOPQRSTUVWXYZ012345).

    00000 -> A,
    00001 -> B,
    00010 -> C, ..., 
    11010 -> Z, 
    11011 -> 0, ...,
    11111 -> 5`
    

    Yielding:

    M  A  T
    
  5. Append these characters, the checksum, to the original ID.

Output:

a0RE000000IJmcNMAT

Write program or function which takes 15-character alphanumeric (ASCII) string as input and returns 18-character ID.

Input validation is out of scope of this question. Programs may return any value or crash on invalid input.

Please, don't use Salesforce propretiary languages' features that make this challenge trivial (such as formula CASESAFEID(), converting Id to String in APEX &c).

Test Cases

a01M00000062mPg    -> a01M00000062mPgIAI
001M000000qfPyS    -> 001M000000qfPySIAU
a0FE000000D6r3F    -> a0FE000000D6r3FMAR
0F9E000000092w2    -> 0F9E000000092w2KAA
aaaaaaaaaaaaaaa    -> aaaaaaaaaaaaaaaAAA
AbCdEfGhIjKlMnO    -> AbCdEfGhIjKlMnOVKV
aBcDEfgHIJKLMNO    -> aBcDEfgHIJKLMNO025

Trang Oul

Posted 2016-01-11T09:06:31.570

Reputation: 656

3Sadly, converting a string to an Id in Apex Code still wouldn't be shorter than some of the answers provided here, especially if the code must be self-contained. Apex Code is not well-suited for golfing. – phyrfox – 2016-01-12T06:34:10.173

2@phyrfox as a former salesforce dev. Apex isn't suited for much... – Mike McMahon – 2016-01-12T09:08:30.760

2APEX, 56 bytes: public class X{public X(Id i){System.debug((String)i);}}. Works only with valid Salesforce IDs, though. – Trang Oul – 2016-01-12T11:49:45.693

I came here looking to actually do this for work (https://success.jitterbit.com/display/DOC/Formula+Builder+String+Function), not golf, but I'm a little confused by the description of the algorithm. You say each reversed-and-sanitized chunk in step 4 will be a "binary number," but you never replace digits 2-8 with 0's and 1's. What exactly am I supposed to do for step 4 when steps 1-3 on a chunk like "62mPg" have resulted in a number like "01026"?

– k.. – 2017-10-27T18:04:37.157

Answers

6

Pyth, 23 22 bytes

1 byte saved by FryAmTheEggman.

sm@s+JrG1U6i}RJ_d2c3pz

Try it online. Test suite.

This might be the first time I've used the print instruction in golfing.

Explanation

     JrG1                   save uppercase alphabet in J
                     z      input string
                    p       print it without newline
                  c3        split into 3 parts
 m              d           for each part:
               _              reverse
            }R                map characters to being in
              J                 uppercase alphabet (saved in J)
           i     2            parse list of bools as binary
  @                           get correct item of
     J                          uppercase alphabet (saved in J)
   s+    U6                     add nums 0-5 to it
s                           concatenate and print

PurkkaKoodari

Posted 2016-01-11T09:06:31.570

Reputation: 16 699

6

Ruby, 97 bytes

->s{s+s.scan(/.{5}/).map{|x|[*?A..?Z,*?0..?5][x.reverse.gsub(/./){|y|y=~/[^A-Z]/||1}.to_i 2]}*''}
->s{               # define an anonymous lambda
s+                 # the original string plus...
s.scan(/.{5}/)     # get every group of 5 chars
.map{|x|           # map over each group of 5 chars...
[*?A..?Z,*?0..?5]  # build the array of A-Z0-5
[                  # index over it with...
x.reverse          # the 5-char group, reversed...
.gsub(/./){|y|     # ... with each character replaced with...
y=~/[^A-Z]/||1     # ... whether it's uppercase (0/1)...
}.to_i 2           # ... converted to binary
]                  # (end index)
}*''               # end map, join into a string
}                  # end lambda

This one's got some really neat tricks.

My original instinct for splitting the string into groups of 5 chars was each_slice:

irb(main):001:0> [*1..20].each_slice(5).to_a
=> [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15], [16, 17, 18, 19, 20]]

Turns out that's waaay too long compared to a simple regex (x.chars.each_slice(5) vs. x.scan(/.{5}/)). This seems obvious in hindsight, but I never really thought about it... perhaps I can optimize some of my old Ruby answers here.

The thing I'm most proud of in this answer, though, is this piece of code:

y=~/[^A-Z]/||1

Alright, so here's some background for the non-Rubyists. Ruby completely separates booleans (TrueClass, FalseClass) from integers/numbers (Numeric)—that means there's no automatic conversion from true to 1 and false to 0 either. This is annoying while golfing (but a good thing... for all other purposes).

The naïve approach to checking whether a single character is uppercase (and returning 1 or 0) is

y.upcase==y?1:0

We can get this down a little further (again, with a regex):

y=~/[A-Z]/?1:0

But then I really started thinking. Hmm... =~ returns the index of a match (so, for our single character, always 0 if there's a match) or nil on failure to match, a falsy value (everything else except FalseClass is truthy in Ruby). The || operator takes its first operand if it's truthy, and its second operand otherwise. Therefore, we can golf this down to

y=~/[^A-Z]/||1

Alright, let's look at what's happening here. If y is an uppercase letter, it will fail to match [^A-Z], so the regex part will return nil. nil || 1 is 1, so uppercase letters become 1. If y is anything but an uppercase letter, the regex part will return 0 (because there's a match at index 0), and since 0 is truthy, 0 || 1 is 0.

... and only after writing all of this out do I realize that this is actually the same length as y=~/[A-Z]/?1:0. Haha, oh well.

Doorknob

Posted 2016-01-11T09:06:31.570

Reputation: 68 138

4

MATL, 24 bytes

j1Y24Y2hG5IePtk=~!XB1+)h

Uses current version (9.1.0) of the language/compiler.

Examples

>> matl
 > j1Y24Y2hG5IePtk=~!XB1+)h
 >
> a0RE000000IJmcN
a0RE000000IJmcNMAT

>> matl
 > j1Y24Y2hG5IePtk=~!XB1+)h
 >
> a01M00000062mPg
a01M00000062mPgIAI

Explanation

j            % input string
1Y2          % predefined literal: 'ABC...Z'
4Y2          % predefined literal; '012...9'
h            % concatenate into string 'ABC...Z012...9'
G            % push input string
5Ie          % reshape into 5x3 matrix, column-major order
P            % flip vertically
tk=~         % 1 if uppercase, 0 if lowercase
!XB1+        % convert each column to binary number and add 1
)            % index 'ABC...Z012...9' with resulting numbers
h            % concatenate result with original string

Luis Mendo

Posted 2016-01-11T09:06:31.570

Reputation: 87 464

3

JavaScript (ES6), 108

x=>x.replace(/[A-Z]/g,(x,i)=>t|=1<<i,t=0)+[0,5,10].map(n=>x+='ABCDEFGHIJKLMNOPQRSTUVWXYZ012345'[t>>n&31])&&x

Test

f=x=>x.replace(/[A-Z]/g,(x,i)=>t|=1<<i,t=0)+[0,5,10].map(n=>x+='ABCDEFGHIJKLMNOPQRSTUVWXYZ012345'[t>>n&31])&&x

// Less golfed

U=x=>{
  x.replace(/[A-Z]/g,(x,i)=>t|=1<<i,t=0); // build a 15 bit number (no need to explicit reverse)
  // convert 't' to 3 number of 5 bits each, then to the right char A..Z 0..5
  [0,5,10].forEach(n=> // 3 value for shifting
    x += 'ABCDEFGHIJKLMNOPQRSTUVWXYZ012345' // to convert value to char
     [ t>>n&31 ] // shift and mask
  );
  return x
}

console.log=x=>O.innerHTML+=x+'\n';

;[
  ['a01M00000062mPg','a01M00000062mPgIAI']
, ['001M000000qfPyS','001M000000qfPySIAU']
, ['a0FE000000D6r3F','a0FE000000D6r3FMAR']
, ['0F9E000000092w2','0F9E000000092w2KAA']
, ['aaaaaaaaaaaaaaa','aaaaaaaaaaaaaaaAAA']
, ['AbCdEfGhIjKlMnO','AbCdEfGhIjKlMnOVKV']
, ['aBcDEfgHIJKLMNO','aBcDEfgHIJKLMNO025']
].forEach(t=>{
  var i=t[0],x=t[1],r=f(i);
  console.log(i+'->'+r+(r==x?' OK':' Fail (expected '+x+')'));
})
<pre id=O></pre>

edc65

Posted 2016-01-11T09:06:31.570

Reputation: 31 086

2

JavaScript (ES6), 137 132 bytes

s=>s+s.replace(/./g,c=>c>"9"&c<"a").match(/.{5}/g).map(n=>"ABCDEFGHIJKLMNOPQRSTUVWXYZ012345"[0|"0b"+[...n].reverse().join``]).join``

4 bytes saved thanks to @ՊՓԼՃՐՊՃՈԲՍԼ!

Explanation

This challenge is not suited for JavaScript at all. There's no short way to reverse a string and it looks like the shortest way to convert the number to a character is to hard-code each possible character.

s=>
  s+                                   // prepend the original ID
  s.replace(/./g,c=>c>"9"&c<"a")       // convert each upper-case character to 1
  .match(/.{5}/g).map(n=>              // for each group of 5 digits
    "ABCDEFGHIJKLMNOPQRSTUVWXYZ012345"
    [0|"0b"+                            // convert from binary
      [...n].reverse().join``]          // reverse the string
  ).join``

If the digits in the checksum were allowed to be lower-case it could be done in 124 bytes like this:

s=>s+s.replace(/./g,c=>c>"9"&c<"a").match(/.{5}/g).map(n=>((parseInt([...n].reverse().join``,2)+10)%36).toString(36)).join``

Test

var solution = s=>s+s.replace(/./g,c=>c>"9"&c<"a").match(/.{5}/g).map(n=>"ABCDEFGHIJKLMNOPQRSTUVWXYZ012345"[0|"0b"+[...n].reverse().join``]).join``
<input type="text" id="input" value="AbCdEfGhIjKlMnO" />
<button onclick="result.textContent=solution(input.value)">Go</button>
<pre id="result"></pre>

user81655

Posted 2016-01-11T09:06:31.570

Reputation: 10 181

If I'm not mistaken, parseInt([...n].reverse().join\`,2)could be changed to+`0b${[...n].reverse().join``}``. – Mama Fun Roll – 2016-01-12T02:07:40.057

@ՊՓԼՃՐՊՃՈԲՍԼ Right you are! I saved another byte on top of that too, thanks. – user81655 – 2016-01-12T06:54:00.483

Save 10 whole bytes by using .replace(/.{5}/g,n=>/*stuff*/). – Neil – 2016-01-13T17:50:17.627

2

MATLAB, 100 98 bytes

s=input('');a=flip(reshape(s,5,3))';e=['A':'Z',48:53];disp([s,e(bin2dec(num2str(a~=lower(a)))+1)])

A string will be requested as input and the output will be displayed on the screen.

Explanation

I'm probably using the most straight-forward approach here:

  • Request input
  • Reshape to 5 (rows) x 3 (columns)
  • Flip the row order
  • Transpose the matrix to prepare it for being read as binary
  • Allocate the ABC...XYZ012345 array
  • Compare the character indices of the transposed matrix to its lower-case equivalent and convert the booleans to strings, which are then read as binary and converted to decimal.
  • Interpret these decimals (incremented by 1) as indices of the allocated array.
  • Display the input with the additional 3 characters

Now below 100 bytes thanks to Luis Mendo!

slvrbld

Posted 2016-01-11T09:06:31.570

Reputation: 619

1You can save a little using e=['A':'Z',48:53] – Luis Mendo – 2016-01-11T22:15:01.813

I see my approach is almost the same as yours :-) – Luis Mendo – 2016-01-11T22:16:02.537

2

CJam, 27 bytes

l_5/{W%{_el=!}%2bH+43%'0+}%

Run all test cases.

A fairly straightforward implementation of the spec. The most interesting part is the conversion to characters in the checksum. We add 17 to the result of each chunk. Take that modulo 43 and add the result of that to the character '0.

Martin Ender

Posted 2016-01-11T09:06:31.570

Reputation: 184 808

2

Japt, 46 bytes

U+U®f"[A-Z]" ?1:0} f'.p5)®w n2 +A %36 s36 u} q

Not too happy with the length, but I can't find a way to golf it down. Try it online!

ETHproductions

Posted 2016-01-11T09:06:31.570

Reputation: 47 880

2

PHP, 186 181 bytes

<?$z=$argv[1];$x=str_split($z,5);$l="ABCDEFGHIJKLMNOPQRSTUVWXYZ012345";foreach($x as$y){foreach(str_split(strrev($y))as$a=>$w)$y[$a]=ctype_upper($w)?1:0;$z.=$l[bindec($y)];}echo $z;

Unglofed

<?php
$z = $argv[1];
$x = str_split($z,5);
$l = "ABCDEFGHIJKLMNOPQRSTUVWXYZ012345";
foreach($x as $y) {
    foreach( str_split( strrev($y) ) as $a => $w) {
        $y[$a] = ctype_upper($w) ? 1 : 0;
    }
    $z .= $l[bindec($y)];
}
echo $z;

I started out thinking I could make it much shorter than this, but I ran out of ideas to make it shorter.

Samsquanch

Posted 2016-01-11T09:06:31.570

Reputation: 271

1

Python 2, 97 bytes

lambda i:i+''.join(chr(48+(17+sum((2**j)*i[x+j].isupper()for j in range(5)))%43)for x in[0,5,10])

TFeld

Posted 2016-01-11T09:06:31.570

Reputation: 19 246

1

C, 120 118 bytes

n,j;main(c,v,s)char**v,*s;{for(printf(s=v[1]);*s;s+=5){for(n=0,j=5;j--;)n=n*2+!!isupper(s[j]);putchar(n+65-n/26*17);}}

Works for any input whose length is a multiple of 5 :)

Ungolfed

n,j;

main(c,v,s) char **v, *s;
{
    for(printf(s = v[1]); *s; s+=5)
    {
        for(n=0, j=5; j--;)
            n=n*2+!!isupper(s[j]);

        putchar(n+65-n/26*17);
    }
}

Cole Cameron

Posted 2016-01-11T09:06:31.570

Reputation: 1 013

To save a few bytes you can remove n, from the global namespace if you use main(n,v,s) for your signature since you're not otherwise using argc. – cleblanc – 2016-01-11T21:49:27.050

Also replace 26*17 with plain old 442 saves another byte – cleblanc – 2016-01-11T21:52:25.733

With a few more edits I got your version down to 110 bytes. I don't understand why you had !!isupprer() when isupper() seems to work fine for me. Also I refactored your for loops to remove some unnecessary {}

j;main(n,v,s)char**v,*s;{for(printf(s=v[1]);*s;s+=5,putchar(n+65-n/442))for(n=0,j=5;j--;n=n*2+isupper(s[j]));} – cleblanc – 2016-01-11T22:05:27.607

@cleblanc Excellent suggestions, thanks very much. The order of operations is very important on the n/26*17 expression so replacing with 442 is not an option. As far as !!isupper, that function doesn't return 1 for true on my system, it returns 256. The !! is a short way to convert it to a 0/1 return value no matter what. YMMV. – Cole Cameron – 2016-01-12T14:28:47.893

1

PowerShell, 162 bytes

function f{param($f)-join([char[]](65..90)+(0..5))[[convert]::ToInt32(-join($f|%{+($_-cmatch'[A-Z]')}),2)]}
($a=$args[0])+(f $a[4..0])+(f $a[9..5])+(f $a[14..10])

OK, a lot of neat stuff happening in this one. I'll start with the second line.

We take input as a string via $args[0] and set it to $a for use later. This is encapsulated in () so it's executed and the result returned (i.e., $a) so we can immediately string-concatenate it with the results of three function calls (f ...). Each function call passes as an argument the input string indexed in reverse order chunks as a char-array -- meaning, for the example input, $a[4..0] will equal @('0','E','R','0','a') with each entry as a char, not a string.

Now to the function, where the real meat of the program is. We take input as $f, but it's only used way toward the end, so let's focus there, first. Since it's passed as a char-array (thanks to our previous indexing), we can immediately pipe it into a loop with $f|%{...}. Inside the loop, we take each character and perform a case-sensitive regex match with -cmatch which will result in true/false if it's uppercase/otherwise. We cast that as an integer with the encapsulating +(), then that array of 1's and 0's is -joined to form a string. That is then passed as the first parameter in the .NET [convert]::ToInt32() call to change the binary (base 2) into decimal. We use that resultant decimal number to index into a string (-join(...)[...]). The string is first formulated as a range (65..90) that's cast as a char-array, then concatenated with the range (0..5) (i.e., the string is "ABCDEFGHIJKLMNOPQRSTUVWXYZ012345"). All of that is to return the appropriate character from the string.

AdmBorkBork

Posted 2016-01-11T09:06:31.570

Reputation: 41 581

1

C# 334

string g(string c){string[]b=new string[]{c.Substring(0,5),c.Substring(5, 5),c.Substring(10)};string o="",w="";for(int i=0,j=0;i<3;i++){char[]t=b[i].ToCharArray();Array.Reverse(t);b[i]=new string(t);o="";for(j=0;j<5;j++){o+=Char.IsUpper(b[i][j])?1:0;}int R=Convert.ToInt32(o,2);char U=R>26?(char)(R+22):(char)(R+65);w+=U;}return c+w;}

If requested, I'll reverse my code back to readable and post it.

Yytsi

Posted 2016-01-11T09:06:31.570

Reputation: 3 582

1

C#, 171 bytes

I'm not really well-practiced in golfing C#, but here's a shot.

s=>{for(var u=s;u.Length>0;u=u.Substring(5)){int p=0,n=u.Substring(0,5).Select(t=>char.IsUpper(t)?1:0).Sum(i=>(int)(i*Math.Pow(2,p++)));s+=(char)(n+65-n/26*17);}return s;}

Cole Cameron

Posted 2016-01-11T09:06:31.570

Reputation: 1 013

Suggestions: char.IsUpper(t) can be replaced with t>=65&t<=90 (& on bool in C# is basically a golf-shorter && without short-circuiting). 447 is shorter than 26*17. You don't need to do a separate Select: you can include the ternary directly within the Sum. Consider replacing all those usages of Substring with a loop based on Take instead, e.g. for(int i=0;i<3;i++)s.Skip(i*5).Take(5). For future reference, u!="" would be shorter than u.Length>0 (but that's no longer necessary if you're using Take). – Bob – 2016-01-11T23:07:55.427

The expression n/26*17 is not equivalent to n/442, but other than that, thanks for the suggestions. As stated, I'm not very experienced in golfing in C# so this is all great stuff for me to consider in the future. – Cole Cameron – 2016-01-12T14:34:00.543

Ah, sorry - I misread that. – Bob – 2016-01-12T15:06:10.593

1

Jolf, 30 bytes

Finally, a probably still-jolfable! Try it here!

+i mZci5d.p1CρA_Hpu1"[^1]'0"2
    Zci5                      split input into groups of 5
  _m                          map it
        d                      with this function
               _H              reverse H
              A  pu1            and replace in it all uppercase letters with 1
             ρ      "[^1]'0"    replace all non-ones with zeroes
            C               2   parse as binary integer
         .p1                    get the (^)th member of "A...Z0...9"

Conor O'Brien

Posted 2016-01-11T09:06:31.570

Reputation: 36 228

1

Python 3, 201 174 138 bytes

Big thanks to Trang Oul for pointing out a function declaration that no longer needed to exist. And Python ternary operators. And some incorrect output. Just...just give him the upvotes.

i=input();n='';c=l=15;
while c:c-=1;n+=('0','1')[i[c].isupper()]
while l:v=int(n[l-5:l],2);l-=5;i+=(chr(v+65),str(v-26))[v>25]
print(i)

Steve Eckert

Posted 2016-01-11T09:06:31.570

Reputation: 216

You use function z() once, you can replace its call and save 25 bytes. Also, your code incorrectly assigns [ instead of 0. – Trang Oul – 2016-01-12T07:22:17.947

Well, that was an embarrassing oversight on my part. Thanks. – Steve Eckert – 2016-01-12T14:16:10.117

1

You can save even more by replacing first if else with this construction and second one with ternary operator.

– Trang Oul – 2016-01-12T14:30:33.730

1

J, 36 bytes

,_5(u:@+22+43*<&26)@#.@|.\]~:tolower

Usage:

   (,_5(u:@+22+43*<&26)@#.@|.\]~:tolower) 'a0RE000000IJmcN'
a0RE000000IJmcNMAT

Try it online here.

randomra

Posted 2016-01-11T09:06:31.570

Reputation: 19 909

1

Python 3, 87 bytes

lambda s:s+bytes(48+(17+sum((~s[i+j]&32)>>(5-i)for i in range(5)))%43 for j in(0,5,10))

Aleksi Torhamo

Posted 2016-01-11T09:06:31.570

Reputation: 1 871