Implementing the SHA-1 hash algorithm

15

2

The goal of this code-golf is to create a program that takes a string as input, and you have to output the SHA-1 hash value as a hexadecimal number. You can find the pseudocode for SHA-1 here

Other rules:

  1. No network access
  2. You're not allowed to run external programs
  3. You're not allowed to use built-in methods to hash the input
  4. The shortest code wins
  5. It's only necessary to handle ASCII input
  6. Output can be either lowercase or uppercase
  7. The input can be provided using:

    1. Prompting for input
    2. Using command-line arguments
    3. Using STDIN

Test cases:

Input: The quick brown fox jumps over the lazy dog
Output: 2fd4e1c67a2d28fced849ee1bb76e7391b93eb12
----------------------------------------------------------
Input: The quick brown fox jumps right over the lazy dog
Output: 1c3aff41d97ada6a25ae62f9522e4abd358d741f
------------------------------------------------------------
Input: This is a code golf challenge
Output: f52ff9edd95d98e707bd16a7dead459cb8db8693

ProgramFOX

Posted 2013-12-31T19:02:55.833

Reputation: 8 017

Answers

5

GolfScript, 374 322 characters

[128]+.,.~55+64%1,*\(8*2
32?:?.*+256{base}:B~1>++"!Vi9y BRQ+@phoKD5Vj=]30z0"{96@32-*+}*?B\64/{4/{256B}%{0'=820'{64-
2$=^}/2*.?/+?%+}64*1$'&4$?(^3$&|1518500249{++[]@+@+@?*4/.?/+?%+2$+\@32*.?/++@(@+?%@-1%+}:Y~
^2$^1859775393Y
&4$3$&|3$3$&|2400959708Y
^2$^3395469782Y'n/{'~3$3$'\+20*~}/+]zip{~+?%}%}/{?+16B{.9>7*+48+}%1>}%''+

This is based on an exact implementation of the pseudo-code in GolfScript plus some minor encodings to shorten the code (try it online). Input will be read from STDIN.

Howard

Posted 2013-12-31T19:02:55.833

Reputation: 23 109

This is the longest GolfScript I've ever seen :P – Doorknob – 2014-01-16T22:09:51.867

5

Python 3 - 645 chars

h=0x67452301,0xEFCDAB89,0x98BADCFE,0x10325476,0xC3D2E1F0
L=lambda v,s:(v<<s|v>>B-s)%2**B
B=32
R=range
m=input().encode()
l=len(m)
m+=b'\x80'+bytes((55-l)%64)+m.fromhex(hex(2**64+l*8)[3:])
for C in [m[i:i+64]for i in R(0,len(m),64)]:
 w=[sum(C[i+j]<<8*(3-j)for j in R(4))for i in R(0,64,4)];a,b,c,d,e=h
 for i in R(16,80):w+=[L(w[i-3]^w[i-8]^w[i-14]^w[i-16],1)]
 for i in R(80):f=[b&c|~b&d,b^c^d,b&c|b&d|c&d,b^c^d][i//20];k=[0x5A827999,0x6ED9EBA1,0x8F1BBCDC,0xCA62C1D6][i//20];t=L(a,5)+f+e+k+w[i];e=d;d=c;c=L(b,30);b=a;a=t%2**32
 g=a,b,c,d,e;h=[(h[i]+g[i])%2**32for i in R(5)]
B=160
print('%x'%(L(h[0],128)|L(h[1],96)|L(h[2],64)|L(h[3],32)|h[4]))

Just a golfed version of the pseudocode.

grc

Posted 2013-12-31T19:02:55.833

Reputation: 18 565

does utf = u8 in python3? – YOU – 2014-01-03T11:53:34.213

1@YOU Yes, that works too. I just checked and it turns out that .encode() uses UTF-8 by default, which is even shorter. – grc – 2014-01-03T11:59:31.290

2

D (759 chars)

Online executable version: http://dpaste.dzfl.pl/f0c8508f

import std.range,std.algorithm,std.bitmanip,std.stdio:g=writef;
void main(char[][]x){
    auto m=cast(ubyte[])x[1];
    uint a=0x67452301,b=0xEFCDAB89,i,t,f,k;
    uint[5]h=[a,b,~a,~b,0xC3D2E1F0],s;
    uint[80]w;
    auto r=(uint u,uint b)=>u<<b|u>>32-b;
    auto u=(uint i)=>[r(s[0],5)+f+s[4]+k+w[i],s[0],r(s[1],30),s[2],s[3]];
    ubyte[64]p;
    p[0]=128;
    m~=p[0..64-(m.length+8)%64]~nativeToBigEndian(8*m.length);
    foreach(ch;m.chunks(64)){
        16.iota.map!(i=>w[i]=ch[i*4..$][0..4].bigEndianToNative!uint).array;
        iota(16,80).map!(i=>w[i]=r(w[i-3]^w[i-8]^w[i-14]^w[i-16],1)).array;
        s=h;
        80.iota.map!(i=>(i<20?f=s[3]^s[1]&(s[2]^s[3]),k=0x5A827999:i<40?f=s[1]^s[2]^s[3],k=0x6ED9EBA1:i<60?f=s[1]&s[2]|s[3]&(s[1]|s[2]),k=0x8F1BBCDC:(f=s[1]^s[2]^s[3],k=0xCA62C1D6),s[]=u(i)[])).array;
        h[]+=s[];
    }
    g("%(%08x%)",h);
}

mleise

Posted 2013-12-31T19:02:55.833

Reputation: 111

2

C, 546 characters

The program calculates the SHA-1 of the contents of its standard input.

unsigned b[16],k[]={1732584193,0xEFCDAB89,0,271733878,0xC3D2E1F0},i,j,n,p,t,u,v,w,x;
char*d=b;a(c){for(d[p++^3]=c,c=0,t=*k,u=k[1],v=k[2],w=k[3],x=k[4];
c>79?*k+=t,k[1]+=u,k[2]+=v,k[3]+=w,k[4]+=x,p=0:p>63;x=w,w=v,v=u<<30|u/4,u=t,t=i)
i=b[c-3&15]^b[c+8&15]^b[c+2&15]^b[j=c&15],c++>15?b[j]=i*2|i>>31:0,i=u^v^w,
i=(t<<5|t>>27)+x+b[j]+(c<21?(w^u&(v^w))+1518500249:c<41?i+1859775393:
c<61?(u&v|w&(u|v))-1894007588:i-899497514);}
main(){for(k[2]=~*k;i=~getchar();++n)a(~i);for(a(128);p-56;a(0));
for(;p<20?printf("%02x",(d=k)[p^3]&255):p>55;a(n*8L>>504-p*8));}

A couple of notes:

  1. This program assumes that int is exactly 32 bits. For platforms where this is not the case, the unsigned declaration at the very beginning should be replaced by the platform's unsigned 32-bit type. (uint32_t would be the obvious choice, if it didn't require #include <stdint.h>.)
  2. This program assumes a little-endian platform. For a big-endian platform, simply remove the two occurrences of ^3 in the program, and replace the initialization of k[] with the following block: {19088743,0x89ABCDEF,0,1985229328,0xF0E1D2C3}, for a size of 541 characters.
  3. For implementations where char is unsigned by default, one can delete the &255 that appears on the final line to save four more characters.

breadbox

Posted 2013-12-31T19:02:55.833

Reputation: 6 893

This returns be417768b5c3c5c1d9bcb2e7c119196dd76b5570 for The quick brown fox jumps over the lazy dog but it has to return 2fd4e1c67a2d28fced849ee1bb76e7391b93eb12 – ProgramFOX – 2014-01-06T17:43:46.180

@ProgramFOX That hash value is for the string plus a terminating new line. The test SHA-1 values quoted in the description are for strings without terminating newlines, so don't append a newline to the input if you want to test those strings. – breadbox – 2014-01-07T05:27:20.727

1

My python code is a little longer but it is fully working.

data="hello world"
bytes = ""
h0,h1,h2,h3,h4=0x67452301,0xEFCDAB89,0x98BADCFE,0x10325476,0xC3D2E1F0
for n in range(len(data)):bytes+='{0:08b}'.format(ord(data[n]))
bits=bytes+"1"
pBits=bits
while len(pBits)%512!=448:pBits+="0"
pBits+='{0:064b}'.format(len(bits)-1)
def chunks(l,n):return[l[i:i+n]for i in range(0,len(l),n)]
def rol(n,b):return((n<<b)|(n>>(32-b)))&0xffffffff 
for c in chunks(pBits,4512):
    words=chunks(c,32)
    w=[0]*80
    for n in range(0,16):w[n]=int(words[n],2)
    for i in range(16,80):w[i]=rol((w[i-3]^w[i-8]^w[i-14]^w[i-16]),1)
    a,b,c,d,e=h0,h1,h2,h3,h4
    for i in range(0,80):
        if 0<=i<=19:f=(b&c)|((~b)&d);k=0x5A827999
        elif 20<=i<=39:f=b^c^d;k=0x6ED9EBA1
        elif 40<=i<=59:f=(b&c)|(b&d)|(c&d);k=0x8F1BBCDC
        elif 60<=i<=79:f=b^c^d;k=0xCA62C1D6
        temp=rol(a,5)+f+e+k+w[i]&0xffffffff
        e,d,c,b,a=d,c,rol(b,30),a,temp 
    h0+=a
    h1+=b
    h2+=c
    h3+=d
    h4+=e 
print '%08x%08x%08x%08x%08x'%(h0,h1,h2,h3,h4)

kyle k

Posted 2013-12-31T19:02:55.833

Reputation: 409

Please specify the language in the answer. Also, since this is code-golf, you should at least attempt minification. One-char variable names, removing unneccesary whitespace, stashing expensive constants (0xffffffff) into variables (also, would -1 suffice?)... – John Dvorak – 2014-01-03T05:48:36.120

looks like phython to me :) – Owais Qureshi – 2014-01-03T05:53:16.837

@JanDvorak I modified my code. – kyle k – 2014-01-03T18:17:28.797

h0..h4 are still two-letter ;-) – John Dvorak – 2014-01-03T18:20:13.660

I get a syntax error at line 29. – ProgramFOX – 2014-01-04T08:16:07.677