Cryptographic hash golf (robbers)

12

This contest is over.

There are no remaining crackable answer in the cops challenge.

Companion thread of Cryptographic hash golf

As a reminder, here are the rules for robbers from the main challenge:

Task

Crack any of the cops' submissions by posting the following in the robbers' thread: two messages M and N in I such that H(M) = H(N) and M ≠ N.

Scoring

Cracking each cop submissions gains you one point. The robber with the most points wins.

In the case of a tie, the tied robber that cracked the longest submission wins.

Additional rules

  • Every cop submission can only be cracked once.

  • If a cop submission relies on implementation-defined or undefined behavior, you only have to find a crack that works (verifiably) on your machine.

  • Each crack belongs to a separate answer in the robbers' thread.

  • Posting an invalid cracking attempt bans you from cracking that particular submission for 30 minutes.

  • You may not crack your own submission.

Example

Python 2.7, 22 bytes by user8675309

1

and

18

Leaderboard

  1. eBusiness: 3 cracks, 393 bytes
  2. Martin Büttner: 3 cracks, 299 bytes
  3. jimmy23013: 3 cracks, 161 bytes
  4. Sp3000: 3 cracks, 44 bytes
  5. tucuxi: 2 cracks, 239 bytes
  6. Vi.: 2 cracks, 87 bytes
  7. feersum: 1 crack, 216 bytes
  8. mathmandan: 1 crack, 139 bytes
  9. squeamish ossifrage: 1 crack, 134 bytes

Dennis

Posted 2015-05-31T14:45:51.143

Reputation: 196 637

Answers

5

C, 122 bytes - by: Mr. Llama

bmaj8PCosFLAJjeHaevvvchnJedmg2iujpePOPivI2x2asw0yKa2eA15xvFJMFe82RGIcdlvxyaAPRuDuJhFjbh78BFsnCufJkarwEyKa0azHxccw5qegpcP9yaO0FKoohanxgiAfK1Lqwba51bKtjacbvdjMmcBkiv8kd62sBd98c4twa98sgj3iPh7nkP4
rlaejTPrua1DhBdg0jrIoDBi8fc1GIJAigivIGaxs1OmfPcctNadK3HErvzPLCeDPD8fkMNPCBcIwuoGfEHegOfk9k9pwktslqaBenaati1uNthMiyk9ndpy7gdIz88iot6A09cbNeIMheyjBvbeegL7aGp7mCb91hCxnvgV5abfImrPfLbrbraAsN6loJgh

Both strings hash to bb66000000000000d698000000000000

Just like "C, 128 bytes - by: squeamish ossifrage" the higher order bits never influence the lower order bits, this can be exploited.

Code

Visual C++, uses "unsafe" string operations

#include "stdafx.h"
#include <string>
#include <iostream>
#include <fstream>

long long x, y;

//Original hash function (not used for cracking).
void h(char inp[]){
    long long c;
    int index = 0;
    int len = strlen(inp);
    x = 0;
    y = 0;
    long long p = 0;
    for (c = 9; c ; c = (index<len?inp[index++]:-1) + 1) {
        for (++p; c--;) {
            x = x*'[3QQ' + p;
            y ^= c*x;
            y ^= x ^= y;
        }
    }
    printf("%016llx%016llx\n", x, y);
}

//Partial hash, takes a string and a starting point in the stream.
//The byte 0x08 must be prepended to a string in order to produce a full legal hash.
void hp(char inp[],long long p){
    long long c;
    int index = 0;
    int len = strlen(inp);
    x = 0;
    y = 0;
    for (index = 0; index<len; index++) {
        c = inp[index] + 1;
        for (++p; c--;) {
            x = x*'[3QQ' + p;
            y ^= c*x;
            y ^= x ^= y;
        }
    }
}

//Reverse partial hash, backtracks the inner state.
void hprev(char inp[], long long p){
    long long c;
    long long clim;
    int index = 0;
    int len = strlen(inp);
    p += len + 1;
    x = 0;
    y = 0;
    for (index = len-1; index>=0; index--) {
        clim = inp[index] + 1;
        c = 0;
        for (--p; c<clim;c++) {
            y ^= x;
            x ^= y;
            y ^= c*x;
            x -= p;
            x = x * 17372755581419296689;
            //The multiplicative inverse of 1530089809 mod 2^64.
        }
    }
}
const int rows = 163840;
const int maprows = 524288;

//Store for intermediate input strings, row 0 contains 64 columns with 3-char strings,
//row 1 contain 32 columns with 6-char strings and so forth, the final strings will
//contain one string from each column, in order.
char store[7][rows][512];

//Storage for a hashmap, used for matching n strings with n string in O(n) time.
char map[maprows][512];

int _tmain(int argc, _TCHAR* argv[])
{
    char alpha[] = "abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
    int row;
    int col;
    int layer;
    int a=0, b=0, c=0;
    int colzero;
    //Produce some starting strings.
    for (row = 0; row < rows; row++){
        //All column 0 strings begin with 0x08 in order to imitate the hash.
        store[0][row][0] = 8;
        colzero = 1;
        for (col = 0; col < 64; col++){
            store[0][row][col * 8 + colzero] = alpha[a];
            store[0][row][col * 8 + colzero + 1] = alpha[b];
            store[0][row][col * 8 + colzero + 2] = alpha[c];
            store[0][row][col * 8 + colzero + 3] = 0;
            colzero = 0;
        }
        a++;
        if (a >= 52){
            b++;
            a = 0;
            if (b >= 52){
                c++;
                b = 0;
            }
        }
    }
    //Layer for layer, column for column, build strings that preserve successively
    //more zero bits. Forward calculated partial hashes are matched with backwards
    //calculated partial hashes.
    for (layer = 1; layer < 7; layer++){
        int slayer = layer - 1;
        int swidth = 1 << (slayer + 3);
        int width = 1 << (layer + 3);
        int slen = 3 << slayer;
        int len = 3 << layer;
        int colnum;
        int layershift=slayer*8;
        for (col = 0,colnum=0; col < 512; col+=width,colnum++){
            printf("Layer: %i, column: %i\n",layer,colnum);
            memset(map, 0, sizeof map);
            int col2 = col + swidth;
            for (row = 0; row < rows; row++){
                hprev(store[slayer][row] + col2, 1 + slen*(1 + colnum * 2));
                x = (x >> layershift) & 255;
                y = (y >> layershift) & 255;
                int index = (x << 3) | (y << 11);
                for (a = 0; a < 8; a++){
                    if (map[index + a][0] == 0){
                        strcpy_s(map[index + a], store[slayer][row] + col2);
                        break;
                    }
                }
            }
            int destrow = 0;
            for (row = 0; row < rows && destrow < rows; row++){
                hp(store[slayer][row] + col, !!colnum + slen*(colnum * 2));
                x = (x >> layershift) & 255;
                y = (y >> layershift) & 255;
                int index = (x << 3) | (y << 11);
                for (a = 0; a < 8 && destrow < rows; a++){
                    if (map[index + a][0]){
                        strcpy(store[layer][destrow] + col, store[slayer][row] + col);
                        strcat(store[layer][destrow] + col, map[index + a]);
                        destrow++;
                    }
                }
            }
        }
    }
    memset(map, 0, sizeof map);
    char temp[1000];
    std::ofstream myfile;
    myfile.open("hashout.txt");
    for (row = 0; row < rows; row++){
        hp(store[6][row], 0);
        sprintf(temp, "%016llx%016llx", x, y);
        myfile << store[6][row] <<" " << temp << "\n";
    }
    myfile << "\n";
    //The final hash set has 96 of 128 output bits set to 0, I could have gone all
    //the way, but this is enough to find a collision via the birthday paradox.
    for (row = 0; row < rows; row++){
        hp(store[6][row], 0);
        long long xc = x;
        long long yc = y;
        int pos = (xc >> 45 | ((yc >> 48) & 7)) & (maprows-1);
        while (map[pos][0]!=0){
            hp(map[pos], 0);
            if (x == xc && y == yc){
                myfile << store[6][row] << "\n" << map[pos] << "\n";
                sprintf(temp,"%016llx%016llx", x, y);
                myfile << temp << "\n\n";
            }
            pos = (pos + 1) % maprows;
        }
        strcpy_s(map[pos], store[6][row]);
    }
    myfile.close();
    printf("done");
    getchar();
    return 0;
}

aaaaaaaaaaaa

Posted 2015-05-31T14:45:51.143

Reputation: 4 365

Awesome! I'm actually flattered in an odd sort of way! :D – Mr. Llama – 2015-06-05T20:06:58.003

Also, for my own personal education, when you say the higher order bits never affect the lower ones, what do you mean? The higher order bits of the input string, or of the hash state? – Mr. Llama – 2015-06-05T20:13:16.333

@Mr.Llama In the hash state, the upper bits of x and y will never influence the lower bits, so for instance if you flip on of the middle bits during computation the low part of the hash will still come out right. This lets me start out ignoring everything but the lowest bits of the hash state, then when I have those under complete control I move on to the next layer of bits, and so forth. – aaaaaaaaaaaa – 2015-06-05T20:36:39.440

Cool! Thanks for the explanation! – Mr. Llama – 2015-06-05T20:44:32.857

Congratulations on winning the robbers challenge! – Dennis – 2015-06-10T20:19:00.357

12

Python, 109 bytes by Sp3000

Note that Martin cracked first, so I'm not sure if this deserves points. On the other hand, I did make a preimage attack rather than a simple collision - a much stronger result. This means that you can give it an arbitrary hash value, and it will construct an input that generates that hash value.

M = 2**128

# The hash to crack.
def jenkins(n):
    h = 42
    while n:
        h += n & (M - 1)
        n >>= 128
        h *= 1025
        h ^= h >> 6
        h %= M

    h *= 9
    h ^= h >> 11
    h *= 32769

    return h % M

def egcd(a, b):
    if a == 0:
        return (b, 0, 1)
    else:
        g, y, x = egcd(b % a, a)
        return (g, x - (b // a) * y, y)

def modinv(a, m):
    g, x, y = egcd(a, m)
    if g != 1:
        raise Exception('modular inverse does not exist')
    else:
        return x % m

def invxorshift(h, s):
    r = h >> s
    while r:
        h ^= r
        r >>= s
    return h

def moddiv(a, b):
    return (a * modinv(b, M)) % M

def jenkins_crack(h):
    h = moddiv(h, 32769)
    h = invxorshift(h, 11)
    h = moddiv(h, 9)
    h = invxorshift(h, 6)
    h = moddiv(h, 1025)
    h -= 42
    return h

And to show that it works:

>>> from crack import *
>>> n = 2**128 + 1337
>>> h = jenkins(n)
>>> n2 = jenkins_crack(h)
>>> h2 = jenkins(n2)
>>> n != n2
True
>>> h == h2
True

And to give a particular set of numbers that collide:

N: 2**128
M: 43617

orlp

Posted 2015-05-31T14:45:51.143

Reputation: 37 067

3My initial proposal in the sandbox awarded points for collison, preimage and (slightly oversimplifying things) length extension attacks, but I decided to keep the scoring simple. When I edited those parts out, the fact that every submission can only be cracked once (which is how cops-and-robbers usually works) somehow got lost. Seeing your answer makes me wish I had kept preimage attacks... – Dennis – 2015-05-31T20:02:01.167

9

Python, 109 bytes by Sp3000

340282366920938463463374607431768211414

and

113982837842983129870077688367927159293402923522160868689804433865221255200726

both yield

132946164914354994014709093261515948032

The algorithm splits the input into chunks of 128 bits, and repeatedly modifies the hash (seeded to 42) with each chunk, before doing some extra hashing at the end. To find a collision, our goal is to find two numbers that yield the same result after running the following pseudo code on each chunk:

hash += chunk
hash += (hash << 10)
hash ^= (hash >> 6)
hash %= 2**128

Since the hash is taken mod 2128 we want to look for numbers that shift all the interesting stuff outside of this bit range. But the hash is seeded to 42 so it has some not-so-significant bits set to begin with:

000000000000000000000000 ... 000000000000000000101010

My idea was to get rid of those bits when adding the first chunk. So let's try 2128-42:

           000000000000000000000000 ... 000000000000000000101010     hash = 42
           111111111111111111111111 ... 111111111111111111010110     chunk = 2**128 - 42
          1000000000000000000000000 ... 000000000000000000000000     hash += chunk
10000000001000000000000000000000000 ... 000000000000000000000000     hash += hash << 10
10000010001000001000000000000000000 ... 000000000000000000000000     hash ^= hash >> 6
           000001000000000000000000 ... 000000000000000000000000     hash %= 2**128

That's fairly simple so let's try to use that as one of the two numbers. (Indeed, the first number of the collision I used is 2128-42.

Now how do we find another number with the same result? Well after one iteration the hash isn't 42 any more, but 2**122 as we've just shown. Now by adding a second chunk to our input number, we can run another iteration. We can choose the second chunk by the same argument as this one, i.e. we want 2128-2122. Then the intermediate result after hash += chunk will be identical and we end up with the same result at the end.

So we can compute the two numbers of the collision:

>>> 2**128-42
340282366920938463463374607431768211414L
>>> 2**128-42 + ((2**128-2**122)<<128)
113982837842983129870077688367927159293402923522160868689804433865221255200726L

We can easily generate many more collisions like this.

Martin Ender

Posted 2015-05-31T14:45:51.143

Reputation: 184 808

I was also cracking this - almost done. Is this a fastest gun in the west contest or can I still get points for posting it? – orlp – 2015-05-31T18:37:37.527

@orlp Normally only the first robber gets a point. Otherwise, people could just generate millions of additional cracks once the first crack has been posted. – Martin Ender – 2015-05-31T18:54:32.387

1Lame =/ Think I'll stop doing this challenge then. I don't enjoy racing against others - I just want to puzzle. Can't there be some time window for cracks after the first, with only 1 crack per person? – orlp – 2015-05-31T18:55:22.430

@orlp The original version in the sandbox had three different methods of cracking a cop, and all three could be posted independently. I guess that's an interesting model to investigate at some point. But so far in previous CnRs allowing multiple cracks would always have broken the challenge more than it would have improved it. – Martin Ender – 2015-05-31T18:57:11.930

1See my answer for a preimage attack, rather than a collision :) – orlp – 2015-05-31T19:41:53.240

8

Mathematica, 89 bytes by LegionMammal978

0

and

16

Both yield 0.

The principle of this cop is to evolve a "random" binary 1-D cellular automaton from a "random" initial condition for a "random" number of steps, and then interpret the first 128 cells of the result as an integer.

The problem is that the rule is determined simply by Mod[#^2,256], such that any multiple of 16 gives rule 0, which is the trivial rule where all cells are always zero. If the input is not divisible by 99 then we will evolve at least 1 step, so the output is always zero. So any two multiples that aren't multiples of 99 definitely collide. However, input 0 also gives 0 (despite never using the rule), because the initial condition is just the binary representation of the input (which is all zeroes in this case).

As an aside, we can find other collisions that are completely independent of the rule. As noted above, any multiple of 99 means that the cellular automaton isn't evolved at all, so the result is simply the first (most significant) 128 bits of the initial condition... which is itself just the input number. So if we take two multiples that don't differ in the first 128 bits (padded to the right with zeroes), we also get a collision. The simplest example of this is M = 99, N = 99*2 = 198.

Martin Ender

Posted 2015-05-31T14:45:51.143

Reputation: 184 808

8

J, 39 bytes

The first number is:

10000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000

That is, 10000000 repeated 64 times. The second number is that plus one, i.e.

10000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000000100000001000000010000001

Both yield

322124197561777885386414076216564234267

Explanation

Let's start with x := H 10000000 = 146018215378200688979555343618839610915, and y := 2^128. Rather than finding a, b such that a == b mod y, we'll look for a, b such that x^a == x^b mod y, making use of the power towers in the algorithm.

But there must be some k such that x^k == 1 mod y, since x, y are coprime, and for that k we must have a == b mod k. So we can find the discrete logarithm of 1 mod y, and for the first step we get

x ^ (k = 85070591730234615865843651857942052864) == 1 mod 2^128

So now we want to find two numbers a, b such that a == b mod k. To do this, we set y to be k and try to find a, b such that x^a == x^b mod y again. Using the same logic, we take the discrete logarithm again and get

x ^ (k = 21267647932558653966460912964485513216) == 1 mod 85070591730234615865843651857942052864

We repeat this until we get to a small y, at which point it is trivial to find two numbers which hash to the same thing modulo y. After 63 iterations y = 4, at which point basically any two numbers work.

Here's the Mathematica code to generate the discrete log chain:

k = 0; x = 146018215378200688979555343618839610915; y = 2^128; While[y > 10, y = MultiplicativeOrder[x, y]; k++; Print[k, " ", y]];

This gives the following output.

Sp3000

Posted 2015-05-31T14:45:51.143

Reputation: 58 729

The slightly shorter version is that if the first few hundreds of digits are the same it is improbable that the rest of the input matter in any way at all. Actually doing maths to break this is overkill. – aaaaaaaaaaaa – 2015-06-06T08:25:14.933

@eBusiness That's true. It turned out that it didn't matter much here, but initially I was worried about going over the 2^(2^30) limit, hence the check. – Sp3000 – 2015-06-06T08:38:58.560

Actually, I suspect that it might be impossible to fabricate a string where anything beyond the 512th digit matters. You managed to produce the worst-case scenario. Easiest crack must be to take advantage of internal leading zeroes: "100000001" "1000000001". – aaaaaaaaaaaa – 2015-06-06T10:00:42.763

7

Pyth, 8 bytes by FryAmTheEggman

99999999999999999999999999

and

99999999999999999999999998

Floating point precision isn't big enough for this.

Sp3000

Posted 2015-05-31T14:45:51.143

Reputation: 58 729

I'm actually getting different results for this, but I do believe that this strategy would work anyway, so I'll mark it as cracked. Fast work :P – FryAmTheEggman – 2015-05-31T16:22:11.383

Hmm weird. I get 437409784163148 for both. I wonder why there's a difference... – Sp3000 – 2015-05-31T16:23:58.713

You're probably using python 3.5 right? I haven't updated yet, still on 3.4 maybe that's it? – FryAmTheEggman – 2015-05-31T16:24:42.843

@FryAmTheEggman I'm using the online interpreter, actually... – Sp3000 – 2015-05-31T16:25:42.610

Actually yeah I get 437409784163148 and 37409784163148 so I guess it's just lost the last digit for some reason, but 99...997 gives the same answer as 999...98. – FryAmTheEggman – 2015-05-31T16:26:30.200

@FryAmTheEggman There, I added more 9s. How's that? :P – Sp3000 – 2015-05-31T16:30:02.353

Yep that's the same. Are you sure you aren't Randall's cat? :P – FryAmTheEggman – 2015-05-31T16:33:37.463

6

CJam, 44 bytes, user jimmy23013

The numbers are too big to post, so here they are on Pastebin: num 1, num 2.

The first number is 600^2 = 360000 ones. The second number is the same, except for the following changes:

Positions to change to "2": 605, 1811, 3001, 6603
Positions to change to "4": 1805, 3003, 57348, 208895
Positions to change to "5": 602, 1201, 2405, 3004
Positions to change to "6": 1203, 1802
Positions to change to "7": 12, 609, 5401, 7200
Positions to change to "8": 1, 2, 4, 6, 600, 1200, 1808, 2400, 3600, 4803

Both hash to 271088937720654725553339294593617693056.

Explanation

Let's take a look at the first half of the code:

lW%                e#  Read input number as string, and reverse
600/               e#  Split every 600 digits, forming a 2D array
_z                 e#  Duplicate and zip, swapping rows and columns
{           }%     e#  For both arrays...
 JfbDb             e#  Find sum of S[i][j]*13^i*19^j, where S are the character values
                   e#  and the indices are from right to left, starting at 0.
      GK#          e#  Take modulo 16^20

         ...  ...  e#  (Rest of code irrelevant)

So if we can find two input numbers so that the sums of S[i][j]*13^i*19^j are the same modulo 16^20 for both the initial 600-wide array and the zipped array, then we're done.

To make things a little easier, we'll consider only 600^2 = 360000-digit input numbers, so that the 600-wide array is just a 600 by 600 square of digits. This makes things easier to visualise, and is valid since 10^360000 ~ 2^(2^20.19) < 2^(2^30). To simplify things even further, we'll only consider such input strings whose digit square is symmetric along the main diagonal, so that the original array and the zipped array are the same. This also allows us to ignore the initial string reversal and the right-to-left index numbering, which cancel each other out.

To start us off, we can take the first number to be 360000 ones. To get the second number, we want to modify this by changing some of the digits so that the sums are the same modulo 16^20, while preserving the digit square's symmetry. We accomplish this by finding a list of triples (i, j, k) so that

sum of k*(13^i 19^j + 19^i 13^j) == 0 mod 16^20

where 1 <= k <= 8 is the amount to increase the digit 1 by (i.e. changing to a digit from 2 to 9 – we could have included 0 but we didn't need it) and 0 <= i < j < 600 are index pairs.

Once we have the (i, j, k) triplets, we change the digits at (i, j) and (j, i) to 1+k to get the second number. The triplets were found using a greedy backtracking algorithm, and for the second number above the digit square looks like:

188181811111711 ...
815112111711111 ...
851611111111111 ...
116114118112111 ...
811115111111111 ...
121451111111111 ...
811111111111111 ...
111111111111111 ...
111811111111111 ...
171111111111111 ...
111111111111111 ...
111211111111111 ...
711111111111111 ...
111111111111111 ...
111111111111111 ...

............... .
...............  .
...............   .

For example, (i, j, k) = (0, 1, 7) corresponds to changing the digits (0, 1) (position 600*0 + 1 = 1) and (1, 0) (position 600*1 + 0 = 600) to 1 + 7 = 8.


Here's the backtracker in Python 3, although closer inspection revealed that we were pretty lucky, as no backtracking actually happened:

n = 16**20
L = [(k *(pow(13,i,n)*pow(19,j,n) + pow(19,i,n)*pow(13,j,n)) % n, i, j, k)
     for i in range(600) for j in range(600) for k in range(1, 9) if i < j]

L.sort(reverse=True)
stack = [(n, 0, [])]

while stack:
    k, index, result = stack.pop()

    if k == 0:
        print(result)
        break

    if index == len(L):
        continue

    stack.append((k, index+1, result)) # Don't include triplet

    if L[index][0] <= k:
        stack.append((k - L[index][0], index+1, result + [L[index][1:]])) # Include

For a bonus, here's a not-so-efficient port of the hash in Python 3. It was useless.

Sp3000

Posted 2015-05-31T14:45:51.143

Reputation: 58 729

5

PHP 4.1, 66 bytes by Ismael Miguel

$ A=0111129112911291111111111111111111111111 php hash.php 2> /dev/null ; echo
0100038003800381129111111111111111111111
$ A=0111129112911291129111111111111111111111 php hash.php 2> /dev/null ; echo
0100038003800381129111111111111111111111
$ cat hash.php 
<? $a = getenv("A"); for($l=strlen($b.=$a*1);$i<40;$o.=+$b[+$i]^"$a"/$a,$i++);echo$o;

Found using simple iterated hashing, starting from 1:

$ i=1; while true; do i=$(A=$i php hash.php  2> /dev/null); echo $i; done | head -n 10
0111111111111111111111111111111111111111
0100000000000001129111111111111111111111
0111129111111111111111111111111111111111
0100038000000001129111111111111111111111
0111129112911111111111111111111111111111
0100038003800001129111111111111111111111
0111129112911291111111111111111111111111
0100038003800381129111111111111111111111
0111129112911291129111111111111111111111
0100038003800381129111111111111111111111

Vi.

Posted 2015-05-31T14:45:51.143

Reputation: 2 644

Yup, that one is cracked. How long did it took to reach there? – Ismael Miguel – 2015-06-01T13:40:33.583

The second try. The first try was looking the values of the first 100 hashes, second try was making the hash chains (i.e. hash(hash(hash(...(hash(1)...)))). The first chain converged into a loop almost instantly. I didn't even need to bring up my multithreaded hash cracker. – Vi. – 2015-06-01T13:45:10.940

Translation: pretty weak hash? – Ismael Miguel – 2015-06-01T14:06:37.010

Yes. – Vi. – 2015-06-01T14:08:42.590

5

Python 3 (216) by Sp3000

My messages are

5012053369354645637214643587103597313576086380250249302980438772005248488380915054746146050001036696049972235875591571028140916001206596142280971107479334216535925703480968283657357930602076844092875640359192729378384507238123245417656548512647301639542279794868512420586823155070914644206130805893968511673770843170450832829657206145931885656157628306896903719624729809643572222159364893644113710867223921580178741177106141068298067479650611992859787419779579962211254029169589775046869542029842374359998053713002047081002506346875804341770199884355810931652447801492691887376948615365487982834690942054717077615539311699691010938426302886867891090301248321702485904291177813145565144089044261424329155436660979948932491709511914065619715728353376578192548334780893602675684085757434059540582004872746967999949306946618036846307799677491651967418565531672392468089533111553281620101129322575737949904022139471688252420467041529301533363008476437812216585923822571793353317799365005036029476865
5012053369354645637214643587103103086948976188724715498910865650846170784131001427390927276355140411160919276493388206817700368694224128444524223814513348177926532982330730066315320819293979046126543806115318009892783577432467861426768883700930779409885418980853424256180864755881414774514084197887594253752179391098292488771920695965135791582218083012144604515253506370334133858904659263953147111654656123599460222236152128559750436960308887683690915261431659087040402402092795259541564130228515353133867041828417398395559815392177084002004583988047406317670433664624642858480970640416500369367395538257341309676777745698712896295462462064271676447460293684100001583256400774270688958051470568447233589146620275159126426142305307007744396679875427883384557759778766330566230012377845843842097372663092379922300568052486301863154557664156185573021849420011058607321977550938866119133331529852821217331665195832442542012455132139770813510559894254061471149750738447764616026512400623344132554752

I used this Python 2 code to generate them:

a,b = 14460445391122031029,16815296360833931837 #http://www.numberempire.com/numberfactorizer.php
pr = ~-a * ~-b

m0 = reduce(long.__or__, [long(b) << 26*i for i,b in enumerate(bin(pr)[2:])])
m1 = 1 << 26*i-1
m0 |= m1

#print m0, m1
print f(m0), f(m1)

The big modulus was a product of two primes a and b. I guess the hope was that it would be NP-impossible for us to factor the semiprime, but I guess 128 bits is too small as some webpage gave me the answer right away.

The multiplicative group modulo ab will have order (a - 1)(b - 1) which means if we raise any number to that power it will have to result in 0 or (usually) 1. So I put 1 bits in places that resulted in 2(a-1)(b-1) being multiplied into the hash. Then the other message is basically 0, but I set one other bit in each number to make the lengths the same.

I think it would have been more annoying were the hash squared on every bit, rather than only after using all the primes. Then it wouldn't have been so simple to build arbitrary exponents for them.

feersum

Posted 2015-05-31T14:45:51.143

Reputation: 29 566

Nice work :) Yeah the vulnerability I had in mind was basically that the semiprime is visible in the code, and I realised Mathematica could factor it instantly. – Sp3000 – 2015-06-01T23:50:21.347

+1 Your code is hard to read, but otherwise a nice and instructive crack. – aaaaaaaaaaaa – 2015-06-05T20:16:07.877

5

C, 128 bytes - by: squeamish ossifrage

The following two strings both hash to all zeroes:

dddl|lddH4|@dhxdxXdh0TXPdhhdx(dTxtlpdd@4Lhd|hdDpdhDdXLdXP4(PdddL|ldXdD(lddX4|0hddp4|ddP4Lxdp0dP@dhpTxPdhXdXxdhHDxHdllLttdhPT8pd(pT8Pdd0TL8dlLLTtddPtl8dP@DPPdhhDxhd804(pdh0Txpd@DDpLdhL4xtdXXdHXdd04lXht40dlddh4|@ddPTLXdhhDXHhtPH40dh0t8pd(pt80dhPtX0dhLtXtdhLT8thlLplTdhpt80dh0txpdhHDX(hdX8txdhhdxHdp|d@tdhlTx4dlptdxdh0T8PdT@t|Hdd@tL(ht(8DhdhHD8(hpHHP8dhLtXtdX8dhxdhpt8Pd@(D@Hdd@tLhdtxTLPdd0tlxhhL8X|dd8t|0dT04|Xddxt|phxxxhhdhpt8PhhxX8hdhlTX4dd4l||dd@TLHdXlTHtdhHd8hdX0THPdh(D8(d8xdh8dhp4xPd0HDp(dhl4xTdxlthtdhlTx4d8lT(TdhhdXHdphdP(dhp4x0d0Xd0XddTl||d88DH8dhhdxhdx|tHDdhLT8Thll0lTddPTlXdxXd(xdd0Tlxdhp480dhp4x0dd|LltdhPt80dtll|dddPTlXdxXd(xdd0Tlxdhp480dhp4x0dd|LltdhPt80dtll|dddP4Lxd|ptT8dhddxldH|4xDdhp4x0dDdl|LdhtD8|hhHx88ddpTL8hhphx@dhtd8|dphDP(dh0tx0hhDHx4dhpt8Pd@(D@HddLLLDhh|xxldhl4xTdhL4x4dhPt8Pd(HDx(dh(D8Hd4PT|8ddH4|@hh4H8ddhxd8XdDP4lxdhHd8hdl04d8ddXT|phdh8Thdd@TlHdhXdxxdllL44dD@4lHdhxdxXhd8XtxddLlLddT@T|(dhxdXXd|P44Xdhpt8pdlHDT0dhL4Xtd@ldpDdddl|LdXP4h0dhltXtdX8d(Xdh|tXdhhLXX|dhxd8XdP@D0PdhXDxXhtpHtPdd84|pddtl||dh(dx(d88Dh8ddx4|PhtT0DLdd@tL(hdX8Txdhp480d08d08dlll44d4dLLldhTdX|hh8Xxhdh048pd08d08ddPtL8d4H4l@dhhdxHd|pt4Xddp4lXhp(hPxdh|48DdxhDh(ddLlldd8XdH8dddl|LdLHDT0dhpt8pdlHDT0dh(d8hdTHtl@ddptl8dt84LPdh8dxxdlptD8dd04lxhhH8XxddDl|ldP|D@4ddTl||d|ptT8dh(dXhhd8X48dhPtXpd(8DxXdh@TX@dDP4L8dhpTX0d4@4|hdhHdxHdX8DHxdhPT8PhllplTdh0TXPhlXHLXddp4lXhtHXD(dhP4X0htH8dhdhLTx4hpxHPHdhhd8(dX8DHxdhpt80hhdHxTdlll44d@Hd@(dhhDxhdh0t8Pddh4|@ddh4|@dhptx0dpPD0@ddPtlxdhPT8pdhhdX(htTpDLdd@4L(dLHDtpdhxd8xdt84lPdlpTdxdDPTLXddLLLDdxlThtdlhd4PdXLTh4ddptLxd|@44(dhhd8HdtDLLlddxt|pd|hDd0ddPtLXhl@H|pdhDD8ld8dDhLdhXDXxdDxT|PdhHD8hdp8dpxdhp480d@XD@xddpTLXdHhD8(ddllLDdD|LL4dhpt80d@LdPDdh|4xDdP8dpXddLllddl8d4@dhptXpdd(4|@dhltx4d0Dd@LdhptxphdPHdpdhl4xTdxlthtdhHD8HdTdllldhLtX4dXP4(PdhLTxTd4X4LpddlllDdlpTD8dllltTdL(dtPdhDDxLdhLTx4dhptx0d|0T4Xdhl4xTdHL4XtdhpTXpdLp4dxddHt|@dHL484dhHDXHdHLtxtdhDdXldxL4H4dh|TxDhh8xX(dhLt8td8Lt(TdhHDx(d4DlLlddXT|PdHHD8(dlll44dlP4dxdd@tL(dL@4dhdd0tLxd4X4l0dhhdxhdDlLldddLLlddD04l8ddPtlxd(hd8hdd(T|@hdDp4|ddP4Lxdp0dP@dhptXpd(p4X0dhhd8(d8pT(0dh8d8Xhd(XT(dhddxLd@XD@8dd@tlhd@ld0ddhTD8|hhPH8@ddtl||dH0Tx0ddLlLddhp480dhHdxhd4lL|DdhXD8xdhhDX(dh048pd4Ll|ddddl|LdXP4h0dlll4thhdhxtddP4LXdhXdxXdhpTX0hdXXtxddlLLddx0Th0ddTl||hlhhlHdd|Ll4dHDdXldhhDX(hpxh0HdhDDXLdXDDhLdlhDTpht8Xdxdhpt8phhHXX8dd(t|@dHl4xtddp4LXhxhXH8dhDDxldDXt|PdhTDX|d|0ttxdhdDXLdDLLLddd84|PdT84LpdlhDTphl8hlxdhXD8xdHpt8Pdlhd40ddHT|@dhxdX8dhlT84dh|T8dhlXHLxdhxDxXdT4lL|dlllttd@xd@xdhhDXHhtXXD8dh(d8(d4p4|8dd04lxdxPThpdhHD8Hhdhx4hdhl4xthl|pLDdhltX4dhP4XPdd0Tlxdl@tDhddP4lXd0xD0xdhHD8Hd@8D@xdh0T8Pd0XDpxddPtl8dP@DPPdhhDxhd804(pdd04L8hpxHphdhDdxLdppD0@dd@tl(d88dHXdh0txpdXhDhHdd@Tlhdx8DHXdh0tXPdxxdH8dhPT8Pd484LPdlhD4pdxxdHxdd|Lltdhptx0dhlTx4hp8HPhdhPt8pdt4lL|ddtl||dH0Tx0dhxd8xhl@H|pddLllDhldP||dhdD8ldXLTHTdlhDTpddllLddd04lxhhH8Xxdh|48DdP8d0XddLLldd|@44hdhhd8hd4x4L0dhltXthh4H8Ddh4DX|dD@Tlhdh0tXpd8|T(ddhtDX|dlhdTPdd@tLhdThTl@dh8D8xdT(TL@dd@Tl(d8Hd(hdhXdxxhtHXdhdd0tl8d|HDDPdd8T|PdH04xPdd@Tl(d|@4t(dd(4|@dHp4xpdhpt80dh0txpdhp48phdXxTxdhhDXHhtPH40dh0t8pd(pt80dd8T|pdlxdt@dhp48PdD0TLXdh0t8Pd|lldTdh|t8DhphHp8

ddTl||d4|L|4dhptX0d4dllLddxT|pdxXdH8dlhDtPhlpH|@dd|Lltdhptx0dhlTx4hp8HPhdhPt8pdt4lL|ddtl||dH0Tx0ddLLLDd8tdH|dhDD8LdtxtLpdhxD8Xhd8xtxdhPt8Pd(8DX8dhddxLd0xd08dd0Tlxdxdd(Lddh4|@dXpt(Pdh048pd0xd0xdhhDX(d8p4Hpdh0480d(8DX8dhL4x4d4PT|XddPTLXdPDd@Ldddl|ld(P4X0ddDL|lht88DXdhPtxpd((Dx(dh0tx0dxXd(8dhpT8Pd0xD0XdlhD4pdT0T|8dh04XPht0H40dlhDtpdpHDP(dhlTXtdPHdpHdhXDxXhpPH0pddDl|lhltp|Ldh04x0dtXTL0ddLLLDdLhdtpdhL4xtdHDdXLddxt|0d4X4l0dh(Dxhdx04h0ddllLDd0PD0@dhXDxxhdx848dhDDxldpXDpXdhPt8pdhltxTdd04lxhhH8Xxdh|48DdP8d0XddLLldd|@44hdhhd8hd4x4L0dhltXthh4H8Ddh4DX|dD@Tlhdh0tXpd8|T(ddhtDX|dlhdTPdhlTXtdTX4L0dd@Tlhhh8xXHdhPt80d|XdD@dhp4xphd4Ptldd|LL4dL|ltDdhPTx0d80T(pdhpt8pd|pTtXdhpTX0hhth8Ddhxd8xdphdP(dh8D88dp(DPhdhHD8(htxXdXdh8dXXdXpTH0ddx4|PdpXDPxdhXDXXdxxdhXdhlt8Td@xD@8dhP4XPdhltX4dd@tlHdhXDxxdhPtXPd(8Dxxdh0t8PhdpHd0dh(D8HdX(D(Hdd@tLhht|@4Tdd@4lHdttll|dd0tlXhh|xxldd@TLHdlHdTPdd|LL4dt@T|hddx4|PdlHdtPddTl||d88DH8dlhdTpd40t|xddht|@dpPDP@dhHDxHhxthHDdhddxldxtDH|dhltx4d8Dd(ldd|LLthp0H0Pdhl4x4d|0T4Xdd|ll4dt8tLPdd@4lhd|0TTXddPtLXd(8d8xdhPTxPdHxd8xdhHDX(ddLllddhp48Pd0@d0PdhptxpdX(DhHdd0TlXhtPHTPddh4|@dTDlLldhDDxLhp(hPxdhdD8ldXLTHTddPtLXdTh4L@dhLtxTdlpTd8dhPtXpdhLtX4ddPTlXdxxdhXdhhd8(d404|8dhTd8|dhL4Xtddp4l8d4X4LpdhL4Xtd@ldpDdddl|LdXP4h0dhpTX0htXxDxdhpt8pddLlLddhp4XPhp0H00dh4Dx|dlp4D8dhPtxpd((Dx(dh0tx0dxXd(8dhDDxlhlL0ltdhhDxHd@|d0TdhHdxhdL0tD8dhhD8hhl|pLdddxt|pd|hDd0ddPtLXhl@H|pdhxDXxd8DdhldlhdtphpphppdhpT8PdH8dxXdlhd40dtXtlPdhTd8|dXlthtdhTDX|dx|4HDddxT|pdHDd8ldhL4X4dhP4XpdhtDx|ddXt|Pdh|T8DdHhdxhddLLLDhlxHl8dh0tXPd|(ddPddDL|LdHhdxhdhp4x0dl8dT@ddXT|phdh8Thdh(DXhd0HDP(dddl|lhd(xT(dhXdXxdTxtl0dd|lLtd8|4hddd4l||dXLTh4dd04lxdP8DP8ddxT|0dXXdh8ddP4lxd0@DpPdh8dXxddp4lxdhLt8tdHTdx|dh4Dx|dxLTHtdhhd8hd@DDpldd04LXdXlT(tdhXdXxdhPT8pdh(DXHdP@dp0ddP4LXdXpThPdllL4td((D8(dh0tXpd|(ddpdh(DxhhdL@DDdhHDx(dxL4(tdhLtXtdl@4dHdhxd8xdTh4L@dhXDXXhhdH8Tdd8T|PdH04xPdlllT4hllpLtdhhDXHhxxXhhdhXDxXdPDd@Ldd0TlXdHLtX4ddDL|ldXLT(4dhPtXPdXXd(8dhpt8phdH8thddxT|pd(ptXpddP4LxdLXDT@dhpT80dLptDxddxt|pdP@Dp0dhptx0d|0T4XdlpTdxdxpt(PdhHD8(d4TlL|dhHDx(d@hD@(dd@tl(d88dHXdh(Dx(d4pT|xddPtl8dP@DPPdhhDxhd804(pdhHD8Hhdhx4hddP4lxhdhXt(dhxdX8dp@DppdlllT4dP0dp@dddl|ldH8DXXdllLT4dPXdp8dd@tLHdlPTd8ddtL||d8PtHpddHt|@hd|@d4dh(dX(hdhXT(dhpT80hdHX4(dlpTdxdtDlLlddxT|pd(ptXpddP4LxdLXDT@dhpT80dLptDxddxt|pdP@Dp0dhptx0d|0T4XdlpTdxdxpt(PdhHD8(d4TlL|dhHDx(d@hD@(dddL|lhtph40dhpTxPdlp4dXdhDDxldpxD08dh(dX(dHlTxTdd|ll4d40t|Xdh0480ht@hT@dhptXphdHxT(dh(D8Hd4PT|8dhpt8pd88dhXddDl|LhxdHHtddPtlXd|pt4Xdd0Tl8d0(D0hdhhd8hdppd0@ddPTlXd8P4hpdhlTx4d8dDhLdd@TLhhllplTddXT|0dH|4XDdh|4xDht8XD8ddptl8dH8d88dd|LLTdh(DXhddHt|@hpXhp(dhdDxLdDhT|@dhP4X0dXhDHhdh0T8Pd((dxhdhhDx(hdx8Txddp4LXd8xDH8dhPTXpdlPtD8dh(DxHd@8D@Xdhl48Td00Dp@dhLT8Tdp(d0(dhhd8(d404|8dhhdx(dx0T(pdd|lL4ddXt|Pdd0TlXhxdH(4ddllLDhhLXX|dhXDx8hl8hLxdhpT80dLPtDXdhptX0dPXd0XddP4lxd0@DpPdlptd8dl(dTPdhxDx8d(ptX0dhpT80htxxdXdhhDxhdXltHtddh4|@d@|dPTdhdDXLhpph0Pdhp48Pdt4lL|dh04xpdLpTD8dd@4lhdl8dt@ddhT|@dPxDp8dd04lXd40t|xdd0TLxdTdlLLddpTLXd|pTT8dd04lxhhH8XxdhddxlhddPT|dd04LXdlhd4pdh8d8xhh|8XLdhxd8xd(8d8xdhp48pd(8DX8dhhDXHd4dllLddx4|0d8PTH0ddPtlxd|P44XdlpTdxd(XDXXddpTlxdHltX4dhLTxtd|HDD0

The hash function is built so that higher order bits never influence lower order bits, therefore I can generate a collection of strings where all of the x lower order bits are zero, then I can try concatenated combinations of these strings to find some where more of the lower bits are zero etc. I'm pretty sure that there are more ways of breaking this, and also ways that produce significantly shorter strings, but this way I avoided doing a lot of maths.

aaaaaaaaaaaa

Posted 2015-05-31T14:45:51.143

Reputation: 4 365

Awesome! They both hash to 0x0000000a0000000a0000000a0000000a on my system, but that's still pretty amazing. (echo -ne '\x0a' |./hash also gives the same result.) – r3mainer – 2015-06-03T12:43:09.800

1@squeamishossifrage You have a rouge newline after each of the strings, without that it is plain zeroes. – aaaaaaaaaaaa – 2015-06-03T12:47:04.437

Oh yes, my mistake :-) – r3mainer – 2015-06-03T12:48:37.867

4

Python 3, 118 bytes

int(H("9"+"0"*400))

and

int(H("9"+"0"*4000))

(that is: 9E400 and 9E4000)

Both produce

83909358607540647658718900164058931893

Digging somewhat deeper, it seems that any integer followed by k repeated digits such that k > 128 and (k%4 == 0) will return the same hash. For example, H("1"+"1"*32*4) and H("1"+"1"*33*4) are both 13493430891393332689861502800964084413. Hmmm, 128...

tucuxi

Posted 2015-05-31T14:45:51.143

Reputation: 583

4

Python 2, 161 bytes, by Puzzled

340282366920938463463374607431768211456 (decimal)
100000000000000000000000000000000 (hexadecimal)

and

340282366920938468780317283222139437056 (decimal)
100000000000001203B66F94300000000 (hexadecimal)

Both have the output:

83F172CC3D050D131F64FD04B8181DC2

The numbers are 2^128 and 2^128 + (3*5*7*11*13*17)^2*19*2^32.

jimmy23013

Posted 2015-05-31T14:45:51.143

Reputation: 34 042

3

Java, 299 bytes by SuperJedi224

Pastebin for M. In binary, M has 65535 1s, followed by 2 0s.

Pastebin for N. In binary, N has 21845 1s, followed by 174766 0s.

Both yield 0.

Note that the basis of the algorithm is i.bitCount()*i.bitLength()+1 and ultimately, we take the result to the power of i and take it mod 2128. So the idea was just to find two i that are divisible by four, but where the first expression gives 232. That was easily done by factoring 232-1 and picking two factors for the count of 1s and total bit width of the number.

Edit: Actually, there's a bit more to why M yields zero, but we can easily find more numbers that yield zero because of my explanation by using other factors of 232-1 such that there are at least 64 zeroes at the end.

Martin Ender

Posted 2015-05-31T14:45:51.143

Reputation: 184 808

3

C, 134 bytes (by Barteks2x)

10

and

20

both hash to

0xa609779942cb9ef1

because the algorithm only hashes the last digit!

r3mainer

Posted 2015-05-31T14:45:51.143

Reputation: 19 135

That was a stupid mistake... – barteks2x – 2015-05-31T23:47:43.803

3

C, 87 bytes

$ echo B075343F9832CD60 | ./hash6_ ; echo
fc2e9f02bd284bd1
$ echo 5914BD1B71164C77 | ./hash6_ ; echo
fc2e9f02bd284bd1

Found using my collision bruteforcer.

Vi.

Posted 2015-05-31T14:45:51.143

Reputation: 2 644

Maybe it's because of only being 64-bit. – Vi. – 2015-06-02T14:54:56.593

Well done :-) How long did it take? – r3mainer – 2015-06-02T14:55:36.043

About 7 minutes of running the program. Now started again with measurements. – Vi. – 2015-06-02T15:01:34.050

1Found another collision: 473E0B6ED5AF2B92 7EC2BC9B5E9F5645 -> 0000000000000000 0EAC34C8A9F94389 after 3525078917 hash function calls and real 14m24.970s user 48m42.410s time. – Vi. – 2015-06-02T15:33:48.740

3

Python 2, 115 bytes, by squeamish ossifrage

1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111122222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222

and

2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222211111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

The hash value had nothing to do with the order of blocks.

jimmy23013

Posted 2015-05-31T14:45:51.143

Reputation: 34 042

2

Python 2.x, 139 bytes by Puzzled

H(2)

and

H(128)

both return

16645614427504350476847004633262883518.

mathmandan

Posted 2015-05-31T14:45:51.143

Reputation: 943

2

C++, 239 bytes by SpelingMistake

Using the provided "main" program, the following two inputs produce the same hash:

echo -n "dog" | ./h
481c27f26cba06cf

and

echo -n "fog" | ./h
481c27f26cba06cf

The first 8 bytes of input are never processed, due to this bug in the code:

 for(I i=n;--i;) // and then we use q[i] for this iteration

because --i evaluates to false when i==1, q[0] (the first 8 bytes: I is an int64). Replacing the loop condition with for(I i=n;i--;) would have fixed this.

tucuxi

Posted 2015-05-31T14:45:51.143

Reputation: 583

Looks like the first 8 bytes of input are just ignored. – Vi. – 2015-06-03T12:15:05.943

Seems like a bug in the code; the fix is going suffix instead of prefix. – tucuxi – 2015-06-03T12:17:38.770

1There are non-bug collisions as well (see comments to the original question). – Vi. – 2015-06-03T12:26:58.927

2

Ruby, 90 Bytes, by MegaTom

4271974071841820164790043412339104229205409044713305539894083215644439451561281100045924173873152

and

23495857395130010906345238767865073260629749745923180469417457686044416983587046050252582956302336

which are 2 and 11 followed by 40 zero bytes. So they both have 41 bytes. The hash value is added by the input length for each byte, and then it is reversed in decimal. A input length ending with 1 can make sure the hash value will end with a 0 pretty quickly. Then reversing it reduces the length of the hash value by 1.

Both have the hash value 259.

jimmy23013

Posted 2015-05-31T14:45:51.143

Reputation: 34 042

2

C# - 393 bytes - by: Logan Dam

70776e65642062792031333337206861786f72 and 70776e65642062792031333337206861786f7200 both hash to 18E1C8E645F1BBD1.

aaaaaaaaaaaa

Posted 2015-05-31T14:45:51.143

Reputation: 4 365

Cool! Could you explain how you cracked it? And maybe the "faulty padding"? – ldam – 2015-06-05T06:52:38.017

@LoganDam It is all that code that switch around the input, you end up processing a multiple of 8 characters, and if the input length isn't a multiple of 8, you substitute zeroes. If I add in some zeroes in the right place they simply take the place of the padding, which was zeroes in the first place. – aaaaaaaaaaaa – 2015-06-05T08:11:49.857