Add two vectors modulo 5

4

Interpret a triple of 32-bit integers as a vector of 32 integers modulo 5. Write an addition routine for these vectors.

Example:

def add32((a0,a1,a2),(b0,b1,b2)):
    c0 = c1 = c2 = 0
    m = 1
    for i in range(32):
        a  = ((a0//m)&1) + 2*((a1//m)&1) + 4*((a2//m)&1)
        b  = ((b0//m)&1) + 2*((b1//m)&1) + 4*((b2//m)&1)
        c  = (a+b)%5
        c0+= m*(c&1)
        c1+= m*((c&2)//2)
        c2+= m*((c&4)//4)
        m += m
    return c0,c1,c2

Note:

You only need to be internally consistent, not consistent with my sample code. If you use a different representation of the integers mod 5, that's fine as long as you provide a translation table of the form:

0 = 000 or 101
1 = 001 or 110
2 = 010 or 111
3 = 011
4 = 100

Scoring:

Characters + number of arithmetic operations required. The sample code takes 331 characters and 34 operations per bit (note that I'm not counting the function call or iteration, nor interpreter overhead); hence gets a score of 331+1056=1686.

Clarification:

This is a trick called bitslicing. It's natural to pack integers mod 5 directly into words (spaces added for clarity:

0 1 2 3 4 -> 000 001 010 001 101

but you end up sacrificing a few bits at the end of each word. To keep your data aligned but pack the most in, store the data in slices

                 0 0 0 0 1 = a2
a = 0 1 2 3 4 -> 0 0 1 1 0 = a1
                 0 1 0 1 0 = a0

Example 2

A language-agnostic example was requested, so here it is. Suppose we use the representation

0 = 100
1 = 001
2 = 010 or 101
3 = 110
4 = 011

and 111 and 000 are meaningless. Then, the following sum is valid according to our representation:

                  1 0 0 1 1 0 1 0
0 1 2 2 3 4 0 2 = 0 0 1 0 1 1 0 1
                  0 1 0 1 0 1 0 0
+
                  0 1 0 0 1 0 1 1
1 2 1 2 0 1 0 3 = 0 0 0 1 0 0 0 1
                  1 1 1 0 0 1 0 0
---------------------------------
                  0 1 1 0 1 1 1 1 
1 3 3 4 3 0 0 0 = 0 1 1 1 1 0 0 0
                  1 0 0 1 0 0 0 0

Remark

As Keith Randall has demonstrated, this can be done with all bits in parallel using logical operators. We can view such a solution as a circuit with a certain number of gates. Perhaps I should call this "circuit golf". I chose 5 instead of 3 because it's too easy to brute-force (optimal is 6 gates/trit).

boothby

Posted 2011-07-04T04:59:47.083

Reputation: 9 038

5BORING!!! No more plain math, please. – Oleh Prypin – 2011-07-04T07:43:00.110

1Can you give some small samples, maybe using bytes instead of 32-bit ints. – user unknown – 2011-07-04T08:56:32.200

I count 34 ops inside the loop, not 32. By lines: 10 + 10 + 2 + 3 + 4 + 4 + 1 = 34. – Matthew Read – 2011-07-04T17:51:41.617

Can you please give a language-agnostic description of what the code is supposed to do, and some sample input/output, or do you only want answers in Python ? – Paul R – 2011-07-04T21:42:36.937

It is difficult to count the number of "ops" in different languages. If you take eg. Brainfuck, you need a lot of ops while in another language it might be only a few. That said, some languages provide more powerful ops than others, so the rating is a bit biased. – FUZxxl – 2011-07-04T21:50:27.170

@Matthew Read: Thanks, fixed. @FUZxxl: Nothing's perfect, nothing's fair. I'm leaving that up to the audience -- but in the case of brainfuck, I'd say every increment/decrement visited in the worst case is the total cost. – boothby – 2011-07-05T05:47:33.143

Answers

5

Python, score=215 166 (167 118 chars, at most 48 ops)

def A(x,y,z,p,q,r):
 u,c=x^p,x&p;v,d=y^q^c,y&q|y&c|q&c;w,e=z^r^d,z&r|z&d|r&d
 if e:u,v,w=A(e,e,0,u,v,w)
 return u,v,w

Does addition in bit-parallel. The c,d,e are the 32 carries from one position to the next. e represents multiples of 8 that carry out of the vector, so if e has any bits set in it, we add 8%5=3 back in using a recursive call.

The translation table is the same as the one given in the question.

Keith Randall

Posted 2011-07-04T04:59:47.083

Reputation: 19 865

4You can save 43 chars by using single-digit names. And another 3 by concatenating assignments to d0,c0 and d1,c1 and d2,c2 by commas. – Howard – 2011-07-04T17:59:26.653