Implement bit-wise matrix multiplication

7

The MMIX architecture is a fairly simple big-endian RISC design. It has a couple of interesting features, one of them is the MXOR instruction, which is what you should implement.

The MXOR instruction interprets its arguments (two 64-bit registers) as two 8×8 matrices of bits and performs a matrix multiplication where exclusive-or is used for addition and logical and is used for multiplication.

An argument in binary representation where the leftmost digit is most significant

a0 a1 a2 … a63

is interpreted as a matrix like this:

a0 a1 … a7
a8 a9 … a15

a56 a57 … a63

Your task is to write a function, subroutine or similar in a language of choice which can be evoked from a single-letter name (i.e. f(x,y)) that implements this function. The two arguments of the function shall be of equal implementation-defined unsigned 64 bit integer type, the function shall return a value of the same type.

he winner of this competition is the shortest submission (measured in bytes or characters if the language you use uses a meaningful character encoding). You are encouraged to post a readable commented version of your code in addition to the golfed solution so other's can understand how it works.

FUZxxl

Posted 2014-08-20T19:05:11.567

Reputation: 9 656

3Some test cases would be nice. – Peter Taylor – 2014-08-20T19:27:25.653

1Do the input and outputs need to be numbers that correspond to the bitstrings, or can we use lists of bits? – xnor – 2014-08-20T19:44:12.823

3Am I allowed to use the language's matrix multiplication (if present in the standard lib)? – Howard – 2014-08-20T20:19:05.360

2Put another way, matrix multiplication where the elements of the matrix are in GF(2). – hobbs – 2014-08-20T21:53:51.330

@Howard: why not? – FUZxxl – 2014-08-20T22:13:12.443

@FireFly Please do not alter the specification without asking and without understanding what it's intent is. – FUZxxl – 2014-08-20T22:43:02.650

Answers

7

APL, 28 characters

f←{2⊥,↑≠.×/{8 8⍴(64⍴2)⊤⍵}¨⍵}

Usage:

> f 12345678 87654321
3977539

You can read it almost literally if you go right-to-left. (64⍴2)⊤ transforms each argument into 64 bit representation and 8 8⍴ makes an 8x8 matrix out of the results. ≠.×/ then applies the inner product to both arguments, where multiplication × and xor is used. The matrix is then transformed back to vector form ,↑, before 2⊥ undoes the basis transformation.

Howard

Posted 2014-08-20T19:05:11.567

Reputation: 23 109

Save 6 bytes by converting to tradfn, swapping 's arguments using , and flattening with . Also, you don't need to count the naming of your function in your code length. Try it online!

– Adám – 2017-08-09T12:48:21.580

7

x86 machine code, 45 44 40 bytes

  40072c:       b1 08                   mov    $0x8,%cl
  40072e:       48 c1 c7 08             rol    $0x8,%rdi
  400732:       b5 08                   mov    $0x8,%ch
  400734:       48 0f 6e c6             movq   %rsi,%mm0
  400738:       0f d7 d0                pmovmskb %mm0,%edx
  40073b:       40 20 fa                and    %dil,%dl
  40073e:       7a 01                   jp     400741 <amul+0x15>
  400740:       f9                      stc    
  400741:       48 11 c0                adc    %rax,%rax
  400744:       48 d1 c6                rol    %rsi
  400747:       fe cd                   dec    %ch
  400749:       75 e9                   jne    400734 <amul+0x8>
  40074b:       48 c1 ce 08             ror    $0x8,%rsi
  40074f:       fe c9                   dec    %cl
  400751:       75 db                   jne    40072e <amul+0x2>
  400753:       c3                      retq   

The code uses two tricky moves. The first is the pmovmskb intsruction which takes every 8th bit of an mmx register and packs them together into an integer register. Very useful for extracting columns. The second is the parity flag which is used to accumulate the dot product (the and/jp/stc combo).

Here's the expanded code with comments:

    # rowloop
    mov $8, %cl
1:
    rol $8, %rdi    # get next row of A into low 8 bits

    # columnloop
    mov $8, %ch
2:
    # extract first column
    movq    %rsi, %mm0
    pmovmskb    %mm0, %edx

    # dot with current row
    testb   %dil, %dl

    # shift result and add parity bit
    jp  3f
    stc
3:
    adc %rax, %rax

    # shift over columns of B
    rol %rsi

    dec %ch
    jne 2b

    # restore B
    ror $8, %rsi

    dec %cl
    jne 1b
    ret

Keith Randall

Posted 2014-08-20T19:05:11.567

Reputation: 19 865

Looks cool! Very neat solution. – FUZxxl – 2014-08-21T21:29:52.277

The use of the parity bit is really ingenious though. Makes the solution so much easier. – FUZxxl – 2014-08-22T04:15:22.770

BTW, you can replace xor %rax,%rax by xor %eax,%eax for one byte. The high byte is zeroed out automatically. sub %eax,%eax might be even shorter. – FUZxxl – 2014-08-22T04:16:33.633

You might also want to replace setnp %dl; lea (%rdx,%rax,2),%rax by jnp 1f; stc; 1:adc %rax,%rax if that improves the size (should bring one byte). – FUZxxl – 2014-08-22T04:21:59.480

Oh yes, you can also replace dec %cl;jne 1b by loop 1b since %ch will always be zero when encountering this instruction. – FUZxxl – 2014-08-22T04:33:05.603

And finally, you don't need to initialize %rax at all—we left shift it's contents 64 times anyway so all the old contents are cleared out. – FUZxxl – 2014-08-22T04:35:29.063

Here is an updated version of the code: http://fuz.su/~fuz/src/golf/dotproduct.s

– FUZxxl – 2014-08-22T04:43:34.780

@FUZxxl: I incorporated your suggestions. The only one I didn't do is the loop suggestion - it doesn't quite work. If rcx contains high bits on entry, we loop way too many times. It turns out the result is always correct because we recompute it over and over, but if bit 63 of rcx is set, termination takes a really long time. – Keith Randall – 2014-08-22T18:11:06.057

Shouldn't mov $8,%cl clear the upper 32 bits of %rcx? Looping for up to about 2³² extra cycles isn't too bad—should be an extra 20 seconds top. – FUZxxl – 2014-08-22T19:03:20.373

@FUZxxl: no, it doesn't. – Keith Randall – 2014-08-23T03:20:27.847

5

GolfScript (49 48 45 chars)

This takes a completely different approach to my earlier GolfScript entry, so I think it wants a separate answer.

{0{1$M&16.?M/:K*3$K&M*&^@2/@M)/@}255:M*++}:f;

It's pure bit-bashing.

If we consider a as a series of matrix elements we get

a_00 a_01 a_02 a_03 a_04 a_05 a_06 a_07 a_10 a_11 ...

Similarly b. Now, c_ij = a_i0 & b_0j ^ a_i1 & b1j ^ ... ^ a_i7 & b7j. So by grouping over the common index we get

c =   (a_00 a_00 ... a_00 a_00 a_10 a_10 ...) & (b_00 b_01 b_02 ... b_07 b_00 b_01 ...)
    ^ (a_01 a_01 ... a_01 a_01 a_11 a_11 ...) & (b_10 b_11 b_12 ... b_17 b_10 b_11 ...)
    ^ ...
  =   (((a >> 7) & 0x0101010101010101) * 0xff) & (((b >> 56) & 0xff) * 0x0101010101010101)
    ^ (((a >> 6) & 0x0101010101010101) * 0xff) & (((b >> 48) & 0xff) * 0x0101010101010101)
    ^ ...

That's enough background to be able to explain the code:

{                 # Function declaration boilerplate; stack is a b
  0               # Accumulator for the result
  {               # Loop: stack holds a>>k b>>8k partial-sum
    1$M&          #   M holds 255 (see below); get (b>>8k) & 0xff
    16.?M/:K      #   Derive 0x0101010101010101 and store in K
    *             #   Multiply: stack is a>>k b>>8k partial-sum rhs-of-and
    3$K&M*        #   Similarly for a: add lhs-of-and to the stack
    &^            #   Do the pointwise multiplication and addition
    @2/@M)/@      #   Restore loop invariant with a>>=1 and b>>=8
  }255:M*         # Store 255 in M and execute the loop that many times
                  # 8 times suffices to get the correct result; after that b>>8k == 0 and the
                  # partial sum doesn't change. After 64 iterations, a>>k == 0
  ++              # So we can ditch their remnants by adding to the partial sum
}:f;              # Function declaration boilerplate

Thanks to Dennis for the observation that 16.? is equal to 2 64? (or 2.6??); since it's one char shorter, I've stolen it. Thanks to Howard for suggesting a 2-char improvement which became 48->45.

Peter Taylor

Posted 2014-08-20T19:05:11.567

Reputation: 41 901

1Now that's an interesting approach. Nice bit magic! – Dennis – 2014-08-21T15:35:55.217

1You can replace ;; with * since b>>64 is always zero. – Howard – 2014-08-21T15:38:53.197

If you directly calculate the result on the stack (instead of collecting terms in an array) you may even save one more char: {0{1$255:M&16.?M/:K*3$K&M*&^@2/@M)/@}8*+\;}:f;. – Howard – 2014-08-21T15:57:19.017

@Howard, good points both. It's even possible to save one more by shifting a far enough right to become zero. – Peter Taylor – 2014-08-22T07:29:07.790

4

CJam, 35 bytes

{2{GG#+2b1>8/\}*zm*{z{:&}%:^}%2b}:F

Try it online by pasting the following Code:

{2{GG#+2b1>8/\}*zm*{z{:&}%:^}%2b}:F;  " Define function and clear stack.              ";
12345678 87654321 F                   " Call function.                                ";

How it works

2{      " Do the following twice:                                                     ";
  GG#+  " Add 16 ** 16 = 2 ** 64 to the topmost integer on the stack.                 ";
  2b    " Convert it into an array of its digits in base two.                         ";
  1>    " Discard the first digit.                                                    ";
  8/    " Split into arrays of 8 digits.                                              ";
  \     " Swap the resulting array with the second topmost stack element.             ";
}*      " This converts two 64-bit integers into 8×8 matrices of its base 2 digits.   ";
zm*     " Transpose the second matrix and compute the cartesian product of both.      ";
{       " For each pair (one row of the first and one column of the second matrix):   ";  
  z     " Transpose the array.                                                        ";
  {:&}% " For each array of the array, compute the logical AND of its elements.       ";
  :^    " Compute the logical XOR of the results.                                     ";
}%      " Note that ^ and & are addition and multiplication in GF(2).                 ";
2b      " Convert the array of base 2 digits into and integer.                        ";

Dennis

Posted 2014-08-20T19:05:11.567

Reputation: 196 637

8 ** 8 = 2 ** 24, surely? – Peter Taylor – 2014-08-21T15:14:54.773

Of course... G is 16, so it's a wrong comment for correct code. Thanks! – Dennis – 2014-08-21T15:15:49.127

Nice trick, though. If I steal it, I can save 1 char. – Peter Taylor – 2014-08-21T15:17:29.223

3

GolfScript (65 chars)

{[{2.6??+2base(;8/\}2*zip 8*\{{.}7*}%]zip{zip{~&}%{^}*}%2base}:f;

This turns out to be similar to BeetDemGuise's Python solution, although it uses more zips and less eval.

Dissection:

{
  # Stack: a b (as integers)
  [                  # Start gathering into array for subsequent zip
    {                # Repeat twice:
      2.6??+2base(;  # Convert to 64 bits
      8/             # Break into an 8x8 matrix
      \              # Flip top of stack
    }2*
    # Stack: a b (as matrices)
    # Each cell of the matrix product needs a row of a and a column of b
    zip 8*           # Get 8 copies of columns of b, ordered
                     #   [col_0 col_1 ... col_7 col_0 col_1 ...]
    \{{.}7*}%        # Get 8 copies of rows of a, ordered
                     #   [row_0 row_0 row_0 ... row_7 row_7]
  ]                  # Gather into array
  zip                # Zip: each array entry is now an array [col_j row_i]
  {                  # Map each array item
    zip              # Pair up each bit from col_j with corresponding bit of row_i
    {~&}%            # Multiply the bits in each pair
    {^}*             # Add the products
  }%

  2base              # Convert back from an array of 64 bits to an integer
}:f;                 # Assign to f and don't leave a copy lying about on the stack

Peter Taylor

Posted 2014-08-20T19:05:11.567

Reputation: 41 901

3

J, 50 chars

Here is a version that does not use J's built-in general matrix product conjunction ..

   f=:#.@,@:(|."1)@(~:/@:*"2 1"1 2|.)&(8 8$]#:~64#2:)

Testing (with 4×4 matrices)

   g=:#.@,@:(|."1)@(~:/@:*"2 1"1 2|.)&(4 4$]#:~16#2:)
   44584 g 28371
46550

And the matrices visualized, for manual verification.

   (4 4$]#:~16#2:)&.> 44584;28371;46550
┌───────┬───────┬───────┐
│1 0 1 0│0 1 1 0│1 0 1 1│
│1 1 1 0│1 1 1 0│0 1 0 1│
│0 0 1 0│1 1 0 1│1 1 0 1│
│1 0 0 0│0 0 1 1│0 1 1 0│
└───────┴───────┴───────┘

Exploded view

                                  &(             ) to each of the arguments,
                                            64#2:    generate 64 2's (radix for below)
                                        ]#:~         decode from radix
                                    8 8$             reshape as 8×8 array
                (              |.)                 reverse the right argument
                       "2 1"1 2                    for each pair of rows,
                      *                              multiply
                    @:                               composed with
                 ~:/                                 xor folded
         (|."1)@                                   reverse each row
      ,@:                                          ravel (flatten array)
   #.@                                             back to binary again
f=:                                                assign function to name 'f'

FireFly

Posted 2014-08-20T19:05:11.567

Reputation: 7 107

Instead of 64#2:, isn't there a shorter construct with 64$2? – FUZxxl – 2014-08-21T11:08:48.530

@FUZxxl no, we can substitute # with $ (they do the same in this case), but we need the 2 to be a verb for the whole paren to be a train. 2: is a constant verb returning 2. – FireFly – 2014-08-21T11:40:05.117

A, right. That makes sense. – FUZxxl – 2014-08-21T14:01:42.803

2

Python 2 - 183

def m(a,b):a,b=map(lambda x:'{:064b}'.format(x),[a,b]);r=range(8);return int(''.join(`eval('^'.join(o+'&'+t for o,t in zip(a[8*i:8+8*i],v)))`for i in r for v in[b[j::8]for j in r]),2)

This is the first thing that came to my mind and so it can mostly likely be golfed down quite a bit more. In any case, its pretty straight-forward string manipulation:

def m(a,b):
    # Convert the integer inputs to their 64bit binary versions.
    a,b=map(lambda x:'{:064b}'.format(x),[a,b])

    n=''
    # For all rows in matrix a...
    for row in [a[8*i:8+8*i] for i in range(8)]:
        # For all columns in matrix b...
        for col in [b[j::8] for j in range(8)]:
            # Get each pair.
            pairs = zip(row, col)

            # Multiply each pair.
            mul_pairs = ['&'.join(z) for z in pairs]

            # XOR them all together.
            result = '^'.join(mul_pairs) 

            # Append the result to n.
            n+=`eval(result)`

    return int(n,2) 

BeetDemGuise

Posted 2014-08-20T19:05:11.567

Reputation: 442

2

J (34)

This should work I think. It would be nice if you gave some test cases...

f=:~:/ .*&.(8 8&$@(_64&{.) :.,@#:)

This works with the test cases supplied in the comments (when changing 8 8 to 4 4 and changing 64 to 16)

Explanation

The whole function can be simplified to the following:

F&.G

Visual representation:

Under

What this means: G is applied to both arguments. Then F is applied with G x as its left argument, and G y as its right argument. Then, the inverse of G is applied to the result of F. "Inverse?", you might ask. With inverse I mean a function applied _1 times, so for example the inverse of #: (dec -> bin vector) is #:^:_1 (or #.) (bin vector -> dec).

So, applying this theory, this would be G:

  • #: takes the binary form of the argument
  • _64&{. takes the last 64 items of the list. When there aren't 64 or more items, it will be padded with 0's
  • 8 8&$ shapes it to an 8*8 matrix
  • :., sets the reverse of (4 4&$@(_16&{.) to be just ,: ravel

and this would be F:

  • ~:/ .* is matrix multiplication

ɐɔıʇǝɥʇuʎs

Posted 2014-08-20T19:05:11.567

Reputation: 4 449

I think you can shorten this using 8 8&$ :.,@(64#2)&#: for the right-hand side. – FUZxxl – 2015-02-24T14:10:55.300

Nope, I have a longer version using . that seems to work though. You need to return the corresponding number of the matrix, and #: will have issues with leading zero bits. Also, the space before the . is significant (to disambiguate from /.). Oh, and you want ~: (xor) in place of + (or take the result modulo 2 manually). – FireFly – 2014-08-20T21:22:40.197

@FireFly Made some changes, would that work? – ɐɔıʇǝɥʇuʎs – 2014-08-20T21:26:20.577

1Try these for testcases (4×4 arrays, so change the 8 8 correspondingly): 44584 × 33825 = 44584 (33825 is identity); 44584 × 28371 = 46550. – FireFly – 2014-08-20T21:27:39.410

What I have is currently f=:#.@,@(~:/ .*)&(8 8$]#:~64#2:), but it's so similar to yours that it'd be weird to post it as its own answer. Maybe you can work off of that. – FireFly – 2014-08-20T21:53:21.777

It would be nice if you could add some text that explains how it works. – FUZxxl – 2014-08-20T22:43:28.313

@FUZxxl Done!​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​ – ɐɔıʇǝɥʇuʎs – 2014-08-21T09:41:16.227