Convolve integers in subquadratic time

7

An linear discrete convolution is an operation that turns two vectors of numbers into a third vector of numbers by multiplying elements inside-out. Formally, for two vectors a and b with elements 0 to n - 1, the discrete linear convolution of a and b can be computed with this loop:

for i = 0 to 2*n - 2
    c[i] = 0;
    for j = 0 to n - 1
        if i - j >= 0 and i - j < n
            c[i] = c[i] + a[j] * b[i - j];

As an example, the convolution of a = { 1, 2, 3, 4 } and b = { 5, 6, 7, 8 } is c = { 5, 16, 34, 60, 61, 52, 32 }. These convolutions appear when doing long multiplication, for example:

1234 * 5678 =
         20 24 28 32
      15 18 21 24
   10 12 14 16
 5  6  7  8
--------------------
 5 16 34 60 61 52 32
--------------------
      32
     52
    61
   60
  34
 16
 5
--------------------
 7006652

Your task is to write a program or function that, given two arrays (or similar) a and b of non-negative integers of equal length n and optionally, n and an output array c, computes the linear discrete convolution of a and b and returns it, assigns it to the parameter c. or prints it out. You may also take input from the user while or before your code is running. The following constraints apply:

  • Your program must run in subquadratic or o(n2) time. An algorithm like the pseudo-code above that runs in quadratic time Θ(n2) is invalid.
  • You may assume that all integers in in- and output are in the range from 0 to 65535, this also applies to n.
  • You may not claim that your algorithm runs in subquadratic time because n has an upper bound.
  • The results must be exact.
  • This is code golf, the shortest code in octets wins.
  • You may not use existing library routines or similar to compute a Fourier transform or a number theoretic transform or the respective inverse transformations.

FUZxxl

Posted 2015-12-21T20:05:12.480

Reputation: 9 656

@LegionMammal978 No, O(n²) time allows for quadratic-time algorithms. o(n²) doesn't, that's the difference between big-O and little-O (think of big-O being ≤ and little-o being <). – FUZxxl – 2015-12-21T20:14:07.567

@LegionMammal978 No, that's Ω vs. Θ vs. O. There is also ω and o. – FUZxxl – 2015-12-21T20:17:31.547

1@LegionMammal978 Why do you delete your comments? They are useful for future readers. – FUZxxl – 2015-12-21T20:17:59.103

Comments are not for discussions. For the record, I was confused about the differences between big- and little-O complexity. – LegionMammal978 – 2015-12-21T20:20:05.777

@LegionMammal978 Maybe other users are confused, too. Your comments help them and answer questions they might have asked again. – FUZxxl – 2015-12-21T20:22:37.867

@PeterTaylor Reworded. – FUZxxl – 2015-12-21T21:23:48.633

@FUZxxl If you think other users might stumble over the same confusion, then it's something you should address in the challenge. Comments aren't meant to be permanent, and people shouldn't be expected to read them. – Martin Ender – 2015-12-21T22:51:09.727

So, let me get this straight: looping through each digit in each of the two integers qualifies for Θ(n²), but not o(n²), which makes it invalid? – ETHproductions – 2015-12-21T23:15:29.663

@ETHproductions yes. – FUZxxl – 2015-12-21T23:33:44.010

@FUZxxl I think you should say "discrete convolution", not "linear discrete convolution". A convolution is always a linear operation, AFAIK – Luis Mendo – 2015-12-22T00:47:47.310

@LuisMendo There are also cyclic and negacyclic convolutions, so I felt the need to clarify that the convolution isn't cyclic. Perhaps I should have said “acyclic?” – FUZxxl – 2015-12-22T00:48:35.170

@FUZxxl Oh, you mean "linear" as opposed to "circular". I always call that just "convolution". But you're right, now I see in which sense you said linear – Luis Mendo – 2015-12-22T00:49:29.817

2I'll give a 200-point bounty for any TI-BASIC solution, with no time limit. – lirtosiast – 2016-01-03T04:39:49.813

Answers

2

J, 79 bytes, O(n log n)

f=:_2&(0((+,-)]%_1^i.@#%#)&f/@|:]\)~1<#
<:@+&#$9(o.f%#)[:+@*/;f@{.&>~2^#@#:@+&#

Uses the Convolution theorem which states that the Fourier transform of the convolution of two sequences is equal to the Fourier transform of each sequence multiplied together elementwise.

Fourier( Convolve(a, b) ) = Fourier(a) * Fourier(b)
Convolve(a, b) = InverseFourier( Fourier(a) * Fourier(b) )

Also uses the fact the the inverse Fourier transform can be applied by using the Fourier transform of the conjugate.

InverseFourier(a) = Conjugate( Fourier( Conjugate(a) ) ) / Length(a)

I used the FFT implementation from a previous solution. That implementation uses the Cooley-Tukey algorithm for sequences where the length is a power of 2. Therefore, I have to zero-pad the input sequences to the a power of 2 (typically the minimal value that is valid) such that their lengths are greater than or equal to the sum of the lengths of the input sequences.

Usage

   f =: _2&(0((+,-)]%_1^i.@#%#)&f/@|:]\)~1<#
   g =: <:@+&#$9(o.f%#)[:+@*/;f@{.&>~2^#@#:@+&#
   1 2 3 4 g 5 6 7 8
5 16 34 60 61 52 32
   2 4 5 6 1 g 1 2 4 7
2 8 21 46 61 61 46 7

Try it online!

Explanation

An explanation for the FFT portion in J is included in my previous solution to a different challenge.

<:@+&#$9(o.f%#)[:+@*/;f@{.&>~2^#@#:@+&#  Input: A on LHS, B on RHS
                                      #  Get the lengths of A and B
                                    +&   Sum them
                                 #:@     Get list of binary digits
                               #@        Length
                             2^           Raise 2 to that power, call it N
                     ;                    Link A and B as a pair of boxed arrays
                          &>~             For each box
                        {.                  Take N values, extending with zeros
                      f@                    Apply FFT to it
               [:  */                    Reduce using multiplication
                 +@                      Conjugate
           f                             Apply FFT
            %                            Divide by
             #                           Length
       9 o.                              Take the real part
     #                                   Get the lengths of A and B
   +&                                    Sum them
<:@                                      Decrement
      $                                  Shape to that length and return

miles

Posted 2015-12-21T20:05:12.480

Reputation: 15 654

Maybe you can avoid having a name for f using &. or such? Not sure though. – FUZxxl – 2016-06-19T08:58:58.933

@FUZxxl It'd be amazing if J could figure out the Fourier inverse from that. It might be possible by reorganizing the tacit FFT but it'd probably cost more than just using conjugate and such. But this is barely golfed, other than the FFT part. Using f&.+ doesn't work since + is rank 0 which means that f gets applied to each value individually. But there's probably a way. – miles – 2016-06-19T09:37:08.607

@FUZxxl I've got (9 o.[:(f&.(+"1)%#)*&f) as a possible version but it ends up being 1 byte longer. – miles – 2016-06-19T09:51:07.680

f&.:+ would probably. – FUZxxl – 2016-06-19T10:02:37.320

@FUZxxl I just had a couple ideas from that, and didn't even know the : (whole) extended past @ but also to & and &.. Well, I probably read it and long forgotten. – miles – 2016-06-19T10:10:14.667

5

Python 2, 250 bytes, n^log_2(3)

s=lambda a,b,S=1:[x+y*S for x,y in zip(a,b)]
def f(a,A):
 n=len(a);m=n/2
 if m<1:return[a[0]*A[0]]
 if n%2:return f([0]+a,[0]+A)[2:]
 b,c=a[:m],a[m:];B,C=A[:m],A[m:];p=f(b,B);r=f(c,C);z=m*[0];return s(z+z+r,s(z+s(f(s(b,c),s(B,C)),s(p,r),-1)+z,p+z+z))

This isn't well golfed, but more a proof of concept of the algorithm. Let's look at a more readable version.

s=lambda a,b,S=1:[x+y*S for x,y in zip(a,b)]

def f(a,A):
 n=len(a)
 if n==1:return[a[0]*A[0]]
 if n%2:return f([0]+a,[0]+A)[2:]
 m=n/2
 b,c=a[:m],a[m:]
 B,C=A[:m],A[m:]
 p=f(b,B)
 r=f(c,C)
 q=s(f(s(b,c),s(B,C)),s(p,r),-1)
 z=m*[0]
 return s(z+z+r,s(z+q+z,p+z+z))

The idea is to divide and conquer adapting the Karatsuba multiplication algorithm to do convolution. As before, it runs in time n^log_2(3), as it makes 3 recursive calls which shrink the input length in half. This isn't as good as the quasilinear of DFT, but still better than quadratic.

We can think of convolution as multiplication of formal polynomials. Alternatively, if multiplication were done without regrouping, you'd get convolution. So, adapting the algorithm only requires doing vector addition and subtraction instead of the numerical operations. These additions are still linear-time.

Unfortunately for Python, it doesn't have built-in vector addition and subtraction, so we hack together a function s to do both: addition by default, and subtraction when passed an optional argument of -1. Whenever we add, we make sure to pad with zeroes to avoid truncation.

We handle odd list lengths by padding with a zero to an even length, removing the extraneous zeroes that result.

xnor

Posted 2015-12-21T20:05:12.480

Reputation: 115 687

1

Mathematica, 214 bytes, O(n log n)

f@{a_}:={a}
f@a_:=#~Join~#+Join[#2,#2]Exp[2.Most@Range[0,1,1/Length@a]Pi I]&@@f/@Thread@Partition[a,2]
(l=Length@#;RotateLeft[Round[#/Length@#]][[-1;;(1-2l);;-1]]&@f[Times@@(f@PadRight[#,2^Floor[2+Log2@l]]&/@{##})])&

Function f performs FFT, which is called several times. No built-in FFT is used here.

%[{1,2,3,4},{5,6,7,8}]
(* {5,16,34,60,61,52,32} *)

njpipeorgan

Posted 2015-12-21T20:05:12.480

Reputation: 2 992

You are not allowed to use builtin FFT functionality. – FUZxxl – 2016-01-03T02:31:50.817