Implement the Max-Pooling operation from Convolutional Neural Networks

27

5

In convolutional neural networks, one of the main types of layers usually implemented is called the Pooling Layer. Sometimes, the input image is big (and therefore time consuming especially if you have a big input set) or there is sparse data. In these cases, the objective of the Pooling Layers is to reduce the spatial dimension of the input matrix (even though you would be sacrificing some information that could be gathered from it).

A Max-Pooling Layer slides a window of a given size \$k\$ over the input matrix with a given stride \$s\$ and get the max value in the scanned submatrix. An example of a max-pooling operation is shown below:

enter image description here

In the example above, we have an input matrix of dimension 4 x 4, a window of size \$k=2\$ and a stride of \$s=2\$.

Task

Given the stride and the input matrix, output the resulting matrix.

Specs

  • Both the input matrix and the window will always be square.
  • The stride and the window size will always be equal, so \$s=k\$.
  • The stride is the same for the horizontal and the vertical direction
  • The stride \$s\$ will always be a nonzero natural number that divides the dimension of the input matrix. This guarantees that all values are scanned by the window exactly once.
  • Input is flexible, read it however you see fit for you.
  • Standard loopholes are not allowed.

Test Cases

Format: 
s , input, output

2, [[2, 9, 3, 8], [0, 1, 5, 5], [5, 7, 2, 6], [8, 8, 3, 6]] --> [[9,8], [8,6]]
1, [[1, 2, 3], [4, 5, 6], [7, 8, 9]] --> [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
4, [[12, 20, 30, 0], [8, 12, 2, 0], [34, 70, 37, 4], [112, 100, 25, 12]] --> [[112]]
3, [[0, 1, 2, 3, 4, 5, 6, 7, 8], [9, 10, 11, 12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23, 24, 25, 26], [27, 28, 29, 30, 31, 32, 33, 34, 35], [36, 37, 38, 39, 40, 41, 42, 43, 44], [45, 46, 47, 48, 49, 50, 51, 52, 53], [54, 55, 56, 57, 58, 59, 60, 61, 62], [63, 64, 65, 66, 67, 68, 69, 70, 71], [72, 73, 74, 75, 76, 77, 78, 79, 80]] --> [[20, 23, 26], [47, 50, 53], [74, 77, 80]]

This is , so shortest answers in bytes wins!

ihavenoidea

Posted 2019-11-05T15:06:20.280

Reputation: 521

Sandbox link (now deleted).

– ihavenoidea – 2019-11-05T15:08:16.063

2would have been a more interesting challenge without s==k imho – Sparr – 2019-11-05T20:09:56.203

Can we do my homework next? :P – jaaq – 2019-11-08T08:56:59.317

Answers

9

Python 3, 101 108 bytes

def f(m,w):r=range(0,len(m),w);return[[max(sum([x[i:i+w]for x in m[j:j+w]],[]))for i in r]for j in r]

Thanks to Jo King and Jonathan Allan for some more bytes off!

Try it online!

Alternate (longer) version as a single Lambda expression:

Try it online!

JPeroutek

Posted 2019-11-05T15:06:20.280

Reputation: 734

Very nice. -4 bytes - no need for list of unpacked range, also one extra white-space: TIO

– Jonathan Allan – 2019-11-05T20:00:19.277

You can also put it all one one line for a further -3 bytes

– Jo King – 2019-11-05T22:30:13.363

Note: in the lambda version it is better to just use range twice

– Jo King – 2019-11-06T00:49:40.800

1

You could also get to 100 bytes with Python 3.8

– Jo King – 2019-11-06T00:52:14.247

Yeah, repeating Range eliminates the internal lambda. Still longer than this solution iirc – JPeroutek – 2019-11-06T00:52:23.133

8

Jelly, 6 bytes

»⁹/Z$⁺

A dyadic Link accepting the matrix (list of lists) on the left and the window size on the right which yields a matrix (list of lists).

Try it online!
Or see all windows of a 24*24 grid pooled at each possible granularity

How?

»⁹/Z$⁺ - Link: M, k
    $  - last two links as a monad - i.e. f(X) where X is initially M
  /    -   reduce (X)...
 ⁹     -     ...in chunks of: chain's right argument (k)
»      -     ...by: (left) maximum (right) (this vectorises across the k rows)
   Z   -   transpose the resulting matrix
     ⁺ - repeat the last link - i.e. f(that result)

Jonathan Allan

Posted 2019-11-05T15:06:20.280

Reputation: 67 804

7

Octave with Image Package, 38 bytes

@(M,s)blockproc(M,[s s],@(b)max(b(:)))

Try it online!

Luis Mendo

Posted 2019-11-05T15:06:20.280

Reputation: 87 464

2Didn't know blockproc, very nice:) – flawr – 2019-11-05T15:40:23.303

7

J, 15 14 bytes

-1 byte, thanks @Bubbler

The left argument is the matrix and the right argument is the stride.

>./@,;.3~2 2&$

Try it online!

Traws

Posted 2019-11-05T15:06:20.280

Reputation: 331

1That's clean J, much better than my contrived solution that uses an inappropriate cutting verb. I started forgetting J even before I have learned it :) – Galen Ivanov – 2019-11-05T18:58:40.863

4+1 for using the right tool for the job. Btw, you can use ;.3 instead of ;._3 because OP specifies that the stride is always a divisor of the matrix size. – Bubbler – 2019-11-05T23:21:04.200

6

J, 13 12 bytes

>./\&|:^:2~-

Try it online!

... When I said ";.3 is the right tool for the job", I was wrong. This challenge is so simple that an alternative method can get to fewer bytes.

How it works

>./\&|:^:2~-  Left argument: matrix, Right argument: stride (s)
           -  Negate the right argument
       ^:2~   Run the thing twice, with flipped arguments:
              (now the left is -s, right is matrix)
    &|:         Transpose both the matrix and -s
                (transpose on -s is no-op)
>./\            Run "reduce by max" on each of non-overlapping intervals of
                length s, in the primary dimension
              By running this twice, we achieve max pooling in two dimensions

Bubbler

Posted 2019-11-05T15:06:20.280

Reputation: 16 616

5

Python 2, 68 bytes

lambda M,k:eval("map(lambda*l:map(max,zip(*[iter(l)]*k)),*"*2+"M))")

Try it online!


74 bytes

lambda M,k:eval("zip(*map(lambda*N:map(max,*N*2),*k*[iter("*2+"M)])))]))")

Try it online!

I can't even ...

xnor

Posted 2019-11-05T15:06:20.280

Reputation: 115 687

4

APL (Dyalog Unicode), 20 bytes

{⌈⌿⍤2⌈/⍵}⊢⍴⍨4⍴÷⍨∘≢,⊣

Try it online!

I'm pretty sure converting the dfn part into an atop (⌈⌿⍤2⌈/) should work (and save a byte), but there seems to be an anomaly that prevents me from running multiple test cases with that version.

How it works

{⌈⌿⍤2⌈/⍵}⊢⍴⍨4⍴÷⍨∘≢,⊣  left: stride (s), right: matrix
              ÷⍨∘≢    a = size of matrix divided by stride
                  ,⊣  Concatenate a with s
            4⍴        Create a length-4 array (a s a s)
         ⊢⍴⍨          Reshape the matrix into a 4-dimensional array with shape a,s,a,s
{       }             Pass the result into the dfn...
     ⌈/⍵              Take the max through the 4th axis (shape: a,s,a)
 ⌈⌿⍤2                 Take the max through the 2nd axis (shape: a,a)

APL also has an operator called "Stencil" that is similar to J's "Subarrays" ;.3, but it has a weird starting location, making it clunky to use for this challenge.

Bubbler

Posted 2019-11-05T15:06:20.280

Reputation: 16 616

{⌈/↑⍵⊂⍨0=⍺|⍳≢⍉⍵}⍣2 is a bit shorter (⎕io←0) – ngn – 2019-11-12T11:23:18.623

3

J, 23 bytes

(#@];~@$1{.~[)>./@,;.1]

Try it online!

Galen Ivanov

Posted 2019-11-05T15:06:20.280

Reputation: 13 815

3

05AB1E, 8 7 bytes

ôεø¹ôεZ

-1 byte thanks to @Grimy.

Stride as first input and matrix as second input.

Try it online or verify all test cases.

Explanation:

ô        # Split the rows of the (implicit) input-matrix into
         # the (implicit) input-integer amount of groups
         #  i.e. [[12,20,30,0],[8,12,2,0],[34,70,37,4],[112,100,25,12]] and 2
         #   → [[[12,20,30,0],[8,12,2,0]],[[34,70,37,4],[112,100,25,12]]]
 ε       # Map each group of rows to:
  ø      #  Zip/transpose; swapping rows/columns
         #   i.e. [[12,20,30,0],[8,12,2,0]] → [[12,8],[20,12],[30,2],[0,0]]
   ¹ô    #  Split the columns into the input-integer amount of groups as well
         #   i.e. [[12,8],[20,12],[30,2],[0,0]] → [[[12,8],[20,12]],[[30,2],[0,0]]]
     ε   #  Inner map over each block:
      Z  #   And get the flattened maximum of the block
         #    i.e. [[12,8],[20,12]] → 20
         # (after which the result is output implicitly)

Kevin Cruijssen

Posted 2019-11-05T15:06:20.280

Reputation: 67 575

1I think ø¹ô instead of ¹δôø works. – Grimmy – 2019-11-05T16:10:26.183

@Grimmy Ah, you're indeed right. How did I miss that.. Thanks (as always xD) – Kevin Cruijssen – 2019-11-05T16:17:34.373

3

Wolfram Language (Mathematica), 23 bytes

BlockMap[Max,#2,{#,#}]&

Try it online!

Lukas Lang

Posted 2019-11-05T15:06:20.280

Reputation: 231

2

Python 3, 275 196 184 163 159 111 bytes

def f(s,m):r=range(len(m)//s);return[[max(b for k in range(s)for b in m[i*s+k][j*s:][:s])for j in r]for i in r]

Try it online!

Thanks to:
- @randomdude999 for saving me 12 bytes
- @Jitse for saving 48 bytes

Delta

Posted 2019-11-05T15:06:20.280

Reputation: 543

1You don't need a newline after for k in range(l):, since there's only a single simple statement following it. Same for if l<2:. Also, **0.5 == **.5. Since the input matrix is a square, len(m[0]) == len(m). And i'm pretty sure at least the innermost loop can be done with a list comprehension. – randomdude999 – 2019-11-05T18:01:22.347

1

Same approach, 118 bytes: Try it online!

– Jitse – 2019-11-18T07:58:14.103

2

JavaScript (ES6), 85 bytes

Takes input as (s)(matrix).

s=>m=>m.map((r,y)=>r.map((v,x)=>(r=M[Y=y/s|0]=M[Y]||[])[x=x/s|0]>v?0:r[x]=v),M=[])&&M

Try it online!

Arnauld

Posted 2019-11-05T15:06:20.280

Reputation: 111 334

2

Charcoal, 24 bytes

NθIE⪪EAE⪪ιθ⌈λθE§ι⁰⌈Eι§νμ

Try it online! Link is to verbose version of code. Outputs each maximum on its own line, with matrix rows double-spaced. Explanation:

Nθ                          Input the stride
      A                     Input matrix
     E                      Map over rows
         ι                  Current row
        ⪪ θ                 Split (horizontally) into windows
       E                    Map over windows
            λ               Current window
           ⌈                Take the maximum
    ⪪        θ              Split vertically into windows
   E                        Map over windows
              E§ι⁰          Map over horizontal maxima
                   Eι       Map over windowed rows
                     §νμ    Get windowed cell
                  ⌈         Take the maximum
  I                         Cast to string for implicit print

E§ι⁰Eι§νμ is effectively the nearest Charcoal has to a transpose operation, although obviously I can at least take the maximum of the transposed column in situ.

Neil

Posted 2019-11-05T15:06:20.280

Reputation: 95 035

2

dzaima/APL, 14 12 bytes

⊣t¨t←⌈/⍬∘⍮⍛⍴

Try it online!

dzaima

Posted 2019-11-05T15:06:20.280

Reputation: 19 048

1

Stax, 11 bytes

ó╚←XF«≡ûÖçx

Run and debug it at staxlang.xyz!

Unpacked (13 bytes) and explanation

X/{Mx/{:f|Mmm
X                Save window size to register X
 /               Take groups of X rows
  {         m    Map block over groups:
   M               Transpose (gets list of column segments)
    x/             Take groups of X column segments (our windows)
      {    m       Map block over windows:
       :f|M          Flatten and take maximum

Leaves the result on the stack, since arrays print as strings. That makes the output hard to determine. Here's a pretty version.

Khuldraeseth na'Barya

Posted 2019-11-05T15:06:20.280

Reputation: 2 608

1

Ruby, 57 bytes

->m,w{g=->r{r.transpose.each_slice(w).map &:max};g[g[m]]}

Try it online!

G B

Posted 2019-11-05T15:06:20.280

Reputation: 11 099

1

Japt, 24 bytes

mòY=UÊzV)Õc òYp)®rwÃòV y

Try it

U.m("ò", // split each row by..
 Y = U.l().z(V)) // input size / s
.y() // transpose
.c() // flatten
.ò(Y.p()) // split in Y*Y chunks
.m(function(Z) { return Z.r("w") }) // max of each
.ò(V) // rematrix
.y() // retranspose

AZTECCO

Posted 2019-11-05T15:06:20.280

Reputation: 2 441