top of page

All About Pooling Layers for Convolutional Neural Networks

Updated: Dec 30, 2020

Before we start diving into different pooling methods, we will explain the following concepts that involves in building a pooling layer:

  1. Stride: stride indicates how much the kernel will move when sliding through the image. Strides of 1 mean that the kernel moves through the image pixel by pixel. A stride of 2 means that the kernel can move twice as many pixels per step (so it skips every other pixel). We can use a stride that is greater than 2 for down sampling an image.

  2. Kernel size: The size of the filter mask is described in terms of width times height (w X h)

  3. Padding: padding required to process inputs with a shape that is not a perfectly matched size to the pooling layer's kernel size and stride.

 

Max Pooling


Max pooling is one of the most used pooling layers, it is used to prevent overfitting by providing a less complex version of the original representation.

Max pooling uses the highest value of all neurons in the prior layer.


Max pooling image example for Convolutional Neural Networks
Max pooling example for convolutional neural networks

In our example:

  • Top left cluster that contains [15,24,14,63] will output max( 15, 24, 14, 63 ) = 63

  • Top right cluster that contains [13,19,85,33] will output max( 13, 19, 85, 33 ) = 85

  • Bottom left cluster that contains [81,74,55,93] will output max( 81, 74, 55, 93 ) = 93

  • Bottom right cluster that contains [77,12,15, 69] will output max( 77, 12, 15, 69) = 77

 

Average Pooling


Average pooling involves finding the average value of each 2×2 block of the feature map. This means that each 2×2 block is down-sampled to the average.


Average pooling example for convolutional neural networks

In our example:

  • Top left cluster that contains [15,24,14,63] will output ( 15 + 24 + 14 + 63 ) / 4 = 29

  • Top right cluster that contains [13,19,85,33] will output ( 13 + 19 + 85 + 33 ) / 4 = 37

  • Bottom left cluster that contains [81,74,55,96] will output ( 81 + 74 + 55 + 96 ) / 4 = 76

  • Bottom right cluster that contains [77,12,15, 69] will output (77 + 12+ 15 + 69) / 4 = 43

 

Min Pooling


The minimum pixel value for the cluster is selected to be in the new layer

MIn pooling example for convolutional neural networks

In our example:

  • Top left cluster that contains [15,24,14,63] will output min( 15, 24, 14, 63 ) = 14

  • Top right cluster that contains [13,19,85,33] will output min( 13, 19, 85, 33 ) = 13

  • Bottom left cluster that contains [81,74,55,93] will output min( 81, 74, 55, 93 ) = 55

  • Bottom right cluster that contains [77,12,15, 69] will output min( 77, 12, 15, 69) = 12

bottom of page