I – Basic Masks

As mentioned at the beginning of this section, the use of spatial convolution masks is a straightforward extension of discrete convolution to two (2) dimensions. A simple example is demonstrated below. First, consider the following three objects:

  • An input image, whose dimensions must be equal or greater than those of the mask (described below)
  • A convolution “mask” (usually it is a square matrix, and its dimensions are odd; ie: 3, 5, 7, etc)
  • An output image buffer which is the same size as the input image

Basic Objects

The actual mathematics behind this operation is rather simple and straightforward. First, lets imagine that we “stack” the mask on top of the input image so that the upper-left corners of both the mask and the input image are touching.  Now, for each pixel where the mask “overlaps” the input image, calculate the product of the value in the “mask pixel”, and the the “image pixel”, and sum these up (in the above example, assuming we are going from left-to-right, then top-to bottom, the calculation would be: SUM=[(1*1)+(1*2)+(1*3)] + [(1*1)+(1*1)+(1*)] + [(1*1)+(1*2)+(1*3)] = 15 ). We then take this sum-of-products, find the “centre” pixel of where the mask overlaps with the input image, and put the sum in the output image buffer at the same “centre” pixel location, as shown below.

First convolution sum

We then repeat “slide” the mask from left-to-right, then top to bottom, so that we’ve carried this operation out on every pixel for which the mask can overlap the input image without leaving the boundaries of the input image. The next few iterations of this operation are shown below to demonstrate this operation in action.

Second convolution sum

Third convolution sum

Fourth convolution sum

Fifth convolution sum

Proceeding in this manner, we eventually arrive at the convolution sum output which we stored in the output image buffer, as shown below.

Output of convolution operation

There is still a problem with this operation though – the border pixels are all zeroes!  In image processing, this is the equivalent of having a black border in the output image. What’s worse is that if we were to use a larger mask (say, 5×5, or 11×11), or perform multiple convolution operations on an image, we would have a very small output image. We could always “trim” the image to remove these borders, but the output image will become smaller each time we do this. There must be a reasonable workaround, and there is. One simple enough method is to simply replicate the columns of the input image, then the rows, carry out convolution, then “trim” the output image, as shown below.

Replicating rows, then columns…

Performing convolution on the (now larger) input image…

Trimming the appropriate number of rows/columns from the output image buffer

So, consider an input image with dimensions A by B, and a convolution mask with dimensions M by M, such that M\epsilon Z^{+}, odd (ie: it is a square matrix with odd dimensions). Therefore, the number of columns on both the right and left-hand-side of the input image that need to be replicated are \dfrac{M-1}{2}. Once we have replicated columns on the left and right, the same number of rows need to be replicated on the top and bottom of the input image. From there, we simply perform the convolution of the input image, and then trim \dfrac{M-1}{2} rows/columns from the top, bottom, left, and right of the final output image buffer. This will result in a better output image that also has the same dimensions as the original input image. The following MATLAB code below demonstrates this concept in action, and can be used to test any arbitrary mask. Please note that it also normalizes the images, but this feature can be removed easily enough. Enjoy!


6 thoughts on “I – Basic Masks

Leave a Reply

Your email address will not be published. Required fields are marked *