Compact Gaussian interpolation for small displays

Was working with the MLX90640 thermal imager chip and wanted to do some pixel interpolation to improve the visual image quality. One of the MLX90640 examples from Adafruit used Gaussian blur to smooth out the pixels and I thought it looked pretty good. Since I wanted to implement the image processing in a micro controller I needed a lightweight algorithm to calculate this. I started by looking into Gaussian blur and realized that it was actually pretty heavy to calculate. For each pixel in the resulting image a bunch of surrounding source pixels from the original image has to be sampled and their values taken into the calculation

Each surrounding pixel is multiplied with a kernel factor and added up to form the new pixel. I found this  Gaussian kernel calculator online. I chose a 3 by 3 kernel since it seemed to be a good compromise for my application.

I made the calculation pretty fast by creating two arrays, one containing the address offsets for each source pixel to be sampled with respect to the pixel being calculated, and one containing the kernel factors. In case of my 32  pixel wide sensor image:

Where ‘n’ is the address of the resulting pixel.

At some point I realized that the very same algorithm could be used to create sub pixel interpolation thus actually increasing the number of output pixels. I started to divide each output pixel into four sub pixels:

The surrounding pixels could then be equally sub divided and their values multiplied with the kernel factors:

Noting that many of the source pixels were actually the same larger original pixel the 9 original calculations could be combined into four. Here are the 4 calculations for each sub pixel:

Where P is an array containing source pixels and G is an array containing the combined kernel  values for the sub pixels they represents. Since the kernel values are symmetric around the center the values can be combined (added) into three values:

In the above G[0] are the yellow, G[1] are the orange and G[2] are the blue.

When looking at what calculations needed to be performed for each sub-pixel it can be seen that in the first line alternates between  A and B and the second between C and D

I looked at the sub pixel output addresses, source pixel address and what calculations needed to be performed to find a logic relationship between these:

When I looked at the binary address of the output pixels I observed some useful patterns . Assuming counting the destination(sub-pixel) address up from 0 (upper left sub-pixel) to 2688 (lower right):

Using these various bit fields the appropriate action for each output pixel can be extracted:

The above can be thought of as a kind of sequencer that guide the algorithm through the calculation in a more efficient way than just brute force.

Here is the code to do this:

//  Lightweight 1:2 Gaussian interpolation for
//  small displays
//  (F) Dzl 2019
//  Optimized for simplicity and does not account for
//  screen edge ghosting.
//  Image dimensions
#define IMAGE_WIDTH 32
#define IMAGE_HEIGHT 24

//  Different kernels:
#define P0 (0.077847)
#define P1 (0.123317+0.077847)
#define P2 (0.195346+0.123317+0.123317+0.077847)
  #define P0 (0.024879)
  #define P1 (0.107973+0.024879)
  #define P2 (0.468592+0.107973+0.107973+0.024879)
  #define P0 (0.102059)
  #define P1 (0.115349+0.102059)
  #define P2 (0.130371+0.115349+0.115349+0.102059)

class GBlur
    const int offsets[4][4] =
      { -IMAGE_WIDTH - 1, -IMAGE_WIDTH, -1, 0},
      { -IMAGE_WIDTH, -IMAGE_WIDTH + 1, 0, 1},
      { -1, 0, IMAGE_WIDTH, IMAGE_WIDTH + 1},
      {0, 1, IMAGE_WIDTH, IMAGE_WIDTH + 1}
    const float kernel[4][4] =
      {P0, P1, P1, P2},
      {P1, P0, P2, P1},
      {P1, P2, P0, P1},
      {P2, P1, P1, P0}

    //  This method takes two pixel arrays:
    //  'source' and 'dest'.
    //  For speed 'IMAGE_WIDTH' and 'IMAGE_HEIGHT'
    //  (original image size) are pre defiend in the top.
    //  'source' is the original (monochrome) pixels
    //  and 'dest' is the interpolated pixels.
    //  the size of 'source' is IMAGE_WIDTH*IMAGE_HEIGHT
    //  and 'dest' is IMAGE_WIDTH*IMAGE_HEIGHT*4
    void calculate(float *source, float *dest)
      float pix;
      //For rest of  output pixel:
      for (int i = 1; i < IMAGE_WIDTH * IMAGE_HEIGHT * 4; i++)
        int sourceAddress = ((i >> 1) & 0x1f) + ((i & 0xffffff80) >> 2);
        pix = 0;
        int q = (i & 0x00000001) + ((i & 0x00000040) >> 5);   //Calculation to perform
        for (int z = 0; z < 4; z++)
          int sa = sourceAddress + offsets[q][z];
          if (sa > 0 && sa < IMAGE_WIDTH * IMAGE_HEIGHT)
            pix += kernel[q][z] * source[sa];
          dest[i] = pix;




Leave a Reply