COMPSCI 180 Project 2: Fun with Filters and Frequencies!

Kaitlyn Chen

Introduction:

This project explores how to find image edges, sharpen images, make hybrid images, and blend images using filters are frequencies. The most important thing I learned from this project is that frequencies within an image are very important and that Gaussian filters are very powerful! This also means you can manipulate images quite a lot by just manipulating their frequencies!

Part 1.1 Finite Difference Operators:

This part of the project aims to find and highlight the edges of the given cameraman image.

The first approach used here is using finite difference filters in the x, y directions:
$\mathbf{D_x} = \begin{bmatrix} 1 & -1 \end{bmatrix}, \quad \mathbf{D_y} = \begin{bmatrix} 1 \\ -1 \end{bmatrix}$ . These filters are used to create partial derivatives in the x, y directions. This is done by convolving the original image with these finite difference filters in their respective directions using convolve2d from scipy.signal library with mode = ‘same’.

A gradient magnitude is then created by taking the l2 norm of a gradient vector where the gradient in x and gradient in y are values of such vector. This gradient magnitude image essentially combines the gradient in x and gradient in y image together.

To create a true edge image, the gradient magnitude image is binarized. I found a threshold of 0.2 to work well in balencing the noise while still showing all the real edges.

Part 1.2: Derivative of Gaussian (DoG) Filter

Gaussian Filter:
The finite difference operator gave us a fairly noisy edge image, so in this section we will use a Gaussian filter, acting as a smoothing operator. We perform all the same steps as the previous section, but instead utilize a blurred version of the original image instead fo the original.

A blurred version of the original image is created by convolving it with a 2D gaussian filter. A 2D gaussian filter can be created by first creating a 1D gaussian filter with cv2.getGaussianKernel( ) then taking the outer product of it and its transpose. A kernel size of 5 and sigma of 1 was chosen in this scenario. The code is given below, as it it used throughout the rest of the project to create a 2D gaussian filter (with different kernel sizes and sigmas).

We can note that using a Gaussian filter significantly reduced the noise in the edge image, however details are lost; such as the details within the camera and background. We can also see that the white edges are much thicker and prominent.

gaussian_1d = cv2.getGaussianKernel(ksize=5, sigma=1)
gaussian_2d = np.outer(gaussian_1d, gaussian_1d.T)

Derivative of Gaussian:
Now instead of convolving twice, we can just perform a single convolution by creating derivative of Gaussian (DoG) filters. These are created by convolving the previous Gaussian kernels with their respective finite difference filter. The original image is then convolved with with these DoG filters to create the respective partial derivatives images. A gradient magnitude image and edge image is then created with the same steps are previous two approaches.

We can verify that the result in this third approach is very very similar to that of the second approach. especially compared to the first approach’s results

Part 2.1: Image “Sharpening”

Unsharp Masking Technique:
The goal of this section is to sharpen images by enhancing high frequencies. The given image is a not very clear image of the Taj Mahal.

To get only the high frequencies of an image we can subtract the low frequencies from the original image. A gaussian filter helps with this as it is a low pass filter, keeping only the low frequencies of an image. So get the low frequencies, we this we convolve the original image with the gaussian kernel.

Once we have the high frequencies we can multiply it by some alpha value, to determine how much “sharpening” we want and add it to the original image. (Result images are shown below with different alpha values)

Note: for colored images we need to perform this “unsharp mask filter” on each r, g, b color channel and then stack the result.

Applied to other images: The original image of Oski is a little blurry, so sharpening it made it much better. However, the original puppy image was already quite sharp, so sharpening it beyond a certain alpha value gave unnatural results. But we can see that an alpha of 0.1 gave an image close to the original (further alpha value refinement could have created a better match to the original puppy image)

Original Oski Image

Puppy Original Image

Blurred Puppy Image

Sharpened Puppy Image with Alpha = 0.7

Sharpened Puppy Image with Alpha = 2

Part 2.2: Hybrid Images

The goal of this section is to create hybrid images with the approach from SIGGRAPH 2006 paper by Oliva, Torralba, and Schyns. The interpretation of hybrid images change with distance to the image. When viewing an image up close, the high frequencies of the image dominate, while the converse occurs with distance. Blending high frequencies of one image with low frequencies of another yields such hybrid images.

I first aligned the two images using the provided staff code. I then chose which image to use as the low frequency image and which as the high frequency image. I then extracted only their respective frequencies using the approach from part 2.1 above, and simply added them to create the hybrid image.
Note: for colored images, each color channel must be processed seperately then combined

Bells & Whistles:
I tried using color to enhance the effect. I noticed that the cat’s color was not prominent enough against both the colored Derek image and gray Derek image to make a clear difference. Since both Derek and the cat have colored eyes (blue and green), keeping both images gray creates the best hybrid image in my opinion.

Derek - low frequency, colored
Nutmeg - high frequency, colored

Derek - low frequency, colored
Nutmeg - high frequency, gray

Derek - low frequency, gray
Nutmug - high frequency, gray

Derek - low frequency, gray
Nutmeg - high frequency, colored

More images - Cillian Murphy + Antz:

Many, including myself, believe Cillian Murphy looks like the character from the movie Antz, so I thought this was the perfect opportunity.

When looking at the large hybrid image, you see the high frequency ant more, while the opposite is true when looking at the shrunken image on its right.

The two images are quite different so the alignment looks funny, but nonetheless I think the functionality of the hybrid image using both gray turned out great.

+

=

Cillian & Antz Hybrid Image
Cillian - low frequency, gray
Antz - high frequency, gray

More images - Heath Ledger + Joker (Failure):

Before generating the hybrid image I thought a grayscale low frequency image of heath ledger with a colored high frequency joker would be best. However, the color of the joker did not stand out as much as hoped. And the tilt of the Joker’s head distorts the hybrid image and alignment.

Below is the 2D frequency analysis of these two input images, their filtered images, and the hybrid image.

Heath Ledger - low frequency, color
Joker - high frequency, color

Heath Ledger - low frequency, gray
Joker - high frequency, color

Frequency Analysis for Heath Ledger - Joker Images:

2D Fourier Transform of Heath Ledger (low frequency Image)

2D Fourier Transform of Joker (high frequency image)

2D Fourier Transform of Low Pass Filtered Heath Ledger (low frequency Image)

2D Fourier Transform of High Pass Filtered Joker (high frequency image)

2D Fourier Transform of Heath-Joker Hybrid Image

Part 2.3: Gaussian and Laplacian Stacks & 2.4: Multiresolution Blending (a.k.a. the oraple!)

This section aims to blend two images together using a multi resolution blending technique. This technique creates a seam between two images at each image frequency to create a smooth seam.

First we created Gaussian and Laplacian stacks. Stacks don’t downsample, instead the next level is produced by blurring the previous with a gaussian filter. Stacks also do not subsample, this means the size of each level is the same. Each level of a laplacian stack is created by subtracting the gaussian stack at next level from that of the current.

Note: each color channel needs to be processed separately then combined

Apple Gaussian Stack

Orange Gaussian Stack

Apple Laplacian Stack

Orange Laplacian Stack

Blended Laplacian Stack

Mask Gaussian Stack

Recreation of Figure 3.42 in Szelski (Ed 2)

Oraple Images:
A vertical mask was used. For each level, a blended level is created with this equation:

$(\text{laplacian\_stack1}[ \text{level} ] \times \text{gaussian\_stack\_mask}[ \text{level} ]) + (\text{laplacian\_stack2}[ \text{level} ] \times (1 - \text{gaussian\_stack\_mask}[ \text{level} ]))$

To create the final blended image, these blended levels are collapsed by adding them together. Using different sigma values and kernel sizes yielded very different results at each level. Utlimately a sigma value of 2 for the input gaussian stacks, a sigma value of 10 for the mask’s gaussian stack, and large kernel sizes were used.

Bells & Whistles:
Used color to enhance the output image.

Other Blended Images with Irregular Masks:

Various sigma values and kernel sizes were used for the input gaussian stacks and mask gaussian stacks; were decided based on a qualitative matter. Final hybrid image of the weiner hot dog turned out a bit blurry. I think tilting and shrinking the labradoodle input image would’ve made the hybrid image more believable.

Bells & Whistles:

Used color to enhance the output image.

Chicken Gaussian Stack

Labradoodle Gaussian Stack

Chicken Laplacian Stack

Labradoodle Laplacian Stack

Blended Laplacian Stack

Mask Gaussian Stack

Recreation of Figure 3.42 in Szelski (Ed 2)

Conclusions:

The most important thing I learned from this project is that frequencies within an image are very important! This also means you can manipulate images quite a lot by just manipulating their frequencies!