Computer Vision Tutorial Series M2C1
Last Updated on July 17, 2023 by Editorial Team
Author(s): Sujay Kapadnis
Originally published on Towards AI.
Module 2 β Convolutional Filters and Edge Detection
Chapter 1 β Fourier Transform
Starting here? This article is part of a computer vision Tutorial Series. Hereβs where you can start.
Learning Objectives:
- What is Fourier Transform?
- How to use Fourier transform with the numpy library?
- Applying Fourier transform on the images.
Pre-Requisites: Previous Tutorials
What is Fourier Transform?
To understand this let us go step by step.
- The image we regularly use is considered to be in the spatial domain.
- To understand the color changes in an image we need to understand the frequency changes which occur at the edge of color change.
- The Fourier Transform is an important image-processing tool that when applied to the original image decomposes it into sine and cosine components.
The maths behind Fourier Transform is beyond the scope of this article, but if you are interested to understand you can read it here.
Fourier Transform with numpy
First, we will see how to find Fourier Transform using Numpy. Numpy has an FFT package to do this. np.fft.fft2()
provides us with the frequency transform which will be a complex array. Its first argument is the input image, which is grayscale. The second argument is optional which decides the size of the output array. If it is greater than the size of the input image, the input image is padded with zeros before the calculation of FFT. If it is less than the input image, the input image will be cropped. If no arguments are passed, the Output array size will be the same as the input.
Now once you got the result, the zero frequency component (DC component) will be at the top left corner. If you want to bring it to the center, you need to shift the result by N/2 in both directions. This is simply done by the function, np.fft.fftshift()
. (It is easier to analyze). Once you found the frequency transform, you can find the magnitude spectrum.
- Imports
import matplotlib.pyplot as plt
import numpy as np
import cv2
%matplotlib inline
2. Load the image in RGB and GRAY colorspace β For this article, I have chosen two images of different kinds
a. One with the strips
b. Image with a solid background
so that we can see the frequency changes properly.
stripes_image = cv2.imread('image 1')
stripes_image = cv2.cvtColor(stripes_image,cv2.COLOR_BGR2RGB)
solid_image = cv2.imread('image 2')
solid_image = cv2.cvtColor(solid_image,cv2.COLOR_BGR2RGB)
f,(ax1,ax2) = plt.subplots(1,2,figsize=(10,5))
ax1.imshow(stripes_image)
ax2.imshow(solid_image)
plt.show()
NOTE: In an image area with edges is considered to be a high-frequency region and for the solid color it is considered to be a low-frequency region.
3. Conversion to GRAY colorspace and normalize the image
# convert it to gray scale
gray_stripes = cv2.cvtColor(stripes_image, cv2.COLOR_RGB2GRAY)
gray_solid = cv2.cvtColor(solid_image, cv2.COLOR_RGB2GRAY)
# Normalize image from 0-255 to 0-1
norm_stripes = gray_stripes/255.0
norm_solid = gray_solid/255.0
4. Apply the Fourier Transform
# convert it to gray scale
gray_stripes = cv2.cvtColor(stripes_image, cv2.COLOR_RGB2GRAY)
gray_solid = cv2.cvtColor(solid_image, cv2.COLOR_RGB2GRAY)
# Normalize image from 0-255 to 0-1
norm_stripes = gray_stripes/255.0
norm_solid = gray_solid/255.0
Why 20*np.log(np.abs(fshift))?
The reason behind taking absolute is that fshift
is very complex and cannot be displayed directly. Even after taking this absolute value, it varies over a large range, and taking log shrinks that range and can be plotted
20 * log(abs(f)) = 10 * log(abs(f)Β²)
Factor 10 is arbitrary, but factor 2 (2*10) is equivalent to squaring the spectrum before taking the logarithm. If you only want to visualize the FFT, this factor does not matter β only the logarithm is important.
4. Apply Fourier Transform and plot the images.
f_stripes = Fourier_transform(norm_stripes)
f_solid = Fourier_transform(norm_solid)
f,(ax1,ax2,ax3,ax4) = plt.subplots(1,4,figsize = (30,5))
ax1.imshow(stripes_image)
ax1.set_title('Original Stripes Image')
ax2.imshow(f_stripes)
ax2.set_title('Transformed Stripes Image')
ax3.imshow(solid_image)
ax3.set_title('Original Solid Image')
ax4.imshow(f_solid)
ax4.set_title('Transformed Solid Image')
Conclusion:
- Solid Image- It has no edges present, hence as stated earlier is a low-frequency image and is brighter at the center.
- Stripes Image β A region of white or black color can be considered solid colors area and hence have less frequency but the region at the transition from white to black or vice versa is considered a high-frequency one due to edge.
EXTRA: If you want to learn how Fourier transforms looks on a regular image follow this jupyter notebook till the end.
Wrap Up
With this, we have completed our learning objectives for this lesson.
After this, you know what Fourier transform is and how to use it at a novice level. To understand it more
Refer: Reading Material
Link to GitHub.
Links Catalogue here
Upcoming
This is it for Fourier Transform in the next notebook weβll learn What are filters, and how to create one.
This is it for this article. See you at the next one
Until then, Follow for more, and donβt forget to connect with me on LinkedIn.U+2764U+2764U+2764
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI