Understanding Convolution

Last Updated on September 29, 2024 by Editorial Team

Author(s): Ayo Akinkugbe

Originally published on Towards AI.

To better understand what convolution is, it is needful to know why dense neural networks (DNN) don’t work well for images. If you trained a DNN and a CNN (convolutional neural network), you are bound to get higher accuracy and lower loss on the CNN model compared to the DNN model. Here are some reasons why:

1. High Dimensionality and Computational Complexity

Images typically have a large number of pixels. For example, a 200×200 image has 40,000 pixels, and a dense neural network would need to treat each pixel as an independent input. A fully connected layer with 40,000 inputs would require an enormous number of connections to the next layer, leading to:

High memory usage: Storing the weights for every pixel connection in large images becomes impractical.
Increased computational cost: Processing becomes slow and inefficient because dense layers don’t take advantage of the spatial structure of images.

In contrast, convolutional layers in CNNs use small filters that share weights across the image, drastically reducing the number of parameters and making computations more efficient.

2. Loss of Spatial Hierarchy

DNNs treat all pixels as independent features, ignoring the fact that neighboring pixels in an image are closely related. This means that in a DNN:

Spatial relationships are not considered: Dense layers don’t account for spatial patterns like edges, textures, or shapes that are present in nearby pixels. Images have local features (e.g. eyes, corners of objects) that need to be preserved.
No translation invariance: Dense networks struggle to recognize patterns like an object in an image if it appears in different positions. Convolutional layers, on the other hand, apply filters across the entire image, making them good at recognizing objects regardless of their location.

3. Inefficient Feature Learning

In DNNs, each layer needs to learn global patterns from scratch. This makes it difficult to detect complex hierarchical features in images, such as edges in earlier layers and entire objects in later layers.

In contrast, CNNs can learn hierarchical features. Early layers in a CNN focus on low-level features (like edges and textures), while deeper layers learn more abstract concepts (like parts of objects or even whole objects). Dense layers do not efficiently capture this hierarchical structure, leading to poor performance on image data.

4. Overfitting

With a large number of parameters in fully connected layers, a dense network is more prone to overfitting, especially with smaller datasets. Images usually contain a lot of redundant information, and fully connected networks have no mechanism to reduce this redundancy. Convolutional layers reduce overfitting through the concept of weight sharing (the same filter slides over different parts of the image). This greatly reduces the number of parameters, leading to more generalizable models with less risk of overfitting.

How Then Does Convolution Works?

Imagine sweeping a magnifying glass across an image to detect specific patterns (like lines or shapes). Convolution in CNNs can be thought of as a way to capture patterns in data by sliding a small magnifying glass (filter) across an image or other data to focus on specific local features. Each filter looks for a different kind of pattern, and the CNN uses many of them to understand the image, layer by layer, from simple features to complex ones.

Filter as a Pattern Detector: Imagine you have an image of a cat. A filter (or kernel) in a CNN is a small matrix (e.g., 3×3 or 5×5) that scans across this image. Each filter looks for a specific feature like edges, textures, or shapes. For example, one filter might detect horizontal lines, another might detect vertical lines, and yet another could find corners.
Sliding Across the Image: The filter moves over the image (convolves) in small steps. At each step, it performs a dot product between the values in the filter and the corresponding region of the image. This helps the CNN extract local information about the image (such as edges or texture patterns) without looking at the entire image at once.
Feature Map: The result of this sliding process is a new matrix called a feature map. The values in the feature map represent how strongly the feature (pattern) the filter is looking for is present in different parts of the image. For example, if the filter is detecting vertical edges, the feature map will have high values where vertical edges appear in the image.
Multiple Filters, Rich Features: A CNN uses many different filters to capture various features. Early layers typically learn simple features like edges, while deeper layers learn more complex patterns (e.g., eyes, faces, or even abstract shapes).
Receptive Field: The filter’s size limits how much of the image it “sees” at once, which is called its receptive field. As you go deeper in the network, the filters “see” larger parts of the image, which allows the network to detect higher-level features, like objects or parts of objects.

Conclusion

Convolution improves image prediction as it uses filters to reduce parameter complexity in training while considering spatial hierarchy. These unique properties convolution offers make CNNs deliver better accuracy and lower loss when used on image data.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Understanding Convolution

Author(s): Ayo Akinkugbe

How Then Does Convolution Works?

Conclusion

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Arbitration for AI: A New Frontier in Governing Uncensored Models

Fine-Tuning vs Distillation vs Transfer Learning: What’s The Difference?

#63: Full of Frameworks: APDTFlow, NSGM, MLFlow, and more!

Vector Databases 101: A Beginner’s Guide to Vector Search and Indexing

AI Agent Developer: A Journey Through Code, Creativity, and Curiosity

The World’s Leading AI and Technology Publication.

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Understanding Convolution

Author(s): Ayo Akinkugbe

How Then Does Convolution Works?

Conclusion

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement