Convolutional Neural Networks (CNNs)

Introduction

Convolutional Neural Networks (CNNs) are designed for processing grid-like data such as images. They use convolution operations to detect local features like edges, textures, and patterns.

Key Concepts

Convolution Operation

Slides filters across input to detect features
Parameter sharing reduces overfitting
Translation invariance for robust detection
Creates feature maps highlighting patterns

Pooling Layers

Reduces spatial dimensions
Max pooling: takes maximum value
Average pooling: takes average value
Provides translation invariance

Interactive Convolution Playground

Experiment with different filters to see how convolution works. Draw on the input image and observe the effects:

Input Image (32×32)

Click to draw | Shift+click to erase | Alt+click to analyze pixel | Ctrl+click for precise

Convolution Output

Filter Selection

Detects vertical edges

Current Kernel (3×3)

-1.0

0.0

1.0

-1.0

0.0

1.0

-1.0

0.0

1.0

Parameters

Stride: 1

Padding: 0

Pooling Type

Image Presets

Feature Maps Visualization

See how multiple filters create different feature maps, followed by pooling operations:

Top row: Convolution outputs | Bottom row: After pooling

CNN Architecture

Typical CNN architecture showing the flow from input through convolution, pooling, and fully connected layers:

Convolution: Feature extraction with learnable filters

Pooling: Spatial dimension reduction

Fully Connected: Classification/regression

Output: Final predictions

Receptive Field Analysis

Click on any pixel in the input to see its receptive field - the region that influences the output after convolution:

Receptive Field Visualization

Blue highlight shows the receptive field of the selected pixel

Receptive Field Details

Alt+click on the input image above to analyze a pixel's receptive field

💡 Understanding Receptive Fields

• Each output pixel "sees" only a local region of the input
• Deeper layers have larger receptive fields
• This locality enables CNNs to detect features hierarchically
• Small receptive fields detect edges, large ones detect objects

Mathematical Foundation

Convolution Operation:

(f * g)[n] = Σ f[i] × g[n - i]

Where f is the input and g is the kernel/filter

Output Size Formula:

Output = (Input + 2×Padding - Kernel) / Stride + 1

Key Takeaways

CNNs are specifically designed for processing spatial data like images
Convolution layers detect local features through learnable filters
Pooling layers reduce spatial dimensions and provide translation invariance
Parameter sharing in convolution makes CNNs efficient and robust
Different filters detect different types of features (edges, textures, shapes)
CNNs form the backbone of modern computer vision systems
Understanding convolution is crucial for working with image-based deep learning models