Interactive Exercises - CNN Architectures

Progress: 0%

Exercise 1: Convolution Fundamentals

Beginner

Consider a 3x3 convolution filter applied to a 5x5 input image with padding='valid' and stride=1. What will be the spatial dimension of the output?

5x5
3x3
4x4
3x3x3
Remember the formula to calculate the output size with padding='valid':
output_size = (input_size - filter_size) / stride + 1

Exercise 2: Classic Architectures

Beginner

Match each CNN architecture with its main feature:

AlexNet
VGG
GoogLeNet/Inception
ResNet

Architecture with residual connections:

Architecture with inception modules:

Architecture with a simple and deep design:

First successful deep CNN architecture:

Exercise 3: Efficient Architectures

Intermediate

Which of the following statements about MobileNet is FALSE?

It uses depthwise separable convolutions to reduce parameters
It introduces the width multiplier to adjust the model size
It uses inverted bottleneck blocks in all of its versions
It is optimized for mobile and embedded devices
Think about the differences between MobileNetV1 and MobileNetV2, and what innovations each version introduced.

Exercise 4: Architectures for Segmentation

Intermediate

Select all features that are specific to the U-Net architecture:

Symmetric encoder-decoder architecture
Skip connections between corresponding levels
Residual blocks at each level
Originally designed for medical image segmentation
Uses spatial attention at each level
U-Net has a characteristic "U" shape due to its design. Think about its original components before considering later variants.

Exercise 5: Architectures for Detection

Advanced

Chronologically order the following object detection architectures based on their appearance:

YOLO v3
R-CNN
Faster R-CNN
YOLO v1

Chronological order (oldest on top):

R-CNN was one of the first successful CNN-based detectors, while YOLO introduced the single-stage detection paradigm after Faster R-CNN.

Exercise 1: Convolution Implementation

Intermediate

Complete the following code to implement a 3x3 convolutional layer with 16 filters, stride=1 and 'same' padding using PyTorch:


            import torch
            import torch.nn as nn
            
            class SimpleCNN(nn.Module):
                def __init__(self):
                    super(SimpleCNN, self).__init__()
                    self.conv = nn.______(______, kernel_size=______, stride=______, padding=______)
                    self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
                    self.flatten = nn.Flatten()
                    self.fc = nn.Linear(16 * 14 * 14, 10)  # Assuming input is 28x28
            
                def forward(self, x):
                    x = torch.relu(self.conv(x))
                    x = self.pool(x)
                    x = self.flatten(x)
                    x = self.fc(x)
                    return x
            
In PyTorch, the convolutional layer is called nn.Conv2d. Use padding=1 with a 3x3 kernel and stride 1 to simulate 'same' padding.

Exercise 2: Residual Block

Advanced

Implement a basic residual block like the one used in ResNet. The block should contain two 3x3 convolutional layers with the same number of filters and a skip connection:


                    import torch
                    import torch.nn as nn
                    import torch.nn.functional as F
                    
                    class ResidualBlock(nn.Module):
                        def __init__(self, channels):
                            super(ResidualBlock, self).__init__()
                            self.conv1 = nn.Conv2d(channels, channels, kernel_size=3, padding=1)
                            self.bn1 = nn.BatchNorm2d(channels)
                            self.conv2 = nn.Conv2d(channels, channels, kernel_size=3, padding=1)
                            self.bn2 = nn.BatchNorm2d(channels)
                    
                        def forward(self, x):
                            # Store input for the skip connection
                            shortcut = x
                    
                            # First convolutional layer
                    
                            # Second convolutional layer
                    
                            # Add skip connection
                    
                            return y
                    
A basic residual block has this structure: Conv → BN → ReLU → Conv → BN → Add(input) → ReLU.

Exercise 3: Depthwise Separable Convolution

Advanced

Complete the following code to implement a depthwise separable convolution block like the one used in MobileNet:


                        import torch
                        import torch.nn as nn
                        import torch.nn.functional as F
                        
                        class DepthwiseSeparableConv(nn.Module):
                            def __init__(self, in_channels, out_channels, kernel_size=3):
                                super(DepthwiseSeparableConv, self).__init__()
                                # Depthwise convolution
                                self.depthwise = nn.___(___, ___, ___, ___, ___)
                                # Pointwise convolution
                                self.pointwise = nn.___(___, ___, ___)
                        
                            def forward(self, x):
                                x = F.relu(self.depthwise(x))
                                x = F.relu(self.pointwise(x))
                                return x
                        
A depthwise separable convolution consists of two steps: 1. nn.Conv2d with groups=in_channels (depthwise) 2. nn.Conv2d with kernel_size=1 (pointwise) This is used to reduce parameters while maintaining performance.

Exercise 1: CNN Design for Classification

Intermediate

Design a CNN for CIFAR-10 image classification (32x32x3) by dragging and ordering the layers:

Input(32,32,3)
Conv2D(32,3,3)
Conv2D(64,3,3)
Conv2D(128,3,3)
MaxPooling2D(2,2)
MaxPooling2D(2,2)
Dropout(0.25)
Dropout(0.5)
Flatten()
Dense(512)
Dense(10, softmax)

CNN Architecture (order top to bottom):

A typical CNN architecture follows this pattern: input layer → convolutional blocks (Conv+Pool) → flatten → dense layers → output layer. Consider where to place dropout to prevent overfitting.

Exercise 2: Architecture Analysis

Advanced

Analyze the following architectures and choose the most suitable one for each use case:

1. Real-time object detection on a mobile device:

VGG16
YOLO
U-Net
ResNet-152

2. Accurate tumor segmentation in medical images:

MobileNet
AlexNet
U-Net
YOLO

3. High-accuracy image classification without resource constraints:

ResNet-50
LeNet-5
MobileNetV2
SqueezeNet

Certificate of Completion

Interactive CNN Architecture Course

This certificate certifies that

Student

has successfully completed all interactive exercises of the course.

Date: