Ejercicios Interactivos - Arquitecturas CNN

Exercise 1: Convolution Fundamentals

Beginner

Consider a 3x3 convolution filter applied to a 5x5 input image with padding='valid' and stride=1. What will be the spatial dimension of the output?

5x5

3x3

4x4

3x3x3

Remember the formula to calculate the output size with padding='valid':
output_size = (input_size - filter_size) / stride + 1

Exercise 2: Classic Architectures

Beginner

Match each CNN architecture with its main feature:

AlexNet

VGG

GoogLeNet/Inception

ResNet

Architecture with residual connections:

Architecture with inception modules:

Architecture with a simple and deep design:

First successful deep CNN architecture:

Exercise 3: Efficient Architectures

Intermediate

Which of the following statements about MobileNet is FALSE?

It uses depthwise separable convolutions to reduce parameters

It introduces the width multiplier to adjust the model size

It uses inverted bottleneck blocks in all of its versions

It is optimized for mobile and embedded devices

Think about the differences between MobileNetV1 and MobileNetV2, and what innovations each version introduced.

Exercise 4: Architectures for Segmentation

Intermediate

Select all features that are specific to the U-Net architecture:

Symmetric encoder-decoder architecture

Skip connections between corresponding levels

Residual blocks at each level

Originally designed for medical image segmentation

Uses spatial attention at each level

U-Net has a characteristic "U" shape due to its design. Think about its original components before considering later variants.

Exercise 5: Architectures for Detection

Advanced

Chronologically order the following object detection architectures based on their appearance:

YOLO v3

R-CNN

Faster R-CNN

YOLO v1

Chronological order (oldest on top):

R-CNN was one of the first successful CNN-based detectors, while YOLO introduced the single-stage detection paradigm after Faster R-CNN.

Exercise 1: Convolution Implementation

Intermediate

Complete the following code to implement a 3x3 convolutional layer with 16 filters, stride=1 and 'same' padding using PyTorch:


            import torch
            import torch.nn as nn
            
            class SimpleCNN(nn.Module):
                def __init__(self):
                    super(SimpleCNN, self).__init__()
                    self.conv = nn.______(______, kernel_size=______, stride=______, padding=______)
                    self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
                    self.flatten = nn.Flatten()
                    self.fc = nn.Linear(16 * 14 * 14, 10)  # Assuming input is 28x28
            
                def forward(self, x):
                    x = torch.relu(self.conv(x))
                    x = self.pool(x)
                    x = self.flatten(x)
                    x = self.fc(x)
                    return x

self.conv = nn.

In PyTorch, the convolutional layer is called nn.Conv2d. Use padding=1 with a 3x3 kernel and stride 1 to simulate 'same' padding.

Exercise 2: Residual Block

Advanced

Implement a basic residual block like the one used in ResNet. The block should contain two 3x3 convolutional layers with the same number of filters and a skip connection:


                    import torch
                    import torch.nn as nn
                    import torch.nn.functional as F
                    
                    class ResidualBlock(nn.Module):
                        def __init__(self, channels):
                            super(ResidualBlock, self).__init__()
                            self.conv1 = nn.Conv2d(channels, channels, kernel_size=3, padding=1)
                            self.bn1 = nn.BatchNorm2d(channels)
                            self.conv2 = nn.Conv2d(channels, channels, kernel_size=3, padding=1)
                            self.bn2 = nn.BatchNorm2d(channels)
                    
                        def forward(self, x):
                            # Store input for the skip connection
                            shortcut = x
                    
                            # First convolutional layer
                    
                            # Second convolutional layer
                    
                            # Add skip connection
                    
                            return y

        # Store input for the skip connection
        shortcut = x
        # First conv layer  
        # Second conv layer  
        # Skip connection  
        # ReLU

A basic residual block has this structure: Conv → BN → ReLU → Conv → BN → Add(input) → ReLU.

Exercise 3: Depthwise Separable Convolution

Advanced

Complete the following code to implement a depthwise separable convolution block like the one used in MobileNet:


                        import torch
                        import torch.nn as nn
                        import torch.nn.functional as F
                        
                        class DepthwiseSeparableConv(nn.Module):
                            def __init__(self, in_channels, out_channels, kernel_size=3):
                                super(DepthwiseSeparableConv, self).__init__()
                                # Depthwise convolution
                                self.depthwise = nn.___(___, ___, ___, ___, ___)
                                # Pointwise convolution
                                self.pointwise = nn.___(___, ___, ___)
                        
                            def forward(self, x):
                                x = F.relu(self.depthwise(x))
                                x = F.relu(self.pointwise(x))
                                return x

        # Depthwise convolution
        self.depthwise = nn.___(___, ___, ___, ___, ___)
        # Pointwise convolution
        self.pointwise = nn.___(___, ___, ___)

A depthwise separable convolution consists of two steps: 1. nn.Conv2d with groups=in_channels (depthwise) 2. nn.Conv2d with kernel_size=1 (pointwise) This is used to reduce parameters while maintaining performance.

Exercise 1: CNN Design for Classification

Intermediate

Design a CNN for CIFAR-10 image classification (32x32x3) by dragging and ordering the layers:

Input(32,32,3)

Conv2D(32,3,3)

Conv2D(64,3,3)

Conv2D(128,3,3)

MaxPooling2D(2,2)

Dropout(0.25)

Dropout(0.5)

Flatten()

Dense(512)

Dense(10, softmax)

CNN Architecture (order top to bottom):

A typical CNN architecture follows this pattern: input layer → convolutional blocks (Conv+Pool) → flatten → dense layers → output layer. Consider where to place dropout to prevent overfitting.

Exercise 2: Architecture Analysis

Advanced

Analyze the following architectures and choose the most suitable one for each use case:

1. Real-time object detection on a mobile device:

VGG16

YOLO

U-Net

ResNet-152

2. Accurate tumor segmentation in medical images:

MobileNet

AlexNet

U-Net

YOLO

3. High-accuracy image classification without resource constraints:

ResNet-50

LeNet-5

MobileNetV2

SqueezeNet

Interactive Exercises - CNN Architectures

Exercise 1: Convolution Fundamentals

Exercise 2: Classic Architectures

Exercise 3: Efficient Architectures

Exercise 4: Architectures for Segmentation

Exercise 5: Architectures for Detection

Exercise 1: Convolution Implementation

Exercise 2: Residual Block

Exercise 3: Depthwise Separable Convolution

Exercise 1: CNN Design for Classification

Exercise 2: Architecture Analysis

Results Summary

Certificate of Completion

Interactive CNN Architecture Course

Student