AlexNet Implementation

Deep Learning Engineer

Impact

Re-implemented the pioneering CNN architecture that started the deep learning revolution.

Overview

The Spark of the Deep Learning Revolution

In 2012, Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton changed the course of AI history. Their submission to the ImageNet Large Scale Visual Recognition Challenge (ILSVRC)—AlexNet—achieved a top-5 error rate of 15.3%, more than 10.8 percentage points lower than that of the runner up. This was the moment Deep Learning "arrived."

This project is a high-fidelity re-implementation of that pioneering architecture using modern PyTorch, designed for educational depth and performance.

Figure: AlexNet Hero

Architecture Breakdown

AlexNet was revolutionary for its depth and its use of techniques that are now standard.

Figure: AlexNet Architecture

Deep Convolutional Layers: 5 convolutional layers followed by 3 fully connected layers.
ReLU Activation: One of the first major successes for ReLU over Tanh/Sigmoid, accelerating training significantly.
Local Response Normalization (LRN): A technique used in the original paper to aid generalization (though less common today).
Overlapping Pooling: Reducing dimensionality while preserving more spatial information.
Dropout: A critical regularization technique to prevent overfitting in the large fully-connected layers.

Implementation Highlights

1. The Model Class

We mirrored the split between the "features" (Convs) and the "classifier" (FCs).

class AlexNet(nn.Module):
    def __init__(self, num_classes: int = 1000):
        super(AlexNet, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            # ... additional layers ...
        )
        self.classifier = nn.Sequential(
            nn.Dropout(),
            nn.Linear(256 * 6 * 6, 4096),
            nn.ReLU(inplace=True),
            # ...
        )

2. Weight Initialization

The original paper specified weights initialized from a zero-mean Gaussian distribution with a standard deviation of 0.01. We implement this explicitly to ensure the model's behavior matches the historical baseline.

3. Modern Training CLI: Typer

While the architecture is historic, the developer experience is modern. We integrated Typer to build a robust CLI that allows for easy hyperparameter tuning and dataset selection directly from the terminal.

Why Re-implement AlexNet?

In an age of Transformers and 100B+ parameter models, why look back at 2012?

Fundamental Understanding: Understanding how weight initialization and dropout affected early deep networks is crucial for any AI engineer.
Benchmarking: AlexNet remains a gold standard for testing new deep learning hardware or optimization techniques.
Purity: It is a purely convolutional architecture, making it an excellent teaching tool for the mechanics of spatial feature extraction.

🔗 Project Resources

GitHub Repository: Full source code and lineage.
Original Paper: "ImageNet Classification with Deep Convolutional Neural Networks".

PyTorchComputer VisionDeep LearningCNN

Gallery Overview

View Project Site