印
AI ProductGitHub
AlexNet Implementation
Deep Learning Engineer

Impact
Re-implemented the pioneering CNN architecture that started the deep learning revolution.
Overview
The Spark of the Deep Learning Revolution
In 2012, Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton changed the course of AI history. Their submission to the ImageNet Large Scale Visual Recognition Challenge (ILSVRC)—AlexNet—achieved a top-5 error rate of 15.3%, more than 10.8 percentage points lower than that of the runner up. This was the moment Deep Learning "arrived."
This project is a high-fidelity re-implementation of that pioneering architecture using modern PyTorch, designed for educational depth and performance.

Figure: AlexNet Hero
Architecture Breakdown
AlexNet was revolutionary for its depth and its use of techniques that are now standard.

Figure: AlexNet Architecture
- Deep Convolutional Layers: 5 convolutional layers followed by 3 fully connected layers.
- ReLU Activation: One of the first major successes for ReLU over Tanh/Sigmoid, accelerating training significantly.
- Local Response Normalization (LRN): A technique used in the original paper to aid generalization (though less common today).
- Overlapping Pooling: Reducing dimensionality while preserving more spatial information.
- Dropout: A critical regularization technique to prevent overfitting in the large fully-connected layers.
Implementation Highlights
1. The Model Class
We mirrored the split between the "features" (Convs) and the "classifier" (FCs).
class AlexNet(nn.Module):
def __init__(self, num_classes: int = 1000):
super(AlexNet, self).__init__()
self.features = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2),
# ... additional layers ...
)
self.classifier = nn.Sequential(
nn.Dropout(),
nn.Linear(256 * 6 * 6, 4096),
nn.ReLU(inplace=True),
# ...
)
2. Weight Initialization
The original paper specified weights initialized from a zero-mean Gaussian distribution with a standard deviation of 0.01. We implement this explicitly to ensure the model's behavior matches the historical baseline.
3. Modern Training CLI: Typer
While the architecture is historic, the developer experience is modern. We integrated Typer to build a robust CLI that allows for easy hyperparameter tuning and dataset selection directly from the terminal.
Why Re-implement AlexNet?
In an age of Transformers and 100B+ parameter models, why look back at 2012?
- Fundamental Understanding: Understanding how weight initialization and dropout affected early deep networks is crucial for any AI engineer.
- Benchmarking: AlexNet remains a gold standard for testing new deep learning hardware or optimization techniques.
- Purity: It is a purely convolutional architecture, making it an excellent teaching tool for the mechanics of spatial feature extraction.
🔗 Project Resources
- GitHub Repository: Full source code and lineage.
- Original Paper: "ImageNet Classification with Deep Convolutional Neural Networks".
PyTorchComputer VisionDeep LearningCNN
Gallery Overview

Siwarat Laoprom © 2026