
Build a PyTorch-like deep learning framework from scratch with NumPy, implement automatic differentiation and neural network abstractions, and train on MNIST with CNNs and RNNs.
Discover how to access course resources, download the source code zip, and use the provided materials to follow along with each lecture.
Set up a Ubuntu-based development environment for building a deep learning framework by installing Python, creating a virtual environment, and installing NumPy; configure PyCharm with the virtual interpreter.
Set up a macOS development environment for building a deep learning framework by installing Python, creating a virtual environment, and installing NumPy.
Explore building a simple deep learning framework from scratch by implementing a base function class, a sine subclass, and using NumPy to perform forward computations.
Implement the variable class to manage values in the deep learning framework, initialize the value as an ndarray, and enforce type checks with explicit error handling.
Implement a to_array helper to ensure forward outputs are ndarrays. Use numpy is_scalar to convert scalars to ndarrays, keeping all results consistent for deep learning computations.
Explore computation graphs, forward propagation, and backward propagation to understand how gradients guide deep learning model training.
Implement a gradient check function that uses central difference numerical differentiation to compare backpropagated gradients with numerical gradients, validating with allclose on a sine of sine example.
Implement variable arguments support for forward to handle multiple inputs by collecting xvalues and unpacking with *, and test with a two-input add function.
Demonstrates backpropagation when the same input is used twice in y = x0 + x0, showing gradient accumulation to avoid overwriting, and verifies the correct result with a test.
Implement the addition operation by overloading the add method in the var class, converting inputs to ndarray objects and var objects, then test with sine of x plus x.
Implement the subtraction operator in a deep learning framework, mirroring addition, and update the forward and backward passes to yield gradients 1 and -1 while testing the change.
Implement the multiplication operation in the Python deep learning framework, performing forward computation and back propagation to compute x and y gradients from upstream derivatives.
Develop division in the backward pass by deriving x and y derivatives, update the forward division, and add true div and rtrue div in the var class, then test.
Apply the negate operation in backpropagation to derive dz/dx as minus 1, and implement y = -x with backward = -gy.
Implement the reshape method to alter tensor shapes without changing values using NumPy reshape in the forward pass, and reshape gradients to the input shape in backpropagation.
Rewrite the gradient check to support tensor inputs by computing per-element derivatives with an elementwise loop, using nditer and in-place updates, summing y0 and y1, and storing results in grads.
We summarize the sum_to helper function that compresses input x to a target out shape using sum with keepdims, handling delta dim and delta axis, and validating final shape.
Implement a matmul class to perform matrix multiplication using NumPy's dot function, compute X and W gradients via backpropagation with transposed weights, and verify with a gradient check.
Implement the exp method for tensors and perform backpropagation on the exp node, using dz/dx = z for gradient computation. Use numpy exp to compute y = e^x.
Explore neural networks by building the basic structure with input, hidden, and output layers. Discover how weights and bias drive z computations through matrix multiplication and a sigmoid activation.
Explore mean squared error as the loss function for training a neural network, and apply gradient descent to update weights and biases to minimize loss.
Implement the module class as the base for layers, enabling forward flow from input x to output z, and manage its own parameters rather than manual w1, b1, w2, b2.
Implement a parameter class and integrate it into a module class, using a set to track parameters, __setattr__ overrides, and recursive, yield-based parameter traversal with clear grad support.
Define a linear layer class that inherits from the module class, initializes weights and bias, handles input size, and uses forward to perform matrix multiplication within a deep learning framework.
Reorganize the framework by placing core components—var class, function class, and forward/backward calculations—into core, move helpers into helper, place module class and linear class into layers, and SGD into optimizer.
Reorganize the nanotorch project by moving code into dedicated helper, core, layers, functions, and optimizers modules, updating imports while preserving the network's training behavior.
Add a dataset and a dataloader to the NanoTorch framework, and test them with the MNIST handwriting dataset of 28 by 28 grayscale images (60,000 training, 10,000 test).
Implement a dataloader class for the NanoTorch framework to automate batch retrieval, shuffling, and iteration, making training cleaner with batch x and batch labels.
Download and unzip the MNIST dataset from the resources, then use the train images and train labels to train your network and begin implementing the MNIST dataset class.
Implement the MNIST dataset class to load training and test data from gzip files and map data and label paths, then reshape 28x28 grayscale images for use in the framework.
Learn how softmax converts neural outputs into category probabilities and how cross-entropy computes loss for multi-category problems, with examples from MNIST and output slicing.
Implement the log method in a custom deep learning framework, with forward y equals log(x) and backward gradient gy by computing dz/dx = 1/x.
Implement slicing in your deep learning framework, covering forward slicing, backpropagation into sliced positions, and a getitem method for the var object using numpy zeros.
Implement the clip class to truncate values to min and max limits, apply forward clipping with numpy, and perform backpropagation using a mask to zero gradients for clipped elements.
Implement the softmax and cross entropy loss in nanoTorch by defining the softmax function, clipping probabilities, computing the log, selecting true labels, and averaging the scalar loss for multiclass problems.
Introduce data preprocessing classes for MNIST: convert raw unsigned int images to float, flatten images, and apply min-max scaling. Integrate transforms into the dataset to prepare data for training MyNet.
Implement an accuracy function for the MNIST classifier by extracting final predictions from y_predict and comparing them to true labels. Compute the average with numpy mean to measure training accuracy.
Evaluate your neural network on the mnist test set using a test dataloader, compute and print the accuracy, and compare training versus test performance in your from-scratch deep learning framework.
Explore the convolutional neural network from input to output, covering conv operations, padding, stride, and multi-channel feature maps to understand output shapes.
Implement guideconfOutsize and Pair helper functions in the helper file to compute the CNN output size from input, kernel, stride, and padding using integer division for NanoTorch.
Implement the conf2d layer in the cnn class, initializing parameters, weights, and bias. Forward passes use stride and padding, with the conf2d function handling core convolution and backpropagation.
Implement the conv2d function in nanoTorch, preparing inputs and weights, applying stride and padding, reshaping kernels, and performing matrix multiplication to produce output maps with shape n, oc, oh, ow.
Master image to matrix and matrix to image in a CNN class with im2col, including batch size, padding, and stride, to enable efficient convolution via matrix multiplication.
Import CNN components such as Conf2D, ImageToMatrix, and MatrixToImage, apply Flatten preprocessing, and evaluate on the Amnesty data site using the CNN instead of the MLP, achieving rising accuracy.
Welcome to Python: Write Your Own Deep Learning Framework From Scratch.
This course teaches you how to build a simple, PyTorch-like deep learning framework from scratch. It covers the core mechanics of automatic differentiation and neural network abstractions. In this course, I will take you through the process of building a modular working system step by step, using only Python and NumPy.
The first part of the course teaches all you need to know (computation graphs, backpropagation logic, gradient checking, etc.) before you can build a functional autograd engine. In this part, we start with scalar-valued variables and move on to handling complex logic, such as dealing with the same inputs and advanced operators. You will learn how to automate the chain rule and verify your engine’s accuracy.
The second part of the course teaches you how to transition from scalars to tensors. You will learn how to implement broadcasting, matrix multiplication, and shape manipulation. We will then restructure our code into a modular framework called NanoTorch. By the end of this part, you will implement essential framework components like Datasets, DataLoaders, and Optimizers to train models on the real-world MNIST dataset.
The final part of the course focuses on implementing core neural network architectures. We will deep-dive into Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). You will see how to implement the im2col algorithm for efficient convolution and handle sequential data for time-series tasks. Ultimately, we will write fully functional CNN and RNN architectures from the ground up, ensuring an in-depth understanding of these powerful models.
In this course you will learn:
How to write a deep learning framework using pure Python and NumPy code.
How to build a functional Autograd Engine from scratch.
Be able to implement core classes like Variable, Function, and Module.
Be able to build a tensor engine that supports broadcasting and matrix operations.
How to implement activation functions like ReLU, Sigmoid, and Softmax.
How to build a Data Pipeline including Dataset and DataLoader for mini-batch training.
Be able to implement Optimizers like Stochastic Gradient Descent (SGD).
How to train and evaluate models on the MNIST dataset.
Be able to understand the im2col algorithm for convolutions.
How to implement Convolutional Neural Networks (CNN) from the ground up.
How to implement Recurrent Neural Networks (RNN) from the ground up.
How to develop Sequential model support for Recurrent Neural Networks (RNN).
At the end of the course, you should be able to develop your own deep learning framework and understand the low-level mechanics of deep learning structures.