
See how machine learning models are functions that map inputs to outputs, tuned by parameters and architectures, and optimized with gradient descent to improve accuracy.
Explore how loss functions like mean squared error and cross entropy evaluate model predictions, compute gradients, and update parameters via gradient descent to reduce error across epochs.
Explore derivatives by applying the power rule to x squared and the chain rule to composite functions, with a note on backpropagation and a future loss function.
Explore how to compute the derivative of a mean squared error loss with respect to the model weight W using the chain rule, backpropagation, and gradient updates.
Explore gradient descent, an algorithm that updates model parameters using the loss derivative, negating the gradient with a learning rate to move toward lower loss across billions of parameters.
Demonstrates updating a single parameter w with gradient descent to fit synthetic data, using a ground-truth w of 2, a mean squared error loss, and a learning rate to converge.
Explore how partial derivatives enable computing the gradient of the loss with respect to every trainable parameter in large models, and prepare to learn by hand on simple equations.
Learn to compute partial derivatives by hand for two-variable functions, holding one variable constant, then apply product rule to see how x or y affects the output and gradient descent.
Learn how the jacobian extends partial derivatives to vector-valued functions, and compute it from F1 = X^2 + Y and F2 = XY, highlighting backpropagation and machine learning relevance.
Compute by-hand gradients for a two-layer neural network with relu activations, applying the chain rule to derive loss derivatives with respect to y hat, l2, w2, and b2.
Train a neural network from scratch with NumPy on the MNIST handwritten digits, covering forward and backward passes, ReLU activation, and a training loop using mean squared error.
Learn how automatic differentiation enables scalable backpropagation by breaking networks into primitive operations and using a computation graph to derive derivatives with reverse mode.
Breaks a network into primitive operations—matrix multiplication, addition, ReLU—and demonstrates forward and backward passes with a computation graph to compute the mean squared error loss derivatives and gradients.
Learn how autodiff in PyTorch computes gradients, inspect the computation graph, and train a simple linear model with synthetic data to recover parameters w and b.
Stay up to date with weekly AI research via the TensorTeach AI newsletter, delivered every Friday, curating papers on multimodal LLMs, text LLMs, embodied agents, robotics, and quantization.
Calculus is the foundation of modern machine learning and AI—but most courses either stay too theoretical or skip the math entirely. This course bridges that gap.
Calculus for Data Science & AI is designed to help you truly understand how machine learning models learn, using calculus as a practical tool—not just abstract theory.
Instead of memorizing formulas, you’ll learn how calculus directly powers core concepts like loss functions, gradient descent, and neural networks.
We start by reframing machine learning models as mathematical functions and show how learning is simply the process of minimizing error. From there, you’ll build a strong intuition for derivatives, slopes, and sensitivity—then apply them step-by-step to real models.
As the course progresses, you’ll move into multivariable calculus, gradients, and Jacobians—key tools for understanding how modern AI systems operate under the hood.
You’ll then connect theory to practice by:
Deriving backpropagation by hand
Training a neural network from scratch using NumPy
Understanding how gradients flow through deep networks
Finally, you’ll explore automatic differentiation, the engine behind modern ML frameworks, and see how tools like PyTorch handle gradient computation at scale.
By the end of this course, you won’t just use machine learning—you’ll understand how it works at a fundamental level.
This course is ideal for intermediate learners who want to go beyond high-level intuition and gain a deeper, more technical understanding of AI systems.