
Introduces recurrent neural networks in Python, covering RNN units like GRU and LSTM, building with TensorFlow, and applying to time series forecasting, text classification, and image recognition.
Engage in this hands-on course by typing code and implementing concepts in real projects, building muscle memory and producing useful outputs.
Discover where to get the course code: Colab notebooks or plain text Python on GitHub. Access the resources through the code link, after logging in.
Discover three guidelines to succeed: ask questions in the Q and A, meet prerequisites, and engage with both conceptual and coding lectures by handwritten notes and coding what you see.
Explore Google Colab as a hosted, Jupyter-like environment for deep learning in Python. Access free GPUs or TPUs, use cloud notebooks stored in Google Drive, and preinstalled libraries.
Learn the basics of NumPy, Pandas, Matplotlib, Scipy, and Scikit-Learn for deep learning, including tensors, matrix operations, data loading with Pandas, and the model lifecycle from training to evaluation.
Resolve temporary 403 errors by downloading the file in a browser and uploading it to Colab with the file explorer (drag and drop), addressing host blocking of public IPs.
Begin with a practical warmup that reviews core machine learning concepts, including classification, regression, and neurons, and shows how learning is implemented in code for upcoming sections.
Explore how machine learning reduces to a geometry problem, using supervised learning, regression, and classification to fit lines, planes, or curves. See that all data share the same geometry.
Explore the theory of binary classification using logistic regression, sigmoid activation, and linear models, implemented in TensorFlow's Keras dense layer with binary cross-entropy and Adam optimization.
Perform linear classification on the breast cancer dataset to predict malignant or benign tumors using TensorFlow 2 and Keras, with standard scaling, train-test split, and a sigmoid output.
Learn to prepare data, normalize inputs, and implement linear regression in TensorFlow with a dense layer and no activation, using MSE and SGD; apply log-transformed counts to explore Moore's law.
learn regression in TensorFlow using a log transform to linearize exponential data, train a small tf.keras model with SGD and a learning-rate schedule, and derive the transistor doubling time.
Explore how linear regression and logistic regression underpin neuron-like computation in deep learning, detailing inputs, weights, bias, and the sigmoid activation that yields a binary output.
Explore how a model learns from linear regression to gradient descent, minimizing mean squared error by updating weights and bias via a learning rate.
Apply model.predict in TensorFlow 2 via the Keras API to obtain probabilities, then round and flatten outputs to create 1D predictions, and verify accuracy against model.evaluate.
Save and load a TensorFlow model with the Keras API, using model.save and tf.keras.models.load_model. Verify identical performance and note the explicit input layer bug; define input shape.
Share feedback via the suggestion box at lazyprogrammer.me/suggestions to improve this course. Tell me about your background, course, difficulty, and missing topics you'd like covered, such as CNNs or transformers.
Introduce artificial neural networks and the feedforward model as the foundation for modern deep learning. Explore activation functions, multi-class classification, and how neural networks process images and regression tasks.
Explore forward propagation in neural networks, where inputs pass through wide and deep layers of neurons with vectorized weights, biases, and sigmoid activations.
Explore how neural networks create nonlinear decision boundaries beyond a single neuron using layered nonlinear features learned automatically, without manual feature engineering, via sigmoid activation and gradient descent.
Explore activation functions in neural networks, from sigmoid and tanh to ReLU and Leaky ReLU, addressing vanishing gradients, dead neurons, and practical choices for deep learning practice.
Explore how images are represented as height, width, and three color channels in a 3d tensor with rgb 8-bit quantization.
Clarify why mixing primary colors may not yield white and explain how rgb represents colors in images, encouraging hands-on exploration of rgb values.
Prepare code for MNIST digit classifier by loading data from tf.keras.datasets and building flatten-then-dense network with dropout and softmax for 10 classes; train with sparse categorical cross entropy and evaluate.
demonstrates building a feedforward neural network for MNIST image classification in TensorFlow 2.0, including data normalization, flattening, dense layers, dropout, and evaluation with a confusion matrix.
Explore a neural network for regression using synthetic two-dimensional data, two dense layers with ReLU, and mean squared error optimization, visualizing predictions as a three-dimensional surface.
Explore sequence data, from time series to text, and learn why RNNs excel. Understand data shapes, especially n by t by d, and how padding enables fixed-length sequences.
Forecast time series with a horizon of multiple steps using autoregressive linear regression and iterative predictions, then convert to a non-linear neural network forecaster with an ANN.
Implement an autoregressive linear model for time series in Python, using a 10-step window on a sine wave, with TensorFlow training and one-step vs multi-step forecasting.
Show how a linear model can perfectly forecast a sine wave with an AR2 recurrence on two past values, deriving W1 = 2 cos omega and W2 = -1.
Explore how recurrent neural networks model sequences with a looping hidden state and shared weights, contrasting simple flattening with rnn structure and Elman units.
Implement a simple RNN in TensorFlow 2.0 using the functional API, prepare N by T by D inputs, and train, evaluate, and predict with proper reshaping.
Explore time series prediction with an rnn by forecasting a sine wave, compare it to a linear model, and examine activation effects in a Colab notebook.
Track the shapes of N by T by D inputs, hidden states, and outputs in a hands-on RNN calculation, using a Colab notebook to inspect weights and the final predictions.
Explore the LSTM and GRU, compare them to simple RNNs, and learn how update and reset gates manage memory to address vanishing gradients in sequence models.
Learn how rnn units evolve from simple rnn to gru and lstm, focusing on forget, input, and output gates, the cell and hidden states, and long-term dependency preservation.
Explore a more challenging time series with a synthetic signal, comparing autoregressive linear models, RNNs, and LSTM, and evaluate one-step and multi-step forecasts using a Colab notebook.
Demonstrates that LSTMs capture long-distance dependencies in a XOR-based time series, comparing RNN, GRU, and LSTM, and showing how return_sequences and global max pooling improve long-range pattern learning.
Explore applying rnns to image classification by treating images as a multi-dimensional time series scanned row by row, using an lstm with a dense 10-output softmax on mnist.
Demonstrate image classification with an RNN on MNIST using a three-line TensorFlow model (input, LSTM, dense with softmax), then compile and fit to achieve about 99% accuracy.
Explore predicting stock returns with an LSTM, working from Starbucks data, creating windows, scaling, and a one step and multi step forecast, while undoing marketing mistakes and highlighting model limitations.
predict stock returns with an autoregressive lstm by computing returns from close and prev close, shifting data, normalizing, and training a supervised rnn for one- and multi-step forecasts.
By using all price features and volume, the lecture builds a binary up-or-down classifier and highlights overfitting and the limits of predicting stock returns with LSTMs.
Explore multi-step forecasting in neural networks, comparing one-step, iterative, and multi-output approaches, and learn to benchmark against naive baselines for time series.
Map words to integers and use an embedding layer to produce dense word vectors, replacing large one-hot encodings; train embeddings with model.fit for RNNs.
Tokenize text, build a word-to-index dictionary, and convert sentences into integer sequences for embedding with an rnn, then pad or truncate to a fixed length with padsequences.
Learn how to preprocess text with TensorFlow and Keras using the text vectorization layer, adapt vocabulary, convert words to integer sequences, and manage truncation and ragged padding.
build a text classification model for spam detection using an rnn with embedding, lstm, and global max pooling in tensorflow keras, covering vectorization and f1 evaluation.
Explain mean squared error from a probabilistic perspective and its link to maximum likelihood under a Gaussian model. Relate squared error to linear regression and cross entropy loss.
Explore binary cross-entropy loss for binary classification, derived from the Bernoulli distribution via maximum likelihood and negative log-likelihood, with averaging over data points.
Explain how categorical cross-entropy arises from the categorical distribution and one-hot targets, and show sparse cross-entropy with NumPy double indexing and TensorFlow 2.0.
Gradient descent trains models by minimizing the loss with respect to W, using the gradient and small updates, while tuning learning rate eta and epochs through numerical approximations.
Explore stochastic gradient descent (SGD) in TensorFlow 2.0, using mini-batches to estimate gradients. Implement two nested loops over epochs and batches, randomizing data for faster convergence and memory efficiency.
Learn how momentum speeds up gradient descent by introducing velocity and a momentum term, reducing zigzagging and accelerating convergence on uneven gradient landscapes.
Explore variable and adaptive learning rates in deep learning, including step and exponential decay, AdaGrad and RMSProp, with notes on caches, decay rates, and initialization.
Adam blends momentum and RMSprop to optimize neural networks with adaptive learning rates, using first and second moment estimates of the gradient.
Apply bias correction to the Adam optimizer by using bias-corrected m and v in updates, and review typical defaults for learning rate, beta1, beta2, and epsilon.
*** NOW IN TENSORFLOW 2 and PYTHON 3 ***
Ever wondered how AI technologies like OpenAI ChatGPT, GPT-4, DALL-E, Midjourney, and Stable Diffusion really work? In this course, you will learn the foundations of these groundbreaking applications.
Learn about one of the most powerful Deep Learning architectures yet!
The Recurrent Neural Network (RNN) has been used to obtain state-of-the-art results in sequence modeling.
This includes time series analysis, forecasting and natural language processing (NLP).
Learn about why RNNs beat old-school machine learning algorithms like Hidden Markov Models.
This course will teach you:
The basics of machine learning and neurons (just a review to get you warmed up!)
Neural networks for classification and regression (just a review to get you warmed up!)
How to model sequence data
How to model time series data
How to model text data for NLP (including preprocessing steps for text)
How to build an RNN using Tensorflow 2
How to use a GRU and LSTM in Tensorflow 2
How to do time series forecasting with Tensorflow 2
How to predict stock prices and stock returns with LSTMs in Tensorflow 2 (hint: it's not what you think!)
How to use Embeddings in Tensorflow 2 for NLP
How to build a Text Classification RNN for NLP (examples: spam detection, sentiment analysis, parts-of-speech tagging, named entity recognition)
All of the materials required for this course can be downloaded and installed for FREE. We will do most of our work in Numpy, Matplotlib, and Tensorflow. I am always available to answer your questions and help you along your data science journey.
This course focuses on "how to build and understand", not just "how to use". Anyone can learn to use an API in 15 minutes after reading some documentation. It's not about "remembering facts", it's about "seeing for yourself" via experimentation. It will teach you how to visualize what's happening in the model internally. If you want more than just a superficial look at machine learning models, this course is for you.
See you in class!
"If you can't implement it, you don't understand it"
Or as the great physicist Richard Feynman said: "What I cannot create, I do not understand".
My courses are the ONLY courses where you will learn how to implement machine learning algorithms from scratch
Other courses will teach you how to plug in your data into a library, but do you really need help with 3 lines of code?
After doing the same thing with 10 datasets, you realize you didn't learn 10 things. You learned 1 thing, and just repeated the same 3 lines of code 10 times...
Suggested Prerequisites:
matrix addition, multiplication
basic probability (conditional and joint distributions)
Python coding: if/else, loops, lists, dicts, sets
Numpy coding: matrix and vector operations, loading a CSV file
WHAT ORDER SHOULD I TAKE YOUR COURSES IN?:
Check out the lecture "Machine Learning and AI Prerequisite Roadmap" (available in the FAQ of any of my courses, including the free Numpy course)
UNIQUE FEATURES
Every line of code explained in detail - email me any time if you disagree
No wasted time "typing" on the keyboard like other courses - let's be honest, nobody can really write code worth learning about in just 20 minutes from scratch
Not afraid of university-level math - get important details about algorithms that other courses leave out