Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Deep Learning: Advanced Natural Language Processing and RNNs

Name: Deep Learning: Advanced Natural Language Processing and RNNs
Rating: 4.6 (7686 reviews)

Natural Language Processing (NLP) with Sequence-to-sequence (seq2seq), Attention, CNNs, RNNs, and Memory Networks!

Created byLazy Programmer Inc., Lazy Programmer Team

Last updated 7/2026

English

English [Auto],Italian [Auto],

What you'll learn

Build a text classification system (can be used for spam detection, sentiment analysis, and similar problems)
Build a neural machine translation system (can also be used for chatbots and question answering)
Build a sequence-to-sequence (seq2seq) model
Build an attention model
Build a memory network (for question answering based on stories)
Understand important foundations for OpenAI ChatGPT, GPT-4, DALL-E, Midjourney, and Stable Diffusion

Course content

13 sections • 67 lectures • 8h 21m total length

Introduction2:51
Explore how word embeddings and RNNs power practical deep NLP systems, with bi-directional RNNs, Seq2Seq, attention, and memory networks for text classification, translation, and question answering.
Outline4:09
Outline covers bidirectional RNNs, seq2seq with attention, memory networks, and reading comprehension via question answering. Learn the Keras workflow—from data loading to building, training, and evaluating models.
Where to get the code4:45
discover where to obtain the course code via git clone from the official GitHub page, avoid forking, and follow the theory-to-code workflow across data sets.
How to Succeed in this Course3:04
Learn how to succeed in this deep learning course by using the q&a, meeting prerequisites, and engaging with conceptual and coding lectures through notes and hands-on practice.

Review Section Introduction4:24
Review foundational deep learning and natural language processing concepts through text-based NLP problems, including word embeddings, CNNs for text, and RNNs using Keras.
How to Open Files for Windows Users2:18
Learn to open plain text files on Windows using utf-8 encoding with the open function, and avoid common pitfalls from compressed files and GitHub downloads by cloning repositories.
What is a word embedding?15:10
Discover how word embeddings convert words into numerical feature vectors, forming a vocabulary size by embedding dimension matrix that maps vocabulary indices to dense representations, enabling efficient neural network input.
Using word embeddings4:33
Explore how pre-trained word embeddings like Word2Vec and GloVe enable transfer learning in NLP by initializing neural networks with external embeddings, handling out-of-vocabulary words, and testing fine-tuning in Keras.
What is a CNN?13:36
Explore how 1-D convolution on word embeddings enables nlp in a simple, trainable architecture built in Keras, covering cross-correlation, pooling, and a sequence-modeling convolutional neural network.
Where to get the data5:06
Download the toxic comments dataset from Kaggle and train a single neural network with six binary outputs, using sigmoid activations for toxic, severe_toxic, obscene, threat, insult, and identity_hate.
CNN Code (part 1)15:08
Explore a cnn for text classification in Python using Keras, covering tokenization, padding, and embedding with pre-trained GloVe vectors. Preprocess data with word2vec mappings and vocabulary handling.
CNN Code (part 2)6:14
Train a cnn for text classification with pre-trained word embeddings, three convolution layers, three max pools, a global max pool, and six binary sigmoid outputs.
What is an RNN?13:11
Explore how a simple recurrent unit adds memory to neural networks, using shared weights (Elman unit) to process sequences and compare with fixed-size feedforward models in Keras and TensorFlow.
GRUs and LSTMs10:47
Examine GRUs and LSTMs, their gates and two states, and how they address vanishing gradients to model long-term dependencies in language. Learn practical Keras usage for return_state and return_sequences.
Different Types of RNN Tasks12:27
Discover how RNN task types depend on input and output shapes, from one-to-one and many-to-one to many-to-many and one-to-many, using Tx by D, Ty by K.
A Simple RNN Experiment6:29
Explore how return_sequences and return_states shape LSTM and GRU outputs, including hidden and cell states, and verify results with a simple rnn test.
RNN Code3:25
Build an LSTM-based rnn for toxic comment classification using embeddings, max sequence length, and pooling, with sigmoid output and higher auc and accuracy despite slower training.
Review Section Summary4:49
Review the core concepts of word embeddings, CNNs, and RNNs, including GRU and LSTM, and apply them to toxic comment classification using pre-trained embeddings like Word2Vec, GloVe, or FastText.
Suggestion Box3:10
The lecture invites learners to share feedback via a simple suggestion box, guiding improvements for the deep learning NLP and RNNs course through targeted questions at lazyprogrammer.me/suggestions.

Bidirectional RNNs Motivation8:31
Explore bidirectional RNNs by concatenating forward and backward hidden states to form the output, improving NLP sequence labeling, with guidance on Keras wrappers and debugging return options.
Bidirectional RNN Experiment5:09
Explore the Keras bidirectional LSTM API by running experiments in bilstm test.py with return_sequences true and false, inspecting outputs, and comparing forward and backward hidden and cell states.
Bidirectional RNN Code2:33
Explore the code for a bidirectional LSTM on the toxic comments dataset, compare with a unidirectional LSTM, and note CPU vs GPU performance and library differences (TensorFlow, PyTorch, Keras).
Image Classification with Bidirectional RNNs6:12
Explore how bidirectional RNNs classify images by treating an image as a sequence of pixels, using vertical and horizontal scans, global max pooling, and concatenation for softmax classification.
Image Classification Code5:45
Implements a dual bidirectional LSTM for image classification on MNIST, using bilstm_mnist.py, with concatenation, permute_dimensions, a lambda layer, and sparse categorical cross-entropy.
Bidirectional RNNs Section Summary2:36
Explore bidirectional RNNs in Keras, showing outputs as the last forward hidden state concatenated with the first backward hidden state, and demonstrate an image classification application.

Seq2Seq Theory7:29
Explore sequence-to-sequence models, a dual-RNN encoder-decoder that compresses input into a thought vector and unfolds it into a variable-length output. Apply to translation and other seq-to-seq tasks.
Seq2Seq Applications3:27
Explore Seq2Seq applications like machine translation, question answering, and chatbots, learn how input-output word sequences form thought vector for decoding; understand why conversation requires memory beyond simple Seq2Seq.
Decoding in Detail and Teacher Forcing6:47
Explore decoding in detail within seq2seq architectures, and implement teacher forcing to train decoders with true previous words, addressing constant-size inputs in Keras and dual training and sampling models.
Poetry Revisited3:28
Revisit poetry generation in Keras to learn language modeling with RNNs, focusing on next-word prediction, SOS and EOS tokens, and the seq2seq decoder–encoder relationship via teacher forcing.
Poetry Revisited Code 18:29
train a language model for poetry generation with poetry.py, using start and end tokens, tokenizer, padding, and an lstm-based seq2seq-like model, and compare greedy translation to sampling from posterior distributions.
Poetry Revisited Code 26:58
Explore poetry generation with an end-to-end sampling approach in advanced NLP deep learning, reusing trained LSTM layers to predict next words, manage hidden states, and print four-line verses.
Seq2Seq in Code 17:55
Learn neural machine translation with a seq2seq model using an encoder LSTM and a decoder, handling two vocabularies, tokenizers, and SOS/EOS tokens, trained via teacher forcing.
Seq2Seq in Code 25:14
Implement a sampling seq2seq for neural machine translation by wiring the encoder and decoder LSTMs, defining initial states, and performing greedy one-word decoding with SOS and EOS tokens.
Seq2Seq Section Summary3:04
Explore Seq2Seq, the encoder-decoder approach for translation and question answering, and how decoding with teacher forcing, maxpooling, and bidirectional RNNs overcome fixed-vector limits.

Attention Section Introduction2:28
Introduce attention in recurrent neural networks, comparing final-output predictions with mid-sequence hidden-state weighting using softmax. Explain hardmax versus softmax and why attention matters.
Attention Theory18:04
Explore how attention works in seq2seq models by computing a context vector from encoder hidden states via attention weights, guiding the decoder LSTM to translate with bidirectional encoders.
Teacher Forcing2:09
Explains reconciling teacher forcing with attention by concatenating the previous word and the context vector at the decoder input, for training and inference.
Helpful Implementation Details11:21
Implement attention in Keras from scratch, outlining encoder-decoder architecture, context vectors, alphas, and softmax over time, while emphasizing shapes and creating robust, version-safe code.
Attention Code 19:48
Explore attention in neural machine translation with code, implementing a bi-directional encoder and a decoder using teacher forcing, attention context vectors, and a two-layer network with softmax over time.
Attention Code 23:50
Finish off attention script with a one-step encoder–decoder model that uses an attention context vector and embedding to predict word probabilities. Initialize s and c to zeros, and avoid loops.
Visualizing Attention2:26
Explore how attention weights reveal which input parts influence each output step by visualizing alpha as a t by t' matrix, and interpret near linear patterns in translations.
Building a Chatbot without any more Code10:31
Build a chatbot without writing code by converting Twitter conversation data into the right format and training a model, highlighting data-driven input and output patterns and ESL dialogue.
Attention Section Summary3:33
Learn how attention augments seq2seq by weighting encoder states with a neural network, enabling end-to-end differentiable training for long sequences.

Memory Networks Section Introduction9:19
Introduce memory networks for story-based question answering, using story, question, and answer inputs; link to bAbI data, attention mechanisms, and word embeddings.
Memory Networks Theory8:55
Explore memory networks for reading comprehension, building sentence embeddings by summing word vectors and scoring attention-based relevance with dot products and softmax, including one- and two-hop models for supporting facts.
Memory Networks Code 17:55
Explore memory networks in code by loading triplets of stories, questions, and answers, training single and two supporting facts models, and visualizing attention weights across story lines.
Memory Networks Code 25:05
Build and interpret a memory network for a single supporting fact by embedding sentences, computing story weights with dot products and softmax, and visualizing results with a debug model.
Memory Networks Code 35:41
Extend memory networks to two supporting facts stories using embed and sum for story and question representations, with two hops and a dense elu-activated layer.
Memory Networks Section Summary3:50
Explore memory networks that retain past information to answer questions about stories from the bAbI dataset. Use attention with softmax and vector representations, with optional rnn-based hops, to derive answers.

(Review) Keras Discussion6:48
Explore how Keras, a high-level library built on Theano, TensorFlow, and CNTK, enables quick convolutional and dense neural network construction with activation layers, while noting its limits for deep understanding.
(Review) Keras Neural Network in Code6:37
Learn to build a Keras sequential neural network with two hidden layers (500 and 300), using dense and activation, and train with the fit function and multi-class cross entropy loss.
(Review) Keras Functional API4:26
Learn the Keras functional API for building neural networks with model, input, and dense layers, offering a compact alternative to sequential. Build a model by connecting input to output.
(Review) How to easily convert Keras into Tensorflow 2.0 code1:49
Learn to convert Keras code to TensorFlow 2.0 using the built-in Keras API, with minimal import changes and a cnn text classification example.

Pre-Installation Check4:12
Understand installation lectures as guidelines and focus on principles over syntax when installing Python and deep learning libraries like CNTK, Theano, and OpenAI Gym.
Anaconda Environment Setup20:20
Discover how to install data science libraries on Windows with Anaconda, including NumPy, SciPy, matplotlib, pandas, NLTK, scikit-learn, and popular deep learning tools like TensorFlow, PyTorch, and OpenAI Gym.
How to How to install Numpy, Theano, Tensorflow, etc...17:30
Master how to set up a deep learning development environment across Windows, Linux, and Mac, including virtual machines, and installing numpy, scipy, matplotlib, ipython, pandas, Theano, and TensorFlow.

Requirements

Understand what deep learning is for and how it is used
Decent Python coding skills, especially tools for data science (Numpy, Matplotlib)
Preferable to have experience with RNNs, LSTMs, and GRUs
Preferable to have experience with Keras
Preferable to understand word embeddings

Description

Ever wondered how AI technologies like OpenAI ChatGPT, GPT-4, DALL-E, Midjourney, and Stable Diffusion really work? In this course, you will learn the foundations of these groundbreaking applications.

It’s hard to believe it's been been over a year since I released my first course on Deep Learning with NLP (natural language processing).

A lot of cool stuff has happened since then, and I've been deep in the trenches learning, researching, and accumulating the best and most useful ideas to bring them back to you.

So what is this course all about, and how have things changed since then?

In previous courses, you learned about some of the fundamental building blocks of Deep NLP. We looked at RNNs (recurrent neural networks), CNNs (convolutional neural networks), and word embedding algorithms such as word2vec and GloVe.

This course takes you to a higher systems level of thinking.

Since you know how these things work, it’s time to build systems using these components.

At the end of this course, you'll be able to build applications for problems like:

text classification (examples are sentiment analysis and spam detection)
neural machine translation
question answering

We'll take a brief look chatbots and as you’ll learn in this course, this problem is actually no different from machine translation and question answering.

To solve these problems, we’re going to look at some advanced Deep NLP techniques, such as:

bidirectional RNNs
seq2seq (sequence-to-sequence)
attention
memory networks

All of the materials of this course can be downloaded and installed for FREE. We will do most of our work in Python libraries such as Keras, Numpy, Tensorflow, and Matpotlib to make things super easy and focus on the high-level concepts. I am always available to answer your questions and help you along your data science journey.

This course focuses on "how to build and understand", not just "how to use". Anyone can learn to use an API in 15 minutes after reading some documentation. It's not about "remembering facts", it's about "seeing for yourself" via experimentation. It will teach you how to visualize what's happening in the model internally. If you want more than just a superficial look at machine learning models, this course is for you.

See you in class!

"If you can't implement it, you don't understand it"

Or as the great physicist Richard Feynman said: "What I cannot create, I do not understand".
My courses are the ONLY courses where you will learn how to implement machine learning algorithms from scratch
Other courses will teach you how to plug in your data into a library, but do you really need help with 3 lines of code?
After doing the same thing with 10 datasets, you realize you didn't learn 10 things. You learned 1 thing, and just repeated the same 3 lines of code 10 times...

Suggested Prerequisites:

Decent Python coding skills
Understand RNNs, CNNs, and word embeddings
Know how to build, train, and evaluate a neural network in Keras

WHAT ORDER SHOULD I TAKE YOUR COURSES IN?:

Check out the lecture "Machine Learning and AI Prerequisite Roadmap" (available in the FAQ of any of my courses, including the free Numpy course)

UNIQUE FEATURES

Every line of code explained in detail - email me any time if you disagree
No wasted time "typing" on the keyboard like other courses - let's be honest, nobody can really write code worth learning about in just 20 minutes from scratch
Not afraid of university-level math - get important details about algorithms that other courses leave out

Who this course is for:

Students in machine learning, deep learning, artificial intelligence, and data science
Professionals in machine learning, deep learning, artificial intelligence, and data science
Anyone interested in state-of-the-art natural language processing

Deep Learning: Advanced Natural Language Processing and RNNs

What you'll learn

Explore related topics

Course content

Welcome4 lectures • 15min

Recurrent Neural Networks, Convolutional Neural Networks, and Word Embeddings15 lectures • 2hr 1min

Bidirectional RNNs6 lectures • 31min

Sequence-to-sequence models (Seq2Seq)9 lectures • 53min

Attention9 lectures • 1hr 4min

Memory Networks6 lectures • 41min

Keras and Tensorflow 2 Basics4 lectures • 20min

Course Conclusion1 lecture • 4min

Appendix / FAQ Intro1 lecture • 4min

Setting Up Your Environment (FAQ by Student Request)3 lectures • 42min

Requirements

Description

Who this course is for: