
Explore how word embeddings and RNNs power practical deep NLP systems, with bi-directional RNNs, Seq2Seq, attention, and memory networks for text classification, translation, and question answering.
Outline covers bidirectional RNNs, seq2seq with attention, memory networks, and reading comprehension via question answering. Learn the Keras workflow—from data loading to building, training, and evaluating models.
discover where to obtain the course code via git clone from the official GitHub page, avoid forking, and follow the theory-to-code workflow across data sets.
Learn how to succeed in this deep learning course by using the q&a, meeting prerequisites, and engaging with conceptual and coding lectures through notes and hands-on practice.
Review foundational deep learning and natural language processing concepts through text-based NLP problems, including word embeddings, CNNs for text, and RNNs using Keras.
Learn to open plain text files on Windows using utf-8 encoding with the open function, and avoid common pitfalls from compressed files and GitHub downloads by cloning repositories.
Discover how word embeddings convert words into numerical feature vectors, forming a vocabulary size by embedding dimension matrix that maps vocabulary indices to dense representations, enabling efficient neural network input.
Explore how pre-trained word embeddings like Word2Vec and GloVe enable transfer learning in NLP by initializing neural networks with external embeddings, handling out-of-vocabulary words, and testing fine-tuning in Keras.
Explore how 1-D convolution on word embeddings enables nlp in a simple, trainable architecture built in Keras, covering cross-correlation, pooling, and a sequence-modeling convolutional neural network.
Download the toxic comments dataset from Kaggle and train a single neural network with six binary outputs, using sigmoid activations for toxic, severe_toxic, obscene, threat, insult, and identity_hate.
Explore a cnn for text classification in Python using Keras, covering tokenization, padding, and embedding with pre-trained GloVe vectors. Preprocess data with word2vec mappings and vocabulary handling.
Train a cnn for text classification with pre-trained word embeddings, three convolution layers, three max pools, a global max pool, and six binary sigmoid outputs.
Explore how a simple recurrent unit adds memory to neural networks, using shared weights (Elman unit) to process sequences and compare with fixed-size feedforward models in Keras and TensorFlow.
Examine GRUs and LSTMs, their gates and two states, and how they address vanishing gradients to model long-term dependencies in language. Learn practical Keras usage for return_state and return_sequences.
Discover how RNN task types depend on input and output shapes, from one-to-one and many-to-one to many-to-many and one-to-many, using Tx by D, Ty by K.
Explore how return_sequences and return_states shape LSTM and GRU outputs, including hidden and cell states, and verify results with a simple rnn test.
Build an LSTM-based rnn for toxic comment classification using embeddings, max sequence length, and pooling, with sigmoid output and higher auc and accuracy despite slower training.
Review the core concepts of word embeddings, CNNs, and RNNs, including GRU and LSTM, and apply them to toxic comment classification using pre-trained embeddings like Word2Vec, GloVe, or FastText.
The lecture invites learners to share feedback via a simple suggestion box, guiding improvements for the deep learning NLP and RNNs course through targeted questions at lazyprogrammer.me/suggestions.
Explore bidirectional RNNs by concatenating forward and backward hidden states to form the output, improving NLP sequence labeling, with guidance on Keras wrappers and debugging return options.
Explore the Keras bidirectional LSTM API by running experiments in bilstm test.py with return_sequences true and false, inspecting outputs, and comparing forward and backward hidden and cell states.
Explore the code for a bidirectional LSTM on the toxic comments dataset, compare with a unidirectional LSTM, and note CPU vs GPU performance and library differences (TensorFlow, PyTorch, Keras).
Explore how bidirectional RNNs classify images by treating an image as a sequence of pixels, using vertical and horizontal scans, global max pooling, and concatenation for softmax classification.
Implements a dual bidirectional LSTM for image classification on MNIST, using bilstm_mnist.py, with concatenation, permute_dimensions, a lambda layer, and sparse categorical cross-entropy.
Explore bidirectional RNNs in Keras, showing outputs as the last forward hidden state concatenated with the first backward hidden state, and demonstrate an image classification application.
Explore sequence-to-sequence models, a dual-RNN encoder-decoder that compresses input into a thought vector and unfolds it into a variable-length output. Apply to translation and other seq-to-seq tasks.
Explore Seq2Seq applications like machine translation, question answering, and chatbots, learn how input-output word sequences form thought vector for decoding; understand why conversation requires memory beyond simple Seq2Seq.
Explore decoding in detail within seq2seq architectures, and implement teacher forcing to train decoders with true previous words, addressing constant-size inputs in Keras and dual training and sampling models.
Revisit poetry generation in Keras to learn language modeling with RNNs, focusing on next-word prediction, SOS and EOS tokens, and the seq2seq decoder–encoder relationship via teacher forcing.
train a language model for poetry generation with poetry.py, using start and end tokens, tokenizer, padding, and an lstm-based seq2seq-like model, and compare greedy translation to sampling from posterior distributions.
Explore poetry generation with an end-to-end sampling approach in advanced NLP deep learning, reusing trained LSTM layers to predict next words, manage hidden states, and print four-line verses.
Learn neural machine translation with a seq2seq model using an encoder LSTM and a decoder, handling two vocabularies, tokenizers, and SOS/EOS tokens, trained via teacher forcing.
Implement a sampling seq2seq for neural machine translation by wiring the encoder and decoder LSTMs, defining initial states, and performing greedy one-word decoding with SOS and EOS tokens.
Explore Seq2Seq, the encoder-decoder approach for translation and question answering, and how decoding with teacher forcing, maxpooling, and bidirectional RNNs overcome fixed-vector limits.
Introduce attention in recurrent neural networks, comparing final-output predictions with mid-sequence hidden-state weighting using softmax. Explain hardmax versus softmax and why attention matters.
Explore how attention works in seq2seq models by computing a context vector from encoder hidden states via attention weights, guiding the decoder LSTM to translate with bidirectional encoders.
Explains reconciling teacher forcing with attention by concatenating the previous word and the context vector at the decoder input, for training and inference.
Implement attention in Keras from scratch, outlining encoder-decoder architecture, context vectors, alphas, and softmax over time, while emphasizing shapes and creating robust, version-safe code.
Explore attention in neural machine translation with code, implementing a bi-directional encoder and a decoder using teacher forcing, attention context vectors, and a two-layer network with softmax over time.
Finish off attention script with a one-step encoder–decoder model that uses an attention context vector and embedding to predict word probabilities. Initialize s and c to zeros, and avoid loops.
Explore how attention weights reveal which input parts influence each output step by visualizing alpha as a t by t' matrix, and interpret near linear patterns in translations.
Build a chatbot without writing code by converting Twitter conversation data into the right format and training a model, highlighting data-driven input and output patterns and ESL dialogue.
Learn how attention augments seq2seq by weighting encoder states with a neural network, enabling end-to-end differentiable training for long sequences.
Introduce memory networks for story-based question answering, using story, question, and answer inputs; link to bAbI data, attention mechanisms, and word embeddings.
Explore memory networks for reading comprehension, building sentence embeddings by summing word vectors and scoring attention-based relevance with dot products and softmax, including one- and two-hop models for supporting facts.
Explore memory networks in code by loading triplets of stories, questions, and answers, training single and two supporting facts models, and visualizing attention weights across story lines.
Build and interpret a memory network for a single supporting fact by embedding sentences, computing story weights with dot products and softmax, and visualizing results with a debug model.
Extend memory networks to two supporting facts stories using embed and sum for story and question representations, with two hops and a dense elu-activated layer.
Explore memory networks that retain past information to answer questions about stories from the bAbI dataset. Use attention with softmax and vector representations, with optional rnn-based hops, to derive answers.
Explore how Keras, a high-level library built on Theano, TensorFlow, and CNTK, enables quick convolutional and dense neural network construction with activation layers, while noting its limits for deep understanding.
Learn to build a Keras sequential neural network with two hidden layers (500 and 300), using dense and activation, and train with the fit function and multi-class cross entropy loss.
Learn the Keras functional API for building neural networks with model, input, and dense layers, offering a compact alternative to sequential. Build a model by connecting input to output.
Learn to convert Keras code to TensorFlow 2.0 using the built-in Keras API, with minimal import changes and a cnn text classification example.
Learn the appendix, or faq (frequently asked questions), as optional supplementary material not part of main content, offering answers to common questions and clarifying course topics.
Understand installation lectures as guidelines and focus on principles over syntax when installing Python and deep learning libraries like CNTK, Theano, and OpenAI Gym.
Discover how to install data science libraries on Windows with Anaconda, including NumPy, SciPy, matplotlib, pandas, NLTK, scikit-learn, and popular deep learning tools like TensorFlow, PyTorch, and OpenAI Gym.
Master how to set up a deep learning development environment across Windows, Linux, and Mac, including virtual machines, and installing numpy, scipy, matplotlib, ipython, pandas, Theano, and TensorFlow.
Ever wondered how AI technologies like OpenAI ChatGPT, GPT-4, DALL-E, Midjourney, and Stable Diffusion really work? In this course, you will learn the foundations of these groundbreaking applications.
It’s hard to believe it's been been over a year since I released my first course on Deep Learning with NLP (natural language processing).
A lot of cool stuff has happened since then, and I've been deep in the trenches learning, researching, and accumulating the best and most useful ideas to bring them back to you.
So what is this course all about, and how have things changed since then?
In previous courses, you learned about some of the fundamental building blocks of Deep NLP. We looked at RNNs (recurrent neural networks), CNNs (convolutional neural networks), and word embedding algorithms such as word2vec and GloVe.
This course takes you to a higher systems level of thinking.
Since you know how these things work, it’s time to build systems using these components.
At the end of this course, you'll be able to build applications for problems like:
text classification (examples are sentiment analysis and spam detection)
neural machine translation
question answering
We'll take a brief look chatbots and as you’ll learn in this course, this problem is actually no different from machine translation and question answering.
To solve these problems, we’re going to look at some advanced Deep NLP techniques, such as:
bidirectional RNNs
seq2seq (sequence-to-sequence)
attention
memory networks
All of the materials of this course can be downloaded and installed for FREE. We will do most of our work in Python libraries such as Keras, Numpy, Tensorflow, and Matpotlib to make things super easy and focus on the high-level concepts. I am always available to answer your questions and help you along your data science journey.
This course focuses on "how to build and understand", not just "how to use". Anyone can learn to use an API in 15 minutes after reading some documentation. It's not about "remembering facts", it's about "seeing for yourself" via experimentation. It will teach you how to visualize what's happening in the model internally. If you want more than just a superficial look at machine learning models, this course is for you.
See you in class!
"If you can't implement it, you don't understand it"
Or as the great physicist Richard Feynman said: "What I cannot create, I do not understand".
My courses are the ONLY courses where you will learn how to implement machine learning algorithms from scratch
Other courses will teach you how to plug in your data into a library, but do you really need help with 3 lines of code?
After doing the same thing with 10 datasets, you realize you didn't learn 10 things. You learned 1 thing, and just repeated the same 3 lines of code 10 times...
Suggested Prerequisites:
Decent Python coding skills
Understand RNNs, CNNs, and word embeddings
Know how to build, train, and evaluate a neural network in Keras
WHAT ORDER SHOULD I TAKE YOUR COURSES IN?:
Check out the lecture "Machine Learning and AI Prerequisite Roadmap" (available in the FAQ of any of my courses, including the free Numpy course)
UNIQUE FEATURES
Every line of code explained in detail - email me any time if you disagree
No wasted time "typing" on the keyboard like other courses - let's be honest, nobody can really write code worth learning about in just 20 minutes from scratch
Not afraid of university-level math - get important details about algorithms that other courses leave out