
Explore supervised learning, where a model f(x; w) predicts targets y_hat from observations x using ground truth labels y and a loss function L to optimize weights w.
Represent text numerically for machine learning using simple count-based methods and one-hot encoding, building from tokens to sentences and documents. Also examine binary encoding and tf-idf relevance in nlp.
Learn tf-idf concepts: weight terms by term frequency and inverse document frequency to emphasize rare, informative words in a corpus, with patent document examples.
Explore target encoding in NLP and how computational graphs drive supervised learning, enabling forward evaluation, loss signaling, and automatic differentiation with PyTorch.
Learn to create tensors in PyTorch, inspect their type, shape, and values, and convert between numpy arrays and tensors while using random, zeros, ones, and in-place fills.
Explore tensor size and types in PyTorch, learn how to set or cast a dtype (float, long, double), and use shape and size to inspect dimensions for debugging.
Explore tensor operations in PyTorch using functions like dot, add, and sum to manipulate tensors, including 2d tensors where rows are dim 0 and columns dim 1. Practice indexing, slicing, joining, and mutating with built-in PyTorch tools, similar to numpy.
Learn joining, slicing, and indexing in PyTorch, with NumPy familiarity. Access non-contiguous tensors using long tensor indices, index select, and concatenate, view, and stack for linear algebra.
Learn how PyTorch tensors encapsulate data and operations within a computational graph. Track gradients with requires_grad, perform forward passes, and compute backpropagation via backward from a loss function.
Explore the perceptron as the simplest neural network, combining an affine transform w·x + b with a nonlinear activation like sigmoid, and see how PyTorch handles it.
Explore sigmoid activation functions in neural networks, nonlinearities that capture complex relationships and squash real-valued inputs to 0–1, and examine vanishing and exploding gradients that affect learning.
Learn how the tanh activation relates to the sigmoid, its expression, that tanh is a linear transform of the sigmoid, and see PyTorch code and plotting.
Explore ReLU, the rectified linear unit activation function, which clips negative values to zero and addresses vanishing gradients while noting dying ReLU and leaky or parametric variants.
Explore softmax as the activation that yields a discrete probability distribution over classes. Compare it with sigmoid and other activations and note its link to cross-entropy loss for classification.
Learn how mean squared error loss measures the distance between the predicted y hat and the target y in regression with continuous outputs, using PyTorch to implement MSE loss.
Learn how categorical cross entropy loss evaluates multi-class predictions by comparing network outputs to true distributions, emphasizing numerically stable use with softmax, one-hot targets, and PyTorch implementation.
Explore binary cross entropy loss for binary classification, using a sigmoid output and ground truth vectors to compute the loss, with examples and preparing for supervised training.
Construct toy data for supervised learning using synthetic two-dimensional points to separate stars and circles by a line, then train a model with a loss function and gradient-based optimization.
Explore optimizer choosing in supervised training, compare learning rates and optimizers like Adam and SGD in PyTorch, balancing convergence and update dynamics.
Learn gradient-based supervised learning in Python: compute loss, backpropagate through the graph, and update parameters with an optimizer on batch data across epochs for binary classification.
Learn to classify sentiment of Yelp reviews by mapping 1–2 stars to negative and 3–4 stars to positive, using a lite dataset and PyTorch dataset class with train/validation/test splits.
Import and prepare the dataset, then create train, validation, and test splits with a fixed random seed. Perform minimal text cleaning by spacing around punctuation and applying a preprocessing function.
Explore PyTorch dataset representation by subclassing the dataset class, implementing __getitem__ and __len__, and using a review vectorizer to tokenize text by whitespace to numeric vectors for a data loader.
Load the dataset from a csv, build a dataframe vectorizer, and create train, val, and test splits while implementing a PyTorch dataset that returns features X data and y label.
Explore building a vocabulary from a serialized dictionary, adding tokens, and mapping text to token indices with token-to-index lookups and unknown handling. Preview the vectorizer in the next lecture.
Vectorize text by mapping tokens to integers from the data frame and producing a fixed-length vector. Builds a vocabulary with cutoff and an unk token, using a collapsed one-hot representation.
Explore how the vectorizer converts text to numeric vectors using a vocabulary, one-hot encoding, and token lookup, built from a dataframe with frequency-based cutoff.
Learn how to use PyTorch's DataLoader to group vectorized data into mini batches, set batch size and shuffle, and move data between CPU and GPU with a generate_batches generator.
Reimplement a perceptron classifier in a PyTorch module with a single linear output for binary sentiment classification. It explains when to apply sigmoid versus BCE loss, with a Yelp example.
Explore the training routine that instantiates the model, iterates the dataset, computes outputs and loss, updates parameters, and uses a central arch object to coordinate hyperparameters.
Explore feedforward networks for natural language processing, including multilayer perceptrons and convolutional neural networks, and see how MLPs handle non-linear data and CNNs detect localized patterns in sequential data.
Build an MLP in PyTorch with two linear layers and a ReLU between them. Learn how the forward pass computes outputs, while PyTorch handles the backward pass and gradients.
Train a multilayer perceptron to classify surnames by country of origin, vectorizing surname characters with a vocabulary vectorizer and a dataloader for mini-batches.
Explore a 10,000-surname dataset from 18 nationalities, highlighting data imbalance and orthography linked to origin, then learn subsampling, grouping by nationality, and train, validation, and test splits.
Extend the surnames dataset in a PyTorch dataset by detailing get_item and len, returning a vectorized surname and nationality index for the vocabulary vectorizer and dataloader.
Build a surname classifier with a two-layer MLP, featuring a first linear layer and a second linear layer that produce a prediction vector, with optional softmax and cross-entropy loss.
Explore the convolution operation and channels in pytorch, with conv1d for NLP, conv2d for images, and conv3d for video, and learn how in_channels and out_channels shape the output.
Explore how kernel size controls local information and output size in convolution, with smaller kernels capturing fine grained patterns like n-grams and larger ones producing coarser features in NLP.
Explore how dilation in convolutional kernels expands the receptive field by spacing kernel elements, enabling larger region summarization without extra parameters in dilated CNNs.
Explore CNN design in PyTorch through an end-to-end surname classification example, replacing an MLP with convolutional layers that feed a final linear layer to a prediction.
Explore data loader, vocabulary, and a character-level vectorizer that maps each character to an integer and builds a one-hot matrix for CNN models, aligning batch, channel, and feature dimensions.
Learn how to build a surname classifier with convolutional networks in Python, using conv1d layers, a sequential module, and the L2 nonlinearity to produce a final class prediction.
Execute the standard training routine by instantiating the data set, model, loss function, and optimizer, then iterate over training and validation partitions across epochs to measure performance.
Evaluate test data set performance with quantitative and qualitative metrics; compare cnn and mlp results for textual data. Predict nationality from surnames using top-k probabilities with a vectorizer.
Explore how word embeddings encode syntactic and semantic relationships and support analogy tasks using a difference vector added to a word, with a nearest neighbor index predicting the fourth word.
Build a text dataset from Mary Shelley’s Frankenstein via Project Gutenberg using nltk for tokenization. Construct a cbow dataset with two-context windows and split into 70/15/15 train, val, test.
cbow classifier embeds context words with an embedding layer, combines vectors by sum (or max, average, or an mlp), and uses a linear layer to produce a vocabulary-sized probability distribution.
Learn how to initialize a news classifier's embedding layer with pre-trained glove embeddings, load and subset embeddings from disk, and handle missing words with Xavier uniform initialization.
Build a glove-based embedding pipeline by loading glove vectors from file, creating a word-to-index mapping, and assembling a final embedding matrix for dataset words, initializing unknown embeddings with Xavier uniform.
Continue building a convnet news classifier using one-hot character embeddings, an embedding layer mapping indices to vectors, and a pre-trained embeddings subset in the PyTorch module.
Examine a 1d convnet with dropout and embed-and-permute steps, followed by average pooling, squeeze, and final linear layers that yield class predictions.
Explore sequence modeling in natural language processing, where data depends on prior items, from phonemes to verb agreement, and see end-to-end recurrent neural networks applied to classification tasks.
Model sequences with a basic elman rnn, where a hidden state updates from the current input and previous hidden state, trained via backpropagation through time with shared weights.
Explore the Elman RNN and its single time-step computations using an RNN cell in PyTorch. Understand input-to-hidden and hidden-to-hidden weight matrices through a hands-on, explicit implementation.
Continue exploring Elman RNN theory and the forward pass in the Elman RNN class. Handle batch first, initial hidden states, and stacking outputs into a three-dimensional tensor.
Classify surname nationality with a character RNN (Elman) by loading a surnames dataset, creating a vectorizer, and building a PyTorch dataset to map surname sequences to nationalities.
Explore end of sequence handling and surname vectorization for sequence prediction, transforming surnames into two indexed sequences with begin and end tokens and forming input-output pairs for training.
Examine how the encoder processes a sequence of integers to produce per-position feature vectors and a final hidden state, which initializes the decoder in the next lecture.
Explore unconditioned surname generation using a GRU that ignores nationality, starting from a zero initial hidden state, embedding characters, and predicting tokens with a linear layer in PyTorch.
Explain embedding setup, rnn/gru configuration, dropout, and padding, then reshape the three-dimensional output to a two-dimensional matrix to compute predictions for every sample with a linear layer.
Learn how to train a character-level surname generator, handling variable length sequences with masking, reshaping tensors for time-step predictions, and cross-entropy loss with ignore index, while examining results.
Preprocess the English-French translation dataset by lowercasing and NLTK tokenization, then narrow data with syntax patterns such as 'I am' and 'you are' before 70/15/15 train, validation, and test splits.
Explore the vectorization pipeline for NMT, where source English and target French sentences use two separate vocabularies and max sequence lengths, preparing data for PyTorch packed sequences.
Continue the discussion by vectorizing source text and target text, generating x indices and y indices via vocab lookups, and preparing for the target decoder.
Encode and decode in neural machine translation using an encoder–decoder with a bidirectional GRU and attention, and learn how the NMT forward method coordinates encoder and decoder components.
Explains packing and unpacking sequences for mini-batches, feeding outputs into a GRU to form decoder vectors, and shows the NMT decoder constructing target sentences from encoded sources with embeddings.
Continuing the nmt decoder, the lecture walks through initializing the decoder, embedding layers, and gru-based rnn components, mapping hidden states through linear layers, and preparing the begin-of-sequence index.
Explore the neural machine translation decoder in depth, covering initialization of context and hidden states, encoder outputs, and the forward pass with scheduled sampling and prediction vectors.
Explore Python data collection structures, including lists, dictionaries, tuples, series, data frames, and panels, with practical examples and key concepts like mutability, index starting at zero, and square brackets.
Practice creating lists of strings and integers, including empty and nested lists, print lists like colors (blue, purple, red), and explore counting and accessing values in lists.
Learn how to access values in lists using forward and backward indexing with Python, and slice lists with a colon to retrieve elements by index.
Update and add items to Python lists using slice assignment and append or extend. See practical examples with student names and marks that illustrate updating single or multiple elements.
Discover Python list operations, including concatenation and repetition with plus and asterisk, printing results, obtaining length with len, membership testing with in, and traversing elements with for loops.
Explore slicing, matrices, and indexing in Python, mastering list creation and negative-index slices, and using functions like len, max, min, and converting tuples to lists.
Explore essential Python list methods for editing and managing data. Learn how to append, extend, insert, pop, remove, count, index, reverse, and sort to manipulate lists efficiently.
Master traversing and sorting in Python by creating lists from sequences, iterating to process elements, and printing sorted versus unsorted lists.
Explore strings and lists in Python by splitting strings into characters or words with delimiters, and using the join method to reconstruct strings, with practical natural language processing examples.
Explore dictionaries as unordered key-value stores with curly braces, colons, and commas, where keys are unique and immutable. Create them with dict() or direct assignment, car prices and student marks.
Access and update dictionary values in Python to manage student marks, add new items like stud marks or Andrew Cook, and print the updated dictionary.
Explore built-in dictionary functions in Python, learn how the cmp function compares dictionaries in older versions, and use len, str, and type to inspect dictionaries with practical examples.
Explore built-in dictionary methods in Python, including clear, copy, get, items, keys, setdefault, update, and values, then learn to sort dictionaries by keys or by values with sorted.
Learn how to concatenate tuples and access values with indices in Python, while practicing immutability, deleting a tuple, and using forward and backward slicing.
Explore sentiment analysis on movie reviews using Python, NLTK, and sklearn in Google Colab, covering data loading, preprocessing with tokenization, stopword removal, and lemmatization, plus tf-idf features and Multinomial NB.
Learn how to download and load the movie reviews dataset from nltk, a collection of labeled reviews. Install and unzip key resources such as stopwords and WordNet.
Load and preprocess the data by organizing documents, movie reviews, categories, and file IDs, then perform a random shuffle of the documents.
Shuffle the data to balance positive and negative reviews, then extract text reviews and their labels. Create xtrain, xtest, ytrain, and ytest with train_test_split, specifying test size and random state.
Build a naive bayes text classifier using tf-idf features in a pipeline with TfidfVectorizer and MultinomialNB, evaluate accuracy on the test set, for sentiment analysis with NLTK and scikit-learn.
Build an llm powered study assistant in Google Colab using python, OpenAI SDK, and LangChain to create embeddings and a vector database for retrieval augmented generation, question answering, and summaries.
Install and import external libraries for natural language processing with Python, including OpenAI, LangChain, and Chroma DB, to manage embeddings with a tokenizer for token counting.
Import libraries and set up OpenAI integration with LangChain core, text splitters, embeddings, and vector stores for retrieval-based QA.
Perform a simple GPT-4 sanity check by sending a single prompt to a GPT model and printing the response to verify the API works.
Natural Language Processing (NLP) is at the forefront of artificial intelligence, enabling machines to understand, interpret, and generate human language. This course provides a comprehensive introduction to NLP, covering both foundational linguistic concepts and advanced deep learning techniques. Through a hands-on approach with PyTorch, students will learn to build, train, and evaluate deep learning models for a variety of NLP tasks.
The course begins with an Introduction to Natural Language Processing (NLP), exploring key applications such as machine translation, chatbots, and text summarization. Following this, students will dive into Text Preprocessing Techniques, including tokenization, stopword removal, stemming, lemmatization, and vectorization—essential steps for preparing textual data for machine learning models.
Next, we will explore fundamental NLP applications, including Sentiment Analysis and Text Classification, using traditional machine learning approaches before advancing to deep learning-based methods. Students will also work with Named Entity Recognition (NER) and Part-of-Speech (POS) Tagging, essential for information extraction and linguistic analysis.
To understand how machines interpret textual data, we will cover Word Embeddings and Semantic Similarity, including Word2Vec, GloVe, and contextual embeddings from modern models. This leads naturally into deep learning fundamentals, starting with an Introduction to Neural Networks, Perceptrons and Feedforward Networks, and Backpropagation and Gradient Descent, which power most deep learning models.
A key focus will be on Activation Functions and Optimization Algorithms, helping students fine-tune their models for improved performance. The course then explores sequence-based deep learning models, such as Recurrent Neural Networks (RNNs) and Long Short-Term Memory Networks (LSTMs), which are critical for processing sequential text data.
Modern NLP relies on Transformers for NLP Tasks, including the groundbreaking Transformer architecture behind BERT and GPT models. We will then introduce PyTorch and its Ecosystem, equipping students with the tools to build, train, and deploy deep learning models.
Hands-on projects will guide students through Building NLP Models with PyTorch, Implementing Neural Networks with PyTorch, and Training and Evaluating Deep Learning Models to ensure proficiency in real-world applications.
By the end of the course, students will have a strong foundation in both classical and deep learning approaches to NLP, with the ability to build cutting-edge models using PyTorch. This course is ideal for data scientists, machine learning engineers, and AI enthusiasts eager to advance their skills in NLP and deep learning.