Natural Language Processing (NLP) Using NLTK in Python
3.7 (3 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
22 students enrolled

Natural Language Processing (NLP) Using NLTK in Python

Build smart AI-driven linguistic applications using deep learning and NLP techniques
3.7 (3 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
22 students enrolled
Created by Packt Publishing
Last updated 4/2019
English [Auto-generated]
Current price: $139.99 Original price: $199.99 Discount: 30% off
5 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 3 hours on-demand video
  • 1 downloadable resource
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Attain a strong foundation in Python for deep learning and NLP
  • Build applications with Python, using the Natural Language Toolkit via NLP
  • Get to grips on various NLP techniques to build an intelligent Chatbot
  • Classify text and speech using the Naive Bayes Algorithm
  • Use various tools and algorithms to build real-world applications
  • Build solutions such as text similarity, summarization, sentiment analysis and anaphora resolution to get up to speed with new trends in NLP
  • Write your own POS taggers and grammars so that any syntactic analyses can be performed easily
  • Use the inbuilt chunker and create your own chunker to evaluate trained models
  • Create your own named entities using dictionaries to use inbuilt text classification algorithms
Course content
Expand all 54 lectures 03:04:52
+ Natural Language Processing in Practice
28 lectures 01:47:17

This video provides an overview of the entire course.

Preview 03:44

How to follow up with the practical steps?

  • Download and install Python

  • Download and install PyCharm community

Setup and Installation

What is NLP?

  • Introduction on why did we invent NLP

  • Define NLP

Understanding NLP and Its Benefits

How to get the root of the different terms in order to combine similar terms or concepts

  • Initialize a stemmer and a lemmatize

  • Process your tagged text through them

  • Check out the lemmas and stems

Exploring NLP Tools and Libraries

Tokenizing text into sentences or words

  • Create a tokenize from NLTK

  • Process or tokenize your text

Preview 06:39

What are stop words? How to filter or remove them to keep only the important terms

  • Build a list of stop words

  • Filter them out from your text

Stop Words

Build the lexical structure of your text or sentence

  • Import a Part of Speech tagger from NLTK

  • Process or tag the terms in the sentence

  • Check out the results or tags

Part of Speech Tagging

How to get the root of the different terms in order to combine similar terms or concepts

  • Initialize a stemmer and a lemmatize

  • Process your tagged text through them

  • Check out the lemmas and stems

Stemming and Lemmatization

How to extract names of people, places

  • Import a Named Entity recognizer form NLTK

  • Process your text to extract the existing named entities

Named Entity Recognition

Extract Keywords from the provided NLTK Corpus

  • Import the corpus

  • Apply TF-IDF

  • Check out the top 10 keywords for each document


What is Sentiment Analysis?

  • Definition

Introduction to Sentiment Analysis

What dataset to use? Where to download it? and how to preprocess it

  • Download the dataset using Keras

  • Split to Train and Test data

Pre-Processing the Dataset

What are Word Embeddings?

  • Define word embeddings

  • Add a word embeddings layer to our network

Word Embeddings

What other layers should we add? How to build the network

  • Add two more layers

  • Compile the network

Build the Network

Training the model using the train data

  • Train the model

Train the Model

Test the accuracy of the model

  • Use test data to test the model

Test the Model

Test the model with a real example?

  • Predict the sentiment of a review

Apply to a Single Input

What is Machine Learning?

  • Define Machine Learning

  • Applications

  • Algorithms

Machine Learning

What is Classification and Text Classification?

  • Define Classification

  • Text Classification


What steps should we follow to pre-process the data?

  • Load the data

  • Apply TF-IDF

Pre-Processing the Dataset

What is Naïve Bayes Multinomial and SVM

  • Define Naïve Bayes Multinomial

  • Define SVM

Naïve Bayes and SVM

Build and train the classifier

  • Train the classifier using pre-processed data

Train the Classifier

Testing the classifier

Test the Classifier

What are Chatbots?

  • Define Chatbots

  • Introduction to ChatterBot


NLTK Chatbots

  • Simple NLTK Chatbot conversation

Simple NLTK Bot

Creating the first ChatterBot

  • Install ChatterBot library

  • Instantiate a Chabot

Create a ChatterBot

How to make the Chatbot better?

  • Add pre-processors

Enhancing the Chabot

Train the bot for more vocabulary

  • Import the corpus trainer

  • Train and test using English corpus

  • Train and test using French corpus

Training the Chabot
Test Your Knowledge
5 questions
+ Developing NLP Applications Using NLTK in Python
26 lectures 01:17:35

This video gives an overview of the entire course.

Preview 03:25

In this video, we use the Python NLTK library to understand more about the POS tagging features in a given text.

  • Create a variable called simpleSentence

  • Invoke the NLTK built-in tokenizer function word_tokenize()

  • Invoke the NLTK built-in tagger pos_tag()

Exploring the In-Built Tagger

Now, we will explore the NLTK library by writing our own taggers. We’ll write various types of taggers such as Default tagger, Regular expression tagger and Lookup tagger.

  • Define a new Python function called learnDefaultTagger

  • Create an object of the DefaultTagger() class

  • Call the tag() function of the tagger object

Writing Your Own Tagger

Next, let’s learn how to train our own tagger and save the trained model to disk so that we can use it later for further computations.

  • Define a function called sampleData()

  • Define a function called buildDictionary()

  • Build an nltk.UnigramTagger() object

Training Your Own Tagger

This video will teach us how to define grammar and understand production rules.

  • Import the generate function from the nltk.parse.generate

  • Define a new grammar

  • Create a new grammar object using the nltk.CFG.fromstring()

Learning to Write Your Own Grammar

Probabilistic CFG is a special type of CFG in which the sum of all the probabilities for the non-terminal tokens (left-hand side) should be equal to one. Let's write a simple example to understand more.

  • Identify tokens in the grammar

  • Join the list of all the production rules into a string

Writing a Probabilistic CFG

Recursive CFGs are a special types of CFG where the Tokens on the left-hand side are present on the right-hand side of a production rule. Palindromes are the best examples of recursive CFG.

  • Create a new list data structure called productions

  • Add production rules that define palindromes

  • Pass the newly constructed grammarString to the NLTK built-in nltk.CFG.fromstring function

Writing a Recursive CFG

In this video, we will learn how to use the in-built chunker. We will use some features that will be used from NLTK as part of this process.

  • Add string to a variable called text

  • Break the given text into multiple sentences

  • Do POS analysis using the default tagger

Using the Built-In Chunker

Now that we know using the built-in chunker, in this video, we will write our own Regex chunker.

  • Write regular expressions

  • Understand tag patterns

  • Identify chunks

Writing Your Own Simple Chunker

In this video, we will learn the training process, training our own chunker, and evaluating it.

  • Import the conll2000 corpus and treebank corpus

  • Define a new function, mySimpleChunker()

  • Create a list of two datasets

Training a Chunker

Recursive descent parsers belong to the family of parsers that read the input from left to right and build the parse tree in a top-down fashion and traversing nodes in a pre-order fashion.

  • Define a new function, RDParserExample

  • Iterate over the list of sentences in the textlist variable

  • Create a new CFG object using grammar

Parsing Recursive Descent

In this video, we will learn to use and understand shift-reduce parsing.

  • Define a new function, SRParserExample

  • Iterate over the list of sentences in the textlist variable

  • Define two sample sentences to understand the shift-reduce parser

Parsing Shift-Reduce

We will now learn how to parse dependency grammar and use it with the projective dependency parser.

  • Create a grammar object using the nltk.grammar.DependencyGrammar class

  • Define the sample sentence on which parser will be run

Parsing Dependency Grammar and Projective Dependency

Chart parsers are special types of parsers which are suitable for natural languages as they have ambiguous grammars. Let’s learn about them in detail.

  • Import CFG module, ChartParser and BU_LC_STRATEGY features

  • Create a sample grammar for the example

  • Acquire all the parse trees

Parsing a Chart

Python NLTK has built-in support for Named Entity Recognition (NER). Let’s learn to use inbuilt NERs.

  • Define a new function called sampleNE()

  • Define a function called sampleNE2()

  • Call the two sample functions

Using Inbuilt NERs

Is it possible to print the list of all the words in the sentence that are nouns? Yes, for this, we will learn how to use a Python dictionary.

  • Define a new class called LearningDictionary

  • Create buildDictionary() and buildReverseDictionary()

  • Define getPOSForWord()

Creating, Inversing, and Using Dictionaries

Choosing the feature set Features are one of the most powerful components of nltk library. They represent clues within the language for easy tagging of the data that we are dealing with.

  • Create learnSimpleFeatures()

  • Create learnFeatures()

  • Compare both the functions

Choosing the Feature Set

A natural language that supports question marks (?), full stops (.), and exclamations (!) poses a challenge to us in identifying whether a statement has ended or it still continues after the punctuation characters. Let’s try and solve this classic problem.

  • Define featureExtractor()

  • Create segmentTextAndPrintSentences()

  • Extract all the features from the traindata and store it in traindataset

Segmenting Sentences Using Classification

In previous videos, we have written regular-expression-based POS taggers that leverage word suffixes, let’s try to write a program that leverages the feature extraction concept to find the POS of the words in the sentence.

  • Indicate the dual behavior of the words

  • Define a new function called withContextTagger()

  • Build a featuredata list

Writing a POS Tagger with Context

In computing, a pipeline can be thought of as a multi-phase data flow system where the output from one component is fed to the input of another component.

  • Create new empty list to keep track of all the threads in the program

  • Define a new function, extractWords()

Creating an NLP Pipeline

The text similarity problem deals with the challenge of finding how close given text documents are.

  • Define an IDF that finds the IDF value

  • Define a TF_IDF

  • Display the contents of vectors

Solving the Text Similarity Problem

In many natural languages, while forming sentences, we avoid the repeated use of certain nouns with pronouns to simplify the sentence construction.

  • Define a new class called AnaphoraExample

  • Create a unique list of males and females

  • Create a NaiveBayesClassifier object called _classifier

Resolving Anaphora

In previous videos, we learned how to identify POS of the words, find named entities, and so on. Just like a word in English behaves as both a noun and a verb, finding the sense in which a word is used is very difficult for computer programs.

  • Define a function with the name understandWordSenseExamples()

  • Define a new function, understandBuiltinWSD()

  • Define a new variable called maps

Disambiguating Word Sense

Feedback is one of the most powerful measures for understanding relationships. In order to write computer programs that can measure and find the emotional quotient, we should have some good understanding of the ways these emotions are expressed in these natural languages.

  • Define a new function, wordBasedSentiment()

  • Define sample text to analyze

  • Create multiWordBasedSentiment()

Performing Sentiment Analysis

Let’s write our own sentiment analysis program based on what we have learned in the previous video.

  • Define a new function, mySentimentAnalyzer()

  • Extract the sentences from the variable feedback

Exploring Advanced Sentiment Analysis

Conversational assistants or chatbots are not very new. One of the foremost of this kind is ELIZA, which was created in the early 1960s and is worth exploring. NLTK has a module,, which simplifies building these engines by providing a generic framework. Let’s see that in detail.

  • Define builtinEngines()

  • Create a new function called myEngine()

  • Define a nested tuple data structure

Creating a Conversational Assistant or Chatbot
Test Your Knowledge
4 questions
  • Basic knowledge of NLP and some prior programming experience in Python is assumed. Familiarity with deep learning will be helpful.

Natural Language Processing (NLP) is the most interesting subfield of data science. It offers powerful ways to interpret and act on spoken and written language. It’s used to help deal with customer support enquiries, analyse how customers feel about a product, and provide intuitive user interfaces. If you wish to build high performing day-to-day apps by leveraging NLP, then go for this course.

This course teaches you to write applications using one of the popular data science concepts, NLP. You will begin with learning various concepts of natural language understanding, Natural Language Processing, and syntactic analysis. You will learn how to implement text classification, identify parts of speech, tag words, and more. You will also learn how to analyze sentence structures and master syntactic and semantic analysis. You will learn all of these through practical demonstrations, clear explanations, and interesting real-world examples. This course will give you a versatile range of NLP skills, which you will put to work in your own applications.

Contents and Overview

This training program includes 2 complete courses, carefully chosen to give you the most comprehensive training possible.

The first course, Natural Language Processing in Practice, will help you gain NLP skills by practical demonstrations, clear explanations, and interesting real-world examples. It will give you a versatile range of deep learning and NLP skills that you can put to work in your own applications.

The second course, Developing NLP Applications Using NLTK in Python, course is designed with advanced solutions that will take you from newbie to pro in performing natural language processing with NLTK. You will come across various concepts covering natural language understanding, natural language processing, and syntactic analysis. It consists of everything you need to efficiently use NLTK to implement text classification, identify parts of speech, tag words, and more. You will also learn how to analyze sentence structures and master syntactic and semantic analysis.

By the end of this course, you will be all ready to bring deep learning and NLP techniques to build intelligent systems using NLTK in Python.
Meet Your Expert(s):

We have the best work of the following esteemed author(s) to ensure that your learning journey is smooth:

  • Smail Oubaalla is a talented Software Engineer with an interest in building the most effective, beautiful, and correct piece of software possible. He has helped companies build excellent programs. He also manages projects and has experience in designing and managing new ones. When not on the job, he loves hanging out with friends, hiking, and playing sports (football, basketball, rugby, and more). He also loves working his way through every recipe he can find in the family cookbook or elsewhere, and indulging his love for seeing new places.

  • Krishna Bhavsar has spent around 10 years working on natural language processing, social media analytics, and text mining in various industry domains such as hospitality, banking, healthcare, and more. He has worked on many different NLP libraries such as Stanford CoreNLP, IBM's SystemText and BigInsights, GATE, and NLTK to solve industry problems related to textual analysis. He has also worked on analyzing social media responses for popular television shows and popular retail brands and products. He has also published a paper on sentiment analysis augmentation techniques in 2010 NAACL. he recently created an NLP pipeline/toolset and open sourced it for public use. Apart from academics and technology, Krishna has a passion for motorcycles and football. In his free time, he likes to travel and explore. He has gone on pan-India road trips on his motorcycle and backpacking trips across most of the countries in South East Asia and Europe.

  • Naresh Kumar has more than a decade of professional experience in designing, implementing, and running very-large-scale Internet applications in Fortune Top 500 companies. He is a full-stack architect with hands-on experience in domains such as ecommerce, web hosting, healthcare, big data and analytics, data streaming, advertising, and databases. He believes in open source and contributes to it actively. Naresh keeps himself up-to-date with emerging technologies, from Linux systems internals to frontend technologies. He studied in BITS-Pilani, Rajasthan with dual degree in computer science and economics.

  • Pratap Dangeti develops machine learning and deep learning solutions for structured, image, and text data at TCS, in its research and innovation lab in Bangalore. He has acquired a lot of experience in both analytics and data science. He received his master's degree from IIT Bombay in its industrial engineering and operations research program. Pratap is an artificial intelligence enthusiast. When not working, he likes to read about nextgen technologies and innovative methodologies. He is also the author of the book Statistics for Machine Learning by Packt.

Who this course is for:
  • This course is for data science professionals who would like to expand their knowledge from traditional NLP techniques to state-of-the-art techniques in the application of NLP.