Natural Language Processing (NLP) Using NLTK in Python

Name: Natural Language Processing (NLP) Using NLTK in Python
Rating: 4.1 (8 reviews)

Build smart AI-driven linguistic applications using deep learning and NLP techniques

Created byPackt Publishing

Last updated 4/2019

English

What you'll learn

Attain a strong foundation in Python for deep learning and NLP
Build applications with Python, using the Natural Language Toolkit via NLP
Get to grips on various NLP techniques to build an intelligent Chatbot
Classify text and speech using the Naive Bayes Algorithm
Use various tools and algorithms to build real-world applications
Build solutions such as text similarity, summarization, sentiment analysis and anaphora resolution to get up to speed with new trends in NLP
Write your own POS taggers and grammars so that any syntactic analyses can be performed easily
Use the inbuilt chunker and create your own chunker to evaluate trained models
Create your own named entities using dictionaries to use inbuilt text classification algorithms

Course content

2 sections • 54 lectures • 3h 4m total length

Course Overview3:44
This video provides an overview of the entire course.
Setup and Installation4:26
How to follow up with the practical steps?
Download and install Python
Download and install PyCharm community
Understanding NLP and Its Benefits5:18
What is NLP?
Introduction on why did we invent NLP
Define NLP
Exploring NLP Tools and Libraries6:30
How to get the root of the different terms in order to combine similar terms or concepts
Initialize a stemmer and a lemmatize
Process your tagged text through them
Check out the lemmas and stems
Tokenization6:39
Tokenizing text into sentences or words
Create a tokenize from NLTK
Process or tokenize your text
Stop Words5:31
What are stop words? How to filter or remove them to keep only the important terms
Build a list of stop words
Filter them out from your text
Part of Speech Tagging3:42
Build the lexical structure of your text or sentence
Import a Part of Speech tagger from NLTK
Process or tag the terms in the sentence
Check out the results or tags
Stemming and Lemmatization4:55
How to get the root of the different terms in order to combine similar terms or concepts
Initialize a stemmer and a lemmatize
Process your tagged text through them
Check out the lemmas and stems
Named Entity Recognition3:24
How to extract names of people, places
Import a Named Entity recognizer form NLTK
Process your text to extract the existing named entities
TF-IDF5:28
Extract Keywords from the provided NLTK Corpus
Import the corpus
Apply TF-IDF
Check out the top 10 keywords for each document
Introduction to Sentiment Analysis1:28
What is Sentiment Analysis?
Definition
Pre-Processing the Dataset5:52
What dataset to use? Where to download it? and how to preprocess it
Download the dataset using Keras
Split to Train and Test data
Word Embeddings2:12
What are Word Embeddings?
Define word embeddings
Add a word embeddings layer to our network
Build the Network1:20
What other layers should we add? How to build the network
Add two more layers
Compile the network
Train the Model2:06
Training the model using the train data
Train the model
Test the Model1:00
Test the accuracy of the model
Use test data to test the model
Apply to a Single Input2:01
Test the model with a real example?
Predict the sentiment of a review
Machine Learning8:12
What is Machine Learning?
Define Machine Learning
Applications
Algorithms
Classification5:15
What is Classification and Text Classification?
Define Classification
Text Classification
Pre-Processing the Dataset5:52
What steps should we follow to pre-process the data?
Load the data
Apply TF-IDF
Naïve Bayes and SVM1:22
What is Naïve Bayes Multinomial and SVM
Define Naïve Bayes Multinomial
Define SVM
Train the Classifier3:09
Build and train the classifier
Train the classifier using pre-processed data
Test the Classifier2:42
Testing the classifier
Chatbots3:05
What are Chatbots?
Define Chatbots
Introduction to ChatterBot
Simple NLTK Bot2:27
NLTK Chatbots
Simple NLTK Chatbot conversation
Create a ChatterBot3:26
Creating the first ChatterBot
Install ChatterBot library
Instantiate a Chabot
Enhancing the Chabot1:32
How to make the Chatbot better?
Add pre-processors
Training the Chabot4:39
Train the bot for more vocabulary
Import the corpus trainer
Train and test using English corpus
Train and test using French corpus
Test Your Knowledge

The Course Overview3:25
This video gives an overview of the entire course.
Exploring the In-Built Tagger2:08
In this video, we use the Python NLTK library to understand more about the POS tagging features in a given text.
Create a variable called simpleSentence
Invoke the NLTK built-in tokenizer function word_tokenize()
Invoke the NLTK built-in tagger pos_tag()
Writing Your Own Tagger5:42
Now, we will explore the NLTK library by writing our own taggers. We’ll write various types of taggers such as Default tagger, Regular expression tagger and Lookup tagger.
Define a new Python function called learnDefaultTagger
Create an object of the DefaultTagger() class
Call the tag() function of the tagger object
Training Your Own Tagger3:05
Next, let’s learn how to train our own tagger and save the trained model to disk so that we can use it later for further computations.
Define a function called sampleData()
Define a function called buildDictionary()
Build an nltk.UnigramTagger() object
Learning to Write Your Own Grammar1:55
This video will teach us how to define grammar and understand production rules.
Import the generate function from the nltk.parse.generate
Define a new grammar
Create a new grammar object using the nltk.CFG.fromstring()
Writing a Probabilistic CFG2:28
Probabilistic CFG is a special type of CFG in which the sum of all the probabilities for the non-terminal tokens (left-hand side) should be equal to one. Let's write a simple example to understand more.
Identify tokens in the grammar
Join the list of all the production rules into a string
Writing a Recursive CFG2:10
Recursive CFGs are a special types of CFG where the Tokens on the left-hand side are present on the right-hand side of a production rule. Palindromes are the best examples of recursive CFG.
Create a new list data structure called productions
Add production rules that define palindromes
Pass the newly constructed grammarString to the NLTK built-in nltk.CFG.fromstring function
Using the Built-In Chunker2:02
In this video, we will learn how to use the in-built chunker. We will use some features that will be used from NLTK as part of this process.
Add string to a variable called text
Break the given text into multiple sentences
Do POS analysis using the default tagger
Writing Your Own Simple Chunker2:13
Now that we know using the built-in chunker, in this video, we will write our own Regex chunker.
Write regular expressions
Understand tag patterns
Identify chunks
Training a Chunker2:24
In this video, we will learn the training process, training our own chunker, and evaluating it.
Import the conll2000 corpus and treebank corpus
Define a new function, mySimpleChunker()
Create a list of two datasets
Parsing Recursive Descent1:40
Recursive descent parsers belong to the family of parsers that read the input from left to right and build the parse tree in a top-down fashion and traversing nodes in a pre-order fashion.
Define a new function, RDParserExample
Iterate over the list of sentences in the textlist variable
Create a new CFG object using grammar
Parsing Shift-Reduce1:42
In this video, we will learn to use and understand shift-reduce parsing.
Define a new function, SRParserExample
Iterate over the list of sentences in the textlist variable
Define two sample sentences to understand the shift-reduce parser
Parsing Dependency Grammar and Projective Dependency1:38
We will now learn how to parse dependency grammar and use it with the projective dependency parser.
Create a grammar object using the nltk.grammar.DependencyGrammar class
Define the sample sentence on which parser will be run
Parsing a Chart2:56
Chart parsers are special types of parsers which are suitable for natural languages as they have ambiguous grammars. Let’s learn about them in detail.
Import CFG module, ChartParser and BU_LC_STRATEGY features
Create a sample grammar for the example
Acquire all the parse trees
Using Inbuilt NERs2:09
Python NLTK has built-in support for Named Entity Recognition (NER). Let’s learn to use inbuilt NERs.
Define a new function called sampleNE()
Define a function called sampleNE2()
Call the two sample functions
Creating, Inversing, and Using Dictionaries3:44
Is it possible to print the list of all the words in the sentence that are nouns? Yes, for this, we will learn how to use a Python dictionary.
Define a new class called LearningDictionary
Create buildDictionary() and buildReverseDictionary()
Define getPOSForWord()
Choosing the Feature Set3:48
Choosing the feature set Features are one of the most powerful components of nltk library. They represent clues within the language for easy tagging of the data that we are dealing with.
Create learnSimpleFeatures()
Create learnFeatures()
Compare both the functions
Segmenting Sentences Using Classification2:31
A natural language that supports question marks (?), full stops (.), and exclamations (!) poses a challenge to us in identifying whether a statement has ended or it still continues after the punctuation characters. Let’s try and solve this classic problem.
Define featureExtractor()
Create segmentTextAndPrintSentences()
Extract all the features from the traindata and store it in traindataset
Writing a POS Tagger with Context2:28
In previous videos, we have written regular-expression-based POS taggers that leverage word suffixes, let’s try to write a program that leverages the feature extraction concept to find the POS of the words in the sentence.
Indicate the dual behavior of the words
Define a new function called withContextTagger()
Build a featuredata list
Creating an NLP Pipeline7:04
In computing, a pipeline can be thought of as a multi-phase data flow system where the output from one component is fed to the input of another component.
Create new empty list to keep track of all the threads in the program
Define a new function, extractWords()
Solving the Text Similarity Problem4:02
The text similarity problem deals with the challenge of finding how close given text documents are.
Define an IDF that finds the IDF value
Define a TF_IDF
Display the contents of vectors
Resolving Anaphora3:35
In many natural languages, while forming sentences, we avoid the repeated use of certain nouns with pronouns to simplify the sentence construction.
Define a new class called AnaphoraExample
Create a unique list of males and females
Create a NaiveBayesClassifier object called _classifier
Disambiguating Word Sense2:44
In previous videos, we learned how to identify POS of the words, find named entities, and so on. Just like a word in English behaves as both a noun and a verb, finding the sense in which a word is used is very difficult for computer programs.
Define a function with the name understandWordSenseExamples()
Define a new function, understandBuiltinWSD()
Define a new variable called maps
Performing Sentiment Analysis3:01
Feedback is one of the most powerful measures for understanding relationships. In order to write computer programs that can measure and find the emotional quotient, we should have some good understanding of the ways these emotions are expressed in these natural languages.
Define a new function, wordBasedSentiment()
Define sample text to analyze
Create multiWordBasedSentiment()
Exploring Advanced Sentiment Analysis3:04
Let’s write our own sentiment analysis program based on what we have learned in the previous video.
Define a new function, mySentimentAnalyzer()
Extract the sentences from the variable feedback
Creating a Conversational Assistant or Chatbot3:57
Conversational assistants or chatbots are not very new. One of the foremost of this kind is ELIZA, which was created in the early 1960s and is worth exploring. NLTK has a module, nltk.chat, which simplifies building these engines by providing a generic framework. Let’s see that in detail.
Define builtinEngines()
Create a new function called myEngine()
Define a nested tuple data structure
Test Your Knowledge

Requirements

Basic knowledge of NLP and some prior programming experience in Python is assumed. Familiarity with deep learning will be helpful.

Description

Natural Language Processing (NLP) is the most interesting subfield of data science. It offers powerful ways to interpret and act on spoken and written language. It’s used to help deal with customer support enquiries, analyse how customers feel about a product, and provide intuitive user interfaces. If you wish to build high performing day-to-day apps by leveraging NLP, then go for this course.

This course teaches you to write applications using one of the popular data science concepts, NLP. You will begin with learning various concepts of natural language understanding, Natural Language Processing, and syntactic analysis. You will learn how to implement text classification, identify parts of speech, tag words, and more. You will also learn how to analyze sentence structures and master syntactic and semantic analysis. You will learn all of these through practical demonstrations, clear explanations, and interesting real-world examples. This course will give you a versatile range of NLP skills, which you will put to work in your own applications.

Contents and Overview

This training program includes 2 complete courses, carefully chosen to give you the most comprehensive training possible.

The first course, Natural Language Processing in Practice, will help you gain NLP skills by practical demonstrations, clear explanations, and interesting real-world examples. It will give you a versatile range of deep learning and NLP skills that you can put to work in your own applications.

The second course, Developing NLP Applications Using NLTK in Python, course is designed with advanced solutions that will take you from newbie to pro in performing natural language processing with NLTK. You will come across various concepts covering natural language understanding, natural language processing, and syntactic analysis. It consists of everything you need to efficiently use NLTK to implement text classification, identify parts of speech, tag words, and more. You will also learn how to analyze sentence structures and master syntactic and semantic analysis.

By the end of this course, you will be all ready to bring deep learning and NLP techniques to build intelligent systems using NLTK in Python.
Meet Your Expert(s):

We have the best work of the following esteemed author(s) to ensure that your learning journey is smooth:

Smail Oubaalla is a talented Software Engineer with an interest in building the most effective, beautiful, and correct piece of software possible. He has helped companies build excellent programs. He also manages projects and has experience in designing and managing new ones. When not on the job, he loves hanging out with friends, hiking, and playing sports (football, basketball, rugby, and more). He also loves working his way through every recipe he can find in the family cookbook or elsewhere, and indulging his love for seeing new places.
Krishna Bhavsar has spent around 10 years working on natural language processing, social media analytics, and text mining in various industry domains such as hospitality, banking, healthcare, and more. He has worked on many different NLP libraries such as Stanford CoreNLP, IBM's SystemText and BigInsights, GATE, and NLTK to solve industry problems related to textual analysis. He has also worked on analyzing social media responses for popular television shows and popular retail brands and products. He has also published a paper on sentiment analysis augmentation techniques in 2010 NAACL. he recently created an NLP pipeline/toolset and open sourced it for public use. Apart from academics and technology, Krishna has a passion for motorcycles and football. In his free time, he likes to travel and explore. He has gone on pan-India road trips on his motorcycle and backpacking trips across most of the countries in South East Asia and Europe.
Naresh Kumar has more than a decade of professional experience in designing, implementing, and running very-large-scale Internet applications in Fortune Top 500 companies. He is a full-stack architect with hands-on experience in domains such as ecommerce, web hosting, healthcare, big data and analytics, data streaming, advertising, and databases. He believes in open source and contributes to it actively. Naresh keeps himself up-to-date with emerging technologies, from Linux systems internals to frontend technologies. He studied in BITS-Pilani, Rajasthan with dual degree in computer science and economics.
Pratap Dangeti develops machine learning and deep learning solutions for structured, image, and text data at TCS, in its research and innovation lab in Bangalore. He has acquired a lot of experience in both analytics and data science. He received his master's degree from IIT Bombay in its industrial engineering and operations research program. Pratap is an artificial intelligence enthusiast. When not working, he likes to read about nextgen technologies and innovative methodologies. He is also the author of the book Statistics for Machine Learning by Packt.

Who this course is for:

This course is for data science professionals who would like to expand their knowledge from traditional NLP techniques to state-of-the-art techniques in the application of NLP.

Natural Language Processing (NLP) Using NLTK in Python

What you'll learn

Explore related topics

Course content

Natural Language Processing in Practice28 lectures • 1hr 47min

Developing NLP Applications Using NLTK in Python26 lectures • 1hr 18min

Requirements

Description

Who this course is for: