Natural Language Processing with Python: 3-in-1

Name: Natural Language Processing with Python: 3-in-1
Rating: 3.8 (22 reviews)

Build solutions to get up and speed with new trends in NLP. Three complete courses in one comprehensive training program

Created byPackt Publishing

Last updated 8/2018

English

English [Auto],

What you'll learn

Discover how to create frequency distributions on your text with NLTK
Build your own movie review sentiment application in Python
Import, access external corpus & explore frequency distribution of the text in corpus file
Perform tokenization, stemming, lemmatization, spelling corrections, stop words removals, and more
Build solutions such as text similarity, summarization, sentiment analysis and anaphora resolution to get up to speed with new trends in NLP
Use dictionaries to create your own named entities using this easy-to-follow guide

Course content

3 sections • 72 lectures • 4h 29m total length

The Course Overview3:35
This video provides an overview of the entire course.
Installing and Setting Up NLTK6:31
This video will describe what software we will need to get started with the course and will demonstrate how to download, install, and set up the NLTK library.
Implementing Simple NLP Tasks and Exploring NLTK Libraries9:05
This video will demonstrate how to open up the Jupyter Notebook programming environment and introduce you to basic commands. We’ll begin by importing the NLTK library and explore some of the book and corpus that are included as native datasets.
Part-Of-Speech Tagging8:38
This video will introduce the Part-Of-Speech tagging, describe the motivation for its use, and explore various examples to explain how it can be done using NLTK.
Stemming and Lemmatization9:32
This video will introduce to stemming and lemmatization, describe the motivation for its use, and explore various examples to explain how it can be done using NLTK.
Named Entity Recognition7:30
This video will introduce the named entity recognition, describe the motivation for its use, and explore various examples to explain how it can be done using NLTK.
Frequency Distribution with NLTK4:56
This video will describe what a frequency distribution is and how we can create one using NLTK.
Frequency Distribution on Your Text with NLTK6:13
This video will build on the previous lesson and demonstrate how to create some sample text, produce a cumulative frequency plot, and introduce related topics including hapaxes, and text searches with conditional statements.
Concordance Function in NLTK4:06
This video will introduce the student to the Concordance function, explain why it is import in the context of NLP, and demonstrate how to create a concordance using the NLTK library.
Similar Function in NLTK3:33
his video will introduce to the similarity function, explain why it is import in the context of NLP, and demonstrate how to identify similar words using the NLTK library.
Dispersion Plot Function in NLTK4:15
This video will introduce to the Dispersion function, explain why it is import in the context of NLP, and demonstrate how to create a dispersion plot using the NLTK library.
Count Function in NLTK4:44
This video will introduce to the Count function, explain why it is important in the context of NLP, and demonstrate how to count tokens using the NLTK library.
Introduction to Recurrent Neural Network and Long Short Term Memory3:54
This video will introduce to recurrent Neural Networks and the long short term memory architecture. We’ll also learn about the motivation behind their use in the context of NLP.
Programming Your Own Sentiment Classifier Using NLTK4:04
This video will walkthrough a step-by-step tutorial showing how to construct their own sentiment classifier.
Perform Sentiment Classification on a Movie Rating Dataset6:46
This video will finish constructing our Deep Learning classifier using Keras and we’ll train it to make predictions on the IMDB movie rating dataset. We’ll then create a performance metric and use it to demonstrate how well our classifier predicts the positive or negative sentiment classes.
Starting with Latent Semantic Analysis5:54
This section introduces latent semantic analysis and explains how it can be used to classify text datasets. We begin the LSA example by importing the native NLTK Reuters dataset. Then we introduce and implement a technique to create a weighted vectorization of the text dataset in preparation for more advanced analysis like clustering and classification.
Programming Example of Principal Component Analysis6:32
This section introduces the concept of dimensionality reduction and explains why it is used in the context of latent semantic analysis. An example problem is then worked out by importing the native NLTK Reuters dataset and performing dimensionality reduction using principal component analysis.
Programming Example of Singular Value Decomposition7:25
This section introduces the concept of dimensionality reduction and explains why it is used in the context of latent semantic analysis. An example problem is then worked out by importing the native NLTK Reuters dataset and performing dimensionality reduction using principal component analysis.

The Course Overview3:22
his video gives an overview of the entire course.
Accessing In-Built Corpora4:07
Our first task involves us learning how to access any one of these corpora. We have decided to do some tests on the Reuters corpus or the same.
Downloading an External Corpus3:33
Now that we have learned how to load and access an inbuilt corpus, we will learn how to download and also how to load and access any external corpus.
Counting All the wh-words3:43
The objective of this video is to get you to perform a simple counting task on any given corpus.
Frequency Distribution Operations2:40
Web and chat text corpus is non-formal literature that, as the name implies, contains content from Firefox discussion forums, scripts of movies, wine reviews, personal advertisements, and overheard conversations. Our objective here in this video is to understand the use of frequency distribution and its features/functions.
WordNet3:09
From this video onwards, we will turn our attention to WordNet. As you can read in the title, we are going to explore what word sense is.
The Concepts of Hyponyms and Hypernyms Using WordNet3:39
A hyponym is a word of a more specific meaning than a more generic word such as bat, which we explored in the introduction section of our previous video.
Compute the Average Polysemy According to WordNet3:28
This video is different from previous videos. It's not just an API concept discovery but we are going to discover a linguistic concept here.
The Importance of String Operations3:09
As an NLP expert, you are going to work on a lot of textual content. And when you are working with text, you must know string operations. We are going to start with a couple of short and crisp recipes that will help you understand the str class and operations with it in Python.
Getting Deeper with String Operations2:58
Moving ahead from the previous video, we will see substrings, string replacements, and how to access all the characters of a string.
Reading a PDF File in Python2:54
We start off with a small video for accessing PDF files from Python. For this, you need to install the PyPDF2 library.
Reading Word Documents in Python3:56
In this video, we will see how to load and read Word/DOCX documents. The libraries available for reading DOCX word documents are more comprehensive, in that we can also see paragraph boundaries, text styles, and do what are called runs.
Creating a User-Defined Corpus4:30
For this video, we are not going to use anything new in terms of libraries or concepts. We are reinvoking the concept of corpus from the first section. Just that we are now going to create our own corpus here instead of using what we got from the Internet.
Reading Contents from an RSS Feed2:49
The objective of this video is to read such an RSS feed and access content of one of the posts from that feed. For this purpose, we will be using the RSS feed of Mashable.
HTML Parsing Using BeautifulSoup3:50
Most of the times when you have to deal with data on the Web, it will be in the form of HTML pages. For this purpose, we thought it is necessary to introduce you to HTML parsing in Python.
Tokenization – Learning to Use the Inbuilt Tokenizers of NLTK2:51
Understand the meaning of tokenization, why we need it, and how to do it.
Stemming – Learning to Use the Inbuilt Stemmers of NLTK2:28
Let's understand the concept of a stem and the process of stemming. We will learn why we need to do it and how to perform it using inbuilt NLTK stemming classes.
Lemmatization – Learning to Use the WordNetLemmatizer of NLTK2:20
Understand what lemma and lemmatization are. Learn how lemmatization differs from Stemming, why we need it, and how to perform it using nltk library's WordnetLemmatizer.
Stopwords – Learning to Use the Stopwords Corpus3:14
We will be using the Gutenberg corpus as an example in this recipe. The Gutenberg corpus is part of the NLTK data module. It contains a selection of 18 texts from some 25,000 electronic books from the project Gutenberg text archives.
Edit Distance – Writing Your Own Algorithm to Find Edit Distance Between Two Str2:48
Edit distance, also called as Levenshtein distance is a metric used to measure the similarity between two distances.
Processing Two Short Stories and Extracting the Common Vocabulary2:38
This video is supposed to give you an idea of how to handle a typical text analytics problem when you come across it.
Regular Expression – Learning to Use *, +, and ?3:23
We start off with a video that will elaborate the use of the, +, and ? operators in regular expressions.
Regular Expression – Learning to Use Non-Start and Non-End of Word3:20
The starts with (^) and ends with ($) operators are indicators used to match the given patterns at the start or end of an input text. Let’s study about them in detail.
Searching Multiple Literal Strings and Substrings Occurrences1:53
In this video, we shall run some iterative functions with regular expressions. More specifically, we shall run multiple patterns on an input string with a “for” loop and we shall also run a single pattern for multiple matches on the input. Let's directly see how to do.
Creating Date Regex2:40
In this video, we shall first run a simple date regex. Along with that, we will learn the significance of the () groups. Since that's too less to include in a recipe, we shall also throw in some more things like the squared brackets [], which indicate a set.
Making Abbreviations1:19
We have covered all the important notations that I wanted to cover with examples in the previous videos. Now, going forward, we will look at topics that are geared more towards accomplishing a certain task using regular expressions than explaining any notations.
Learning to Write Your Own Regex Tokenizer1:21
We already know the concepts of tokens, tokenizers, and why we need them from the previous section. We have also seen how to use the inbuilt tokenizers of the NLTK module. In this video, we will write our own tokenizer; it will evolve to mimic the behavior of nltk.word_tokenize().
Learning to Write Your Own Regex Stemmer2:14
We already know the concept of stems/lemmas, stemmer, and why we need them from the previous section. We have seen how to use the inbuilt porter stemmer and Lancaster stemmer of the NLTK module.

The Course Overview3:25
This video gives an overview of the entire course.
Exploring the In-Built Tagger2:08
In this video, we use the Python NLTK library to understand more about the POS tagging features in a given text.
Writing Your Own Tagger5:42
Now, we will explore the NLTK library by writing our own taggers. We’ll write various types of taggers such as Default tagger, Regular expression tagger and Lookup tagger.
Training Your Own Tagger3:05
Next, let’s learn how to train our own tagger and save the trained model to disk so that we can use it later for further computations.
Learning to Write Your Own Grammar1:55
This video will teach us how to define grammar and understand production rules.
Writing a Probabilistic CFG2:28
Probabilistic CFG is a special type of CFG in which the sum of all the probabilities for the non-terminal tokens (left-hand side) should be equal to one. Let's write a simple example to understand more.
Writing a Recursive CFG2:10
Recursive CFGs are a special types of CFG where the Tokens on the left-hand side are present on the right-hand side of a production rule. Palindromes are the best examples of recursive CFG.
Using the Built-In Chunker2:02
In this video, we will learn how to use the in-built chunker. We will use some features that will be used from NLTK as part of this process.
Writing Your Own Simple Chunker2:13
Now that we know using the built-in chunker, in this video, we will write our own Regex chunker.
Training a Chunker2:23
In this video, we will learn the training process, training our own chunker, and evaluating it.
Parsing Recursive Descent1:40
Recursive descent parsers belong to the family of parsers that read the input from left to right and build the parse tree in a top-down fashion and traversing nodes in a pre-order fashion.
Parsing Shift-Reduce1:42
In this video, we will learn to use and understand shift-reduce parsing.
Parsing Dependency Grammar and Projective Dependency1:38
We will now learn how to parse dependency grammar and use it with the projective dependency parser.
Parsing a Chart2:56
Chart parsers are special types of parsers which are suitable for natural languages as they have ambiguous grammars. Let’s learn about them in detail.
Using Inbuilt NERs2:09
Python NLTK has built-in support for Named Entity Recognition (NER). Let’s learn to use inbuilt NERs.
Creating, Inversing, and Using Dictionaries3:44
Is it possible to print the list of all the words in the sentence that are nouns? Yes, for this, we will learn how to use a Python dictionary.
Choosing the Feature Set3:48
Choosing the feature set Features are one of the most powerful components of nltk library. They represent clues within the language for easy tagging of the data that we are dealing with.
Segmenting Sentences Using Classification2:31
A natural language that supports question marks (?), full stops (.), and exclamations (!) poses a challenge to us in identifying whether a statement has ended or it still continues after the punctuation characters. Let’s try and solve this classic problem.
Writing a POS Tagger with Context2:28
In previous videos, we have written regular-expression-based POS taggers that leverage word suffixes, let’s try to write a program that leverages the feature extraction concept to find the POS of the words in the sentence.
Creating an NLP Pipeline7:03
In computing, a pipeline can be thought of as a multi-phase data flow system where the output from one component is fed to the input of another component.
Solving the Text Similarity Problem4:02
The text similarity problem deals with the challenge of finding how close given text documents are.
Resolving Anaphora3:35
In many natural languages, while forming sentences, we avoid the repeated use of certain nouns with pronouns to simplify the sentence construction.
Disambiguating Word Sense2:44
In previous videos, we learned how to identify POS of the words, find named entities, and so on. Just like a word in English behaves as both a noun and a verb, finding the sense in which a word is used is very difficult for computer programs.
Performing Sentiment Analysis3:01
Feedback is one of the most powerful measures for understanding relationships. In order to write computer programs that can measure and find the emotional quotient, we should have some good understanding of the ways these emotions are expressed in these natural languages.
Exploring Advanced Sentiment Analysis3:04
Let’s write our own sentiment analysis program based on what we have learned in the previous video.
Creating a Conversational Assistant or Chatbot3:57
Conversational assistants or chatbots are not very new. One of the foremost of this kind is ELIZA, which was created in the early 1960s and is worth exploring. NLTK has a module, nltk.chat, which simplifies building these engines by providing a generic framework. Let’s see that in detail.

Requirements

Good knowledge of Python is a must

Description

Natural Language Processing is a part of Artificial Intelligence that deals with the interactions between human (natural) languages and computers.

This comprehensive 3-in-1 training course includes unique videos that will teach you various aspects of performing Natural Language Processing with NLTK—the leading Python platform for the task. Go through various topics in Natural Language Processing, ranging from an introduction to the relevant Python libraries to applying specific linguistics concepts while exploring text datasets with the help of real-word examples.

About the Author

Tyler Edwards is a senior engineer and software developer with over a decade of experience creating analysis tools in the space, defense, and nuclear industries. Tyler is experienced using a variety of programming languages (Python, C++, and more), and his research areas include machine learning, artificial intelligence, engineering analysis, and business analytics. Tyler holds a Master of Science degree in Mechanical Engineering from Ohio University. Looking forward, Tyler hopes to mentor students in applied mathematics, and demonstrate how data collection, analysis, and post-processing can be used to solve difficult problems and improve decision making.

Krishna Bhavsar has spent around 10 years working on natural language processing, social media analytics, and text mining. He has worked on many different NLP libraries such as Stanford Core NLP, IBM's System Text and Big Insights, GATE, and NLTK to solve industry problems related to textual analysis. He has also published a paper on sentiment analysis augmentation techniques in 2010 NAACL. Apart from academics, he has a passion for motorcycles and football. In his free time, he likes to travel and explore.

Naresh Kumar has more than a decade of professional experience in designing, implementing, and running very-large-scale Internet applications in Fortune Top 500 companies. He is a full-stack architect with hands-on experience in domains such as e-commerce, web hosting, healthcare, big data and analytics, data streaming, advertising, and databases. He believes in open source and contributes to it actively. Naresh keeps himself up-to-date with emerging technologies, from Linux systems internals to frontend technologies. He studied in BITS-Pilani, Rajasthan with dual degree in computer science and economics.

Pratap Dangeti develops machine learning and deep learning solutions for structured, image, and text data at TCS, in its research and innovation lab in Bangalore. He has acquired a lot of experience in both analytics and data science. He received his master's degree from IIT Bombay in its industrial engineering and operations research program. Pratap is an artificial intelligence enthusiast. When not working, he likes to read about Next-gen technologies and innovative methodologies. He is also the author of the book Statistics for Machine Learning by Packt.

Who this course is for:

Python developers who wish to master Natural Language Processing and want to make their applications smarter by implementing NLP

Natural Language Processing with Python: 3-in-1

What you'll learn

Explore related topics

Course content

Natural Language Processing with Python18 lectures • 1hr 47min

Text Processing Using NLTK in Python28 lectures • 1hr 24min

Developing NLP Applications Using NLTK in Python26 lectures • 1hr 18min

Requirements

Description

Who this course is for: