Text mining and Natural Language Processing (NLP) are among the most active research areas. Pre-processing your text data before feeding it to an algorithm is a crucial part of NLP. In this course, you will learn NLP using natural language toolkit (NLTK), which is part of the Python. You will learn pre-processing of data to make it ready for any NLP application.
We go through text cleaning, stemming, lemmatization, part of speech tagging, and stop words removal. The difference between this course and others is that this course dives deep into the NLTK, instead of teaching everything in a fast pace.
This course has 3 sections. In the first section, you will learn the definition of NLP and its applications. Additionally, you will learn how to install NLTK and learn about its components.
In the second section, you will learn the core functions of NLTK and its methods and techniques. We examine different available algorithms for pre-processing text data.
In the last section, we will build 3 NLP applications using the methods we learnt in the previous section.
Specifically, we will go through developing a topic modeling application to identify topics in a large text. We will identify main topics discussed in a large corpus.
Then, we will build a text summarization application. We will teach the computer to summarize the large text and to summarize the important points.
The last application is about sentiment analysis. Sentiment analysis in Python is a very popular application that can be used on variety of text data. One of its applications is Twitter sentiment analysis. Since tweets are short piece of text, they are ideal for sentiment analysis. We will go through building a sentiment analysis system in the last example.
Finally, we compare NLTK with SpaCy, which is another popular NLP library in Python. It's going to be a very exciting course. Let's start learning.