Introduction to Natural Language Processing

Learn how to analyze text data.
4.5 (58 ratings) Instead of using a simple lifetime average, Udemy calculates a
course's star rating by considering a number of different factors
such as the number of ratings, the age of ratings, and the
likelihood of fraudulent ratings.
297 students enrolled
$50
Take This Course
  • Lectures 42
  • Contents Video: 3 hours
    Other: 1 min
  • Skill Level Beginner Level
  • Languages English
  • Includes Lifetime access
    30 day money back guarantee!
    Available on iOS and Android
    Certificate of Completion
Wishlisted Wishlist

How taking a course works

Discover

Find online courses made by experts from around the world.

Learn

Take your courses with you and learn anywhere, anytime.

Master

Learn and practice real-world skills and achieve your goals.

About This Course

Published 11/2015 English

Course Description

This course introduces Natural Language Processing through the use of python and the Natural Language Tool Kit. Through a practical approach, you'll get hands on experience working with and analyzing text.

As a student of this course, you'll get updates for free, which include lecture revisions, new code examples, and new data projects.

By the end of this course you will:

  • Have an understanding of how to use the Natural Language Tool Kit.
  • Be able to load and manipulate your own text data.
  • Know how to formulate solutions to text based problems.
  • Know when it is appropriate to apply solutions such as sentiment analysis and classification techniques.


What are the requirements?

  • A computer running Windows, OS X, or Linux.
  • Basic Python programming knowledge.

What am I going to get from this course?

  • Work with text data using the Natural Language Tool Kit.
  • Load and manipulate custom text data.
  • Analyze text to discover, sentiment, important key words, and statistics.

What is the target audience?

  • This course is for anyone who is not familiar with Natural Language Processing and is looking for a way to start.
  • This course is probably not for you if you already have an understanding of Natural Language Processing and the Natural Language Tool Kit.

What you get with this course?

Not for you? No problem.
30 day money back guarantee.

Forever yours.
Lifetime access.

Learn on the go.
Desktop, iOS and Android.

Get rewarded.
Certificate of completion.

Curriculum

Section 1: Course Introduction
Course Intro and Outline
Preview
03:38
Section 2: Setup
04:39

We will be using the Anaconda distribution of Python throughout this course. You can download it for free from https://www.continuum.io/downloads.

Using the Anaconda Prompt (you can search for this program after Anaconda has installed), type conda install jupyter to install Jupyter. Jupyter is a notebook style interface for interactive coding.

To launch Jupyter, open your Anaconda Prompt and type jupyter notebook. This will launch a new notebook instance in your internet browser. The Anaconda Prompt is now actively running Jupyter notebook locally. If you close the prompt, Jupyter will no longer be able to run. You can simply open the Anaconda Prompt and type jupyter notebook at any time to restart Jupyter, and pick up where you left off.

In your Windows Command Prompt (search for the "command prompt" program) type the following:

python

import nltk

nltk.download()

This will open a new window which will allow you to download extra data and packages we will be using throughout the course.

04:15

We will be using the Anaconda distribution of Python throughout this course. You can download it for free from https://www.continuum.io/downloads.

Using the Terminal (you can search for this app), type conda install jupyter to install Jupyter. Jupyter is a notebook style interface for interactive coding.

To launch Jupyter, type jupiter notebook in your Terminal window. This will launch a new notebook instance in your internet browser. The Terminal is now actively running Jupyter notebook locally. If you close that Terminal window, Jupyter will no longer be able to run. You can simply open the Terminal and type jupyter notebook at any time to restart Jupyter, and pick up where you left off.

In a new Terminal window or tab, (one that is not running Jupyter) type the following:

python

import nltk

nltk.download()

This will open a new window which will allow you to download extra data and packages we will be using throughout the course.

Section 3: Python Refresher
02:34

A python list is a variable that stores comma separated values.

mylist = ["a", "b", "c"]


02:55

A python dictionary stores key-value pairs.

d = {
'Python': 'programming',
'English': "natural",
'French': 'natrual',
'Javascript' : 'programming'
}

If we were to look up any of the keys (Python, English, French, or Javascript), we would get back the associated value.

05:18

We will often use for-loops to scan through lists.

We will use if statements to look for special conditions.

02:21

We will use functions to reuse code that we write.

Section 4: NLTK and the Basics
01:15

We will be using the Natural Language Tool Kit (NLTK) throughout this course.

Counting Text
Preview
07:45
Example - Words Per Sentence Trends
07:12
Frequency Distribution
03:16
Conditional Frequency Distribution
03:04
Example - Informative Words
07:36
Bigrams
05:51
04:02

Regular expressions are a great way to find specific character patterns in your text data.

Regular Expression Practice
06:55
Section 5: Tokenization , Tagging, Chunking
00:58

Tokenization is the act of breaking text into smaller entities. These can either be sentences (sentence tokens) or individual words (word tokens).

02:56

Tokenization is the act of breaking text into smaller entities. These can either be sentences (sentence tokens) or individual words (word tokens).

Normalizing
08:33
Part of Speech Tagging
08:09
Example - Multiple Parts of Speech
03:40
Example - Choices
04:23
Chunking
03:35
Named Entity Recognition
04:45
Section 6: Custom Sources
Overview - Character Encoding
01:12
Text File
Preview
01:47
HTML
04:47
URL
03:30
CSV File
02:06
Exporting
02:23
NLTK Resources
02:27
Example - Remove Stopwords
02:29
Section 7: Projects
01:15

Sentiment analysis is the act of determining if a given collection of text is more positive or negative in nature.

13:39

Sentiment analysis is the act of determining if a given collection of text is more positive or negative in nature.

02:22

We can use Naive Bayes classification techniques to determine if a name is traditionally more masculine or feminine.

12:40

We can use Naive Bayes classification techniques to determine if a name is traditionally more masculine or feminine.

01:20

With Term Frequency-Inverse Document Frequency (TF-IDF), we can easily determine key words in documents.

04:51

With Term Frequency-Inverse Document Frequency (TF-IDF), we can find similarities across entities.

05:48

With Term Frequency-Inverse Document Frequency (TF-IDF), we can easily determine key words in documents.

Section 8: Appendix
Additional NLP Resources
Article
Learning Python
Article
Future Course Content
Article

Students Who Viewed This Course Also Viewed

  • Loading
  • Loading
  • Loading

Instructor Biography

Brian Sacash, Data Scientist

Brian has a BS in physics and MS in Quantitative Analysis from the University of Cincinnati with over 5 years experience in data analysis. Over the course of his career he has developed a skill set in natural language processing analysis and big data. He has helped solved data problems from the Department of Defense to global financial institutions. He previously was a consultant where he helped organizations understand their data through advanced analytics methods. Brian currently works at a big data and analytics startup where he is working to disrupt the way people approach large data analysis.

Ready to start learning?
Take This Course