Introduction to Natural Language Processing
4.2 (201 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
903 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Introduction to Natural Language Processing to your Wishlist.

Add to Wishlist

Introduction to Natural Language Processing

Learn how to analyze text data.
Bestselling
4.2 (201 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
903 students enrolled
Created by Brian Sacash
Last updated 1/2016
English
Current price: $10 Original price: $100 Discount: 90% off
5 hours left at this price!
30-Day Money-Back Guarantee
Includes:
  • 3 hours on-demand video
  • 3 Articles
  • 16 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Work with text data using the Natural Language Tool Kit.
  • Load and manipulate custom text data.
  • Analyze text to discover, sentiment, important key words, and statistics.
View Curriculum
Requirements
  • A computer running Windows, OS X, or Linux.
  • Basic Python programming knowledge.
Description

This course introduces Natural Language Processing through the use of python and the Natural Language Tool Kit. Through a practical approach, you'll get hands on experience working with and analyzing text.

As a student of this course, you'll get updates for free, which include lecture revisions, new code examples, and new data projects.

By the end of this course you will:

  • Have an understanding of how to use the Natural Language Tool Kit.
  • Be able to load and manipulate your own text data.
  • Know how to formulate solutions to text based problems.
  • Know when it is appropriate to apply solutions such as sentiment analysis and classification techniques.


Who is the target audience?
  • This course is for anyone who is not familiar with Natural Language Processing and is looking for a way to start.
  • This course is probably not for you if you already have an understanding of Natural Language Processing and the Natural Language Tool Kit.
Students Who Viewed This Course Also Viewed
Curriculum For This Course
42 Lectures
02:53:04
+
Course Introduction
1 Lecture 03:38
+
Setup
2 Lectures 08:54

We will be using the Anaconda distribution of Python throughout this course. You can download it for free from https://www.continuum.io/downloads.

Using the Anaconda Prompt (you can search for this program after Anaconda has installed), type conda install jupyter to install Jupyter. Jupyter is a notebook style interface for interactive coding.

To launch Jupyter, open your Anaconda Prompt and type jupyter notebook. This will launch a new notebook instance in your internet browser. The Anaconda Prompt is now actively running Jupyter notebook locally. If you close the prompt, Jupyter will no longer be able to run. You can simply open the Anaconda Prompt and type jupyter notebook at any time to restart Jupyter, and pick up where you left off.

In your Windows Command Prompt (search for the "command prompt" program) type the following:

python

import nltk

nltk.download()

This will open a new window which will allow you to download extra data and packages we will be using throughout the course.

Windows Setup
04:39

We will be using the Anaconda distribution of Python throughout this course. You can download it for free from https://www.continuum.io/downloads.

Using the Terminal (you can search for this app), type conda install jupyter to install Jupyter. Jupyter is a notebook style interface for interactive coding.

To launch Jupyter, type jupiter notebook in your Terminal window. This will launch a new notebook instance in your internet browser. The Terminal is now actively running Jupyter notebook locally. If you close that Terminal window, Jupyter will no longer be able to run. You can simply open the Terminal and type jupyter notebook at any time to restart Jupyter, and pick up where you left off.

In a new Terminal window or tab, (one that is not running Jupyter) type the following:

python

import nltk

nltk.download()

This will open a new window which will allow you to download extra data and packages we will be using throughout the course.

OS X Setup
04:15
+
Python Refresher
4 Lectures 13:08

A python list is a variable that stores comma separated values.

mylist = ["a", "b", "c"]


Lists
02:34

A python dictionary stores key-value pairs.

d = {
'Python': 'programming',
'English': "natural",
'French': 'natrual',
'Javascript' : 'programming'
}

If we were to look up any of the keys (Python, English, French, or Javascript), we would get back the associated value.

Dictionaries
02:55

We will often use for-loops to scan through lists.

We will use if statements to look for special conditions.

Loops and Conditionals
05:18

We will use functions to reuse code that we write.

Functions
02:21
+
NLTK and the Basics
9 Lectures 46:56

We will be using the Natural Language Tool Kit (NLTK) throughout this course.

Overview - The Natural Language Tool Kit
01:15


Example - Words Per Sentence Trends
07:12

Frequency Distribution
03:16

Conditional Frequency Distribution
03:04

Example - Informative Words
07:36

Bigrams
05:51

Regular expressions are a great way to find specific character patterns in your text data.

Overview - Regular Expressions
04:02

Regular Expression Practice
06:55
+
Tokenization , Tagging, Chunking
8 Lectures 36:59

Tokenization is the act of breaking text into smaller entities. These can either be sentences (sentence tokens) or individual words (word tokens).

Preview 00:58

Tokenization is the act of breaking text into smaller entities. These can either be sentences (sentence tokens) or individual words (word tokens).

Tokenization
02:56

Normalizing
08:33

Part of Speech Tagging
08:09

Example - Multiple Parts of Speech
03:40

Example - Choices
04:23

Chunking
03:35

Named Entity Recognition
04:45
+
Custom Sources
8 Lectures 20:41
Overview - Character Encoding
01:12


HTML
04:47

URL
03:30

CSV File
02:06

Exporting
02:23

NLTK Resources
02:27

Example - Remove Stopwords
02:29
+
Projects
7 Lectures 41:55

Sentiment analysis is the act of determining if a given collection of text is more positive or negative in nature.

Sentiment Analysis Intro
01:15

Sentiment analysis is the act of determining if a given collection of text is more positive or negative in nature.

Basic Sentiment Analysis
13:39

We can use Naive Bayes classification techniques to determine if a name is traditionally more masculine or feminine.

Gender Prediction Intro
02:22

We can use Naive Bayes classification techniques to determine if a name is traditionally more masculine or feminine.

Gender Prediction
12:40

With Term Frequency-Inverse Document Frequency (TF-IDF), we can easily determine key words in documents.

TF-IDF Intro
01:20

With Term Frequency-Inverse Document Frequency (TF-IDF), we can find similarities across entities.

TF-IDF Part 1
04:51

With Term Frequency-Inverse Document Frequency (TF-IDF), we can easily determine key words in documents.

TF-IDF Part 2
05:48
+
Appendix
3 Lectures 00:58
Additional NLP Resources
00:16

Learning Python
00:25

Future Course Content
00:17
About the Instructor
Brian Sacash
4.2 Average rating
199 Reviews
903 Students
1 Course
Data Scientist

Brian has a BS in physics and MS in Quantitative Analysis from the University of Cincinnati with over 6 years experience in data analysis. Over the course of his career he has developed a skill set in natural language processing analysis and big data. He has helped solved data problems from the Department of Defense to global financial institutions. He previously was a consultant where he helped organizations understand their data through advanced analytics methods. Brian currently works on developing methods and software for large data systems.