Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Text Mining and Natural Language Processing in R

Name: Text Mining and Natural Language Processing in R
Rating: 4.7 (875 reviews)

Hands-on text mining and natural language processing (NLP) training for data science applications in R

Created byMinerva Singh

Last updated 11/2025

English

What you'll learn

Students will be able to read in data from different sources- including databases
Basic webscraping- extracting text and tabular data from HTML pages
Social media mining from Facebook and Twitter
Extract information relating to tweets and posts
Analyze text data for emotions
Carry out Sentiment analysis
Implement natural language processing (NLP) on different types of text data

Course content

10 sections • 86 lectures • 8h 48m total length

About the Course and Instructor7:58
Explore hands-on text mining and natural language processing in R for data science, using real-world social media data to perform sentiment analysis, machine learning, and unstructured text insights.
Data and Scripts For the Course0:01
Introduction to R and RStudio6:36
Conclusion to Section 11:18
Conclude section one by reiterating text mining goals and prerequisites, then outline reading data from diverse sources and installing R and RStudio with attached code.

Read in CSV & Excel Data9:56
Read in Data from Online CSV4:04
Learn to read online csv data into R, handle metadata, skip top rows, set headers, and access data frames before indexing and subsetting.
Read in Zipped File3:04
Read Data from a Database8:23
Read in JSON Data5:28
Read json data in r, parse world bank json files, and extract country ids and iso codes using lapply and custom functions to access regional attributes.
Read in Data from PDF Documents8:33
Read in Tables from PDF Documents4:38
Conclusion to Section 21:03

Read in Data From Online Google Sheets4:03
Read in Data from Online HTML Tables-Part 14:13
Read in Data from Online HTML Tables-Part 26:24
Get and Clean Data from HTML Tables7:30
Read Text Data from an HTML Page8:52
Introduction to Selector Gadget6:11
More Webscraping With rvest-IMDB Webpage8:52
Scrape IMDB pages with rvest and the Selecter gadget to extract rankings, titles, runtimes, and genres, build a data frame, and gain insights from 2017 film data.
Another Way of Accessing Webpage Elements2:52
Discover another way to access and inspect elements on a dynamic New Zealand tourism page, inspecting popular cities via linked HTML pages.
Conclusions to Section 31:35

Extract Data from Facebook4:12
Get More out Of Facebook6:51
Set up a Twitter App for Mining Data from Twitter3:52
Extract Tweets Using R5:21
More Twitter Data Extraction Using R6:28
Learn to extract data from Twitter in R by configuring credentials, connecting to the Twitter API, and retrieving tweets with date ranges and geolocation such as the Barcelona area.
Get Tweet Locations5:06
Get Location Specific Trends2:02
Learn More About the Followers of a Twitter Handle6:55
Another Way of Extracting Information From Twitter- the rtweet Package3:18
Learn to extract tweets with the rtweet package in R by authenticating with app keys and querying English tweets on a topic for text mining.
Geolocation Specific Tweets With "rtweet"7:49
Learn how to extract geolocated tweets with rtweet, stream London tweets for 60 seconds, convert to a data frame, and analyze text, coordinates, and trends.
More Data Extraction Using rtweet3:18
Locations of Tweets4:02
Authenticate with the tweet package, search 500 users tweeting the hashtag, and plot the top locations from the location column, notably Washington, D.C.
Mining Github Using R7:04
Set up the FourSquare App4:32
Register your app on Foursquare developers, obtain a client id and client secret, install the orkun package from GitHub, and authenticate to access venue tips via the API.
Extract Reviews for Venues on FourSquare11:28
Learn to extract venue reviews and check-ins from the Foursquare API using R, focusing on Indian restaurants in Copenhagen, and analyze user tips and comments.
Conclusions to Section 51:46

Explore Tweet Data7:51
Explore tweet data from Hillary Clinton and Donald Trump using a preprocessed dataset to reveal original versus retweeted content and reply activity.
A Brief Explanation4:22
EDA With Text Data9:02
Examine Multiple Document Corpus of Text5:30
Brief Introduction to tidytext8:28
Explore tidytext basics by converting Jane Austen texts to tidy data, tokenizing into words or sentences with unnest_tokens, detecting chapters via regex, and examining Pride and Prejudice as an example.
Text Exploration & Visualization with tidytext11:09
Explore and visualize text from Pride and Prejudice using tidytext in R, including tokenization, stop-word removal, word clouds, and sentiment analysis with the bing lexicon.
Explore Multiple Texts with tidytext9:22
Count Unique Words in Tweets4:54
Visualizing Text Data as TF-IDF7:55
TF-IDF in Graphical Form5:49
Conclusions to Section 61:18

Wordclouds for Visualizing Tweet Sentiments: India's Demonetization Policy12:29
Explore how word clouds visualize tweet sentiments about India's 2016 demonetization, using pre-processing, corpus construction, and frequency-based sentiment analysis in R.
Wordclouds for Visualizing Reviews10:32
Tidy Wordclouds5:35
Quanteda Wordcloud8:34
Create word clouds with quanteda by building a corpus and document frequency matrix, then clean text with stemming and stop-word removal for immigration manifestos and tweets.
Word Frequency in Text Data3:24
Learn to compute word frequency from text data in R, cleaning and converting to a corpus, then analyze Twitter data with API keys and plot the most frequent terms.
Tweet Sentiments- Mugabe's Ouster4:52
Tidy Sentiments- Sentiment Analysis Using tidytext8:38
Examine the Polarity of Text10:58
Explore text polarity by calculating negative, neutral, and positive content with the cued up library on Mugabe tweets, revealing overall negativity and extracting positive and negative keywords.
Examine the Polarity of Tweets6:24
Topic Modelling a Document8:15
Topic Modelling Multiple Documents14:18
Download four public-domain novels, tokenize chapters, and build a four-topic lda model in R to distinguish the books by themes.
Topic Modelling Tweets Using Quanteda8:21
Conclusions to Section 72:14

Clustering for Text Data7:17
Clustering Tweets with Quanteda4:35
Cluster tweets with quanteda by building a document-feature matrix, selecting the top 50 words via idf, and performing hierarchical clustering to reveal word groupings like fake news and media.
Regression on Text Data6:11
Identify Spam Emails with Supervised Classification10:09
Apply supervised classification in R to distinguish ham from spam emails, building a document-term matrix from cleaned text and achieving about 94 percent accuracy.
Introduction to RTextTools6:16
More on RTextTools9:10
Use Artex tools to create a document matrix from email text and classify ham versus spam with a linear support vector machine. Split data 75/25 and predict unseen emails.
Classifying Textual Data4:00
ML Approaches For Predicting a Binary Outcome in Text Data12:24
ML Approaches For Predicting a Multi-Class Outcome in Text Data7:45
Explore multiclass classification on text data using R, including encoding removal, tokenization, TF-IDF vectorization, and training a support vector machine with tenfold cross-validation and evaluation via a confusion matrix.

A Small (Social) Network2:43
A More Theoretical Explanation4:25
Build & Visualize a Network14:31
Network of Emails6:50
More on Network Visualization4:10
Analysis of Tweet Network8:13
Extract 250 tweets for a hashtag using Social Media Lab and Margaretha, build an actor network and semantic term network, identify communities and clusters, and visualize user-hashtag associations.
Identify Word Pair Networks9:13
Network of Words4:42
Text Analysis of Jane Austen's Mansfield Park

Requirements

Should have prior experience of R and RStudio
Prior experience of statistical and machine learning techniques will be beneficial
Should have an interest in learning practical text mining and natural language processing (NLP)
Should have an interest in deriving insights from social media and text data

Description

Do You Want to Gain an Edge by Gleaning Novel Insights from Social Media?

Do You Want to Harness the Power of Unstructured Text and Social Media to Predict Trends?

Over the past decade there has been an explosion in social media sites and now sites like Facebook and Twitter are used for everything from sharing information to distributing news. Social media both captures and sets trends. Mining unstructured text data and social media is the latest frontier of machine learning and data science.

LEARN FROM AN EXPERT DATA SCIENTIST WITH +5 YEARS OF EXPERIENCE:

My name is Minerva Singh and I am an Oxford University MPhil (Geography and Environment) graduate. I recently finished a PhD at Cambridge University (Tropical Ecology and Conservation). I have several years of experience in analyzing real-life data from different sources using data science-related techniques and producing publications for international peer-reviewed journals. Unlike other courses out there, which focus on theory and outdated methods, this course will teach you practical techniques to harness the power of both text data and social media to build powerful predictive models. We will cover web-scraping, text mining and natural language processing along with mining social media sites like Twitter and Facebook for text data. Additionally, you will learn to apply both exploratory data analysis and machine learning techniques to gain actionable insights from text and social media data.

TAKE YOUR DATA SCIENCE CAREER TO THE NEXT LEVEL

BECOME AN EXPERT IN TEXT MINING & NATURAL LANGUAGE PROCESSING :

My course will help you implement the methods using real data obtained from different sources. Many courses use made-up data that does not empower students to implement R based data science in real life. After taking this course, you’ll easily use packages like the caret, dplyr to work with real data in R. You will also learn to use the common social media mining and natural language processing packages to extract insights from text data. I will even introduce you to some very important practical case studies - such as identifying important words in a text and predicting movie sentiments based on textual reviews. You will also extract tweets pertaining to trending topics analyze their underlying sentiments and identify topics with Latent Dirichlet allocation. With this Powerful course, you’ll know it all: extracting text data from websites, extracting data from social media sites and carrying out analysis of these using visualization, stats, machine learning, and deep learning!

Start analyzing data for your own projects, whatever your skill level and Impress your potential employers with actual examples of your data science projects.

HERE IS WHAT YOU WILL GET:

Data Structures and Reading in R, including CSV, Excel, JSON, HTML data.
Web-Scraping using R
Extracting text data from Twitter and Facebook using APIs
Extract and clean data from the FourSquare app
Exploratory data analysis of textual data
Common Natural Language Processing techniques such as sentiment analysis and topic modelling
Implement machine learning techniques such as clustering, regression and classification on textual data
Network analysis

Plus you will apply your newly gained skills and complete a practical text analysis assignment

We will spend some time dealing with some of the theoretical concepts. However, the majority of the course will focus on implementing different techniques on real data and interpreting the results.

After each video, you will learn a new concept or technique which you may apply to your own projects.

All the data and code used in the course has been made available free of charge and you can use it as you like. You will also have access to additional lectures that are added in the future for FREE.

JOIN THE COURSE NOW!

Who this course is for:

People who wish to learn practical text mining and natural language processing
People with prior experience of using RStudio
People with some prior experience of implementing machine learning techniques in R
People who were previously enrolled for my Data Science:Data Mining and Natural Language Processing course
People who wish to derive insights from textual and social media data

Text Mining and Natural Language Processing in R

What you'll learn

Explore related topics

Course content

INTRODUCTION TO THE COURSE: The Key Concepts and Software Tools4 lectures • 16min

Reading in Data from Different Sources8 lectures • 45min

Webscraping: Extract Data from Webpages9 lectures • 51min

Introduction to APIs2 lectures • 9min

Text Data Mining from Social Media16 lectures • 1hr 24min

Exploring Text Data For Preliminary Ideas11 lectures • 1hr 16min

Natural Language Processing: Sentiment Analysis13 lectures • 1hr 45min

Text Data and Machine Learning9 lectures • 1hr 8min

Network Analysis8 lectures • 55min

Miscellaneous Lectures6 lectures • 21min

Requirements

Description

Who this course is for: