Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Text Analysis and Natural Language Processing With Python

Name: Text Analysis and Natural Language Processing With Python
Rating: 4.5 (525 reviews)

Use Python and Google CoLab For Social Media Mining and Text Analysis and Natural Language Processing (NLP)

Created byMinerva Singh

Last updated 11/2024

English

What you'll learn

Students will be able to read in data from different sources- including websites and social media
Social media mining from Twitter
Extract information relating to tweets and posts
Analyze text data for emotions
Carry out Sentiment analysis
Implement natural language processing (NLP) on different types of text data
Introduction to some of the most common Python text analysis packages

Course content

10 sections • 71 lectures • 4h 56m total length

Welcome to the Course2:51
Gain hands-on skills in Python, social media mining, and natural language processing with Google Colab, analyzing tweets and cryptocurrency news for actionable insights.
Data and Code0:10
Python Installation5:44
Install the Anaconda data science platform, choose between individual or team editions, install Python 3.7 64-bit, and launch Jupyter notebooks via the Anaconda prompt to run data science workflows.
What Is Google CoLab?7:13
Explore Google Colab, a cloud-based platform for running your Jupyter notebooks in the browser. Learn to create notebooks from drive folders, add code cells, and import NumPy, TensorFlow, and Keras.
Google Colabs and GPU5:50
Explore how to enable GPU and TPU access in Google Colab to accelerate training of neural networks and deep learning models, and understand when GPUs are necessary versus CPU power.
Google Colab Packages4:27
discover how google colab ships with a pre-installed suite of deep learning and data science packages, including keras, keras preprocessing, tensor flow, and pandas, letting you hit the ground running.

What Is Pandas?12:06
Explore pandas by creating series and data frames, using labels and indices, and reading data from external sources for practical data analysis.
Basic Data Cleaning With Pandas4:30
Learn fundamental data cleaning with pandas, including reading csv data, dropping columns, identifying null values, and filling missing values with mean or specified replacements to prepare datasets for modeling.
Basics of Data Visualization6:46
Master the principles and chart types for visualizing data, including bar, pie, line, histogram, and box plots for categorical, discrete, and quantitative data, with scatter plots for relationships.

Obtaining Tweets Without A Twitter Account2:34
Learn how to obtain tweets without a Twitter account by installing and using Twint in Google Colab, including essential pip commands and setup steps.
Lets Dip Our Toes Into Twitter1:25
Install and import the necessary packages, instantiate and configure the client, set the username to Elon Musk, and run the search to retrieve tweets.
Get Elon Musk's Tweet2:25
Learn how to fetch Elon Musk tweets from a specific user using the Twint package in Google Colab, including installation workarounds and rerunning scripts after errors.
Obtain The Most Popular Tweets of a User4:50
Extract the 1000 most recent tweets of a user and filter for popularity by likes, replies, and retweets using Python, applicable to any username.
Obtain Tweets For A User Between A Certain Date4:09
Learn to fetch tweets from a specific user within a date range, filter by language, and experiment with search terms to prepare text data for analysis.
Look With For With a Specific Term2:46
Configure twint to search for the term BTC, limit results to 200 tweets, and specify a date range from April 25 to 29, 2021, and review tweets from various users.
Elon Musk's Bitcoin Tweets1:13
Identify Elon Musk's Bitcoin tweets by configuring a username-based search for BTC and Bitcoin, then run the query to retrieve relevant posts.
Tweets From a Location2:09
learn how to fetch tweets pertaining to BTC from a location using twint, focusing on near London, adjusting limits and parameters, and exploring a workaround to query multiple cities.
Tweets From Multiple Locations3:23
Fetch tweets from multiple cities—London, Singapore, Beijing—for a search term, iterating a city list, set language and near city, with a 100-tweet limit, and store in a pandas data frame.
Tweets From Multiple Locations and Multiple Terms5:58
Learn to scrape tweets from multiple locations for multiple search terms using Python, and store results in a pandas data frame for analysis.
Another Way of Obtaining Tweets3:17
Install SNS scrape on Windows or macOS and use Twitter search scraper to obtain 100 tweets, then convert to a pandas dataframe with date and content.
More Snscrape Tweets2:49

What is API?2:33
Discover how application programming interfaces enable access to data from websites and social media such as Twitter and Facebook, avoid scraping, and use R to extract information.
Using APIs: Singapore MRT Stations3:21
Explore extracting geolocation data for Singapore MRT stations via the map.gov.sg API, using requests and json to collect addresses, latitudes, and longitudes, and organize them into a pandas dataframe.
Obtain Financial News Headlines4:13
Learn to extract financial headlines for specific stock tickers with finviz, parsing the news table using beautiful soup and urllib to build a ticker-based headlines dataset.
Obtaining Textual Data From Reddit8:21
Learn to obtain textual data from reddit with python and the pro package, including account setup, app creation, authentication, and pulling posts from a subreddit for analysis.

Introduction to Theory4:22
Learn to clean social media text using the HTM package, applying lowercase, punctuation removal, whitespace stripping, stopword removal, and stemming to build a document-term matrix.
Lets Start Cleaning The Text3:23
Begin by extracting tweet text from a dataframe, convert to a single string, and clean the data by lowercasing and removing mentions and links to prepare a word cloud.
Final Cleaned Text3:11
Learn to clean text for analysis by lowering case, removing urls and usernames, keeping letters only, and discarding short words to create cleaned text for word clouds and nlp.
A Function For Text Cleaning3:05
Master Python-based text cleaning by processing tweets: extract text, lower, remove usernames and URLs, delete non-characters and short or stopwords, then save as a clean string or new column.
More Text Cleaning2:22
Learn to clean tweet data by applying a defined function, creating cleaned tweets and tweets without stopwords, and removing emojis to prepare a data frame ready for analysis.
Another NTLK-Based Workflow3:49
Apply a nltk-based text cleaning workflow to tweets by defining a clean function, lowering case, removing punctuation and stopwords, and applying stemming (Porter stemmer) and lemmatization.

Tweet Lengths4:17
How People Interact With Tweets1:43
Analyze tweet frequency by counting repeated tweets and identifying the top tweets. Plot a histogram with log-scaled frequencies to visualize tweet copies.
Of Mentions and Hashtags2:38
learn to extract hashtags and mentions from tweets and identify who is mentioned or retweeted, using functions and dataframe operations to create new columns.
Identify The Most Popular Hashtags2:16
Analyze tweet data to extract hashtags, flatten the hashtag lists into rows, and compute frequencies to identify the most popular hashtags, such as BTC appearing 134 times.
Identify the Most Common Usernames1:55
This lecture demonstrates using Python to analyze tweets and identify the most common usernames by grouping by name, counting occurrences, and extracting the top five, such as Tony Tan 92.
What Are Wordclouds?3:21
Learn how word clouds visualize text by highlighting frequent keywords while filtering out common words, and see how to build them from text using Python.
Basic Wordcloud-Install2:57
Learn to analyze text data using python by building a word cloud, installing the word cloud package, importing modules, and extracting tweet text from a dataframe for basic textual analysis.
A Basic Wordcloud5:33
Create a word cloud from cleaned btc tweets using the wordcloud library and matplotlib, removing stopwords and usernames to highlight bitcoin, XRP, ETH, doge, and other crypto themes.
Word Count of Common Words5:33
Create a word cloud from cleaned tweet texts by removing stopwords and using the wordcloud library, showing Bitcoin, XRP, ETH, and other crypto terms prominently.
N-Grams4:07
Explore n-grams, including unigrams, bigrams, and trigrams, and learn to clean tweets with stopwords and wordnet using nltk, counting top co-occurring word pairs.
Network of Bigrams3:38
Build a network of bigrams around bitcoin (BTC) to show term relationships using pandas and networkx. See how terms like hourly dominance, daily candle, and market cap cluster near BTC.
Topic Modelling With Gensim6:15
Explore topic modeling by preprocessing text and using gensim to build an LDA model that reveals topics and their word weights from tweets.

Identify the Polarity of Text4:19
Analyze text data for polarity and subjectivity using a cleaned, lowercased, punctuation-removed, stopword-filtered, stemmed and lemmatized workflow, then apply text blob sentiment scoring to assign positive, negative, or neutral values.
Polarity: Positive or Negative2:49
Apply polarity analysis to tweet data by lemmatizing text, computing polarity, and classifying tweets as positive, neutral, or negative using defined thresholds in Python.
Dealing With Dates3:15
Isolate the day from the Twitter timestamp to study vanilla time variation and daily polarity by grouping by date after converting to Python date.
Introduction to VADER Sentiment Analysis2:33
Explore the Vader sentiment analysis algorithm, a valence aware dictionary approach for Twitter and microblogs, which assigns a compound score on a -4 to 4 scale.
Dealing With Dates3:24
Implement sentiment-based analysis on tweets using Vader sentiment intensity analyzer, preprocess text, compute compound, positive, negative, and neutral scores, and assemble results in a dataframe.
VADER Sentiment For Financial News4:16
Apply Vader sentiment analysis to finviz financial news headlines and compute compound scores by ticker and date, then plot mean scores for AMC, GME, and Clean.
Visualise the Sentiments3:09
Visualize the sentiments by plotting tweet polarity with seaborn for June 2021, showing neutral, positive, and negative counts and tracing polarity over time with a line plot.

What Is Machine Learning?5:32
Explore the basic theory of machine learning, where algorithms learn from data without formal equations, and distinguish unsupervised from supervised learning, including classification and regression.
Preprocessing-Toy Example3:20
Explore a toy dataset with dummy hotel reviews to illustrate text preprocessing for classification, including tokenization, Lancaster stemming to root forms, and building a bag-of-words with a count vectorizer.
A Simple Machine Learning Model on Textual Data5:27
Fit a random forest classifier on a bag of words to predict review scores from textual data, illustrating the classification approach and the importance of training and testing data.
Predicting Stock Price Movements Based On Newspaper Headlines6:26
Predict stock price movements from newspaper headlines using a bag-of-words model and a random forest classifier, with data prep, count vectorization, date-based training and testing, achieving about 85% accuracy.
Unsupervised Learning With K-Means Algorithm1:57
Explores unsupervised clustering with k-means, a method that partitions data into k clusters by assigning observations to the nearest centroid using Euclidean distance, with iterative refinement.
Identifying Textual Clusters With K-means5:45
Identify textual clusters in tweets using k-means, tf-idf vectorization, stopword removal, and PCA visualization to determine optimal cluster numbers and interpret common words.
DBSCAN Based Textual Clustering2:35
Explore dbscan for textual clustering by preprocessing text with count vectorizer and tf-idf, removing stopwords and n-grams up to 3, then cluster with dbscan and compare to k-means.
Classify the Tweet Sentiment-GBM4:30
Classify tweets using lemmatized text and a count vectorizer, train a gradient boosting model on a 70/30 split to predict positive, neutral, or negative, achieving about 80% accuracy.
Keras Installation-Windows5:16
Install Keras on Windows with Anaconda, create and activate a conda environment, install Keras, and verify by importing Keras and configuring the backend (TensorFlow or Theano) via Keras.json.
Keras Installation-Mac4:19
Activate the Anaconda deep learning environment on Mac, install keras with conda, and verify TensorFlow backend, then switch to Theano by editing keras.json.
Long short-term memory (LSTM): Theory5:40
Explore long short-term memory (LSTM) as a special recurrent neural network, its forget gate, update gate, and output gate, and apply LSTM to cryptocurrency time series forecasting.
Brief Lowdown on Word Embeddings3:19
Explore word embeddings that map words to vectors to reduce dimensionality, contrast with one-hot encoding, and introduce glove as an unsupervised model built on word co-occurrence and log-linear regression.
LSTM For Classifying Tweet Sentiment-15:54
Visualize building an LSTM model for tweet sentiment classification using glove embeddings, tokenization, and padding, achieving about 70% training and validation accuracy.

Lets Do Dictionaries10:33
Explore how dictionaries store data as key–value pairs in Python, manipulate them with copy, delete, and add operations, access values, and sort keys.
Set up the FourSquare App4:32
Set up the Foursquare app API, register your project, and obtain a client ID and client secret. Authenticate and access venue tips for geolocation-based recommendations.
NTLK Cleaning3:52
Master text cleaning with NTLK in Python to prepare data for text analysis and natural language processing tasks.
Text Summarisation With AI-Case Study7:48
Learn how to generate concise summaries of long texts with ai, comparing Jen and Bart transformer methods in a case study using earth observation data and blockchain.
Of NLP and ChatGPT5:05
Discover how natural language processing analyzes human language, converts input into useful representations, and enables tasks like email filtering, translation, and sentiment analysis.
Posit On POSIT3:31
Posit enables browser-based deployment of data science projects from RStudio or Jupyter, allowing you to share with teams, teach and learn, deploy apps like Shiny, Streamlit, Dash, and scale work.
Distributed Computing4:03
Distributed computing frameworks coordinate multiple nodes to split large problems into tasks, run them in parallel, ensure fault tolerance, allocate resources, and scale using Hadoop or Apache Spark.

Requirements

Should have prior experience of Python data science
Prior experience of statistical and machine learning techniques will be beneficial
Should have an interest in extracting unstructured text data from social media and websites
Should have an interest in extracting insights from text analysis
Should have an interest in applying machine learning models on text data

Description

ENROLL IN MY LATEST COURSE ON HOW TO LEARN ALL ABOUT PYTHON SOCIAL MEDIA & NATURAL LANGUAGE PROCESSING (NLP)

Do you want to harness the power of social media to make financial decisions?
Are you looking to gain an edge in the fields of retail, online selling, real estate and geolocation services?
Do you want to turn unstructured data from social media and web pages into real insights?
Do you want to develop cutting edge analytics and visualisations to take advantage of the millions of Twitter posts that appear each day?

Gaining proficiency in social media mining can help you harness the power of the freely available data and information on the world wide web (including popular social media sites such as Twitter) and turn it into actionable insights

MY COURSE IS A HANDS-ON TRAINING WITH REAL PYTHON SOCIAL MEDIA MINING- You will learn to carry out text analysis and natural language processing (NLP) to gain insights from unstructured text data, including tweets

My course provides a foundation to carry out PRACTICAL, real-life social media mining. By taking this course, you are taking an important step forward in your data science journey to become an expert in harnessing the power of social media for deriving insights and identifying trends.

Why Should You Take My Course?

I have an MPhil (Geography and Environment) from the University of Oxford, UK. I also completed a data science intense PhD at Cambridge University (Tropical Ecology and Conservation).

I have several years of experience in analyzing real-life data from different sources and producing publications for international peer-reviewed journals.

This course will help you gain fluency both in the different aspects of text analysis and NLP working through a real-life example of cryptocurrency tweets and financial news using a powerful clouded based python environment called GoogleColab. Specifically, you will

Gain proficiency in setting up and using Google CoLab for Python Data Science tasks
Carry out common social media mining tasks such as obtaining tweets (e.g. tweets relating to bitcoins)
Work with complicated web pages and extract information
Process the extracted textual information in a usable form via preprocessing techniques implemented via powerful Python packages such as NTLK
A thorough grounding in text analysis and NLP related Python packages such as NTLK, Snscrape among others
Carry out common text analytics tasks such as Sentiment Analysis
Implement machine learning and artificial intelligence techniques on text data

You will work on practical mini case studies relating to (a) extracting and pre-processing tweets from certain users and topics relating to cryptocurrencies (b) identify the sentiments of cryptocurrency tweets(c) classify your tweets using machine learning models

In addition to all the above, you’ll have MY CONTINUOUS SUPPORT to make sure you get the most value out of your investment!

ENROLL NOW :)

Who this course is for:

People who wish to learn practical text mining and natural language processing
People who wish to derive insights from textual and social media data
People wanting to understand the impact of human sentiments on financial markets

Text Analysis and Natural Language Processing With Python

What you'll learn

Explore related topics

Course content

Introduction To Social Media Mining With Python6 lectures • 26min

Basic Data Preprocessing3 lectures • 23min

Welcome To Social Media1 lecture • 4min

Extracting Tweets (Without An API)12 lectures • 37min

Other Ways of Obtaining Textual Data4 lectures • 18min

Basic Textual Data Preprocessing6 lectures • 20min

Exploring Text Data12 lectures • 44min

Exploring Sentiments7 lectures • 24min

Machine Learning and Deep learning For Text Data13 lectures • 1hr

Miscellaneous Information7 lectures • 39min

Requirements

Description

Who this course is for: