
Gain hands-on skills in Python, social media mining, and natural language processing with Google Colab, analyzing tweets and cryptocurrency news for actionable insights.
Install the Anaconda data science platform, choose between individual or team editions, install Python 3.7 64-bit, and launch Jupyter notebooks via the Anaconda prompt to run data science workflows.
Explore Google Colab, a cloud-based platform for running your Jupyter notebooks in the browser. Learn to create notebooks from drive folders, add code cells, and import NumPy, TensorFlow, and Keras.
Explore how to enable GPU and TPU access in Google Colab to accelerate training of neural networks and deep learning models, and understand when GPUs are necessary versus CPU power.
discover how google colab ships with a pre-installed suite of deep learning and data science packages, including keras, keras preprocessing, tensor flow, and pandas, letting you hit the ground running.
Explore pandas by creating series and data frames, using labels and indices, and reading data from external sources for practical data analysis.
Learn fundamental data cleaning with pandas, including reading csv data, dropping columns, identifying null values, and filling missing values with mean or specified replacements to prepare datasets for modeling.
Master the principles and chart types for visualizing data, including bar, pie, line, histogram, and box plots for categorical, discrete, and quantitative data, with scatter plots for relationships.
Analyze whether social media can be a force for good by mining Twitter data to reveal patterns in the covid crisis, comparing Mumbai and New Delhi with 30,000 tweets.
Learn how to obtain tweets without a Twitter account by installing and using Twint in Google Colab, including essential pip commands and setup steps.
Install and import the necessary packages, instantiate and configure the client, set the username to Elon Musk, and run the search to retrieve tweets.
Learn how to fetch Elon Musk tweets from a specific user using the Twint package in Google Colab, including installation workarounds and rerunning scripts after errors.
Extract the 1000 most recent tweets of a user and filter for popularity by likes, replies, and retweets using Python, applicable to any username.
Learn to fetch tweets from a specific user within a date range, filter by language, and experiment with search terms to prepare text data for analysis.
Configure twint to search for the term BTC, limit results to 200 tweets, and specify a date range from April 25 to 29, 2021, and review tweets from various users.
Identify Elon Musk's Bitcoin tweets by configuring a username-based search for BTC and Bitcoin, then run the query to retrieve relevant posts.
learn how to fetch tweets pertaining to BTC from a location using twint, focusing on near London, adjusting limits and parameters, and exploring a workaround to query multiple cities.
Fetch tweets from multiple cities—London, Singapore, Beijing—for a search term, iterating a city list, set language and near city, with a 100-tweet limit, and store in a pandas data frame.
Learn to scrape tweets from multiple locations for multiple search terms using Python, and store results in a pandas data frame for analysis.
Install SNS scrape on Windows or macOS and use Twitter search scraper to obtain 100 tweets, then convert to a pandas dataframe with date and content.
Discover how application programming interfaces enable access to data from websites and social media such as Twitter and Facebook, avoid scraping, and use R to extract information.
Explore extracting geolocation data for Singapore MRT stations via the map.gov.sg API, using requests and json to collect addresses, latitudes, and longitudes, and organize them into a pandas dataframe.
Learn to extract financial headlines for specific stock tickers with finviz, parsing the news table using beautiful soup and urllib to build a ticker-based headlines dataset.
Learn to obtain textual data from reddit with python and the pro package, including account setup, app creation, authentication, and pulling posts from a subreddit for analysis.
Learn to clean social media text using the HTM package, applying lowercase, punctuation removal, whitespace stripping, stopword removal, and stemming to build a document-term matrix.
Begin by extracting tweet text from a dataframe, convert to a single string, and clean the data by lowercasing and removing mentions and links to prepare a word cloud.
Learn to clean text for analysis by lowering case, removing urls and usernames, keeping letters only, and discarding short words to create cleaned text for word clouds and nlp.
Master Python-based text cleaning by processing tweets: extract text, lower, remove usernames and URLs, delete non-characters and short or stopwords, then save as a clean string or new column.
Learn to clean tweet data by applying a defined function, creating cleaned tweets and tweets without stopwords, and removing emojis to prepare a data frame ready for analysis.
Apply a nltk-based text cleaning workflow to tweets by defining a clean function, lowering case, removing punctuation and stopwords, and applying stemming (Porter stemmer) and lemmatization.
Analyze tweet frequency by counting repeated tweets and identifying the top tweets. Plot a histogram with log-scaled frequencies to visualize tweet copies.
learn to extract hashtags and mentions from tweets and identify who is mentioned or retweeted, using functions and dataframe operations to create new columns.
Analyze tweet data to extract hashtags, flatten the hashtag lists into rows, and compute frequencies to identify the most popular hashtags, such as BTC appearing 134 times.
This lecture demonstrates using Python to analyze tweets and identify the most common usernames by grouping by name, counting occurrences, and extracting the top five, such as Tony Tan 92.
Learn how word clouds visualize text by highlighting frequent keywords while filtering out common words, and see how to build them from text using Python.
Learn to analyze text data using python by building a word cloud, installing the word cloud package, importing modules, and extracting tweet text from a dataframe for basic textual analysis.
Create a word cloud from cleaned btc tweets using the wordcloud library and matplotlib, removing stopwords and usernames to highlight bitcoin, XRP, ETH, doge, and other crypto themes.
Create a word cloud from cleaned tweet texts by removing stopwords and using the wordcloud library, showing Bitcoin, XRP, ETH, and other crypto terms prominently.
Explore n-grams, including unigrams, bigrams, and trigrams, and learn to clean tweets with stopwords and wordnet using nltk, counting top co-occurring word pairs.
Build a network of bigrams around bitcoin (BTC) to show term relationships using pandas and networkx. See how terms like hourly dominance, daily candle, and market cap cluster near BTC.
Explore topic modeling by preprocessing text and using gensim to build an LDA model that reveals topics and their word weights from tweets.
Analyze text data for polarity and subjectivity using a cleaned, lowercased, punctuation-removed, stopword-filtered, stemmed and lemmatized workflow, then apply text blob sentiment scoring to assign positive, negative, or neutral values.
Apply polarity analysis to tweet data by lemmatizing text, computing polarity, and classifying tweets as positive, neutral, or negative using defined thresholds in Python.
Isolate the day from the Twitter timestamp to study vanilla time variation and daily polarity by grouping by date after converting to Python date.
Explore the Vader sentiment analysis algorithm, a valence aware dictionary approach for Twitter and microblogs, which assigns a compound score on a -4 to 4 scale.
Implement sentiment-based analysis on tweets using Vader sentiment intensity analyzer, preprocess text, compute compound, positive, negative, and neutral scores, and assemble results in a dataframe.
Apply Vader sentiment analysis to finviz financial news headlines and compute compound scores by ticker and date, then plot mean scores for AMC, GME, and Clean.
Visualize the sentiments by plotting tweet polarity with seaborn for June 2021, showing neutral, positive, and negative counts and tracing polarity over time with a line plot.
Explore the basic theory of machine learning, where algorithms learn from data without formal equations, and distinguish unsupervised from supervised learning, including classification and regression.
Explore a toy dataset with dummy hotel reviews to illustrate text preprocessing for classification, including tokenization, Lancaster stemming to root forms, and building a bag-of-words with a count vectorizer.
Fit a random forest classifier on a bag of words to predict review scores from textual data, illustrating the classification approach and the importance of training and testing data.
Predict stock price movements from newspaper headlines using a bag-of-words model and a random forest classifier, with data prep, count vectorization, date-based training and testing, achieving about 85% accuracy.
Explores unsupervised clustering with k-means, a method that partitions data into k clusters by assigning observations to the nearest centroid using Euclidean distance, with iterative refinement.
Identify textual clusters in tweets using k-means, tf-idf vectorization, stopword removal, and PCA visualization to determine optimal cluster numbers and interpret common words.
Explore dbscan for textual clustering by preprocessing text with count vectorizer and tf-idf, removing stopwords and n-grams up to 3, then cluster with dbscan and compare to k-means.
Classify tweets using lemmatized text and a count vectorizer, train a gradient boosting model on a 70/30 split to predict positive, neutral, or negative, achieving about 80% accuracy.
Install Keras on Windows with Anaconda, create and activate a conda environment, install Keras, and verify by importing Keras and configuring the backend (TensorFlow or Theano) via Keras.json.
Activate the Anaconda deep learning environment on Mac, install keras with conda, and verify TensorFlow backend, then switch to Theano by editing keras.json.
Explore long short-term memory (LSTM) as a special recurrent neural network, its forget gate, update gate, and output gate, and apply LSTM to cryptocurrency time series forecasting.
Explore word embeddings that map words to vectors to reduce dimensionality, contrast with one-hot encoding, and introduce glove as an unsupervised model built on word co-occurrence and log-linear regression.
Visualize building an LSTM model for tweet sentiment classification using glove embeddings, tokenization, and padding, achieving about 70% training and validation accuracy.
Explore how dictionaries store data as key–value pairs in Python, manipulate them with copy, delete, and add operations, access values, and sort keys.
Set up the Foursquare app API, register your project, and obtain a client ID and client secret. Authenticate and access venue tips for geolocation-based recommendations.
Master text cleaning with NTLK in Python to prepare data for text analysis and natural language processing tasks.
Learn how to generate concise summaries of long texts with ai, comparing Jen and Bart transformer methods in a case study using earth observation data and blockchain.
Discover how natural language processing analyzes human language, converts input into useful representations, and enables tasks like email filtering, translation, and sentiment analysis.
Posit enables browser-based deployment of data science projects from RStudio or Jupyter, allowing you to share with teams, teach and learn, deploy apps like Shiny, Streamlit, Dash, and scale work.
Distributed computing frameworks coordinate multiple nodes to split large problems into tasks, run them in parallel, ensure fault tolerance, allocate resources, and scale using Hadoop or Apache Spark.
ENROLL IN MY LATEST COURSE ON HOW TO LEARN ALL ABOUT PYTHON SOCIAL MEDIA & NATURAL LANGUAGE PROCESSING (NLP)
Do you want to harness the power of social media to make financial decisions?
Are you looking to gain an edge in the fields of retail, online selling, real estate and geolocation services?
Do you want to turn unstructured data from social media and web pages into real insights?
Do you want to develop cutting edge analytics and visualisations to take advantage of the millions of Twitter posts that appear each day?
Gaining proficiency in social media mining can help you harness the power of the freely available data and information on the world wide web (including popular social media sites such as Twitter) and turn it into actionable insights
MY COURSE IS A HANDS-ON TRAINING WITH REAL PYTHON SOCIAL MEDIA MINING- You will learn to carry out text analysis and natural language processing (NLP) to gain insights from unstructured text data, including tweets
My course provides a foundation to carry out PRACTICAL, real-life social media mining. By taking this course, you are taking an important step forward in your data science journey to become an expert in harnessing the power of social media for deriving insights and identifying trends.
Why Should You Take My Course?
I have an MPhil (Geography and Environment) from the University of Oxford, UK. I also completed a data science intense PhD at Cambridge University (Tropical Ecology and Conservation).
I have several years of experience in analyzing real-life data from different sources and producing publications for international peer-reviewed journals.
This course will help you gain fluency both in the different aspects of text analysis and NLP working through a real-life example of cryptocurrency tweets and financial news using a powerful clouded based python environment called GoogleColab. Specifically, you will
Gain proficiency in setting up and using Google CoLab for Python Data Science tasks
Carry out common social media mining tasks such as obtaining tweets (e.g. tweets relating to bitcoins)
Work with complicated web pages and extract information
Process the extracted textual information in a usable form via preprocessing techniques implemented via powerful Python packages such as NTLK
A thorough grounding in text analysis and NLP related Python packages such as NTLK, Snscrape among others
Carry out common text analytics tasks such as Sentiment Analysis
Implement machine learning and artificial intelligence techniques on text data
You will work on practical mini case studies relating to (a) extracting and pre-processing tweets from certain users and topics relating to cryptocurrencies (b) identify the sentiments of cryptocurrency tweets(c) classify your tweets using machine learning models
In addition to all the above, you’ll have MY CONTINUOUS SUPPORT to make sure you get the most value out of your investment!
ENROLL NOW :)