
Explore how natural language processing uses big data, powerful computing, and enhanced algorithms to derive meaning from language pieces, with a focus on sentiment analysis, contextual extraction, and machine translation.
No background is required as this course guides you to build a sentiment analysis project in R, analyzing character sentiments from a Netflix show.
Learn basic and advanced programming concepts, web scraping tools, and data manipulation in LP technologies while applying sentimental analysis through challenges, quizzes, and practical projects.
R is a programming language and software environment for statistical analysis, graphics, and reporting. It features conditionals, looping, and data handling, with a vast ecosystem of packages for data analytics.
Learn to find and install R packages from the official repository, then load them with library and explore their docs with help to use package functions effectively.
install and load packages in R using install.packages and library to enable data visualization, data manipulation, web scraping, and machine learning workflows.
Identify and use the four main data types in R—logical, integer, numeric, and character—recognize that everything is an object, and learn how data types and data structures drive day-to-day analysis.
Learn how the R assignment operator works and that variables auto print when you type their name. Compare this to Python’s print function and use print for explicit output.
Learn how to assign the same value to multiple variables in a single line in R, with practical examples and printing results.
Discover how to create and assign variables in R using = or <-, print values, and concatenate names with paste or paste0, while exploring numeric and string types.
Follow descriptive variable naming in R. Start with a letter, avoid spaces, use letters, digits, and underscores (not first); remember case sensitivity and reserved words.
Explore data types and type casting in programming, including converting characters to integers, integers to strings, and booleans, with examples and practical implications for data manipulation and modeling.
Learn how assignment operators work in R, using both the equal sign and alternative symbols to create variables by naming them and assigning content, with Python examples for context.
Explore how vectors in R store items of the same type, create numeric, strings, and logical vectors with c and ranges, count with length, and sort with sort.
Access vector items in R using one-based indexing and brackets. Use c() to select single or multiple elements, and apply negative indexing to exclude items.
Generate a sequenced vector by specifying from, to, and by in the sequence function, yielding 0, 20, 40, 60, 80, 100 and illustrating the interval concept for project work.
Create lists in R with the list() function to hold mixed data types. Access and modify elements by index, and determine the list length with length().
Learn to check item existence in a list using the in operator, with apple present yielding true and pineapple absent yielding false.
Discover how to append items to the end of a list using append, and remove items via negative indexing to create a new updated list, with Python examples.
Explore lists in r, which hold mixed data types, and learn to access, modify, and append items using indices. Determine the list length and remove elements with negative indexing.
Learn how to create matrices in R using the matrix function, specify rows and columns, understand vector recycling, and access elements with row and column indices.
Explore relational data as tabular data frames in r, learn how columns and rows form variables and observations, and enforce consistent data types across each column (strings, floats, integers).
Explore how data frames present data in a tabular format with columns of consistent types, and learn to build them in R using vectors and data.frame.
Access data frame columns using single brackets, double brackets, or the dollar sign, with the training column demonstrated; prefer the dollar sign for simplicity and readability.
Add rows to the data frame by using the bind function to insert a new observation with training and duration values, then print the updated data frame to verify.
This lecture shows how to add a new column named steps to a data frame using the bind function, turning three columns into four with values 1000, 6000, and 2000.
Explore data frames in R, learn their tabular structure of variables and observations, view built-in datasets like cars and titanic, and access columns with the dollar sign or indexing.
Demonstrate creating a data frame from scratch, adding and viewing columns and rows with cbind and rbind, and summarizing numeric columns such as pulse and duration using summary.
Create and manipulate factors in R to categorize data, view unique categories with labels, and obtain frequency distributions with table, including accessing items by index.
Explore miscellaneous operators in R, including the colon operator for sequences, the %in% membership test, and the %*% matrix multiplication, with practical examples.
Convert a gender vector from character to factor in R, check its class, levels, and labels, and use table to count males and females.
Explore loops in R, focusing on while loops and basic control. Learn to set a condition, increment a counter, and print one to five while avoiding infinite loops.
Learn for loops in R to iterate over sequences and lists, print elements, break on a condition, and nest loops to map multiple collections such as executives and fruits.
Define a function in R with the function keyword, write a body and arguments, then call the custom function (positional or named) to return a result.
Define and call functions in R, pass arguments, print outputs, and return results; explore functions as objects, default arguments, and reusable name-combining examples.
Explore how default arguments work in R by defining function parameters and calling with or without arguments to obtain default or updated results.
Explore the dplyr package, a suite of data manipulation verbs in R that helps you select, filter, mutate, transmute, and use the pipe operator to shape data for your analysis.
Learn how to use the select function in R to extract columns from a data frame after loading the deployable packages, including selecting single or multiple columns.
Explore the select function in deployer packages to choose specific columns from a data frame, including columns that start with or end with particular letters, using the provided data set.
Explore how to use the filter function in R to subset a data frame by conditions, such as values greater than six, less than six, and missing values.
Apply the filter function in R to subset a data frame by conditions such as displacement greater than 150 and horsepower above 120, using the deployer package.
Explore how mutate and transmute from deployer packages perform element-wise addition of numeric columns, creating a new variable and showing when results stay in the original frame versus separate frame.
Explore mutate and transmute in R using the deployer package: mutate adds columns element-wise; transmute creates a new data frame with the computed results.
Learn the diff() function in R, a base function, by applying it to a five-element vector to compute consecutive differences. The result has four values: -31, -11, 1, 8.
Learn how the pipe operator in R enhances readability by chaining functions—from log and diff to exp and round—streamlining natural language processing and sentimental analysis workflows.
Learn to use the pipe operator in R to chain calculations from vector operations to log, diff, exp, and round, then apply select and filter for clean, readable code.
Transform unstructured text into structured data through text mining and NLP techniques, enabling information extraction and improved decision making with machine learning and deep learning.
Learn to preprocess text for mining with natural language processing techniques, including language identification, tokenization, pos tagging, chunking, information retrieval, syntax parsing, and stemming for effective analysis.
Explore tokenization in R by breaking text into words and sentences, create bag-of-words representations, and visualize frequency distributions for text clustering and word clouds.
Explore natural language processing, from computational linguistics to machine understanding of written and spoken language. Learn subtasks like summarisation, part-of-speech tagging, text categorization, and sentiment analysis, with an R project.
Identify key terms in text mining, including corpus, corpora, document-term matrix, and stemming. Top words like pronouns add little value; offensive words may be removed for mining but inform sentiment.
Web scraping automates extracting structured data from websites for analysis and business purposes, enabling price monitoring, market research, and lead generation; always obtain permission for non-public data.
Discover how to scrape only episode scripts from web pages for sentiment analysis in R, using chrome extensions like selector gadget to extract text.
Learn to install and load the rvest package in R, perform web scraping with archivist, and extract episode titles and seasons for sentimental analysis and text mining.
Learn to scrape a web page in R by loading the required package, reading HTML content, and extracting episode titles while refining results to avoid printing the entire page.
Use the chrome selector gadget to identify and extract HTML nodes for scraping. Specify a base, capture the text content (not links) as episode titles, and store them for output.
Convert scraped titles into a data frame in R, keep strings as text (not factors), remove the about and duplicate pilot episode rows, and rename the column.
Change a data frame column name in R by inspecting the current names, then assign a new name such as titles to replace the old one.
Clean the data and extract episode titles by creating a data frame, extracting titles from strings, and adding a new name column using an apply function.
Filter season one from the dataset using the pipe operator and filter, creating a data frame of 17 episodes for sentimental analysis in R.
Learn to scrape season one episode scripts from 17 links, extract clean content with HTML text, and fetch each link via a reusable function for sentimental analysis in R.
Split the episode content into character and dialogue using a string split method, then build a two-column dataframe for text analytics in R.
Use loops to process all 17 season one episode links, retrieve each episode content, and build a final data frame with the create final data frame function.
Learn to build a data frame from episode dialogue by separating character and dialogue columns, converting lists to data frames, and filtering to main characters for sentiment analysis.
Refine the data frame by renaming columns, removing non-scene entries, and filtering to the five main characters for focused dialogue analysis, sentimental analysis, and word cloud visualization.
View and inspect a prepared data frame with two columns, count the 3,652 dialogues (rows) in the season, and begin text mining in R, including printing results.
Define a corpus as a collection of linguistic data used in text mining and NLP, supporting sentimental analysis, hypothesis testing, and machine translation.
Explore how unstructured data can be transformed into structured form for machine learning in the term document matrix framework, using the bag of model and victory space model.
Discover the vector space model, a generalization of bag-of-words, where a corpus yields a document-term matrix with weights from binary, term frequency, and inverse document frequency.
Explore term frequency in text mining by counting how often a word appears in a document within a corpus, converting text to lowercase, and removing stop words during preprocessing.
Learn how inverse document frequency assigns higher weights to rare terms in a corpus. The IDF formula and a three-document example illustrate contrasting common versus uncommon words.
Install and load a text mining package, create a corpus from dialogues, and build a document-term matrix using term frequency weights to convert unstructured text into structured features.
Remove sparse terms from the document metrics by applying a sparsity threshold, dropping high-sparsity words to improve the corpus representation.
Transform a corpus into a document-term matrix, count word frequencies to identify the most frequent terms, remove stop words, and build a two-column data frame of words and their frequencies.
Explore how word clouds visualize text data, where bigger, bolder words indicate higher frequency, enabling comparison of texts and implementation in a project.
Transform text to lowercase, remove punctuation, whitespace, numbers, and stop words using English stop lists and custom words in R's corpus workflow, preparing data for word analysis and visualization.
Explore text mining season two dialogues from Big Bang Theory, build a corpus, clean text by removing punctuation and stop words, and create a word cloud using term document metrics.
Explore frequency distributions in R by counting dialogue per character, converting to factors, building a frequency table, and sorting to reveal who had the most screen time in season one.
Plot a bar graph of character dialogue frequency using grammar of graphics with the DC plot package to show Leonard the most, then Sheldon and Penny.
Enhance a bar graph in R by adding color, borders, and rotated axis labels, then set custom axis titles and transition to sentiment analysis with online lexicons.
Learn sentiment analysis, automatically identifying positive or negative feelings in text. Explore applications in social media, surveys, and product reviews for business intelligence.
Explore how sentiment analysis works by using word lists or lexicons, counting positive and negative words, and adapting lexicons such as Bing Lexicon and the NRC Lexicon to context.
Explore how sentiment classification works using lexicons such as the Bing Lexicon and the Nazi lexicon, applying tokenization, stop-word removal, and stemming to classify words as positive or negative.
Learn how to perform sentiment analysis by tokenizing text into tokens word by word in R using tidytext, turning dialogues into tokens, and preparing data for sentiment evaluation.
Learn to reshape data in R with melt and cast, converting a wide data frame to long format to support machine learning and sentiment analysis models.
Learn casting in R to reshape data by melting to long form, apply aggregate functions with a formula, and cast back to an aggregated wide form.
Use the NRC emotion lexicon to map five The Big Bang Theory characters' dialogues to emotions like anger, fear, anticipation, trust, joy, sadness, and sentiments, and visualize per-character sentiments.
Learn to visualize character-based sentiments in R using ribbon plots, facet wrap, and coordinate flip to compare frequency of eight basic emotions across dialogue.
Caution before taking this course:
This course does not make you expert in R programming rather it will teach you concepts which will be more than enough to be used in machine learning and natural language processing models.
About the course:
In this practical, hands-on course you’ll learn how to program in R and how to use R for effective data analysis, visualization and how to make use of that data in a practical manner. You will learn how to install and configure software necessary for a statistical programming environment and describe generic programming language concepts as they are implemented in a high-level statistical language.
Our main objective is to give you the education not just to understand the ins and outs of the R programming language, but also to learn exactly how to become a professional Data Scientist with R and land your first job.
This course covers following topics:
1. R programming concepts: variables, data structures: vector, matrix, list, data frames/ loops/ functions/ dplyr package/ apply() functions
2. Web scraping: How to scrape titles, link and store to the data structures
3. NLP technologies: Bag of Word model, Term Frequency model, Inverse Document Frequency model
4. Sentimental Analysis: Bing and NRC lexicon
5. Text mining
By the end of the course you’ll be in a journey to become Data Scientist with R and confidently apply for jobs and feel good knowing that you have the skills and knowledge to back it up.