
This video provides an overview of the entire course.
In this video, we will see how to download and install RStudio, and set it up as an R editing environment.
Understand the R language, software, and RStudio editing environment
Download and install RStudio
Set up the RStudio Source and Console panes
In this video, we will Learn how to write, run, save and load R scripts in the RStudio source pane.
Writing code in the RStudio source pane
Running code from the RStudio source pane
Saving and loading R scripts
In this video, we will understand how to use numbers and perform arithmetic operations in R.
Perform basic arithmetic operations
Use the exponent and modulus operators
Understand the order of operations and parentheses
The aim of this video is to make us understand how to create and use R variables, and the basics of vectors and vectorised operations.
Create and use variables
Understand vectors
Perform vectorised operations
In this video, we will understand how to find and use functions.
Understand calling functions
Explore nesting and vectorising functions
Understand function arguments and finding function documentation
In this video, we will see what data types are and how to work with vectors of different data types.
Understand the logical and character data types
Learn how to subset vectors
Learn how to name vector elements
This video explains us what is the purpose and properties of matrices and arrays and how to create them, and how to subset elements from them.
Understand what matrices and arrays are
Learn how to create matrices and arrays
Learn how to subset elements from matrices and arrays
The aim of this video is to make us understand what list data structure is and how does list differ from vectors.
Understand what a list is
Learn how to create and modify lists
Learn how to subset list elements
In this video, we will understand how to use data frame as a flexible way to represent and work with tabular data in R.
Understand what a data frame is
Learn how to create data frames
Learn how to subset data frames
This video explains us why factors exist and how to use them in base R.
Understand what factors are and why they exist
Learn to create factors and work with factor levels
Understand the stringsAsFactors argument
Datasets are often provided to you in a delimited format such as CSV (comma-separated value). In this video, we will learn how to load data from this and other delimited formats into R.
Understand the CSV format
Learn how to read CSV files with read.csv()
Learn how to read any delimited format with read.table()
When working with data, it’s often useful to subset a data frame by value. In this video, we will learn how to combine logical operators with data frame subsetting to subset datasets by value.
Understand the six most important logical operators
Apply logical operators to perform logical subsetting of data frames
Manage missing data with the is.na() function
Large data sets can be difficult to understand at a glance. This video aims to explain how to apply a range of statistical summary functions to condense key statistical properties from dataset variables.
Apply the summary() and table() functions
Apply the min(), max(), range(), and unique() functions
Apply the mean(), median(), and sd() functions
Although there are hundreds of statistical tests that can be performed in R, many of them are applied according to a similar pattern. In this video, we will learn how to perform three common statistical tests in two different ways.
Perform a test with vector arguments and with formulas
Perform a Mann-Whitney test with vector arguments and with formulas
Calculate a Spearman rank correlation between variables
Data sets will not always contain all the information you need. In this video, we will learn how to manipulate and combine variables to reshape a data set for your application.
Combine character variables with paste()
Replace text substrings with sub() and factor levels with levels()
Create new data frame column and replace existing columns
When you finish working with a data frame, you need to write it back to file to work with it later or pass to somebody else. In this video, we will learn how to write a data frame to file.
Write a data frame to file with write.csv()
Create a complete data analysis script finishing with write.csv()
Review data analysis concepts
This video gives an overview of entire course.
This video covers data preparation for clustering.
Discuss iris data as an example
Show how to normalize data
Show how to calculate distances
This video covers clustering using dendrogram.
Show dendrogram with complete and average linkages
Learn characterization of clusters
Show how to make Silhouette plot
In this video, k-means clustering is covered.
Show steps to do k-means in R
Provide output interpretation
Show steps for making scree plot
In this video, we will see how to do data preparation for density based clustering.
Use iris data as an example
Show what key packages need to be installed
Show steps for obtaining optimal eps value
In this video, we will see how to do density based clustering.
See how to use fps package
Learn how to use dbscan package
Show how to do cluster visualization
In this video, we will see how to prepare for text data clustering.
Show steps to read text file and build corpus
Show steps to do term document matrix
Show steps to plot frequent terms
In this video, we will see how to do clustering for words or tweets.
Show steps for hierarchical clustering
Provide interpretation of dendrogram
Show steps for k-means clustering
This video shows how to do discriminant analysis in R.
Discuss iris data, correlations, and scatter plot
Show how to do data partition
Show how to do linear discriminant analysis
This video shows how to do model interpretation.
Discuss coefficients of linear discriminants
Discuss proportion of trace
Discuss prediction
This video shows how to do visualizations in discriminant analysis.
Show steps to do stacked histograms
Show steps to do bi-plot
Show steps to do partition plots
This video shows how to do model assessment.
Show steps to do confusion matrix and accuracy calculations for training data
Show steps to do confusion matrix and accuracy calculations for testing data
Discuss interpretations
This video shows how to do time series decomposition in R.
Discuss an example of time series data
Show how to do log transformation of data
Show how to do decomposition of additive time series
This video shows how to do time series forecasting in R.
Show steps to develop ARIMA model
Show steps for ACF, PACF, and residual plots
Show steps to do forecast using the model
This video shows how to do time series clustering in R.
Show steps to do data partitioning
Show steps to calculate distances
Show steps to do hierarchical clustering
This video shows how to do time series classification in R.
Show steps to do data preparation
Shows steps to do classification using decision tree
Show how to do classification performance assessment
This video shows how to do decision tree in R.
Discuss an example using iris data
Show how to do data partition
Show how to develop decision tree model
This video shows how to do visualize decision tree.
Show steps to plot decision tree
Discuss interpretation of the tree model
Discuss categorical versus numerical dependent variable
This video shows how to assess classification performance.
Show steps to do prediction
Show steps to create confusion matrix for training data
Show steps to create confusion matrix for testing data
This video gives an overview of entire course.
This video covers steps for obtaining Twitter data.
Show how to register API using Twitter account
Show how to use information from earlier step to get tweets
Show how to create a csv file
This video covers steps for data cleaning and preparation.
Show how to read the CSV file
Show how to build corpus and clean text
Show how to create term document matrix
In this video, visualization of text data is covered.
Show steps to do bar plot of most frequent words
Show steps to do wordcloud
Show steps for improving wordcloud
In this video, we will see how to do sentiment analysis using Twitter data.
Show what packages are needed
Show how to obtain sentiment scores
Show steps for plotting sentiment scores
This video covers steps for network analysis using tweets.
Show how to create term document matrix of tweets
Show how to develop network of terms using igraph package in R
Show how to create a network diagram
This video covers steps for term network visualization.
Show how to visualize network of terms in communities using various algorithms
Show how to visualize hubs and authorities
Show how to highlight degrees in the network diagrams
This video covers steps for tweet network visualization.
Show steps to visualize network of tweets
Show steps to delete vertices that have low degrees
Show steps to delete edges to improve visualization of network
Are you looking forward to get well versed with classifying and clustering data with R? Then this is the perfect course for you!
There’s an increase in the number of data being produced every day which has led to the demand for skilled professionals who can analyze these data and make decisions. R is a programming language and environment used in statistical computing, data analytics and scientific research. Due to its expressive syntax and easy-to-use interface, it has grown in popularity in recent years.
This comprehensive 3-in-1 course takes a practical and incremental approach. Analyze and manage large volumes of data using advanced techniques. Attain a greater understanding of the fundamentals of applied statistics. Load, manipulate, and analyze data from different sources! Develop decision tree model for classification and prediction. Know how to use hierarchical cluster analysis using visualization methods such as Dendrogram and Silhouette plots!
Contents and Overview
This training program includes 3 complete courses, carefully chosen to give you the most comprehensive training possible.
The first course, Learn R programming, covers R programming to create data structures and perform extensive statistical data analysis and synthesis. You’ll work with powerful R tools and techniques. Boost your productivity with the most popular R packages and tackle data structures such as matrices, lists, and factors. Create vectors, handle variables, and perform other core functions. You’ll be able to tackle issues with data input/output and will learn to work with strings and dates. Explore more advanced concepts such as metaprogramming with R and functional programming. Finally, you’ll learn to tackle issues while working with databases and data manipulation.
The second course, Classifying and Clustering Data with R, covers classifying and clustering Data with R. This video course provides the steps you need to carry out classification and clustering with R/RStudio software. You’ll understand hierarchical clustering, non-hierarchical clustering, density-based clustering, and clustering of tweets. It also provides steps to carry out classification using discriminant analysis and decision tree methods.In addition, we cover time-series decomposition, forecasting, clustering, and classification.
By the end the course, you will be well-versed with clustering and classification using Cluster Analysis, Discriminant Analysis, Time-series Analysis, and decision trees.
The third course, Bringing Order to Unstructured Data with R, covers obtaining, cleansing, and visualizing data with R. This video course will demonstrate the steps for analyzing unstructured data with the R/R Studio software.
At the end the video course you’ll have mastered obtaining and visualizing data with R. You’ll also be confident with data cleaning, preparation, and sentiment analysis with R.
By the end of the course, you’ll be able to classify as well as cluster data and bring order to unstructured data with R.
About the Authors
Dr. David Wilkins has been writing R for over a decade. He is the author of a number of popular open-source R packages, two previous Packt Publishing courses on the R language, and over a dozen scientific publications involving R analyses. He holds a Bachelor's degree in Science and a PhD in molecular genetics. David has a particular passion for creating beautiful and informative statistical graphics, and enjoys teaching people to use R to find and express insights in their own datasets.
Dr. Bharatendra Rai is Professor of Business Statistics and Operations Management in the Charlton College of Business at UMass Dartmouth. He received his Ph.D. in Industrial Engineering from Wayne State University, Detroit. His two master's degrees include specializations in quality, reliability, and OR from Indian Statistical Institute and another in statistics from Meerut University, India. He teaches courses on topics such as Analyzing Big Data, Business Analytics and Data Mining, Twitter and Text Analytics, Applied Decision Techniques, Operations Management, and Data Science for Business. He has over twenty years' consulting and training experience, including industries such as automotive, cutting tool, electronics, food, software, chemical, defense, and so on, in the areas of SPC, design of experiments, quality engineering, problem solving tools, Six-Sigma, and QMS. His work experience includes extensive research experience over five years at Ford in the areas of quality, reliability, and six-sigma. His research publications include journals such as IEEE Transactions on Reliability, Reliability Engineering & System Safety, Quality Engineering, International Journal of Product Development, International Journal of Business Excellence, and JSSSE. He has been keynote speaker at conferences and presented his research work at conferences such as SAE World Conference, INFORMS Annual Meetings, Industrial Engineering Research Conference, ASQs Annual Quality Congress, Taguchi's Robust Engineering Symposium, and Canadian RAMS. Dr. Rai has won awards for Excellence and exemplary teamwork at Ford for his contributions in the area of applied statistics. He also received an Employee Recognition Award by FAIA for his Ph.D. dissertation in support of Ford Motor Company. He is certified as ISO 9000 lead assessor from British Standards Institute, ISO 14000 lead assessor from Marsden Environmental International, and Six Sigma Black Belt from ASQ.