Learning Path: R: Data Analysis and Machine Learning with R
0.0 (0 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1 student enrolled
Wishlisted Wishlist

Please confirm that you want to add Learning Path: R: Data Analysis and Machine Learning with R to your Wishlist.

Add to Wishlist

Learning Path: R: Data Analysis and Machine Learning with R

Conquer the wider world of data science with R
New
0.0 (0 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1 student enrolled
Created by Packt Publishing
Last updated 9/2017
English
English [Auto-generated]
Current price: $10 Original price: $200 Discount: 95% off
5 hours left at this price!
30-Day Money-Back Guarantee
Includes:
  • 8.5 hours on-demand video
  • 1 Supplemental Resource
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Understand how to organize and set up data
  • Learn to label and scale data
  • Use the caret package to apply and score a model
  • Handle missing values and duplicates
  • Apply classification and regression techniques
  • Conduct independent data analysis
  • Knowthe essentials of ROC curves
  • Explore multinomial logistic regression with categorical response variables at three levels
View Curriculum
Requirements
  • Working knowledge of R is expected
  • Basic knowledge of math and statistics is needed
Description

With its popularity as a statistical programming language rapidly increasing with each passing day, R is becoming the preferred tool of choice for data analysts and data scientists who want to make sense of large amounts of data as quickly as possible. R has a rich set of libraries that can be used for basic as well as advanced data analysis and machine learning tasks.

So, if you're looking to understand how the R programming environment and packages can be used to for data analysis and machine learning, then you should surely go for this Learning Path.

Packt’s Video Learning Path is a series of individual video products put together in a logical and stepwise manner such that each video builds on the skills learned in the video before it.

This Learning Path starts by organizing the data and then predicting it. You will work through various examples wherein you will explore RStudio and libraries, how to apply linear regression, how to score test sets, and plotting test results on a Cartesian plane. You will also see how to use logistic regression to predict for a classification problem on automobile data. Further, you will learn different ways to use R to generate professional analysis reports. Moving ahead, you will learn various important analysis and machine learning tasks that you can try out with associated and readily available data with the help of examples. Finally, you will learn advanced data analysis concepts such as cluster analysis, time-series analysis, PCA (Principal Component Analysis), sentiment analysis, and spatial data analysis.

By the end of this Learning Path, you will have a solid understanding of how to efficiently perform data analysis and machine learning tasks using R.

About the Author:

For this course, we have combined the best works of these esteemed authors:

  • Tim Hoolihan currently works at DialogTech, a marketing analytics company focused on conversations. He is the senior director of data science there. Prior to that, he was CTO at Level Seven, a regional consulting company in the US Midwest. He is the organizer of the Cleveland R User Group.In his job, he uses deep neural networks to help automate of lot of conversation classification problems. In addition, he works on some side-projects researching other areas of artificial intelligence and machine learning.
  • ViswaViswanathan is an associate professor of computing and decision sciences at the Stillman School of Business in Seton Hall University. After completing his PhD in Artificial Intelligence,Viswa has taught extensively in diverse fields, including operations research, computer science, software engineering, management information systems, and enterprise systems. In addition to teaching at the university, hehas conducted training programs for industry professionals. He has written several peer-reviewed research publications in journals such as Operations Research, IEEE Software, Computers and Industrial Engineering, and International Journal of Artificial Intelligence in Education.
  • ShanthiViswanathan is an experienced technologist who has delivered technology management and enterprise architecture consultations to many enterprise customers. She has worked for Infosys Technologies, Oracle Corporation, and Accenture. As a consultant, Shanthi has helped several large organizations, such as Canon, Cisco, Celgene, Amway, Time Warner Cable, and GE, among others, in areas such as data architecture and analytics, master data management, service-oriented architecture, business process management, and modeling.
  • Dr. Bharatendra Rai is a professor of business statistics and operations management in the Charlton College of Business at UMass Dartmouth. He received his Ph.D. in Industrial Engineering from Wayne State University, Detroit. His two master's degrees include specializations in quality, reliability, and OR from Indian Statistical Institute and another in statistics from Meerut University, India. He teaches courses on topics such as analyzing big data, business analytics,and data mining, Twitter and text analytics, applied decision techniques, operations management, and data science for business. Dr. Rai has won awards for excellence and exemplary teamwork at Ford for his contributions in the area of applied statistics.


    Who is the target audience?
    • This Learning Path is for data scientists and data analysts who want to perform advanced data analysis and machine learning tasksusing R.
    Compare to Other Data Analysis Courses
    Curriculum For This Course
    85 Lectures
    08:15:23
    +
    Getting Started with Machine Learning with R
    19 Lectures 01:20:26

    This video provides an overview of the entire course.

    Preview 01:56

    The goal of this video is to examine the IDE setup we will be using, the packages installed, and other basics that are needed after the setup stage.

    Your R Environment
    03:16

    In this video, we will do exploratory analysis of the USArrests dataset.

    Exploring the US Arrests Dataset
    05:01

    In this video, we will split our data into two sets, one for training our model, and another for testing (or validating) the model.

    Creating Test and Train Datasets
    03:49

    In this video, we will create our first model.

    Creating a Linear Regression Model
    03:46

    In this video, we will score our model on the test set.

    Scoring on the Test Set
    05:28

    In this video, we will plot the results of the test set against the actuals, and look at ways to tweak our results.

    Plotting the Test Results
    04:14

    The goal of this video is to examine the mtcars dataset that is built into R.

    Preview 03:26

    The goal of this video is to know how we can work with factors in our dataset. 

    Working with Factors
    04:27

    The goal of this video is to learn about scaling the data particularly the continuous values.

    Scaling Data
    04:09

    The goal of this video is to create a classification model.

    Creating a Classification Model
    02:30

    In this video, we will plot the results of the test set against the actuals, and look at ways to tweak our results.

    Advanced Formulas
    03:48

    Now that we have created our model, we will look at a different way for scoring our results, that is, F-Score.

    Precision, Recall, and F-Score
    05:19

    The goal of this video is to discuss and examine the caret package.

    Preview 02:32

    In this video, we will preprocess our data using caret.

    EDA and Preprocessing
    10:59

    In this video, we will be using caret to split our data into test and train datasets. 

    Preparing Test and Train Datasets
    03:07

    In this video, we will create a model using caret this time.       

    Creating a Model
    03:07

    In this video, we will use cross-validation instead of a test train split.

    Cross Validation
    03:32

    We want to get our model metrics out of the caret results this time.

    F-Score
    06:00
    +
    R Data Analysis Solutions - Machine Learning Techniques
    43 Lectures 03:11:27

    This video gives overview of the entire course.

    Preview 03:48

    CSV formats are best used to represent sets or sequences of records in which each record has an identical list of fields.

    Reading Data from CSV Files
    06:30

    You may sometimes need to extract data from websites. Many providers also supply data in XML and JSON formats.

    Reading XML and JSON Data
    06:06

    In fixed-width formatted files, columns have fixed widths; if a data element does not use up the entire allotted column width, then the element is padded with spaces to make up the specified width also. During data analysis, you will create several R objects.

    Reading Data from Fixed-Width Formatted Files, R Files, and R Libraries
    06:39

    When we have abundant data, we sometimes want to eliminate the cases that have missing values for one or more variables. When you disregard cases with any missing variables, you lose useful information that the non-missing values in that case convey.

    Removing and Replacing Missing Values
    06:17

    We sometimes end up with duplicate cases in our datasets and want to retain only one among the duplicates.

    Removing Duplicate Cases
    02:03

    Variables with higher values tend to dominate distance computations and you may want to rescale the values to be in the range of 0 - 1.

    Rescaling a Variable
    02:15

    Variables with higher values tend to dominate distance computations and you may want to use the standardized values.       

    Normalizing or Standardizing Data in a Data Frame
    03:04

    Sometimes we need to convert numerical data to categorical data or a factor.

    Binning Numerical Data
    03:27

    In situations where we have categorical variables (factors) but need to use them in analytical methods that require numbers, we need to create dummy variables.

    Creating Dummies for Categorical Variables
    03:49

    In this video, we summarize the data using the summary function.

    Preview 03:28

    In this video, we will look at two ways to subset data.

    Extracting Subset of a Dataset
    05:45

    Split a dataset to create groups corresponding to each level and to analyze each group separately

    Splitting a Dataset
    01:55

    By partitioning data we can unbiasedly evaluate the quality of data.

    Creating Random Data Partitions
    07:37

    Before even embarking on any numerical analyses, you may want to get a good idea about the data through a few quick plots. So we cover only the simplest forms of basic graphs.

    Generating Standard Plots
    05:23

    We often want to see plots side by side for comparisons. This video shows how we can achieve this.

    Generating Multiple Plots
    01:49

    R can send its output to several different graphic devices to display graphics in different formats. This video deals with selecting proper graphic device. 

    Selecting a Graphics Device
    01:53

    The lattice package produces Trellis plots to capture multivariate relationships in the data. Also, ggplot2 graphs are built iteratively, starting with the most basic plot.

    Creating Plots with the Lattice and ggplot2package
    09:05

    In large datasets, we often gain good insights by examining how different segments behave. This video shows how to create graphs that enable such comparisons.

    Creating Charts that Facilitate Comparisons
    02:43

    Visualizing hypothesized causality helps to communicate our ideas clearly.

    Creating Charts that Visualize Possible Causality
    01:35

    When exploring data, we want to get a feel for the interaction of as many variables as possible. In this video, we will show you how you can bring up to five variables into play.

    Creating Multivariate Plots
    02:14

    Getting an idea of how the model does in training data itself is useful, but you should never use that as an objective measure.

    Preview 04:25

    Receiver operating characteristic (ROC) charts helps by giving a visual representation of the true and false positives at various cutoff levels.

    Generating ROC Charts
    03:47

    This video shows you how you can use the rpart package to build classification trees and the rpart.plot package to generate nice-looking tree diagrams.       

    Building, Plotting, and Evaluating – Classification Trees
    06:07

    The randomForest package can help you to easily apply the very powerful but computationally intensive random forest classification technique.

    Using random Forest Models for Classification
    04:20

    The e1071 package can help you to easily apply the very powerful Support Vector Machine (SVM) classification technique.

    Classifying Using the Support Vector Machine Approach
    05:26

    The e1071 package contains the naiveBayes function for the Naïve Bayes classification.

    Classifying Using the Naïve Bayes Approach
    02:22

    The class package contains the knn function for KNN classification.

    Classifying Using the KNN Approach
    05:02

    The nnet package contains the nnet function for classification using neural networks.

    Using Neural Networks for Classification
    04:18

    The MASS package contains the lda function for classification using linear discriminant function analysis.

    Classifying Using Linear Discriminant Function Analysis
    02:48

    The stats package contains the glm function for classification using logistic regression.

    Classifying Using Logistic Regression
    04:01

    R has several libraries that implement boosting where we combine many relatively inaccurate models to get a much more accurate model. The ada package provides boosting functionality on top of classification trees.

    Using AdaBoost to Combine Classification Tree Models
    03:32

    You generally evaluate a model's performance based on the training data, but will rely on the model's performance on the hold out data to get an objective measure.

    Preview 02:43

    In this video, we look at the use of the knn.reg function to build the model and then the process of predicting with the model as well. We also show some additional convenience mechanisms to make the process easier.

    Building KNN Models for Regression
    08:23

    In this video, we will discuss linear regression, arguably the most widely used technique. The stats package has the functionality for linear regression and R loads it automatically at startup.

    Performing Linear Regression
    07:17

    The MASS package has the functionality for variable selection and this recipe illustrates its use.

    Performing Variable Selection in Linear Regression
    02:22

    This video covers the use of tree models for regression. The rpart package provides the necessary functions to build regression trees.

    Building Regression Trees
    07:55

    This video looks at random forests—one of the most successful machine learning techniques.

    Building Random Forest Models for Regression
    05:07

    The nnet package contains functionality to build neural network models for classification as well as prediction. In this recipe, we cover the steps to build a neural network regression model using nnet.

    Using Neural Networks for Regression
    03:35

    The R implementation of some techniques, such as classification and regression trees, performs cross-validation out of the box to aid in model selection and to avoid overfitting.

    Performing k-Fold Cross-Validation and Leave-One-Out-Cross-Validation
    05:07

    The standard R package stats provides the function for K-means clustering. We also use the cluster package to plot the results of our cluster analysis.

    Performing Cluster Analysis Using K-Means Clustering
    06:48

    The hclust function in the package stats helps us perform hierarchical clustering.

    Performing Cluster Analysis Using Hierarchical Clustering
    04:00

    The stats package offers the prcomp function to perform PCA. This recipe shows you how to perform PCA using these capabilities.

    Reducing Dimensionality with Principal Component Analysis
    04:37
    +
    Mastering Data Analysis with R
    23 Lectures 03:43:30

    This video will give an overview of entire course.

    Preview 03:24

    The aim of this video is to introduce R/RStudio to those using it for the first time.

    Getting Started and Data Exploration with R/RStudio
    28:16

    The aim of this video is to introduce commonly used visualization tools in R.

    Introduction to Visualization
    20:29

    The aim of this video is to introduce the interactive visualization package “plotly” in R.

    Interactive Visualization
    10:35

    The aim of this video is to introduce the “googleVis” package in R.

    Geographic Plots
    10:03

    The aim of this video is to introduce visualization with ggplot2, d3heatmap, and googleVis packages.

    Advanced Visualization
    11:00

    The aim of this video is to introduce the idea of regression, logistic regression, and data partitioning.        

    Getting Introductory Concepts
    06:46

    The aim of this video is to introduce data partitioning.

    Data Partitioning with R
    13:49

    The aim of this video is to present steps for multiple linear regression.

    Multiple Linear Regression with R
    11:59

    The aim of this video is to introduce multicollinearity issues with regression models. 

    Multicollinearity Issues
    07:31

    The aim of this video is to introduce logistic regression using R.

    Logistic Regression with Categorical Response Variables at two Levels
    13:46

    The aim of this video is to provide a logistic model interpretation.

    Logistic Regression Model and Interpretation
    04:23

    The aim of this video is to show calculation for confusion matrix and misclassification error.       

    Misclassification Error and Confusion Matrix
    06:40

    The aim of this video is to show how to create ROC curves in R.

    ROC Curves
    06:02

    The aim of this video is to provide an overall view of prediction and model assessment.       

    Prediction and Model Assessment
    08:43

    The aim of this video is to introduce multinomial logistic regression using R.       

    Multinomial Logistic Regression with Categorical Response Variables at 3Levels
    07:29

    The aim of this video is to provide the interpretation to the multinomial logistic model.

    Multinomial Logistic Regression Model and Its Interpretation
    08:14

    The aim of this video is to show calculation for confusion matrix and misclassification error.

    Misclassification Error and Confusion Matrix
    06:33

    The aim of this video is to provide an overall view of prediction and model assessment.

    Prediction and Model Assessment
    09:54

    The aim of this video is to introduce ordinal logistic regression using R.

    Ordinal Logistic Regression with R
    12:54

    The aim of this video is to provide ordinal logistic model interpretation.

    Ordinal Logistic Regression Model and Interpretation
    04:40

    The aim of this video is to show calculation for the confusion matrix and misclassification error.

    The Misclassification Error and Confusion Matrix
    04:28

    The aim of this video is to provide an overall view of the prediction and model assessment.

    Prediction and Model Assessment
    05:52
    About the Instructor
    Packt Publishing
    3.9 Average rating
    8,197 Reviews
    58,869 Students
    687 Courses
    Tech Knowledge in Motion

    Packt has been committed to developer learning since 2004. A lot has changed in software since then - but Packt has remained responsive to these changes, continuing to look forward at the trends and tools defining the way we work and live. And how to put them to work.

    With an extensive library of content - more than 4000 books and video courses -Packt's mission is to help developers stay relevant in a rapidly changing world. From new web frameworks and programming languages, to cutting edge data analytics, and DevOps, Packt takes software professionals in every field to what's important to them now.

    From skills that will help you to develop and future proof your career to immediate solutions to every day tech challenges, Packt is a go-to resource to make you a better, smarter developer.

    Packt Udemy courses continue this tradition, bringing you comprehensive yet concise video courses straight from the experts.