R Data Analysis Solutions - Machine Learning Techniques
0.0 (0 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1 student enrolled
Wishlisted Wishlist

Please confirm that you want to add R Data Analysis Solutions - Machine Learning Techniques to your Wishlist.

Add to Wishlist

R Data Analysis Solutions - Machine Learning Techniques

Over 40 recipes dedicated to machine learning techniques
0.0 (0 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1 student enrolled
Created by Packt Publishing
Last updated 8/2017
Current price: $10 Original price: $125 Discount: 92% off
5 hours left at this price!
30-Day Money-Back Guarantee
  • 3 hours on-demand video
  • 1 Supplemental Resource
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Learn to handle missing values and duplicates
  • Learn to scale and standardize values
  • Reveal underlying patterns
  • Learn to apply classification techniques
  • Learn to apply regression techniques
  • Learn to reduce data
View Curriculum
  • A basic knowledge of the R programming language is expected.

Data analysis has recently emerged as a very important focus for a huge range of organizations and businesses. R makes detailed data analysis easier, making advanced data exploration and insight accessible to anyone interested in learning it. This video empowers you by showing you ways to use R to generate professional analysis reports. It provides examples for various important analysis and machine-learning tasks that you can try out with associated and readily available data. You will learn to carry out different tasks on the data to bring it into action.By the end of this course, you will be able to carry out different analyzing techniques, apply classification and regression, and also reduce data.

About the Auhtor :

Shanthi Viswanathan is an experienced technologist who has delivered technology management and enterprise architecture consulting to many enterprise customers. She has worked for Infosys Technologies, Oracle Corporation, and Accenture. As a consultant, Shanthi has helped several large organizations, such as Canon, Cisco, Celgene, Amway, Time Warner Cable, and GE among others, in areas such as data architecture and analytics, master data management, service-oriented architecture, business process management, and modeling. When she is not in front of her Mac, Shanthi spends time hiking in the suburbs of NY/NJ, working in the garden, and teaching yoga.

Shanthi would like to thank her husband, Viswa, for all the great discussions on numerous topics during their hikes together and for exposing her to R and Java. She would also like to thank her sons, Nitin and Siddarth, for getting her into the data analytics world.
Viswa Viswanathan

Viswa Viswanathan is an associate professor of Computing and Decision Sciences at the Stillman School of Business in Seton Hall University. After completing his PhD in Artificial Intelligence, Viswa spent a decade in Academia and then switched to a leadership position in the software industry for a decade. During this period, he worked for Infosys, Igate, and Starbase. He embraced Academia once again in 2001.

Viswa has taught extensively in diverse fields, including operations research, computer science, software engineering, management information systems, and enterprise systems. In addition to teaching at the university, Viswa has conducted training programs for industry professionals. He has written several peer-reviewed research publications in journals such as Operations Research, IEEE Software, Computers and Industrial Engineering, and International Journal of Artificial Intelligence in Education. He authored a book entitled Data Analytics with R: A Hands-on Approach.

Viswa thoroughly enjoys hands-on software development, and has single-handedly conceived, architected, developed, and deployed several web-based applications.

Apart from his deep interest in technical fields such as data analytics, Artificial Intelligence, computer science, and software engineering, Viswa harbors a deep interest in education, with a special emphasis on the roots of learning and methods to foster deeper learning. He has done research in this area and hopes to pursue the subject further.

Viswa would like to express deep gratitude to professors Amitava Bagchi and Anup Sen, who were inspirational during his early research career. He is also grateful to several extremely intelligent colleagues, notably Rajesh Venkatesh, Dan Richner, and Sriram Bala, who significantly shaped his thinking. His aunt, Analdavalli; his sister, Sankari; and his wife, Shanthi, taught him much about hard work, and even the little he has absorbed has helped him immensely.

His sons, Nitin and Siddarth, have helped with numerous insightful comments on various topics.

Who is the target audience?
  • This course is for anyone who wants to learn analytical techniques from scratch.
Students Who Viewed This Course Also Viewed
Curriculum For This Course
43 Lectures
Acquire and Prepare the Ingredients – Your Data
10 Lectures 43:58
This video gives overview of the entire course.
Preview 03:48

CSV formats are best used to represent sets or sequences of records in which each record has an identical list of fields.

Reading Data from CSV Files

You may sometimes need to extract data from websites. Many providers also supply data in XML and JSON formats.
Reading XML and JSON Data

In fixed-width formatted files, columns have fixed widths; if a data element does not use up the entire allotted column width, then the element is padded with spaces to make up the specified width also. During data analysis, you will create several R objects.
Reading Data from Fixed-Width Formatted Files, R Files, and R Libraries

When we have abundant data, we sometimes want to eliminate the cases that have missing values for one or more variables. When you disregard cases with any missing variables, you lose useful information that the non-missing values in that case convey.
Removing and Replacing Missing Values

We sometimes end up with duplicate cases in our datasets and want to retain only one among the duplicates.
Removing Duplicate Cases

Variables with higher values tend to dominate distance computations and you may want to rescale the values to be in the range of 0 - 1.
Rescaling a Variable

Variables with higher values tend to dominate distance computations and you may want to use the standardized values.
Normalizing or Standardizing Data in a Data Frame

Sometimes we need to convert numerical data to categorical data or a factor.

Binning Numerical Data

In situations where we have categorical variables (factors) but need to use them in analytical methods that require numbers, we need to create dummy variables.

Creating Dummies for Categorical Variables
What's in There? – Exploratory Data Analysis
11 Lectures 43:27
In this video, we summarize the data using the summary function.
Preview 03:28

In this video, we will look at two ways to subset data.
Extracting Subset of a Dataset

Split a dataset to create groups corresponding to each level and to analyze each group separately
Splitting a Dataset

By partitioning data we can unbiasedly evaluate the quality of data.

Creating Random Data Partitions

Before even embarking on any numerical analyses, you may want to get a good idea about the data through a few quick plots. So we cover only the simplest forms of basic graphs.

Generating Standard Plots

We often want to see plots side by side for comparisons. This video shows how we can achieve this.

Generating Multiple Plots

R can send its output to several different graphic devices to display graphics in different formats. This video deals with selecting proper graphic device.
Selecting a Graphics Device

The lattice package produces Trellis plots to capture multivariate relationships in the data. Also, ggplot2 graphs are built iteratively, starting with the most basic plot.
Creating Plots with the Lattice and ggplot2package

In large datasets, we often gain good insights by examining how different segments behave. This video shows how to create graphs that enable such comparisons.
Creating Charts that Facilitate Comparisons

Visualizing hypothesized causality helps to communicate our ideas clearly.
Creating Charts that Visualize Possible Causality

When exploring data, we want to get a feel for the interaction of as many variables as possible. In this video, we will show you how you can bring up to five variables into play.
Creating Multivariate Plots
Where Does It Belong? – Classification
11 Lectures 46:08

Getting an idea of how the model does in training data itself is useful, but you should never use that as an objective measure.

Preview 04:25

Receiver operating characteristic (ROC) charts helps by giving a visual representation of the true and false positives at various cutoff levels.

Generating ROC Charts

This video shows you how you can use the rpart package to build classification trees and the rpart.plot package to generate nice-looking tree diagrams.

Building, Plotting, and Evaluating – Classification Trees

The randomForest package can help you to easily apply the very powerful but computationally intensive random forest classification technique.

Using random Forest Models for Classification

The e1071 package can help you to easily apply the very powerful Support Vector Machine (SVM) classification technique.

Classifying Using the Support Vector Machine Approach

The e1071 package contains the naiveBayes function for the Naïve Bayes classification.

Classifying Using the Naïve Bayes Approach

The class package contains the knn function for KNN classification.

Classifying Using the KNN Approach

The nnet package contains the nnet function for classification using neural networks.
Using Neural Networks for Classification

The MASS package contains the lda function for classification using linear discriminant function analysis.
Classifying Using Linear Discriminant Function Analysis

The stats package contains the glm function for classification using logistic regression.

Classifying Using Logistic Regression

R has several libraries that implement boosting where we combine many relatively inaccurate models to get a much more accurate model. The ada package provides boosting functionality on top of classification trees.
Using AdaBoost to Combine Classification Tree Models
Give Me a Number – Regression
8 Lectures 42:29
You generally evaluate a model's performance based on the training data, but will rely on the model's performance on the hold out data to get an objective measure.
Preview 02:43

In this video, we look at the use of the knn.reg function to build the model and then the process of predicting with the model as well. We also show some additional convenience mechanisms to make the process easier.
Building KNN Models for Regression

In this video, we will discuss linear regression, arguably the most widely used technique. The stats package has the functionality for linear regression and R loads it automatically at startup.

Performing Linear Regression

The MASS package has the functionality for variable selection and this recipe illustrates its use.
Performing Variable Selection in Linear Regression

This video covers the use of tree models for regression. The rpart package provides the necessary functions to build regression trees.

Building Regression Trees

This video looks at random forests—one of the most successful machine learning techniques.

Building Random Forest Models for Regression

The nnet package contains functionality to build neural network models for classification as well as prediction. In this recipe, we cover the steps to build a neural network regression model using nnet.
Using Neural Networks for Regression

The R implementation of some techniques, such as classification and regression trees, performs cross-validation out of the box to aid in model selection and to avoid overfitting.
Performing k-Fold Cross-Validation and Leave-One-Out-Cross-Validation
Can You Simplify That? – Data Reduction Techniques
3 Lectures 15:25

The standard R package stats provides the function for K-means clustering. We also use the cluster package to plot the results of our cluster analysis.

Preview 06:48

The hclust function in the package stats helps us perform hierarchical clustering.
Performing Cluster Analysis Using Hierarchical Clustering

The stats package offers the prcomp function to perform PCA. This recipe shows you how to perform PCA using these capabilities.

Reducing Dimensionality with Principal Component Analysis
About the Instructor
Packt Publishing
3.9 Average rating
7,264 Reviews
51,824 Students
616 Courses
Tech Knowledge in Motion

Packt has been committed to developer learning since 2004. A lot has changed in software since then - but Packt has remained responsive to these changes, continuing to look forward at the trends and tools defining the way we work and live. And how to put them to work.

With an extensive library of content - more than 4000 books and video courses -Packt's mission is to help developers stay relevant in a rapidly changing world. From new web frameworks and programming languages, to cutting edge data analytics, and DevOps, Packt takes software professionals in every field to what's important to them now.

From skills that will help you to develop and future proof your career to immediate solutions to every day tech challenges, Packt is a go-to resource to make you a better, smarter developer.

Packt Udemy courses continue this tradition, bringing you comprehensive yet concise video courses straight from the experts.