Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

R: Complete Machine Learning Solutions

Name: R: Complete Machine Learning Solutions
Rating: 3.7 (56 reviews)

Use over 100 solutions to analyze data and build predictive models

Created byPackt Publishing

Last updated 3/2017

English

What you'll learn

Create and inspect the transaction dataset and perform association analysis with the Apriori algorithm
Predict possible churn users with the classification approach
Implement the clustering method to segment customer data
Compress images with the dimension reduction method
Build a product recommendation system

Course content

12 sections • 125 lectures • 8h 34m total length

Introduction4:05
Downloading and Installing R5:34
R must be first installed on your system to work on it

Download R according to the system
Install R
Downloading and Installing R
Downloading and Installing RStudio3:10
RStudio makes the process of development with R easier.

Download RStudio
Install Rstudio
Downloading and Installing RStudio
Installing and Loading Packages5:46
R packages are an essential part of R as they are required in all our programs. Let’s learn to do that.

Download packages
Install them
Installing and loading packages
Reading and Writing Data5:54
You must know how to give data to R to work with data. You will learn that here.

Load the dataset iris package
Use the read.table and write.table functions to read and write data
Reading and writing data
Using R to Manipulate Data5:46
Data manipulation is time consuming and hence needs to be done with the help of built-in R functions.

Load the dataset
Select and subset data according to conditions
Using R to manipulate data
Applying Basic Statistics4:47
R is widely used for statistical applications. Hence it is necessary to learn about the built in functions of R.

Load the dataset
Observe the format of data
Applying basic statistics
Visualizing Data3:33
To communicate information effectively and make data easier to comprehend we need graphical representation. You will learn to plot figures in this section.

Calculate the frequency of the species
Plot a histogram, boxplot and scatterplot
Visualizing data
Getting a Dataset for Machine Learning2:38
Because of some limitations, it is a good practice to get data from external repositories. You will be able to do just that after this video.

Access the UCI machine repository
Download iris.data or use read.csv
Getting a dataset for machine Learning
Test Your Knowledge

Reading a Titanic Dataset from a CSV File7:52
Reading a dataset is the first and foremost step in data exploration. We need to learn to how to do that

Download the train dataset
Use read.csv and the str function to load and display the dataset respectively
Converting Types on Character Variables3:05
In R, since nominal, ordinal, interval, and ratio variable are treated differently in statistical modeling, we have to convert a nominal variable from a character into a factor.

Display the structure of the data using str
Find the attribute name, data type, and values contained in each attribute
Use the factor function to transform data from character to factor
Detecting Missing Values3:18
Missing values affect the inference of a dataset. Thus it is important to detect them.

Sum up all the NA values
Divide the sum by the number of values in each attribute
Apply the calculation to all attributes using sapply
Imputing Missing Values4:30
After detecting missing values, we need to impute them as their absence may affect the conclusion.

Produce statistics using a table
Sort the table. Use str_match to find the title with missing values
Assign the missing value with the mean value
Exploring and Visualizing Datac4:24
After imputing the missing values, you should perform an exploratory analysis to summarize the data characteristics.

Generate a bar plot and histogram of each attribute
Examine the relation between all attributes, two at a time
Predicting Passenger Survival with a Decision Tree3:58
The exploratory analysis helps users gain insights into how single or multiple variables may affect the survival rate. However, it does not determine what combinations may generate a prediction model. We need to use a decision tree for that.

Construct a data split function
Split data according to the need
Generate the prediction model and plot the tree
Validating the Power of Prediction with a Confusion Matrix2:08
After constructing the prediction model, it is important to validate how the model performs while predicting the labels.

Predict the survival of the testing set
Generate the statistics of the output matrix using a confusion matrix
Assessing Performance with the ROC Curve2:32
Test Your Knowledge

Understanding Data Sampling in R2:38
Operating a probability distribution in R3:50
Working with univariate descriptive statistics in R5:00
Univariate statistics deals with a single variable and hence is very simple.

Load data into a data frame. Compute the length of the variable.
Obtain mean, median, standard deviation and variance.
Obtain IQR, quantile, maxima, minima, and so on. Plot a histogram.
Performing Correlations and Multivariate Analysis3:00
To analyze the relation among more than two variables, multivariate analysis is done.

Get the co-variance matrix
Obtain the correlation matrix
Operating Linear Regression and Multivariate Analysis3:24
Assessing the relation between dependent and independent variables is carried out through linear regression.

Fit variables into a model
Create an analysis of a variance table
Plot the regression line
Conducting an Exact Binomial Test3:47
To validate that the experiment results are significant, hypothesis testing is done.

Conduct an exact binomial test
Performing Student's t-test3:12
To compare means of two different groups, one- and two-sample t-tests are conducted.

Visualize the attributes
Perform the statistical procedure
Performing the Kolmogorov-Smirnov Test4:42
Comparing a sample with a reference probability or comparing cumulative distributions of two data sets calls for a Kolmogorov- Smirnov test.

Check a normal distribution with one sample Kolmogrov-Smirnov test.
Generate uniformly distributed sample data.
Plot the ecdf of two samples. Apply a two-sample Kolmogrov-Smirnov test.
Understanding the Wilcoxon Rank Sum and Signed Rank Test2:03
The Wilcoxon Test is a non-parametric test for null hypothesis.

Plot data with boxplot
Perform a Wilcoxon Rank Sum test
Working with Pearson's Chi-Squared Test5:08
To check the distribution of categorical variables of two groups, Pearson’s chi-squared test is used.

Use the contingency table to make the counting table
Plot the mosaic plot
Perform Pearson’s Chi-squared test.
Conducting a One-Way ANOVA4:15
To examine the relation between categorical independent variables and continuous dependent variables, Anova is used. When there is a single variable, one-way ANOVA is used.

Visualize the data with a boxplot
Conduct a one-way ANOVA and perform ANOVA analysis
Plot the differences in mean level.
Performing a Two-Way ANOVA4:01
When there are two categorical values to be compared, two-way ANOVA is used.

Plot the two boxplots.
Use an interaction plot.
Perform two-way ANOVA. Plot the differences in mean level.
Test Your Knowledge

Fitting a Linear Regression Model with lm4:19
Linear regression is the simplest model in regression and can be used when there is one predictor value.

Prepare data with a linear relationship between predictor and response variables
Generate the regression line
Plot the regression line
Summarizing Linear Model Fits5:19
To obtain summarized information of a fitted model, we need to learn how to summarize linear model fits.

Compute the summary using summary function
Using Linear Regression to Predict Unknown Values2:50
It would be really convenient for us if we could predict unknown values. You can do that using linear regression.

Build a linear fitted model
Compute the prediction result using confidence interval
Compute the prediction result using prediction interval
Generating a Diagnostic Plot of a Fitted Model3:56
To check if the fitted model adequately represents the data, we perform diagnostics.

Generate a diagnostic plot
Fitting a Polynomial Regression Model with lm2:15
In the case of a non-linear relationship between predictor and response variables, a polynomial regression model is formed. We need to fit the model. This video will enable you to do that.

Illustrate the polynomial regression model in formula
Fitting a Robust Linear Regression Model with rlm2:14
An outlier will cause diversion from the slope of the regression line. In order to avoid that, we need to fit a robust linear regression model.

Generate the scatter plot
Apply the rlm function
Visualize the fitted line
Studying a case of linear regression on SLID data6:38
We will perform linear regression on a real-life example, the SLID dataset.

Load the SLID data. Fit all attributes.
Generate the diagnostic plot.
Test multi-colinearity and heteroscedasticity.
Applying the Gaussian Model for Generalized Linear Regression2:10
GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

Input the independent and dependent variables
Fit variables to a model
Compare the fitted models with ANOVA function
Applying the Poisson model for Generalized Linear Regression1:33
GLM allows response variables with error distribution other than a normal distribution. We apply the Poisson model to see how that is done.

Load a sample count data
Apply the glm function
View the fitted log-linear model
Applying the Binomial Model for Generalized Linear Regression2:01
When a variable is binary, we apply the binomial model.

Load the binary dependent variable
Fit the model into the binary data
Obtain a description using summary
Fitting a Generalized Additive Model to Data3:12
GAM has the ability to deal with non-linear relationships between dependent and independent variables. We learn to fit a regression using GAM.

Load the Boston dataset
Generate a fitted model
Summarize the GAM fit
Visualizing a Generalized Additive Model1:26
Visualizing a GAM helps it to understand better.

Generate a scatter plot
Add regression line
Visualize the fitted regression lines
Diagnosing a Generalized Additive Model3:37
You can also diagnose a GAM model to analyze it.

Produce the smoothing parameter estimation convergence information
View the four diagnostic plots
Test Your Knowledge

Preparing the Training and Testing Datasets2:54
Training and testing datasets are both essential for building a classification model.

Preprocess the dataset. Remove attributes which are unimportant
Split the data into training and testing sets
Generate a sequence accordingly and interpret the output
Building a Classification Model with Recursive Partitioning Trees6:10
A partitioning tree works on the basis of split condition starting from the base node to the terminal node.

Load the rpart package. Build a classification model
Display the tree node details
Generate the information graphic
Visualizing a Recursive Partitioning Tree3:03
Plotting the classification tree will make analyzing the data easier. You will learn to do this now.

Plot the classification tree
Specify parameters to adjust the layout
Measuring the Prediction Performance of a Recursive Partitioning Tree2:47
Before making a prediction, it is essential to compute the prediction performance of the model.

Generate a predicted label and classification table for the testing dataset
Generate a confusion matrix
Pruning a Recursive Partitioning Tree2:37
There can be parts in a dataset which are not essential for classification. In order to remove these parts, we have to prune the dataset.

Locate the record with minimum cross validation errors
Extract the CP of the record and assign the value to churn
Prune the classification tree
Building a Classification Model with a Conditional Inference Tree1:56
Conditional inference trees are better than traditional classification trees because they adapt the test procedures for selecting the output.

Build the classification model
Examine the built tree model
Visualizing a Conditional Inference Tree2:38
Visualizing a conditional inference tree will make it easier to extract and analyze data from the dataset.

Plot the built classification model
Measuring the Prediction Performance of a Conditional Inference Tree2:10
Like the prediction performance of a traditional classification tree, we can also evaluate the performance of a conditional inference tree.

Predict the category of the testing dataset
Generate a classification table
Determine the performance measurements
Classifying Data with the K-Nearest Neighbor Classifier5:31
K-nearest neighbor classifier is a non parametric lazy learning method. Thus it has the advantages of both the types of methods.

Build a classification model
Generate a classification table. Generate a confusion matrix from it
Examine the sensitivity and specificity
Classifying Data with Logistic Regression4:37
Classification in logistic regression is done based one or more features. It is more robust and doesn’t have as many conditions as the traditional classification model.

Generate a logistic regression model. Generate the model’s summary
Predict the categorical dependent variable of the testing dataset
Generate the confusion matrix
Classifying data with the Naïve Bayes Classifier6:16
The Naïve Bayes classifier is based on applying Bayes’ theorem with a strong independent assumption.

Specify the variables as first input parameters and churn label as the second input parameter in the function call
Assign the classification model to the classifier variable
Use a confusion matrix to calculate the performance measurement
Test Your Knowledge

Classifying Data with a Support Vector Machine4:59
Support vector machines are better at classification because they can capture complex relations between data points and provide both linear and non-linear classifications.

Train a support vector machine
Use different functions and arguments as desired for the output
Obtain a summary of the built support vector machine
Choosing the Cost of an SVM2:55
To control our training errors and margins, we use the cost function. The SVM classifier is affected by the cost.

Create an iris subset
Use SVM with small cost and large cost and see its effect
Visualizing an SVM Fit3:32
To visualize the SVM fit, we can use the plot function.

Train SVM and and use plot to visualize the fitted SVM
Specify appropriate parameters while generating the scatter plot
Predicting Labels Based on a Model Trained by an SVM3:47
We can use the trained SVM to predict labels on a model.

Obtain the predicted labels of the testing dataset.
Generate the classification table. Compute co-efficients of a contingency table.
Use a confusion matrix to measure performance.
Tuning an SVM2:47
According to the desired output, you may need to generate different combinations of gamma and cost to train different SVMs. This is called tuning.

Generate a set of parameters.
Obtain the best parameters. Train a new SVM.
Obtain a classification table. Compare the two models.
Training a Neural Network with neuralnet4:07
A neural network is used in classification, clustering and prediction. Its efficiency depends on how well you train it. Let’s learn to do that.

Split the dataset into training and testing datasets.
Add the required columns. Train the network model.
Configure the hidden neurons. Examine the information of the neural network model.
Visualizing a Neural Network Trained by neuralnet2:21
Visualizing a neural network would make understanding the process easier for you.

Visualize the trained neural network using plot.
View the generalized weights using gwplot.
Predicting Labels based on a Model Trained by neuralnet3:06
Similar to other classification models, we can predict labels using neural networks and also validate performance using confusion matrix.

Create an output probability matrix. Convert the probability matrix to class labels.
Generate a classification matrix based on the labels obtained.
Employ a confusion matrix to measure the prediction performance of the built neural network.
Training a Neural Network with nnet2:45
Nnet provides the functionality to train feed-forward neural networks with backpropagation.

Use nnet to train the neural network. Set different parameters in the function.
Use summary function to obtain information about the built neural network.
Predicting labels based on a model trained by nnet2:49
As we have already trained the neural network using nnet, we can use the model to predict labels.

Generate the predicted labels based on a testing dataset.
Generate a classification table based on predicted labels.
Employ a confusion matrix to measure the prediction performance of the trained neural network.
Test Your Knowledge

Estimating Model Performance with k-fold Cross Validation3:16
The k-fold cross-validation technique is a common technique used to estimate the performance of a classifier as it overcomes the problem of over-fitting. In this video we will illustrate how to perform a k-fold cross-validation:

Generate an index with 10 folds with the cut function
Use a for loop to perform a 10-fold cross-validation
Generate average accuracies with the mean function
Performing Cross Validation with the e1071 Package3:22
In this video, we will illustrate how to use tune.svm to perform 10-fold cross-validation and obtain the optimum classification model.

Apply tune.svm to the training dataset
Obtain the summary information of the model
Access the performance details of the tuned model
Generate a classification table
Performing Cross Validation with the caret Package2:59
In this video we will demonstrate how to perform k-fold cross validation using the caret package.

Set up the control parameter
Train the classification model on telecom churn data
Examine the output of the generated model
Ranking the Variable Importance with the caret Package2:20
This video will show you how to rank the variable importance with the caret package.

Estimate the variable importance
Generate the variable importance plot
Ranking the Variable Importance with the rminer Package2:30
Finding Highly Correlated Features with the caret Package2:13
In this video we will show how to find highly correlated features using the caret package.

Remove the features that are not coded in numeric characters
Obtain the correlation of each attribute
Obtain the names of highly correlated attributes
Selecting Features Using the caret Package4:58
Measuring the Performance of the Regression Model3:57
To measure the performance of a regression model, we can calculate the distance from the predicted output and the actual output as a quantifier of the performance of the model. In this video we will illustrate how to compute these measurements from a built regression model.

Load the dataset
Calculate the root mean square error, relative square error and R-Square value
Measuring Prediction Performance with a Confusion Matrix2:07
In this video we will demonstrate how to retrieve a confusion matrix using the caret package

Train an svm model using the training dataset
Predict labels using the fitted model
Generate a classification table and a confusion matrix
Measuring Prediction Performance Using ROCR2:45
In this video, we will demonstrate how to illustrate an ROC curve and calculate the AUC to measure the performance of a classification model.

Install and load the ROCR package
Visualize the ROC curve using the plot function
Comparing an ROC Curve Using the caret Package3:43
In this video we will use the function provided by the caret package to compare different algorithm-trained models on the same dataset.

Install and load the pROC library
Generate the ROC curve of each model, and plot the curve
Measuring Performance Differences between Models with the caret Package3:40
In this video we will see how to measure performance differences between fitted models with the caret package.

Resample the three generated models and obtain its summary
Plot the re-sampling result in the ROC metric or box-whisker plot
Test Your Knowledge

Classifying Data with the Bagging Method7:07
The adabag package implements both boosting and bagging methods. For the bagging method, the package first generates multiple versions of classifiers, and then obtains an aggregated classifier. Let’s learn the bagging method from adabag to generate a classification model.

Install the adabag package and use the bagging function
Generate the classification model
Obtain a classification table and average error
Performing Cross Validation with the Bagging Method1:56
To assess the prediction power of a classifier, you can run a cross validation method to test the robustness of the classification model. This video will show how to use bagging.cv to perform cross validation with the bagging method.

Use bagging.cv to perform cross-validation
Obtain the confusion matrix
Retrieve the minimum estimation error
Classifying Data with the Boosting Method6:04
Boosting starts with a simple or weak classifier and gradually improves it by reweighting the misclassified samples. Thus, the new classifier can learn from previous classifiers. One can use the boosting method to perform ensemble learning. Let’s see how to use the boosting method to classify the telecom churn dataset.

Use the boosting function from the adabag package
Make a prediction based on the boosted model and testing dataset
Retrieve the classification table and obtain average errors
Performing Cross Validation with the Boosting Method2:06
Similar to the bagging function, adabag provides a cross validation function for the boosting method, named boosting.cv. In this video, we will learn how to perform cross-validation using boosting.cv.

Use boosting.cv to cross-validate the training dataset
Obtain the confusion matrix
Retrieve the average errors
Classifying Data with Gradient Boosting7:09
Gradient boosting creates a new base learner that maximally correlates with the negative gradient of the loss function. One may apply this method on either regression or classification problems. But first, we need to learn how to use gbm.

Install the gbm package and use the gbm function to train a training dataset
Use cross-validation and plot the ROC curve
Use the coords function and obtain a classification table from the predicted results
Calculating the Margins of a Classifier5:30
A margin is a measure of certainty of a classification. It calculates the difference between the support of a correct class and the maximum support of an incorrect class. This video will show us how to calculate the margins of the generated classifiers.

Use the margins function
Use the plot function to plot a marginal cumulative distribution graph
Compute the percentage of negative margin
Calculating the Error Evolution of the Ensemble Method2:18
The adabag package provides the errorevol function for a user to estimate the ensemble method errors in accordance with the number of iterations. Let’s explore how to use errorevol to show the evolution of errors of each ensemble classifier.

Use the errorevol function for error evolution of boosting classifiers
Use the errorevol function for error evolution of bagging classifiers
Classifying Data with Random Forest7:01
Random forest grows multiple decision trees which will output their own prediction results. The forest will use the voting mechanism to select the most voted class as the prediction result. In this video, we illustrate how to classify data using the randomForest package.

Install and load the randomForest package
Plot the mean square error of the forest object
Use the varImpPlot function, the margin function, hist, and boxplot
Estimating the Prediction Errors of Different Classifiers4:34
At the beginning of this section, we discussed why we use ensemble learning and how it can improve the prediction performance. Let’s now validate whether the ensemble model performs better than a single decision tree by comparing the performance of each method.

Estimate the error rate of the bagging model
Estimate the error rate of the boosting method
Estimate the error rate of the random forest model
Use churn.predict and estimate the error rate of single decision tree
Test Your Knowledge

Clustering Data with Hierarchical Clustering7:48
Hierarchical clustering adopts either an agglomerative or a divisive method to build a hierarchy of clusters. This video shows us how to cluster data with the help of hierarchical clustering.

Load the data and save it
Examine the dataset structure
Use agglomerative hierarchical clustering to cluster data
Cutting Trees into Clusters3:29
In this video we demonstrate how to use the cutree function to separate the data into a given number of clusters.

Categorize the data and examine its cluster labels
Count the number of data within each cluster
Visualize how data is clustered
Clustering Data with the k-Means Method4:10
In this video, we will demonstrate how to perform k-means clustering on the customer dataset.

Use k-means to cluster the data
Inspect the center of each cluster
Draw a scatter plot of data and color the points
Drawing a Bivariate Cluster Plot3:31
We will now illustrate how to create a bivariate cluster plot.

Install and load the cluster package
Draw a bivariate cluster plot
Comparing Clustering Methods4:15
In this video we will see how to compare different clustering methods using cluster.stat from the fpc package.

Install and load the fpc package
Use different clustering methods
Generate the cluster statistics of each clustering method
Extracting Silhouette Information from Clustering2:40
In this video we will see how to compute silhouette information.

Use k-means to generate a k-means object
Compute and plot the silhouette information
Obtaining the Optimum Number of Clusters for k-Means2:48
In this video we will discuss how to find the optimum number of clusters for the k-means clustering method.

Calculate the withinss of different numbers of clusters and plot them
Calculate the average silhouette and plot it
Clustering Data with the Density-Based Method6:42
In this video, we will demonstrate how to use DBSCAN to perform density-based clustering.

Install and load the fpc and mlbench packages
Cluster data with regard to its density measurement
Clustering Data with the Model-Based Method4:39
In this video, we will demonstrate how to use the model-based method to determine the most likely number of clusters.

Install and load the mclust library
Perform model-based clustering on the customer dataset
Visualizing a Dissimilarity Matrix3:23
A dissimilarity matrix can be used as a measurement for the quality of a cluster. In this video, we will discuss some techniques that are useful to visualize a dissimilarity matrix.

Install and load the seriation package
Visualize the dissimilarity matrix
Validating Clusters Externally4:11
In this video, we will demonstrate how clustering methods differ with regard to data with known clusters.

Install and load the package png
Perform k-means and the dbscan clustering method on the handwriting digits
Test Your Knowledge

Transforming Data into Transactions2:58
Before starting with a mining association rule, you need to transform the data into transactions. This video will show how to transform any of a list, matrix, or data frame into transactions.

Install and load the arule package
Use the as function
Transform the matrix-format data and data-frame-format dataset into transactions
Displaying Transactions and Associations2:14
The arule package uses its own transactions class to store transaction data. As such, we must use the generic function provided by arule to display transactions and association rules. Let’s see how to display transactions and association rules via various functions in the arule package.

Obtain a LIST representation and use the summary function
Use the inspect function and filter transactions by size
Use the image function and itemFrequenctPlot
Mining Associations with the Apriori Rule7:24
Association mining is a technique that can discover interesting relationships hidden in transaction datasets. This approach first finds all frequent itemsets and then generates strong association rules from frequent itemsets. In this video, we see how to perform association analysis using the apriori rule.

Load the Groceries dataset and examine the summary
Use itemFrequencyPlot and apriori
Inspect the first few rules
Pruning Redundant Rules2:25
Among the generated rules, we sometimes find repeated or redundant rules (for example, one rule is the subset of another rule). Let’s explore how to prune (or remove) repeated or redundant rules.

Find redundant rules
Remove redundant rules
Visualizing Association Rules5:06
Besides listing rules as text, you can visualize association rules, making it easier to find the relationship between itemsets. In this video, we will learn how to use the aruleViz package to visualize the association rules.

Install and load the arulesViz package
Make a scatter plot from the pruned rules and add jitter to it
Plot soda_rule in a graph plot and a ballon plot
Mining Frequent Itemsets with Eclat3:36
An apriori algorithm performs a breadth-first search to scan the database. So, support counting becomes time consuming. Alternatively, if the database fits into the memory, you can use the Eclat algorithm, which performs a depth-first search to count the supports. Let’s see how to use the Eclat algorithm.

Use the eclat function to generate a frequent itemset
Obtain the summary information
Examine the top ten support frequent itemsets
Creating Transactions with Temporal Information2:41
In addition to mining interesting associations within the transaction database, we can mine interesting sequential patterns using transactions with temporal information. This video demonstrates how to create transactions with temporal information.

Install and load the arulesSequences package
Turn the list into transactions and use the inspect function
Obtain summary information and read transaction data in basket format
Mining Frequent Sequential Patterns with cSPADE4:14
In contrast to association mining, we should explore patterns shared among transactions where a set of itemsets occurs sequentially. One of the most famous frequent sequential pattern mining algorithms is the Sequential Pattern Discovery using Equivalence classes (SPADE) algorithm. Let’s see how to use SPADE to mine frequent sequential patterns.

Use the cspade function to generate frequent sequential patterns
Examine the summary of the frequent sequential patterns
Transform a generated sequence format data back to the data frame
Test Your Knowledge

Requirements

No prior knowledge of R is required

Description

Are you interested in understanding machine learning concepts and building real-time projects with R, but don’t know where to start? Then, this is the perfect course for you!

The aim of machine learning is to uncover hidden patterns, unknown correlations, and find useful information from data. In addition to this, through incorporation with data analysis, machine learning can be used to perform predictive analysis. With machine learning, the analysis of business operations and processes is not limited to human scale thinking; machine scale analysis enables businesses to capture hidden values in big data.

Machine learning has similarities to the human reasoning process. Unlike traditional analysis, the generated model cannot evolve as data is accumulated. Machine learning can learn from the data that is processed and analyzed. In other words, the more data that is processed, the more it can learn.

R, as a dialect of GNU-S, is a powerful statistical language that can be used to manipulate and analyze data. Additionally, R provides many machine learning packages and visualization functions, which enable users to analyze data on the fly. Most importantly, R is open source and free.

Using R greatly simplifies machine learning. All you need to know is how each algorithm can solve your problem, and then you can simply use a written package to quickly generate prediction models on data with a few command lines.

By taking this course, you will gain a detailed and practical knowledge of R and machine learning concepts to build complex machine learning models.

What details do you cover in this course?

We start off with basic R operations, reading data into R, manipulating data, forming simple statistics for visualizing data. We will then walk through the processes of transforming, analyzing, and visualizing the RMS Titanic data. You will also learn how to perform descriptive statistics.

This course will teach you to use regression models. We will then see how to fit data in tree-based classifier, Naive Bayes classifier, and so on.

We then move on to introducing powerful classification networks, neural networks, and support vector machines. During this journey, we will introduce the power of ensemble learners to produce better classification and regression results.

We will see how to apply the clustering technique to segment customers and further compare differences between each clustering method.

We will discover associated terms and underline frequent patterns from transaction data.

We will go through the process of compressing and restoring images, using the dimension reduction approach and R Hadoop, starting from setting up the environment to actual big data processing and machine learning on big data.

By the end of this course, we will build our own project in the e-commerce domain.

This course will take you from the very basics of R to creating insightful machine learning models with R.

We have combined the best of the following Packt products:

R Machine Learning Solutions by Yu-Wei, Chiu (David Chiu)
Machine Learning with R Cookbook by Yu-Wei, Chiu (David Chiu)
R Machine Learning By Example by Raghav Bali and Dipanjan Sarkar

Testimonials:

The source content have been received well by the audience. Here is a one of the reviews:

"good product, I enjoyed it"

- Ertugrul Bayindir

Meet your expert instructors:

Yu-Wei, Chiu (David Chiu) is the founder of LargitData a startup company that mainly focuses on providing big data and machine learning products. He has previously worked for Trend Micro as a software engineer, where he was responsible for building big data platforms for business intelligence and customer relationship management systems.

Dipanjan Sarkar is an IT engineer at Intel, the world's largest silicon company, where he works on analytics, business intelligence, and application development. His areas of specialization includes software engineering, data science, machine learning, and text analytics.

Raghav Bali has a master's degree (gold medalist) in IT from the International Institute of Information Technology, Bangalore. He is an IT engineer at Intel, the world's largest silicon company, where he works on analytics, business intelligence, and application development.

Meet your managing editor:

This course has been planned and designed for you by me, Tanmayee Patil. I'm here to help you be successful every step of the way, and get maximum value out of your course purchase. If you have any questions along the way, you can reach out to me and our author group via the instructor contact feature on Udemy.

Who this course is for:

If you are interested in understanding machine learning concepts and building real-time projects with R, then this is the perfect course for you!

R: Complete Machine Learning Solutions

What you'll learn

Explore related topics

Course content

Getting Started with R9 lectures • 41min

Data Exploration with RMS Titanic8 lectures • 32min

R and Statistics12 lectures • 45min

Understanding Regression Analysis13 lectures • 42min

Classification (I) – Tree, Lazy, and Probabilistic11 lectures • 41min

Classification (II) – Neural Network and SVM10 lectures • 33min

Model Evaluation12 lectures • 38min

Ensemble Learning9 lectures • 44min

Clustering11 lectures • 48min

Association Analysis and Sequence Mining8 lectures • 31min

Requirements

Description

Who this course is for: