Training Sets, Test Sets, R, and ggplot
4.7 (226 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
5,991 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Training Sets, Test Sets, R, and ggplot to your Wishlist.

Add to Wishlist

Training Sets, Test Sets, R, and ggplot

How to evaluate regression model performance in R
4.7 (226 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
5,991 students enrolled
Last updated 6/2015
English
Price: Free
Includes:
  • 1.5 hours on-demand video
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
randomly divide a data set into a training set and a test set
calculate the test MSE (mean squared error)
calculate quickly the MSE for a number of models
visualize the variability of the MSE with ggplot
row-slice data frames
use R's predict function
write for loops in R
write functions of two variables in R
combine functions and for loops
add titles and labels to plots in ggplot
View Curriculum
Requirements
  • It is necessary that the students have the background one would get by viewing my two Udemy courses on linear and polynomial regression.
  • Students will need to have R and RStudio installed on their own computers.
Description

In this course, I show you how to evaluate the performance of a regression model using training sets and test sets. We will use R and ggplot as our tools. Along the way, we will learn how to row-slice data frames, use the predict function in R, and add titles and labels to our plots. We will also work on our programming skills by learning how to write for loops and functions of two variables.

Students should have the background in R, ggplot, and regression equivalent to what one would have after viewing my two Udemy courses on linear and polynomial regression. At a relaxed pace, it should take about two weeks to complete the course.

Who is the target audience?
  • This course is for those looking to improve their R programming skills.
  • This course is for those with the background equivalent to what one would have after viewing my first two Udemy courses in linear and polynomial regression.
Students Who Viewed This Course Also Viewed
Curriculum For This Course
Expand All 15 Lectures Collapse All 15 Lectures 01:30:25
+
Training and Test Sets
7 Lectures 42:28

In this video, I introduce the course.

Introduction
01:28

After this lecture, you will be able to extract specific rows from a data frame. You will also be able to sample from a given set of integers. Putting all of this together, you will be able to randomly divide a data frame into two distinct data frames, each of the same size.

Row-slicing Data Frames
08:01

In this lecture, I generate plots of both the training set and the test set. I also show you how to add a title to your plots in ggplot.

Plotting the Training and Test Sets
07:37

After viewing this lecture, you will be able to use R's predict function and ggplot to obtain a plot of the least-squares line. You will also begin thinking about applying the least-squares line, generated from the training data, to the test data.

Plotting the Least-Squares Line
09:43

After viewing this video, you will be able to use R's predict function to calculate the test mean squared error.

Calculating the Test MSE
05:58

After viewing this video, you will be able to use R's predict function to plot the quadratic polynomial that fits the training data set the best. We will have to do this by writing our own function and using the stat_function in ggplot.

Generating a Quadratic Model
06:59

After viewing this video, you will be able to use R's predict function to calculate the test MSE

for the Quadratic Model.

.
Calculating the Test MSE for the Quadratic Model
02:42
+
More with the MSE
8 Lectures 47:57

After viewing this lecture, you will be able to write for loops in R.

For Loops
03:30

After viewing this lecture, it will be easier for you to generate higher-degree polynomial models.

lm Revisited
03:31

After viewing this video, you will be able to quickly generate test MSE's for higher degree polynomials, using a bit of programming.

MSE via a For Loop
07:39

After viewing this video, you will be able to generate a plot, via ggplot, of polynomial degree vs. MSE.

Visualizing the MSE's
05:41

After viewing this lecture, you will be able to write functions of two variables in R.

Functions of Two Variables
01:59

In this video, we will take our for loop and set it inside a function of two variables.

For Loop inside a Function
07:44

After viewing this lecture, you will be able to repeatedly divide the original data set into training and test sets, calculate the test MSE's for a range of polynomial models, and plot the results. You will do this with the help of a bit of programming.

Variability of the Test MSE
15:29

In this lecture, I mention some problems associated with the method we have been discussing throughout the course. I also give information about an excellent resource on machine learning.

Course Wrap-up
02:24
About the Instructor
Charles Redmond
4.6 Average rating
1,437 Reviews
19,458 Students
7 Courses
Professor at Mercyhurst University

Dr. Charles Redmond is a professor in the Tom Ridge School of Intelligence Studies and Information Science at Mercyhurst University. He has been a member of the Department of Mathematics and Computer Systems at Mercyhurst for 21 years and has recently completed a term as chair of the department. Dr. Redmond received his PhD in mathematics from Lehigh University in 1993 and has published in the Annals of Applied Probability, the Journal of Stochastic Processes and Their Applications, Mathematics Magazine, the College Mathematics Journal, and Mathematics Teacher. In his spare time he enjoys making music and computer generated art, reading, and owning a Clumber Spaniel.