Training Sets, Test Sets, R, and ggplot

How to evaluate regression model performance in R
4.5 (181 ratings) Instead of using a simple lifetime average, Udemy calculates a
course's star rating by considering a number of different factors
such as the number of ratings, the age of ratings, and the
likelihood of fraudulent ratings.
5,071 students enrolled
Free
Start Learning Now
  • Lectures 15
  • Length 1.5 hours
  • Skill Level Intermediate Level
  • Languages English
  • Includes Lifetime access
    30 day money back guarantee!
    Available on iOS and Android
    Certificate of Completion
Wishlisted Wishlist

How taking a course works

Discover

Find online courses made by experts from around the world.

Learn

Take your courses with you and learn anywhere, anytime.

Master

Learn and practice real-world skills and achieve your goals.

About This Course

Published 6/2015 English

Course Description

In this course, I show you how to evaluate the performance of a regression model using training sets and test sets. We will use R and ggplot as our tools. Along the way, we will learn how to row-slice data frames, use the predict function in R, and add titles and labels to our plots. We will also work on our programming skills by learning how to write for loops and functions of two variables.

Students should have the background in R, ggplot, and regression equivalent to what one would have after viewing my two Udemy courses on linear and polynomial regression. At a relaxed pace, it should take about two weeks to complete the course.

What are the requirements?

  • It is necessary that the students have the background one would get by viewing my two Udemy courses on linear and polynomial regression.
  • Students will need to have R and RStudio installed on their own computers.

What am I going to get from this course?

  • randomly divide a data set into a training set and a test set
  • calculate the test MSE (mean squared error)
  • calculate quickly the MSE for a number of models
  • visualize the variability of the MSE with ggplot
  • row-slice data frames
  • use R's predict function
  • write for loops in R
  • write functions of two variables in R
  • combine functions and for loops
  • add titles and labels to plots in ggplot

What is the target audience?

  • This course is for those looking to improve their R programming skills.
  • This course is for those with the background equivalent to what one would have after viewing my first two Udemy courses in linear and polynomial regression.

What you get with this course?

Not for you? No problem.
30 day money back guarantee.

Forever yours.
Lifetime access.

Learn on the go.
Desktop, iOS and Android.

Get rewarded.
Certificate of completion.

Curriculum

Section 1: Training and Test Sets
01:28

In this video, I introduce the course.

08:01

After this lecture, you will be able to extract specific rows from a data frame. You will also be able to sample from a given set of integers. Putting all of this together, you will be able to randomly divide a data frame into two distinct data frames, each of the same size.

07:37

In this lecture, I generate plots of both the training set and the test set. I also show you how to add a title to your plots in ggplot.

09:43

After viewing this lecture, you will be able to use R's predict function and ggplot to obtain a plot of the least-squares line. You will also begin thinking about applying the least-squares line, generated from the training data, to the test data.

05:58

After viewing this video, you will be able to use R's predict function to calculate the test mean squared error.

06:59

After viewing this video, you will be able to use R's predict function to plot the quadratic polynomial that fits the training data set the best. We will have to do this by writing our own function and using the stat_function in ggplot.

02:42
After viewing this video, you will be able to use R's predict function to calculate the test MSE

for the Quadratic Model.

.
Section 2: More with the MSE
03:30

After viewing this lecture, you will be able to write for loops in R.

03:31

After viewing this lecture, it will be easier for you to generate higher-degree polynomial models.

07:39

After viewing this video, you will be able to quickly generate test MSE's for higher degree polynomials, using a bit of programming.

05:41

After viewing this video, you will be able to generate a plot, via ggplot, of polynomial degree vs. MSE.

01:59

After viewing this lecture, you will be able to write functions of two variables in R.

07:44

In this video, we will take our for loop and set it inside a function of two variables.

15:29

After viewing this lecture, you will be able to repeatedly divide the original data set into training and test sets, calculate the test MSE's for a range of polynomial models, and plot the results. You will do this with the help of a bit of programming.

02:24

In this lecture, I mention some problems associated with the method we have been discussing throughout the course. I also give information about an excellent resource on machine learning.

Students Who Viewed This Course Also Viewed

  • Loading
  • Loading
  • Loading

Instructor Biography

Charles Redmond, Professor at Mercyhurst University

Dr. Charles Redmond is a professor in the Tom Ridge School of Intelligence Studies and Information Science at Mercyhurst University. He has been a member of the Department of Mathematics and Computer Systems at Mercyhurst for 21 years and has recently completed a term as chair of the department. Dr. Redmond received his PhD in mathematics from Lehigh University in 1993 and has published in the Annals of Applied Probability, the Journal of Stochastic Processes and Their Applications, Mathematics Magazine, the College Mathematics Journal, and Mathematics Teacher. In his spare time he enjoys making music and computer generated art, reading, and owning a Clumber Spaniel.

Ready to start learning?
Start Learning Now