Poly Regression: Data Visualization Tutorial

Dr. Ryan Ahmed, Ph.D., MBA
A free video tutorial from Dr. Ryan Ahmed, Ph.D., MBA
Professor & Best-selling Udemy Instructor, 100K+ students
4.5 instructor rating • 26 courses • 200,379 students

Learn more from the full course

Machine Learning Regression Masterclass in Python

Build 8+ Practical Projects and Master Machine Learning Regression Techniques Using Python, Scikit Learn and Keras

10:19:57 of on-demand video • Updated October 2020

  • Master Python programming and Scikit learn as applied to machine learning regression
  • Understand the underlying theory behind simple and multiple linear regression techniques
  • Apply simple linear regression techniques to predict product sales volume and vehicle fuel economy
  • Apply multiple linear regression to predict stock prices and Universities acceptance rate
  • Cover the basics and underlying theory of polynomial regression
  • Apply polynomial regression to predict employees’ salary and commodity prices
  • Understand the theory behind logistic regression
  • Apply logistic regression to predict the probability that customer will purchase a product on Amazon using customer features
  • Understand the underlying theory and mathematics behind Artificial Neural Networks
  • Learn how to train network weights and biases and select the proper transfer functions
  • Train Artificial Neural Networks (ANNs) using back propagation and gradient descent methods
  • Optimize ANNs hyper parameters such as number of hidden layers and neurons to enhance network performance
  • Apply ANNs to predict house prices given parameters such as area, number of rooms..etc
  • Assess the performance of trained Machine learning models using KPI (Key Performance indicators) such as Mean Absolute error, Mean squared Error, and Root Mean Squared Error intuition, R-Squared intuition, Adjusted R-Squared and F-Test
  • Understand the underlying theory and intuition behind Lasso and Ridge regression techniques
  • Sample real-world, practical projects
English [Auto] Hello everyone and welcome to this lecture. I'm super excited because now we're getting a little bit closer to actually visualizing the data and training our model as well using polynomial regression. So the first step is what we wanted to do is we wanted to visualize the data set in this lecture in the previous lecture. We mainly covered how can we import all our libraries and import all our datasets using append this data free. So here we have simply our salary which is a data frame that contains all the information our data contains. It consists of two columns our number of years of experience versus salary. And we're able to we're able to obtain the head and the tail indicating the the first couple of samples last couple of samples and let's go ahead and visualize the data using seaboard in this section. All right let's go ahead. So going to say OK as an S dot joint plot and I'm going to pass along X equals to a number of years. My apologies we have to make sure that years of experience actually exactly matches the column he is we have years of experience. Please make sure that this is an upper case upper case as well. And this is the first variable and then we're going to plot y equals two. That will be our salary. Right. And we have to specify the source of the data. So our data source is equal to our salary information. All right let's spaceship enter on that. And here we go. Looks great. So the number of years of experience as we increase the number of years of experience here you will find that the actions actually essentially move forward as will increase as well. And you will find here that the data actually simple linear model won't make one. Like I won't be a good fit in this case which means that we need kind of to step up our game and you know move to a polynomial regression instead of simple linear regression. All right. The next step is we're going to say s an S dot l m plots. Then I'm going to pass along X equals to the number of years of experience and please make sure that this is y uppercase and we're going to pass along our y value which is going to be our salary information and then we're going to pass our data which is a source we're going to be our salary data frame. All right. If you get a call from previous sections that Ellen plot we're going to be used to simply plot kind of a quick estimate of the best straight line that can fit the data. So apparently here this is our state line and the state line is basically terrible. Right. So it doesn't fit the data pretty well which makes sense because now we actually need to kind of step up our game a little bit and use polynomial regression instead of simple linear regression. All right. Now it's time for a quick challenge when I'm asking you guys to do is I'm asking you to two tasks actually. First one is to do S.A. to join plot but instead of plotting the number of years of experience versus salary I want you guys to do it the other way around. I wanted to plot the salary versus the number of years of experience first and the second one I want you guys to use pare plot as we have done before to just plot all the information all the data using the paired plot. Please go ahead pause the video and I'll see you guys after the challenge. All right. I hope you guys were able to figure out the challenge yourself. So I ask you guys to do is to simply go here to the joint plot let's copy that. Let's put it here and I'm asking you instead of having years of experience on the x axis we're gonna put the salad here on the x axis and instead of Y we're going to put the number of years of experience and again make sure this an upper case. And here we go astronauts. And here we go. So here we have the salary. And here you have the number of years of experience and obviously the data as well of the relationship is still linear. And the next step or the next challenge I ask you guys to use pair plots to view all the data. So going to say OK an dot pair plots and we'll simply get a pass. Our salary information in there which are the entire data free and paid plot will take care of everything for us. All right. Let's shift enters. And here you go. So pay plot. Actually I personally prefer to pay plot because it's kind of you know like you don't need to plot each two variable separately if you're just going to try all the different combinations for all the data. So here you will find the number of years of experience and he'd have the salary and that's the curve going up in this fashion. Again as you increase number of years of experience then the salary increases. This is basically the exact same curve as we've plotted here right. OK. And then here this is simply our salary versus the user experience which is the curve that I ask guys to plot here. Again you don't need to do them separately. You can just use per plot pass along the data frame and you're good to go. All right. And here it will show you the distribution for the number of years of experience. You will find that here their average is that I want 10 years of experience and the salary here is around let's say around a hundred thousand dollars per say mean value. And if it does go up actually we have this information somewhere in here. So the salary mean is at one hundred and eleven thousand which makes no sense which pretty much matches our information or matches our data in here. All right. So let's go. Keep going. And step four. We're going to learn how can to create. We're going to create our training dataset. Right. So we're gonna say okay if X which is our input to the models we're going to be our salary information. And here we're going to go ahead in tier and pass along our years of experience that will be our input to the model. Let's run that. Let's take a look at X and this is our X input. That's basically our number of years of experience the first column right. And we have two thousand samples 2000 rows by one column. Right. Let's actually move this. And if you want to take a look at it let's take a look at X dot shape. If you take a look at it it's two thousand by one looks perfect the next step. Then I say Okay y equals two. That would be our salary. And if we go inside it and if we go to our column salary again if you go up to the data frame if it's found somewhere in here you'll find that we have two columns the years of experience versus the salary. So simply because the years of experience is our independent variable that will be our X our input and salary we're going to be our output which is our dependent variable already. So let's go down here. And that's our y information that's run it looks good. Take a look at why. All right. This is our Y variable looks perfect. And the last step is. And that's actually very important. And this very important note here in this polynomial regression what I'm going to divide the data into training the training and testing what we're going to do they're going to try just to get the best fit model just using the Empire data as a training dataset. Again this is a very simple example. In the future when we move to kind of more advanced models we have we're going to be performing the division we can divide the data into training and testing here. We're not going to be doing this. We're just gonna use the entire data for training as you guys can see you see extreme equals to X and here you're going to say why on the score train we're going to be equal to Y. Let's run it. And here you go. That's pretty much it. All right. Okay. That's all what we have for the section. Let's go up and recap all we have done so far. So in the section we're able to simply visualize the data so visual visual the data here using seaboard using joint plot using l m plot and we realize that straight line would be kind of a kind of a failure here. And that's why we need to go to a polynomial regression model we use as well joint plot in the mini challenge to plot the salary versus the years of experience. And we also use PID plot to visualize all the data kind of a one stop shop. And we're also able to create our training dataset here as you guys can see here and we are pretty much ready with simply extreme and why chain to go ahead and train our model in the next section. I'm going to walk you through the first solution which is assuming a linear assumption and then following section I'm going to show you the polynomial regression which is kind of the end the best model that would be able to obtain. And that's it. I hope you guys enjoy this lecture and see you in the next lecture.