Poly Regression: Data Visualization Tutorial

Dr. Ryan Ahmed, Ph.D., MBA
A free video tutorial from Dr. Ryan Ahmed, Ph.D., MBA
Professor & Best-selling Instructor, 250K+ students
4.6 instructor rating • 45 courses • 297,641 students

Lecture description

Learn more from the full course

Machine Learning Regression Masterclass in Python

Build 8+ Practical Projects and Master Machine Learning Regression Techniques Using Python, Scikit Learn and Keras

10:19:57 of on-demand video • Updated June 2022

  • Master Python programming and Scikit learn as applied to machine learning regression
  • Understand the underlying theory behind simple and multiple linear regression techniques
  • Apply simple linear regression techniques to predict product sales volume and vehicle fuel economy
  • Apply multiple linear regression to predict stock prices and Universities acceptance rate
  • Cover the basics and underlying theory of polynomial regression
  • Apply polynomial regression to predict employees’ salary and commodity prices
  • Understand the theory behind logistic regression
  • Apply logistic regression to predict the probability that customer will purchase a product on Amazon using customer features
  • Understand the underlying theory and mathematics behind Artificial Neural Networks
  • Learn how to train network weights and biases and select the proper transfer functions
  • Train Artificial Neural Networks (ANNs) using back propagation and gradient descent methods
  • Optimize ANNs hyper parameters such as number of hidden layers and neurons to enhance network performance
  • Apply ANNs to predict house prices given parameters such as area, number of rooms..etc
  • Assess the performance of trained Machine learning models using KPI (Key Performance indicators) such as Mean Absolute error, Mean squared Error, and Root Mean Squared Error intuition, R-Squared intuition, Adjusted R-Squared and F-Test
  • Understand the underlying theory and intuition behind Lasso and Ridge regression techniques
  • Sample real-world, practical projects
English [Auto] Hello, everyone, and welcome to this lecture. I'm super excited because now we're getting a little bit closer to actually visualizing the data and training our model as well, using polynomial regression. So the first step is we wanted to do is we wanted to visualize the data set in this lecture and the previous lecture we mainly covered how can we import all our libraries and import all our data sets using pendas data frame? So here we have simply our salary, which is a data frame that contains all the information our data contains, consists of two columns, our number of years of experience versus salary and what we're able to obtain the head and the tail indicating the the first couple of samples, last couple of samples. And let's go ahead and visualize the data using Seabourne in this section. All right. Let's go ahead. So I'm going to say, OK, since that joint plot and I'm going to pass along X equals to a number of years, my apologies. We have to make sure that years of experience. Actually, exactly matches the column here. Who have years of experience, please make sure that this is an upper case, this upper case as well, and this is the first variable. And then we're going to plot Y equals two. That will be our salary. Right. And we have to specify the source of the data. So our data source is equal to our salary information. All right, let's enter on that. And here we go. Looks great. So the number of years of experience as we increase the number of years of experience here, you will find that the actual salary move forward as will increase as well. And you will find here that the data actually simple linear model won't make one like one be a good fit in this case, which means that we need kind of to step up our game and, you know, move to a polynomial regression instead of simple linear regression. All right. The next step is we're going to say S.A.S., that alleged plot, then we're going to pass along X equals to the number of years of experience. And please make sure that this is why uppercase and we're going to pass along our white value, which is. We're going to be our salary information. And then we're going to pass our data, which is a source, we're going to be our salary data frame. All right. If you guys recall from previous sections that alleged plot, we're going to be used to simply plot kind of a quick estimate of the best straight line that can fit the data. So apparently here, this is our state line and the state line is basically terrible, right? So it doesn't fit the data pretty well, which makes sense, because now we actually need to kind of step up our game a little bit and use polynomial regression instead of simple linear regression. All right. Now it's time for a quick challenge. What I'm asking you guys to do is I'm asking you to two tasks, actually. First one is to do S.A.S. to join plot, but instead of plotting the number of years of experience versus salary, I want you guys to do it the other way around. I want you to plot the salary versus the number of years of experience first. And the second one I want you guys to use per plot, as we have done before, to just plot all the information, all the data using the plot. Please go ahead, pause the video and I'll see you guys after the challenge. All right, I hope you guys were able to figure out the challenge yourself, so we'll ask you guys to do is to simply go here with the joint plotless. Copy that. Let's put it here. And I'm asking you, instead of having years of experience on the x axis, we're going to put the salad here on the x axis and instead of Y, we're going to put the number of years of experience. And again, please make sure there's an upper case and here we go and here we go. So here we have the salary and he drew up the number of years of experience. And obviously the data as well as the relationship is still non-linear. And the next step or the next challenge, I ask you guys to use peer plots to view all the data. So I'm going to say, OK, that's a.. Dot pair plots and we're simply going to pass our salary information in there, which are the entire data frame and plot will take care of everything for us. All right, let's shift. Enter. And here you go. So actually, I personally prefer plot because it's kind of, you know, like you don't need to plot each to variable separately if you're just going to try all the different combinations for all the data. So you would find the number of years of experience and he'll have the salary. And that's the curve going up in this fashion again, as you increase the number of years of experience and salary increases. This is basically the exact same curve as we plotted here. Right. OK, and then here, this is simply our salary versus the user experience, which is the curve that I ask you guys to plot here. Again, you don't need to do them separately. You can just use per plot, pass along the data frame and you're good to go. All right. And here it will show you the distribution for the number of years of experience. You will find that here the average is around 10 years experience and the salary here is around, let's say, around a hundred thousand dollars per say, mean value. And if you guys go up, actually, we have this information somewhere in here. So the salary I mean, is that one hundred and eleven thousand, which makes kind of sense, which pretty much matches our information or matched our data in here. All right. So let's go. Keep going. And step four, we're going to learn how can to create we're going to create our trading data set. Right. So we're going to say, OK, if X, which is our input to the model, we're going to be our selling information. And here we're going to go ahead in here and pass along our years of experience. That would be our input to the model, let's run that. Let's take a look at X and this is our X input. That's basically our number of years of experience. The first column. Right. And we have 2000 samples, 2000 rolls by one column. Right. Actually, remove this. And if you want to take a look at it, let's take a look at the chape. If you take a look at it, it's 2000 by what looks perfect. The next step is basically Y equals two, that will be our salary, and if we go inside it and if we go to our Collum salary again, if you go up to the data frame, if it's found somewhere in here, you'll find that we have two columns, the years of experience versus salary. So simply because the years of experience as our independent variable, that will be our X, our input and salary, we're going to be our output, which is our dependent variable already. So let's go down here. And that's our why information, let's run, it looks good. Let's take a look at why. All right. This is our way. Variable looks perfect. And the last step is and that's actually very important. And this is very important note here in this polynomial regression, we're not going to divide the data into training, the training and testing. What we're going to do, they're going to try just to get the best fit model, just using the entire data as a training data set. Again, this is a very simple example. In the future, when we move to kind of more advanced models, we have we're going to be performing the division. We can divide the data into training and testing here. We're not going to be doing this. We're just going to use the entire data for training, as you guys can see, to see extreme. We're going to be equals to X and we're going to say why on this train? We're going to be equals to Y, let's run it. And here you go. That's pretty much it. All right. OK, that's all we have for the section. Let's go up and recap what we have done so far. So in this section, we're able to simply visualize the data so we can visualize the data here using Seabourne, using joint plot, using L.M. plot. And we realize that straight line will be kind of a kind of a failure here. And that's why we need to go to a regression model we used as well, joint plot in the mini challenge to plot the salary versus the years of experience. And and we also used PPIF plot to visualize all the data, kind of a one stop shop. And we're also able to create our training data sets here, as you guys can see here. And we are pretty much ready with simply extreme and we're trained to go ahead and train our model. And the next section, I'm going to walk you through the first solution, which is assuming a linear assumption. And the following section, I'm going to show you the polynomial regression, which is kind of, you know, the best model that would be able to obtain. And that's it. I hope you guys enjoy this lectures and see you in the next lecture.