Python has become one of any data scientist's favorite tools for doing Predictive Analytics. In this hands-on course, you will learn how to build predictive models with Python.
During the course, we will talk about the most important theoretical concepts that are essential when building predictive models for real-world problems. The main tool used in this course is scikit -learn, which is recognized as a great tool: it has a great variety of models, many useful routines, and a consistent interface that makes it easy to use. All the topics are taught using practical examples and throughout the course, we build many models using real-world datasets.
By the end of this course, you will learn the various techniques in making predictions about bankruptcy and identifying spam text messages and then use our knowledge to create a credit card using a linear model for classification along with logistic regression.
About the author
Alvaro Fuentes is a Data Scientist with an M.S. in Quantitative Economics and a M.S. in Applied Mathematics with more than 10 years of experience in analytical roles. He worked in the Central Bank of Guatemala as an Economic Analyst, building models for economic and financial data. He founded Quant Company to provide consulting and training services in Data Science topics and has been a consultant for many projects in fields such as; Business, Education, Psychology and Mass Media. He also has taught many (online and in-site) courses to students from around the world in topics like Data Science, Mathematics, Statistics, R programming and Python.
Alvaro Fuentes is a big Python fan and has been working with Python for about 4 years and uses it routinely for analyzing data and producing predictions. He also has used it in a couple of software projects. He is also a big R fan, and doesn't like the controversy between what is the “best” R or Python, he uses them both. He is also very interested in the Spark approach to Big Data, and likes the way it simplifies complicated things. He is not a software engineer or a developer but is generally interested in web technologies.
He also has technical skills in R programming, Spark, SQL (PostgreSQL), MS Excel, machine learning, statistical analysis, econometrics, mathematical modeling.
Predictive Analytics is a topic in which he has both professional and teaching experience. Having solved practical problems in his consulting practice using the Python tools for predictive analytics and the topics of predictive analytics are part of a more general course on Data Science with Python that he teaches online.
Explain what the Anaconda Distribution is and why we are using it in this course. Also to show how to get and install the software.
Introduce the computing environment in which we will work for the rest of the course.
Explain what is NumPy, the problem it solves and why is important for Python’s Data Stack. Also show some of the most common ways to create ndarrays and how to operate with them.
Explain what Pandas is and what we can do with it. Talk about the main objects in this library, that is, Series and DataFrames.
Explain to the viewer what is matplotlib and what are the main concepts used when working with this library.
Show some of the visualization capabilities included in pandas objects and how we can modify some elements of a pandas plot with matplotlib.
Introduce the Seaborn library and show some of the specialized and complex statistical visualizations that can be produced with this library.
Explain to the viewer the definition of term Predictive Analytics and how it is different from other forms of making predictions.
Since Predictive Analytics is the used of Data combined with quantitative tools it is possible to distinguish between three approaches for doing Predictive Analytics: mathematical, statistical and machine learning Models, in this video we explain the difference between them.
Explain the main categories of Machine Learning: supervised and unsupervised learning. Briefly mention reinforcement learning.
Explain the distinction between the two types of problems that can be found in Supervised Learning, that is, Regression and Classification.
Provide a clear definition for the terms model and algorithm and their relation with the term learning model. Also give the 3 conditions we must check before using Machine Learning for doing Predictive Analytics.
Present the scikit-learn library and make a demonstration of how to use it to build a predictive model.
Present to the viewer the Multiple Regression Model and explain at a high level the general formulation of the model and the scikit-learn class that is used to build these types of models.
Explain the principle behind the KNN model for regression; present the general steps of the algorithm using a simplified example. Introduce the class used in scikit-learn to produce these models.
Explain the construction of the lasso regression and compare it to the multiple regression model, show the formulation of the model and the modification to the optimization objective. Introduce the class used in scikit-learn to produce these models.
In this video we show how to evaluate regression models, give a short list of the metrics and explain the MSE. Then we explain the intuition behind the concepts of cross-validation, overfitting and regularization.
Demonstrate how to build, evaluate and compare different predictive models for predicting diamond prices and use the best model to make predictions.
Demonstrate how to build, evaluate and compare different predictive models for predicting crime in United States communities and use the best model to make predictions.
Demonstrate how to build, evaluate and compare different predictive models for predicting post popularity and use the best model to make predictions. Also talk about some of the common challenges found when building predictive models.
Mention the types of classification tasks. Then talk intuitively about the Logistic Regression model. Also mention some methods of the the LogisticRegression object from scikit-learn.
Provide an intuitive understanding of how classification trees work, how to interpret these models and how they come up with the decision rules.
Explain at a very high level where the Naïve Bayes models come from and give some of the general characteristics of these models. Talk about the two types of Naïve Bayes that can be used in scikit-learn.
Explain the different kinds of evaluation metrics for classification models. Explain the confusion matrix and the main metrics derived from it: accuracy, precision and recall.
Demonstrate how to build, evaluate and compare different classification models for predicting credit card default and use the best model to make predictions.
Demonstrate how to build, evaluate and compare different classification models for predicting bankruptcy for European companies and use the best model to make predictions.
Demonstrate how to build a spam classifier using the Bag of Words model and the Naïve Bayes model. Use the model to predict the class of actual text messages.
Mention briefly some Predictive Analytics that were not addressed in the course, namely, ensemble methods, working with features, hyper-parameter tuning, neural networks, and deep learning.
Packt has been committed to developer learning since 2004. A lot has changed in software since then - but Packt has remained responsive to these changes, continuing to look forward at the trends and tools defining the way we work and live. And how to put them to work.
With an extensive library of content - more than 4000 books and video courses -Packt's mission is to help developers stay relevant in a rapidly changing world. From new web frameworks and programming languages, to cutting edge data analytics, and DevOps, Packt takes software professionals in every field to what's important to them now.
From skills that will help you to develop and future proof your career to immediate solutions to every day tech challenges, Packt is a go-to resource to make you a better, smarter developer.
Packt Udemy courses continue this tradition, bringing you comprehensive yet concise video courses straight from the experts.