Ensemble models in machine learning with Python
What you'll learn
- Bias variance tradeoff
- What ensemble models are
- Bagging and random forest
- Boosting and XGBoost
Requirements
- Python programming language
Description
In this practical course, we are going to focus on ensemble models in supervised machine learning using Python programming language.
Ensemble models are a particular kind of machine learning model that mixes several models together. The general idea is that a team of models is able to increase the performance of a single one, both in terms of stability (i.e. variance) and in terms of accuracy (i.e. bias). The most common ensemble models are Random Forests and Gradient Boosting Decision Trees, which are explained extensively in the lessons of this course. Other types of ensemble models are voting and stacking, which are more complex procedures that are able to increase the performance of a model.
With this course, you are going to learn:
What bias-variance tradeoff is and how to deal with it
Bagging and some bagging models (like Random Forest)
Boosting and some boosting models (Like XGBoost or AdaBoost)
Voting
Stacking
All the lessons of this course start with a brief introduction and end with a practical example in Python programming language and its powerful scikit-learn library. The environment that will be used is Jupyter, which is a standard in the data science industry. All the Jupyter notebooks are downloadable.
This course is part of my Supervised Machine Learning in Python online course, so you'll find some lessons that are already included in the larger course.
Who this course is for:
- Python developers
- Data scientists
- Computer engineers
- Researchers
- Students
Instructor
My name is Gianluca Malato, I'm Italian and have a Master's Degree cum laude in Theoretical Physics of disordered systems at "La Sapienza" University of Rome.
I'm a Data Scientist who has been working for years in the banking and insurance sector. I have extensive experience in software programming and project management and I have been dealing with data analysis and machine learning in the corporate environment for several years.
I am also skilled in data analysis (e.g. relational databases and SQL language), numerical algorithms (e.g. ODE integration, optimization algorithtms) and simulation (e.g. Monte Carlo techniques).
I've written many articles about Machine Learning, R and Python and I've been a Top Writer on Medium in Artificial Intelligence category.