XGBoost Machine Learning for Data Science and Kaggle

Name: XGBoost Machine Learning for Data Science and Kaggle
Rating: 3.9 (102 reviews)

Master XGBoost machine learning algorithm, join Kaggle contest and start Data Science career

Created byShenggang Li

Last updated 6/2020

English

What you'll learn

How is xgboost algorithm working to predict different model targets
What are the roles that decision trees play in gradient boost and Xgboost modeling
Why XGBoost is so far one of the most powerful and stable machine learning methods in Kaggle contests
How to explain and set appropriate Xgboost modeling parameters
How to apply data exploration, cleaning and preparation for Xgboost method
How to effectively implement the different types of xgboost models using the packages in Python
How to perform feature engineering in Xgboost predictive modeling
How to conduct statistical analysis and feature selection in Xgboost modeling
How to explain and select the typical evaluation measures and model objectives for building Xgboost models
How to perform cross validation and determine the best parameter thresholds
How to proceed parameter tuning in Xgboost model building
How to successfully apply Xgboost into solving various machine learning problems

Course content

6 sections • 63 lectures • 9h 47m total length

What am I teaching in this course5:18
introduction of XGBoost modeling7:54
Walk through gradient boost machine9:09
Introduce advantages and applications of XGBoost (1)5:29
Introduce advantages and applications of XGBoost (2)3:07
Introduce advantages and applications of XGBoost (3)4:43
Introduce advantages and applications of XGBoost (4)4:15

Overview of decision tree modeling5:25
Explain the concepts and components in decision trees5:44
Understand the framework of decision trees6:55
Introduction on decision tree nodes split and growth10:06
Explain how to construct decision tree by examples7:14
Learn Gini and split rules in decision tree modeling9:17
Understand decision tree in terms of two dimensional hyperplane plot8:15
Understand decision tree classifier and regressor (1)5:22
Understand decision tree classifier and regressor (2)5:47
Learn model performance measures (1)6:33
Learn model performance measures (2)9:26
Introduction of Anaconda Installation7:39
The Python programming code and data used in this course3:24
Implement decision tree modeling in Python (1)14:13
Implement decision tree modeling in Python (2)12:09
Implement decision tree modeling in Python (3)11:03

Create first XGBoost model in Python13:28
Lecture on the explanation of XGBoost’s parameters (1)8:03
Lecture on the explanation of XGBoost’s parameters (2)9:11
Lecture on the explanation of XGBoost’s parameters (3)6:28
Lecture on the explanation of XGBoost’s parameters (4)5:32
Lecture on the explanation of XGBoost’s parameters (5)4:23
Lecture on the explanation of XGBoost’s parameters (6)6:49
Lecture on the explanation of XGBoost’s parameters (7)4:49
Build XGBoostClassifier for credit risk score card using Python (1)17:15
Build XGBoostClassifier for credit risk score card using Python (2)11:27
Build XGBoostClassifier for credit risk score card using Python (3)13:08
Lecture on the XGBoost’s fit method and native XGBoost booster (1)12:32
Lecture on the XGBoost’s fit method and native XGBoost booster (2)17:27
Implement native XGBoost booster in Python by examples12:28

Introduction of XGBoost algorithm for multi-classification solutions6:57
Use case of XGBoost for predicting ordinal model objectives in Python16:33
Use case of XGBoost for predicting multi-categorical model objectives20:51
Overview of feature importance and application for XGBoost modeling12:11
Python programs on feature importance and feature selection in XGBoost (1)20:35
Python programs on feature importance and feature selection in XGBoost (2)13:31
Introduce Parameter tuning methods in XGBoost modeling15:42
Introduce online sales forecasting project with XGBoost modeling9:05
Lecture on the Python program of online sales XGBoost modeling (1)13:01
Lecture on the Python program of online sales XGBoost modeling (2)8:40
Lecture on the Python program of online sales XGBoost modeling (3)13:38
Lecture on the Python program of online sales XGBoost modeling (4)11:09
Lecture on the Python program of online sales XGBoost modeling (5)12:05

What you have learned in this course and some supplementary materials2:51
Summary on feature engineering in XGBoost modeling6:12
Summary on feature standardization in XGBoost modeling6:22
Summary on handling categorical and missing data in XGBoost modeling4:31
Summary on feature selection in XGBoost modeling3:30
Summary on training and validating model in XGBoost modeling6:54
Summary on parameters tuning and model persistence in XGBoost9:51
Show Python program for XGBoost model persistence3:45

Requirements

Basic math background
Basic computer skills

Description

The future world is the AI era of machine learning, so mastering the application of machine learning is equivalent to getting a key to the future career. If you can only learn one tool or algorithm for machine learning or building predictive models now, what is this tool? Without a doubt, that is Xgboost! If you are going to participate in a Kaggle contest, what is your preferred modeling tool? Again, the answer is Xgboost! This is proven by countless experienced data scientists and new comers. Therefore, you must register for this course!

The Xgboost is so famous in Kaggle contests because of its excellent accuracy, speed and stability. For example, according to the survey, more than 70% the top kaggle winners said they have used XGBoost.

The Xgboost is really useful and performs manifold functionalities in the data science world; this powerful algorithm is so frequently utilized to predict various types of targets – continuous, binary, categorical data, it is also found Xgboost very effective to solve different multiclass or multilabel classification problems. In addition, the contests on Kaggle platform covered almost all the applications and industries in the world, such as retail business, banking, insurance, pharmaceutical research, traffic control and credit risk management.

The Xgboost is powerful, but it is not that easy to exercise it full capabilities without expert’s guidance. For example, to successfully implement the Xgboost algorithm, you also need to understand and adjust many parameter settings. For doing so, I will teach you the underlying algorithm so you are able to configure the Xgboost that tailor to different data and application scenarios. In addition, I will provide intensive lectures on feature engineering, feature selection and parameters tuning aiming at Xgboost. So, after training you should also be able to prepare the suitable data or features that can well feed the XGBoost model.

This course is really practical but not lacking in theory; we start from decision trees and its related concepts and components, transferring to constructing the gradient boot methods, then leading to the Xgboost modeling. The math and statistics are mildly applied to explain the mechanisms in all machine learning methods. We use the Python pandas data frames to deal with data exploration and cleaning. One significant feature of this course is that we have used many Python program examples to demonstrate every single knowledge point and skill you have learned in the lecture.

Who this course is for:

Anyone who enjoys the Kaggle contests
Anyone who wishes to learn how to apply machine learning and data science approaches into business

XGBoost Machine Learning for Data Science and Kaggle

What you'll learn

Explore related topics

Course content

Introduction7 lectures • 40min

Decision tree and implementation16 lectures • 2hr 9min

Create gradient boost machine using decision trees5 lectures • 58min

Introduce XGBoost method and application14 lectures • 2hr 23min

Advanced topics on XGBoost algorithm13 lectures • 2hr 54min

Summary of XGBoost modeling and important things8 lectures • 44min

Requirements

Description

Who this course is for: