Feature Selection for Machine Learning
4.6 (950 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
6,468 students enrolled

Feature Selection for Machine Learning

From beginner to advanced
4.6 (950 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
6,470 students enrolled
Created by Soledad Galli
Last updated 5/2020
English
English
Current price: $129.99 Original price: $199.99 Discount: 35% off
2 days left at this price!
30-Day Money-Back Guarantee
This course includes
  • 3.5 hours on-demand video
  • 23 articles
  • 2 downloadable resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Understand different methods of feature selection
  • Implement different methods of feature selection
  • Reduce feature space in a dataset
  • Build simpler, faster and more reliable machine learning models
  • Analyse and understand the selected features
Course content
Expand all 53 lectures 05:11:14
+ Introduction
11 lectures 14:41
Additional Requirements | Nice to have
00:26
How to approach this course
01:01
Guide to setting up your computer
01:02
Installing XGBoost in windows
00:15
Download the data sets
00:33
Presentations covered in this course
00:04
Jupyter notebooks covered in this course
00:04
FAQ: Data Science and Python programming
00:32
+ Feature Selection
5 lectures 25:41
What is feature selection?
06:15
Feature selection methods | Overview
06:19
Filter Methods
03:33
Wrapper methods
05:42
Embedded Methods
03:52
+ Filter Methods | Basics
5 lectures 31:43
Constant, quasi constant, and duplicated features – Intro
04:02
Constant features
09:52
Quasi-constant features
09:40
Duplicated features
06:51
Basic methods | review
01:18
+ Filter methods | Correlation
3 lectures 23:24
Correlation – Intro
05:32
Correlation
14:35
Basic methods plus Correlation pipeline
03:17
+ Filter methods | Statistical measures
7 lectures 54:26
Statistical methods – Intro
13:41
Mutual information
08:04
Chi-square for categorical variables | Fisher score
04:46
Univariate approaches
09:27
Univariate ROC-AUC
06:59
Basic methods + Correlation + univariate ROC-AUC pipeline
04:11
BONUS: select features by mean encoding | KDD 2009
07:18
+ Wrapper methods
4 lectures 38:02
Wrapper methods – Intro
06:39
Step forward feature selection
11:58
Step backward feature selection
11:32
Exhaustive search
07:53
+ Embedded methods – Lasso regularisation
3 lectures 19:19
Regularisation – Intro
05:42
Lasso
08:38
Basic filter methods + LASSO pipeline
04:59
+ Embedded methods | Linear models
5 lectures 25:21
Regression Coefficients – Intro
04:15
Selection by Logistic Regression Coefficients
07:38
Coefficients change with penalty
05:50
Selection by Linear Regression Coefficients
03:01
Feature selection with linear models | review
04:37
+ Embedded methods | Trees
5 lectures 23:08
Selecting Features by Tree importance – Intro
06:46
Select by model importance random forests |embedded
05:34
Select by model importance random forests | recursively
03:41
Select by model importance gradient boosted machines
02:22
Feature selection with decision trees | review
04:44
+ Reading Resources
1 lecture 00:18
Additional reading resources
00:18
Requirements
  • A Python installation
  • Jupyter notebook installation
  • Python coding skills
  • Some experience with Numpy and Pandas
  • Familiarity with Machine Learning algorithms
  • Familiarity with scikit-learn
Description

Learn how to select features and build simpler, faster and more reliable machine learning models.

This is the most comprehensive, yet easy to follow, course for feature selection available online. Throughout this course you will learn a variety of techniques used worldwide for variable selection, gathered from data competition websites and white papers, blogs and forums, and from the instructor’s experience as a Data Scientist.

You will have at your fingertips, altogether in one place, multiple methods that you can apply to select features from your data set.

The course starts describing simple and fast methods to quickly screen the data set and remove redundant and irrelevant features. Then it describes more complex techniques that select variables taking into account variable interaction, the feature importance and its interaction with the machine learning algorithm. Finally, it describes specific techniques used in data competitions and the industry. 

The lectures include an explanation of the feature selection technique, the rationale to use it, and the advantages and limitations of the procedure. It also includes full code that you can take home and apply to your own data sets.

This course is therefore suitable for complete beginners in data science looking to learn how to go about to select features from a data set, as well as for intermediate and even advanced data scientists seeking to level up their skills.

With more than 50 lectures and 8 hours of video this comprehensive course covers every aspect of variable selection. Throughout the course you will use python as your main language.

So what are you waiting for? Enrol today, learn how to select variables for machine learning, and build simpler, faster and more reliable learning models.

Who this course is for:
  • Beginner Data Scientists who want to understand how to select variables for machine learning
  • Intermediate Data Scientists who want to level up their experience in feature selection for machine learning
  • Advanced Data Scientists who want to discover alternative methods for feature selection
  • Software engineers and academics switching careers into data science
  • Software engineers and academics stepping into data science
  • Data analysts who want to level up their skills in data science