Udemy
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
Development
Web Development Data Science Mobile Development Programming Languages Game Development Database Design & Development Software Testing Software Engineering Development Tools No-Code Development
Business
Entrepreneurship Communications Management Sales Business Strategy Operations Project Management Business Law Business Analytics & Intelligence Human Resources Industry E-Commerce Media Real Estate Other Business
Finance & Accounting
Accounting & Bookkeeping Compliance Cryptocurrency & Blockchain Economics Finance Finance Cert & Exam Prep Financial Modeling & Analysis Investing & Trading Money Management Tools Taxes Other Finance & Accounting
IT & Software
IT Certification Network & Security Hardware Operating Systems Other IT & Software
Office Productivity
Microsoft Apple Google SAP Oracle Other Office Productivity
Personal Development
Personal Transformation Personal Productivity Leadership Career Development Parenting & Relationships Happiness Esoteric Practices Religion & Spirituality Personal Brand Building Creativity Influence Self Esteem & Confidence Stress Management Memory & Study Skills Motivation Other Personal Development
Design
Web Design Graphic Design & Illustration Design Tools User Experience Design Game Design Design Thinking 3D & Animation Fashion Design Architectural Design Interior Design Other Design
Marketing
Digital Marketing Search Engine Optimization Social Media Marketing Branding Marketing Fundamentals Marketing Analytics & Automation Public Relations Advertising Video & Mobile Marketing Content Marketing Growth Hacking Affiliate Marketing Product Marketing Other Marketing
Lifestyle
Arts & Crafts Beauty & Makeup Esoteric Practices Food & Beverage Gaming Home Improvement Pet Care & Training Travel Other Lifestyle
Photography & Video
Digital Photography Photography Portrait Photography Photography Tools Commercial Photography Video Design Other Photography & Video
Health & Fitness
Fitness General Health Sports Nutrition Yoga Mental Health Dieting Self Defense Safety & First Aid Dance Meditation Other Health & Fitness
Music
Instruments Music Production Music Fundamentals Vocal Music Techniques Music Software Other Music
Teaching & Academics
Engineering Humanities Math Science Online Education Social Science Language Teacher Training Test Prep Other Teaching & Academics
AWS Certification Microsoft Certification AWS Certified Solutions Architect - Associate AWS Certified Cloud Practitioner CompTIA A+ Cisco CCNA Amazon AWS CompTIA Security+ AWS Certified Developer - Associate
Graphic Design Photoshop Adobe Illustrator Drawing Digital Painting InDesign Character Design Canva Figure Drawing
Life Coach Training Neuro-Linguistic Programming Mindfulness Personal Development Personal Transformation Meditation Life Purpose Coaching Neuroscience
Web Development JavaScript React CSS Angular PHP WordPress Node.Js Python
Google Flutter Android Development iOS Development Swift React Native Dart Programming Language Mobile Development Kotlin SwiftUI
Digital Marketing Google Ads (Adwords) Social Media Marketing Google Ads (AdWords) Certification Marketing Strategy Internet Marketing YouTube Marketing Email Marketing Retargeting
SQL Microsoft Power BI Tableau Business Analysis Business Intelligence MySQL Data Analysis Data Modeling Data Science
Business Fundamentals Entrepreneurship Fundamentals Business Strategy Online Business Business Plan Startup Freelancing Blogging Home Business
Unity Game Development Fundamentals Unreal Engine C# 3D Game Development C++ 2D Game Development Unreal Engine Blueprints Blender
2020-12-29 14:04:04
30-Day Money-Back Guarantee

This course includes:

  • 5 hours on-demand video
  • 16 articles
  • 1 downloadable resource
  • Full lifetime access
  • Access on mobile and TV
Development Data Science Machine Learning

Feature Selection for Machine Learning

Select the variables in your data to build simpler, faster and more reliable machine learning models.
Rating: 4.6 out of 54.6 (1,187 ratings)
8,006 students
Created by Soledad Galli
Last updated 12/2020
English
English [Auto]
30-Day Money-Back Guarantee

What you'll learn

  • Learn about filter, embedded and wrapper methods for feature selection
  • Find out about hybdrid methods for feature selection
  • Select features with Lasso and decision trees
  • Implement different methods of feature selection with Python
  • Learn why less (features) is more
  • Reduce the feature space in a dataset
  • Build simpler, faster and more reliable machine learning models
  • Analyse and understand the selected features
  • Discover feature selection techniques used in data science competitions
Curated for the Udemy for Business collection

Course content

12 sections • 80 lectures • 5h 12m total length

  • Preview04:03
  • Preview03:33
  • Preview03:07
  • Course Aim
    01:44
  • Optional: How to approach this course
    01:00
  • Course Material
    02:01
  • The code | Jupyter notebooks
    00:15
  • Presentations covered in this course
    00:04
  • Download the data sets
    00:38
  • FAQ: Data Science and Python programming
    00:36

  • What is feature selection?
    06:15
  • Feature selection methods | Overview
    06:19
  • Filter Methods
    03:33
  • Wrapper methods
    05:42
  • Embedded Methods
    03:52
  • Moving Forward
    04:05
  • Open-source packages for feature selection
    03:00

  • Constant, quasi constant, and duplicated features – Intro
    04:02
  • Constant features
    07:53
  • Quasi-constant features
    07:07
  • Duplicated features
    05:23
  • Install Feature-engine
    00:14
  • Drop constant and quasi-constant with Feature-engine
    04:20
  • Drop duplicates with Feature-engine
    05:23

  • Correlation - Intro
    02:41
  • Correlation Feature Selection
    05:32
  • Correlation procedures to select features
    03:37
  • Correlation | Notebook demo
    11:49
  • Basic methods plus Correlation pipeline
    00:14
  • Correlation with Feature-engine
    08:01
  • Feature Selection Pipeline with Feature-engine
    02:19
  • Additional reading resources
    00:07

  • Statistical methods – Intro
    03:25
  • Mutual information
    06:11
  • Mutual information demo
    04:39
  • Chi-square
    06:09
  • Chi-square | Demo
    03:34
  • Anova
    05:54
  • Anova | Demo
    06:10
  • Basic methods + Correlation + Filter with stats pipeline
    00:16

  • Filter Methods with other metrics
    03:04
  • Univariate model performance metrics
    05:52
  • Univariate model performance metrics | Demo
    04:23
  • KDD 2009: Select features by target mean encoding
    06:39
  • KDD 2009: Select features by mean encoding | Demo
    06:59
  • Univariate model performance with Feature-engine
    04:54
  • Target Mean Encoding Selection with Feature-engine
    05:20

  • Wrapper methods – Intro
    06:39
  • MLXtend
    00:16
  • Step forward feature selection
    03:14
  • Step forward feature selection | Demo
    06:00
  • Step backward feature selection
    03:13
  • Step backward feature selection | Demo
    05:50
  • Exhaustive search
    02:45
  • Exhaustive search | Demo
    03:37

  • Regression Coefficients – Intro
    04:21
  • Selection by Logistic Regression Coefficients
    06:52
  • Selection by Linear Regression Coefficients
    02:44
  • Coefficients change with penalty
    05:26
  • Basic methods + Correlation + Embedded method using coefficients
    00:17

  • Regularisation – Intro
    05:39
  • Lasso
    06:39
  • A note on SelectFromModel
    00:35
  • Basic filter methods + LASSO pipeline
    00:16

  • Feature Selection by Tree importance | Intro
    06:46
  • Feature Selection by Tree importance | Demo
    03:40
  • Feature Selection by Tree importance | Recursively
    05:04
  • Feature selection with decision trees | review
    00:16

Requirements

  • A Python installation
  • Jupyter notebook installation
  • Python coding skills
  • Some experience with Numpy and Pandas
  • Familiarity with Machine Learning algorithms
  • Familiarity with scikit-learn

Description

Welcome to Feature Selection for Machine Learning, the most comprehensive course on feature selection available online.

In this course, you will learn how to select the variables in your data set and build simpler, faster, more reliable and more interpretable machine learning models.


Who is this course for?

You’ve given your first steps into data science, you know the most commonly used machine learning models, you probably built a few linear regression or decision tree based models. You are familiar with data pre-processing techniques like removing missing data, transforming variables, encoding categorical variables. At this stage you’ve probably realized that many data sets contain an enormous amount of features, and some of them are identical or very similar, some of them are not predictive at all, and for some others it is harder to say.

You wonder how you can go about to find the most predictive features. Which ones are OK to keep and which ones could you do without? You also wonder how to code the methods in a professional manner. Probably you did your online search and found out that there is not much around there about feature selection. So you start to wonder: how are things really done in tech companies?

This course will help you! This is the most comprehensive online course in variable selection. You will learn a huge variety of feature selection procedures used worldwide in different organizations and in data science competitions, to select the most predictive features.


What will you learn?

I have put together a fantastic collection of feature selection techniques, based on scientific articles, data science competitions and of course my own experience as a data scientist.

Specifically, you will learn:

  • How to remove features with low variance

  • How to identify redundant features

  • How to select features based on statistical tests

  • How to select features based on changes in model performance

  • How to find predictive features based on importance attributed by models

  • How to code procedures elegantly and in a professional manner

  • How to leverage the power of existing Python libraries for feature selection


Throughout the course, you are going to learn multiple techniques for each of the mentioned tasks, and you will learn to implement these techniques in an elegant, efficient, and professional manner, using Python, Scikit-learn, pandas and mlxtend.


At the end of the course, you will have a variety of tools to select and compare different feature subsets and identify the ones that returns the simplest, yet most predictive machine learning model. This will allow you to minimize the time to put your predictive models into production.


This comprehensive feature selection course includes about 70 lectures spanning ~8 hours of video, and ALL topics include hands-on Python code examples which you can use for reference and for practice, and re-use in your own projects.


In addition, I update the course regularly, to keep up with the Python libraries new releases and include new techniques when they appear.

So what are you waiting for? Enroll today, embrace the power of feature selection and build simpler, faster and more reliable machine learning models.

Who this course is for:

  • Beginner Data Scientists who want to understand how to select variables for machine learning
  • Intermediate Data Scientists who want to level up their experience in feature selection for machine learning
  • Advanced Data Scientists who want to discover alternative methods for feature selection
  • Software engineers and academics switching careers into data science
  • Software engineers and academics stepping into data science
  • Data analysts who want to level up their skills in data science

Featured review

Josep Maria Niubo Marti
Josep Maria Niubo Marti
21 courses
8 reviews
Rating: 5.0 out of 5a year ago
Certainly, above expectations. Feature selection is a crucial part of any machine learning process. This course explains very well the different techniques that can be applied, the pros and the cons, on a very comprehensive manner. Thank you.

Instructor

Soledad Galli
Lead Data Scientist
Soledad Galli
  • 4.6 Instructor Rating
  • 5,560 Reviews
  • 24,109 Students
  • 6 Courses

Soledad Galli is a lead data scientist and founder of Train in Data. She has experience in finance and insurance, received a Data Science Leaders Award in 2018 and was selected “LinkedIn’s voice” in data science and analytics in 2019. Sole is passionate about sharing knowledge and helping others succeed in data science.

As a data scientist in Finance and Insurance companies, Sole researched, developed and put in production machine learning models to assess Credit Risk, Insurance Claims and to prevent Fraud, leading in the adoption of machine learning in the organizations.

Sole is passionate about empowering people to step into and excel in data science. She mentors data scientists, writes articles online, speaks at data science meetings, and teaches online courses on machine learning.

Sole has recently created Train In Data, with the mission to facilitate and empower people and organizations worldwide to step into and excel in data science and analytics.

Sole has an MSc in Biology, a PhD in Biochemistry and 8+ years of experience as a research scientist in well-known institutions like University College London and the Max Planck Institute. She has scientific publications in various fields such as Cancer Research and Neuroscience, and her research was covered by the media on multiple occasions.

Soledad has 4+ years of experience as an instructor in Biochemistry at the University of Buenos Aires, taught seminars and tutorials at University College London, and mentored MSc and PhD students at Universities.

Feel free to contact her on LinkedIn.


========================


Soledad Galli es científica de datos y fundadora de Train in Data. Tiene experiencia en finanzas y seguros, recibió el premio Data Science Leaders Award en 2018 y fue seleccionada como "la voz de LinkedIn" en ciencia y análisis de datos en 2019. A Soledad le apasiona compartir conocimientos y ayudar a otros a tener éxito en la ciencia de datos.


Como científica de datos en compañías de finanzas y seguros, Sole desarrolló y puso en producción modelos de aprendizaje automático para evaluar el riesgo crediticio, automatizar reclamos de seguros y para prevenir el fraude, facilitando la adopción del aprendizaje de máquina en estas organizaciones.


A Sole le apasiona ayudar a que las personas aprendan y se destaquen en ciencia de datos, es por eso habla regularmente en reuniones de ciencia de datos, escribe varios artículos disponibles en la web y crea cursos sobre aprendizaje de máquina.


Sole ha creado recientemente Train In Data, con la misión de ayudar a las personas y organizaciones de todo el mundo a que aprendan y se destaquen en la ciencia y análisis de datos.


Sole tiene una maestría en biología, un doctorado en bioquímica y más de 8 años de experiencia como investigadora científica en instituciones prestigiosas como University College London y el Instituto Max Planck. Tiene publicaciones científicas en diversos campos, como la investigación contra el Cáncer y la Neurociencia, y sus resultados fueron cubiertos por los medios en múltiples ocasiones.


Soledad tiene más de 4 años de experiencia como instructora de bioquímica en la Universidad de Buenos Aires, dio seminarios y tutoriales en University College London, en Londres, y fue mentora de estudiantes de maestría y doctorado en diferentes universidades.


No dudes en contactarla en LinkedIn.

  • Udemy for Business
  • Teach on Udemy
  • Get the app
  • About us
  • Contact us
  • Careers
  • Blog
  • Help and Support
  • Affiliate
  • Terms
  • Privacy policy
  • Cookie settings
  • Sitemap
  • Featured courses
Udemy
© 2021 Udemy, Inc.