Feature Selection for Machine Learning
What you'll learn
- Learn about filter, embedded and wrapper methods for feature selection
- Find out about hybdrid methods for feature selection
- Select features with Lasso and decision trees
- Implement different methods of feature selection with Python
- Learn why less (features) is more
- Reduce the feature space in a dataset
- Build simpler, faster and more reliable machine learning models
- Analyse and understand the selected features
- Discover feature selection techniques used in data science competitions
Requirements
- A Python installation
- Jupyter notebook installation
- Python coding skills
- Some experience with Numpy and Pandas
- Familiarity with Machine Learning algorithms
- Familiarity with scikit-learn
Description
Welcome to Feature Selection for Machine Learning, the most comprehensive course on feature selection available online.
In this course, you will learn how to select the variables in your data set and build simpler, faster, more reliable and more interpretable machine learning models.
Who is this course for?
You’ve given your first steps into data science, you know the most commonly used machine learning models, you probably built a few linear regression or decision tree based models. You are familiar with data pre-processing techniques like removing missing data, transforming variables, encoding categorical variables. At this stage you’ve probably realized that many data sets contain an enormous amount of features, and some of them are identical or very similar, some of them are not predictive at all, and for some others it is harder to say.
You wonder how you can go about to find the most predictive features. Which ones are OK to keep and which ones could you do without? You also wonder how to code the methods in a professional manner. Probably you did your online search and found out that there is not much around there about feature selection. So you start to wonder: how are things really done in tech companies?
This course will help you! This is the most comprehensive online course in variable selection. You will learn a huge variety of feature selection procedures used worldwide in different organizations and in data science competitions, to select the most predictive features.
What will you learn?
I have put together a fantastic collection of feature selection techniques, based on scientific articles, data science competitions and of course my own experience as a data scientist.
Specifically, you will learn:
How to remove features with low variance
How to identify redundant features
How to select features based on statistical tests
How to select features based on changes in model performance
How to find predictive features based on importance attributed by models
How to code procedures elegantly and in a professional manner
How to leverage the power of existing Python libraries for feature selection
Throughout the course, you are going to learn multiple techniques for each of the mentioned tasks, and you will learn to implement these techniques in an elegant, efficient, and professional manner, using Python, Scikit-learn, pandas and mlxtend.
At the end of the course, you will have a variety of tools to select and compare different feature subsets and identify the ones that returns the simplest, yet most predictive machine learning model. This will allow you to minimize the time to put your predictive models into production.
This comprehensive feature selection course includes about 70 lectures spanning ~8 hours of video, and ALL topics include hands-on Python code examples which you can use for reference and for practice, and re-use in your own projects.
In addition, I update the course regularly, to keep up with the Python libraries new releases and include new techniques when they appear.
So what are you waiting for? Enroll today, embrace the power of feature selection and build simpler, faster and more reliable machine learning models.
Who this course is for:
- Beginner Data Scientists who want to understand how to select variables for machine learning
- Intermediate Data Scientists who want to level up their experience in feature selection for machine learning
- Advanced Data Scientists who want to discover alternative methods for feature selection
- Software engineers and academics switching careers into data science
- Software engineers and academics stepping into data science
- Data analysts who want to level up their skills in data science
Featured review
Instructors
Hey, I am Sole. I am a data scientist and open-source Python developer with a passion for teaching and programming.
I teach intermediate and advanced courses on machine learning, covering topics like how to improve machine learning pipelines, better engineer and select features, optimize models, and deal with imbalanced datasets.
I am the developer and maintainer of Feature-engine, an open-source Python library for feature engineering and selection, and the author of Packt's "Python Feature Engineering Cookbook" and the "Feature Selection in Machine Learning with Python" book.
I received a Data Science Leaders Award in 2018 and was selected as one of "LinkedIn’s voices" in data science and analytics in 2019.
I worked as a data scientist for financial and insurance firms, developing and putting in production machine learning models to assess credit risk, process insurance claims, and prevent fraud.
I love sharing knowledge about data science and machine learning. This is why I teach online, create and contribute to open-source software, and also speak at meetups, write blogs, and participate in podcasts.
I've got an MSc in Biology, a PhD in Biochemistry, and 8+ years of experience as a research scientist at well-known institutions like University College London and the Max Planck Institute. I've also taught biochemistry for 4+ years at the University of Buenos Aires and mentored MSc and PhD students.
Feel free to contact me on LinkedIn, follow me on Twitter, or visit our website for blogs about machine learning.
Hey, we are a team of data scientists and Python developers with a passion for teaching and programming.
We teach intermediate and advanced courses on machine learning, covering topics like how to improve machine learning pipelines, better engineer and select features, optimize models, and deal with imbalanced datasets.
We are the developers of Feature-engine, an open-source Python library for feature engineering and selection, and the author of Packt's "Python Feature Engineering Cookbook" and the "Feature Selection in Machine Learning with Python" book.
Feel free to contact our lead instructor on LinkedIn, follow her on Twitter, or visit our website for blogs about machine learning.