Udemy
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
Development
Web Development Data Science Mobile Development Programming Languages Game Development Database Design & Development Software Testing Software Engineering Software Development Tools No-Code Development
Business
Entrepreneurship Communication Management Sales Business Strategy Operations Project Management Business Law Business Analytics & Intelligence Human Resources Industry E-Commerce Media Real Estate Other Business
Finance & Accounting
Accounting & Bookkeeping Compliance Cryptocurrency & Blockchain Economics Finance Finance Cert & Exam Prep Financial Modeling & Analysis Investing & Trading Money Management Tools Taxes Other Finance & Accounting
IT & Software
IT Certifications Network & Security Hardware Operating Systems & Servers Other IT & Software
Office Productivity
Microsoft Apple Google SAP Oracle Other Office Productivity
Personal Development
Personal Transformation Personal Productivity Leadership Career Development Parenting & Relationships Happiness Esoteric Practices Religion & Spirituality Personal Brand Building Creativity Influence Self Esteem & Confidence Stress Management Memory & Study Skills Motivation Other Personal Development
Design
Web Design Graphic Design & Illustration Design Tools User Experience Design Game Design 3D & Animation Fashion Design Architectural Design Interior Design Other Design
Marketing
Digital Marketing Search Engine Optimization Social Media Marketing Branding Marketing Fundamentals Marketing Analytics & Automation Public Relations Paid Advertising Video & Mobile Marketing Content Marketing Growth Hacking Affiliate Marketing Product Marketing Other Marketing
Lifestyle
Arts & Crafts Beauty & Makeup Esoteric Practices Food & Beverage Gaming Home Improvement & Gardening Pet Care & Training Travel Other Lifestyle
Photography & Video
Digital Photography Photography Portrait Photography Photography Tools Commercial Photography Video Design Other Photography & Video
Health & Fitness
Fitness General Health Sports Nutrition & Diet Yoga Mental Health Martial Arts & Self Defense Safety & First Aid Dance Meditation Other Health & Fitness
Music
Instruments Music Production Music Fundamentals Vocal Music Techniques Music Software Other Music
Teaching & Academics
Engineering Humanities Math Science Online Education Social Science Language Learning Teacher Training Test Prep Other Teaching & Academics
Web Development JavaScript React Angular CSS Node.Js PHP HTML5 Typescript
AWS Certification Microsoft Certification AWS Certified Solutions Architect - Associate AWS Certified Cloud Practitioner CompTIA A+ Amazon AWS Cisco CCNA CompTIA Security+ Microsoft AZ-900
Microsoft Power BI SQL Tableau Data Modeling Business Analysis Business Intelligence MySQL Qlik Sense Data Analysis
Unity Unreal Engine Game Development Fundamentals C# 3D Game Development C++ Unreal Engine Blueprints 2D Game Development Mobile Game Development
Google Flutter iOS Development Android Development Swift React Native Dart (programming language) Kotlin Mobile App Development SwiftUI
Graphic Design Photoshop Adobe Illustrator Drawing Digital Painting Canva InDesign Character Design Procreate Digital Illustration App
Life Coach Training Personal Development Neuro-Linguistic Programming Personal Transformation Life Purpose Mindfulness Sound Therapy Coaching CBT Cognitive Behavioral Therapy
Business Fundamentals Entrepreneurship Fundamentals Freelancing Business Strategy Startup Business Plan Online Business Blogging Leadership
Digital Marketing Social Media Marketing Marketing Strategy Google Analytics Internet Marketing Copywriting Email Marketing Startup YouTube Marketing

DevelopmentData ScienceMachine Learning

Feature Engineering for Machine Learning

Learn imputation, variable encoding, discretization, feature extraction, how to work with datetime, outliers, and more.
Rating: 4.7 out of 54.7 (2,539 ratings)
17,200 students
Created by Soledad Galli
Last updated 6/2022
English
English [Auto]

What you'll learn

  • Learn multiple techniques for missing data imputation.
  • Transform categorical variables into numbers while capturing meaningful information.
  • Learn how to deal with infrequent, rare, and unseen categories.
  • Learn how to work with skewed variables.
  • Convert numerical variables into discrete ones.
  • Remove outliers from your variables.
  • Extract useful features from dates and time variables.
  • Learn techniques used in organizations worldwide and in data competitions.
  • Increase your repertoire of techniques to preprocess data and build more powerful machine learning models.

Requirements

  • A Python installation.
  • Jupyter notebook installation.
  • Python coding skills.
  • Some experience with Numpy and Pandas.
  • Familiarity with machine learning algorithms.
  • Familiarity with Scikit-Learn.

Description

Welcome to Feature Engineering for Machine Learning, the most comprehensive course on feature engineering available online. In this course, you will learn about variable imputation, variable encoding, feature transformation, discretization, and how to create new features from your data.


Master Feature Engineering and Feature Extraction.

In this course, you will learn multiple feature engineering methods that will allow you to transform your data and leave it ready to train machine learning models. Specifically, you will learn:


  • How to impute missing data

  • How to encode categorical variables

  • How to transform numerical variables and change their distribution

  • How to perform discretization

  • How to remove outliers

  • How to extract features from date and time

  • How to create new features from existing ones


Create useful Features with Math, Statistics and Domain Knowledge

Feature engineering is the process of transforming existing features or creating new variables for use in machine learning. Raw data is not suitable to train machine learning algorithms. Instead, data scientists devote a lot of time to data preprocessing. This course teaches you everything you need to know to leave your data ready to train your models.


While most online courses will teach you the very basics of feature engineering, like imputing variables with the mean or transforming categorical variables using one hot encoding, this course will teach you that, and much, much more.


In this course, you will first learn the most popular and widely used techniques for variable engineering, like mean and median imputation, one-hot encoding, transformation with logarithm, and discretization. Then, you will discover more advanced methods that capture information while encoding or transforming your variables to improve the performance of machine learning models.


You will learn methods like the weight of evidence, used in finance, and how to create monotonic relationships between variables and targets to boost the performance of linear models. You will also learn how to create features from date and time variables and how to handle categorical variables with a lot of categories.


The methods that you will learn were described in scientific articles, are used in data science competitions, and are commonly utilized in organizations. And what’s more, they can be easily implemented by utilizing Python's open-source libraries!

Throughout the lectures, you’ll find detailed explanations of each technique and a discussion about their advantages, limitations, and underlying assumptions, followed by the best programming practices to implement them in Python.


By the end of the course, you will be able to decide which feature engineering technique you need based on the variable characteristics and the models you wish to train. And you will also be well placed to test various transformation methods and let your models decide which ones work best.


Step-up your Career in Data Science

You’ve taken your first steps into data science. You know about the most commonly used prediction models. You've even trained a few linear regression or classification models. At this stage, you’re probably starting to find some challenges: your data is dirty, lots of values are missing, some variables are not numerical, and others extremely skewed. You may also wonder whether your code is efficient and performant or if there is a better way to program. You search online, but you can’t find consolidated resources on feature engineering. Maybe just blogs? So you may start to wonder: how are things really done in tech companies?


In this course, you will find answers to those questions. Throughout the course, you will learn multiple techniques for the different aspects of variable transformation, and how to implement them in an elegant, efficient, and professional manner using Python. You will leverage the power of Python’s open source ecosystem, including the libraries NumPy, Pandas, Scikit-learn, and special packages for feature engineering: Feature-engine and Category encoders.


By the end of the course, you will be able to implement all your feature engineering steps into a single elegant pipeline, which will allow you to put your predictive models into production with maximum efficiency.


Leverage the Power of Open Source

We will perform all feature engineering methods utilizing Pandas and Numpy, and we will compare the implementation with Scikit-learn, Feature-engine, and Category encoders, highlighting the advantages and limitations of each library. As you progress in the course, you will be able to choose the library you like the most to carry out your projects.

There is a dedicated Python notebook with code to implement each feature engineering method, which you can reuse in your projects to speed up the development of your machine learning models.


The Most Comprehensive Online Course for Feature Engineering

There is no one single place to go to learn about feature engineering. It involves hours of searching on the web to find out what people are doing to get the most out of their data.


That is why, this course gathers plenty of techniques used worldwide for feature transformation, learnt from data competitions in Kaggle and the KDD, scientific articles, and from the instructor’s experience as a data scientist. This course therefore provides a source of reference where you can learn new methods and also revisit the techniques and code needed to modify variables whenever you need to.


This course is taught by a lead data scientist with experience in the use of machine learning in finance and insurance, who is also a book author and the lead developer of a Python open source library for feature engineering. And there is more:


  • The course is constantly updated to include new feature engineering methods.

  • Notebooks are regularly refreshed to ensure all methods are carried out with the latest releases of the Python libraries, so your code will never break.

  • The course combines videos, presentations, and Jupyter notebooks to explain the methods and show their implementation in Python.

  • The curriculum was developed over a period of four years with continuous research in the field of feature engineering to bring you the latest technologies, tools, and trends.


Want to know more? Read on...

This comprehensive feature engineering course contains over 100 lectures spread across approximately 10 hours of video, and ALL topics include hands-on Python code examples that you can use for reference, practice, and reuse in your own projects.


REMEMBER, the course comes with a 30-day money-back guarantee, so you can sign up today with no risk.


So what are you waiting for? Enrol today and join the world's most comprehensive course on feature engineering for machine learning.

Who this course is for:

  • Data scientists who want to learn how to preprocess datasets in order to build machine learning models.
  • Data scientists who want to learn more techniques for feature engineering for machine learning.
  • Data scientists who want to improve their coding skills and programming practices for feature engineering.
  • Software engineers, mathematicians and academics switching careers into data science.
  • Data scientists interested in experimenting with various feature engineering techniques on data competitions
  • Software engineers who want to learn how to use Scikit-learn and other open-source packages for feature engineering.

Featured review

Josep Maria Niubo Marti
Josep Maria N.
30 courses
8 reviews
Rating: 5.0 out of 53 years ago
It is an eye opener! This course tackles the task of feature engineering on a very exhaustive and precise way. It explores ways I ignored and certainly helped me broaden my feature engineering toolkit, and thus helped me obtain better ML models. Thank you for such a great course!

Instructor

Soledad Galli
Lead Data Scientist
Soledad Galli
  • 4.6 Instructor Rating
  • 9,289 Reviews
  • 41,386 Students
  • 7 Courses

Soledad Galli is a lead data scientist and founder of Train in Data. She has experience in finance and insurance, received a Data Science Leaders Award in 2018 and was selected “LinkedIn’s voice” in data science and analytics in 2019. Sole is passionate about sharing knowledge and helping others succeed in data science.

As a data scientist in Finance and Insurance companies, Sole researched, developed and put in production machine learning models to assess Credit Risk, Insurance Claims and to prevent Fraud, leading in the adoption of machine learning in the organizations.

Sole is passionate about empowering people to step into and excel in data science. She mentors data scientists, writes articles online, speaks at data science meetings, and teaches online courses on machine learning.

Sole has recently created Train In Data, with the mission to facilitate and empower people and organizations worldwide to step into and excel in data science and analytics.

Sole has an MSc in Biology, a PhD in Biochemistry and 8+ years of experience as a research scientist in well-known institutions like University College London and the Max Planck Institute. She has scientific publications in various fields such as Cancer Research and Neuroscience, and her research was covered by the media on multiple occasions.

Soledad has 4+ years of experience as an instructor in Biochemistry at the University of Buenos Aires, taught seminars and tutorials at University College London, and mentored MSc and PhD students at Universities.

Feel free to contact her on LinkedIn.


========================


Soledad Galli es científica de datos y fundadora de Train in Data. Tiene experiencia en finanzas y seguros, recibió el premio Data Science Leaders Award en 2018 y fue seleccionada como "la voz de LinkedIn" en ciencia y análisis de datos en 2019. A Soledad le apasiona compartir conocimientos y ayudar a otros a tener éxito en la ciencia de datos.


Como científica de datos en compañías de finanzas y seguros, Sole desarrolló y puso en producción modelos de aprendizaje automático para evaluar el riesgo crediticio, automatizar reclamos de seguros y para prevenir el fraude, facilitando la adopción del aprendizaje de máquina en estas organizaciones.


A Sole le apasiona ayudar a que las personas aprendan y se destaquen en ciencia de datos, es por eso habla regularmente en reuniones de ciencia de datos, escribe varios artículos disponibles en la web y crea cursos sobre aprendizaje de máquina.


Sole ha creado recientemente Train In Data, con la misión de ayudar a las personas y organizaciones de todo el mundo a que aprendan y se destaquen en la ciencia y análisis de datos.


Sole tiene una maestría en biología, un doctorado en bioquímica y más de 8 años de experiencia como investigadora científica en instituciones prestigiosas como University College London y el Instituto Max Planck. Tiene publicaciones científicas en diversos campos, como la investigación contra el Cáncer y la Neurociencia, y sus resultados fueron cubiertos por los medios en múltiples ocasiones.


Soledad tiene más de 4 años de experiencia como instructora de bioquímica en la Universidad de Buenos Aires, dio seminarios y tutoriales en University College London, en Londres, y fue mentora de estudiantes de maestría y doctorado en diferentes universidades.


No dudes en contactarla en LinkedIn.

Top companies choose Udemy Business to build in-demand career skills.
NasdaqVolkswagenBoxNetAppEventbrite
  • Udemy Business
  • Teach on Udemy
  • Get the app
  • About us
  • Contact us
  • Careers
  • Blog
  • Help and Support
  • Affiliate
  • Investors
  • Impressum Kontakt
  • Terms
  • Privacy policy
  • Cookie settings
  • Sitemap
  • Accessibility statement
Udemy
© 2022 Udemy, Inc.