Udemy
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
Development
Web Development Data Science Mobile Development Programming Languages Game Development Database Design & Development Software Testing Software Engineering Software Development Tools No-Code Development
Business
Entrepreneurship Communication Management Sales Business Strategy Operations Project Management Business Law Business Analytics & Intelligence Human Resources Industry E-Commerce Media Real Estate Other Business
Finance & Accounting
Accounting & Bookkeeping Compliance Cryptocurrency & Blockchain Economics Finance Finance Cert & Exam Prep Financial Modeling & Analysis Investing & Trading Money Management Tools Taxes Other Finance & Accounting
IT & Software
IT Certifications Network & Security Hardware Operating Systems & Servers Other IT & Software
Office Productivity
Microsoft Apple Google SAP Oracle Other Office Productivity
Personal Development
Personal Transformation Personal Productivity Leadership Career Development Parenting & Relationships Happiness Esoteric Practices Religion & Spirituality Personal Brand Building Creativity Influence Self Esteem & Confidence Stress Management Memory & Study Skills Motivation Other Personal Development
Design
Web Design Graphic Design & Illustration Design Tools User Experience Design Game Design 3D & Animation Fashion Design Architectural Design Interior Design Other Design
Marketing
Digital Marketing Search Engine Optimization Social Media Marketing Branding Marketing Fundamentals Marketing Analytics & Automation Public Relations Paid Advertising Video & Mobile Marketing Content Marketing Growth Hacking Affiliate Marketing Product Marketing Other Marketing
Lifestyle
Arts & Crafts Beauty & Makeup Esoteric Practices Food & Beverage Gaming Home Improvement & Gardening Pet Care & Training Travel Other Lifestyle
Photography & Video
Digital Photography Photography Portrait Photography Photography Tools Commercial Photography Video Design Other Photography & Video
Health & Fitness
Fitness General Health Sports Nutrition & Diet Yoga Mental Health Martial Arts & Self Defense Safety & First Aid Dance Meditation Other Health & Fitness
Music
Instruments Music Production Music Fundamentals Vocal Music Techniques Music Software Other Music
Teaching & Academics
Engineering Humanities Math Science Online Education Social Science Language Learning Teacher Training Test Prep Other Teaching & Academics
Web Development JavaScript React CSS Angular Node.Js PHP HTML5 Typescript
AWS Certification Microsoft Certification AWS Certified Solutions Architect - Associate AWS Certified Cloud Practitioner CompTIA A+ Amazon AWS Cisco CCNA CompTIA Security+ Microsoft AZ-900
Microsoft Power BI SQL Tableau Data Modeling Business Analysis Business Intelligence MySQL Qlik Sense Data Analysis
Unity Unreal Engine Game Development Fundamentals C# 3D Game Development C++ Unreal Engine Blueprints 2D Game Development Mobile Game Development
Google Flutter iOS Development Android Development Swift React Native Dart (programming language) Kotlin Mobile App Development SwiftUI
Graphic Design Photoshop Adobe Illustrator Drawing Digital Painting Canva InDesign Character Design Procreate Digital Illustration App
Life Coach Training Personal Development Neuro-Linguistic Programming Personal Transformation Life Purpose Mindfulness Sound Therapy Coaching CBT Cognitive Behavioral Therapy
Business Fundamentals Entrepreneurship Fundamentals Business Strategy Freelancing Startup Business Plan Online Business Blogging Leadership
Digital Marketing Social Media Marketing Marketing Strategy Google Analytics Internet Marketing Copywriting Email Marketing Startup YouTube Marketing

DevelopmentData ScienceMachine Learning

Machine learning with Scikit-learn

Learn the most important machine learning techniques using the best machine learning library available
Rating: 4.0 out of 54.0 (86 ratings)
588 students
Created by Francisco Juretig
Last updated 3/2017
English
English [Auto]

What you'll learn

  • Load data into scikit-learn; Run many machine learning algorithms both for unsupervised and supervised data.
  • Assess model accuracy and performance
  • Being able to decide what's the best model for every scenario

Requirements

  • Some Python and statistics knowledge is required: Being able to code loops, functions, classes in Python is necessary. Understanding what are random variables, what is a Gaussian distribution, and the underlying concepts behind linear regression are necessary as well.

Description

This course will explain how to use scikit-learn to do advanced machine learning. If you are aiming to work as a professional data scientist, you need to master scikit-learn!

It is expected that you have some familiarity with statistics, and python programming. It's not necessary to be an expert, but you should be able to understand what is a Gaussian distribution, code loops and functions in Python, and know the basics of a maximum likelihood estimator. The course will be entirely focused on the python implementation, and the math behind it will be omitted as much as possible.

The objective of this course is to provide you with a good understanding of scikit-learn (being able to identify which technique you can use for a particular problem). If you follow this course, you should be able to handle quite well a machine learning interview. Even though in that case you will need to study the math with more detail.

We'll start by explaining what is the machine learning problem, methodology and terminology. We'll explain what are the differences between AI, machine learning (ML), statistics, and data mining. Scikit-learn (being a Python library) benefits from Python's spectacular simplicity and power. We'll start by explaining how to install scikit-learn and its dependencies. And then show how can we can use Pandas data in scikit-learn, and also benefit from SciPy and Numpy. We'll then show how to create synthetic data-sets using scikit-learn. We will be able to create data-sets specifically tailored for regression, classification and clustering.

In essence, machine learning can be divided into two big groups: supervised and unsupervised learning. In supervised learning we will have an objective variable (which can be continuous or categorical) and we want to use certain features to predict it. Scikit-learn will provide estimators for both classification and regression problems. We will start by discussing the simplest classifier which is "Naive Bayes". We will then see some powerful regression techniques that via a special trick called regularization, will help get much better linear estimators. We will then analyze Support Vector Machines, a powerful technique for both regression and classification. We will then use classification and regression trees to estimate very complex models. We will see how we can combine many of the existing estimators into simpler structures, but more robust for out of sample performance, called "ensemble" methods. In particular random forests, random trees, and boosting methods. These methods are the ones winning most data science competitions nowadays.

We will see how we can use all these techniques for online data, image classification, sales data, and more. We also use real datasets from Kaggle such as spam SMS data, house prices in the United States, etc. to teach the student what to expect when working with real data.

On the other hand, in unsupervised learning we will have a set of features (but with no outcome or target variable) and we will attempt to learn from that data. Whether it has outliers, whether it can be grouped into groups, whether we can remove some of those features, etcetera. For example we will see k-means which is the simplest algorithm for classifying observations into groups. We will see that sometimes there are better techniques such as DBSCAN. We will then explain how we can use principal components to reduce the dimensionality of a data-set. And we will
use some very powerful scikit-learn functions that learn the density of the data, and are able to classify outliers.

I try to keep this course as updated as possible, specially since scikit-learn is constantly being updated. For example, neural networks was added in the latest release. I tried to keep the examples as simple as possible, keeping the amount of observations (samples) and features (variables) as small as possible. In real situations, we will use hundreds of features and thousands of samples, and most of the methods presented here scale really well into those scenarios. I don't want this course to be focused on very realistic examples, because I think it obscures what we are trying to achieve in each example. Nevertheless, some more complex examples will be added as additional exercises.

  

Who this course is for:

  • Students with some analytics/data-science knowledge aiming at being able to comfortable model in scikit-learn
  • Experienced data scientists working in R/SAS/MATLAB, wanting to transition into ML with Python

Instructor

Francisco Juretig
Mr
Francisco Juretig
  • 3.9 Instructor Rating
  • 445 Reviews
  • 24,070 Students
  • 9 Courses

I worked for 7+ years exp as statistical programmer in the industry. Expert in programming, statistics, data science, statistical algorithms. I have wide experience in many programming languages. Regular contributor to the R community, with 3 published packages. I also am expert SAS programmer. Contributor to scientific statistical journals. Latest publication on the Journal of Statistical Software.

Top companies choose Udemy Business to build in-demand career skills.
NasdaqVolkswagenBoxNetAppEventbrite
  • Udemy Business
  • Teach on Udemy
  • Get the app
  • About us
  • Contact us
  • Careers
  • Blog
  • Help and Support
  • Affiliate
  • Investors
  • Impressum Kontakt
  • Terms
  • Privacy policy
  • Cookie settings
  • Sitemap
  • Accessibility statement
Udemy
© 2022 Udemy, Inc.