Buying for a Team? Gift This Course
Wishlisted Wishlist

Please confirm that you want to add Byte-Sized-Chunks: Decision Trees and Random Forests to your Wishlist.

Add to Wishlist

Byte-Sized-Chunks: Decision Trees and Random Forests

Cool machine learning techniques to predict survival probabilities aboard the Titanic - a Kaggle problem!
3.9 (33 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1,188 students enrolled
Created by Loony Corn
Last updated 3/2016
English
$10 $20 50% off
3 days left at this price!
30-Day Money-Back Guarantee
Includes:
  • 4.5 hours on-demand video
  • 23 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Have a coupon?
What Will I Learn?
Design and Implement the solution to a famous problem in machine learning: predicting survival probabilities aboard the Titanic
Understand the perils of overfitting, and how random forests help overcome this risk
Identify the use-cases for Decision Trees as well as Random Forests
View Curriculum
Requirements
  • No prerequisites, knowledge of some undergraduate level mathematics would help but is not mandatory. Working knowledge of Python would be helpful if you want to perform the coding exercise and understand the provided source code
Description

Note: This course is a subset of our 20+ hour course 'From 0 to 1: Machine Learning & Natural Language Processing' so please don't sign up for both:-)

In an age of decision fatigue and information overload, this course is a crisp yet thorough primer on 2 great ML techniques that help cut through the noise: decision trees and random forests.

Prerequisites: No prerequisites, knowledge of some undergraduate level mathematics would help but is not mandatory. Working knowledge of Python would be helpful if you want to run the source code that is provided.

Taught by a Stanford-educated, ex-Googler and an IIT, IIM - educated ex-Flipkart lead analyst. This team has decades of practical experience in quant trading, analytics and e-commerce.

What's Covered:

  • Decision Trees are a visual and intuitive way of predicting what the outcome will be given some inputs. They assign an order of importance to the input variables that helps you see clearly what really influences your outcome.
  • Random Forests avoid overfitting: Decision trees are cool but painstaking to build - because they really tend to overfit. Random Forests to the rescue! Use an ensemble of decision trees - all the benefits of decision trees, few of the pains!
  • Python Activity: Surviving aboard the Titanic! Build a decision tree to predict the survival of a passenger on the Titanic. This is a challenge posed by Kaggle (a competitive online data science community). We'll start off by exploring the data and transforming the data into feature vectors that can be fed to a Decision Tree Classifier.

Mail us about anything - anything! - and we will always reply :-)

Who is the target audience?
  • Nope! Please don't enroll for this class if you have already enrolled for our 21-hour course 'From 0 to 1: Machine Learning and NLP in Python'
  • Yep! Analytics professionals, modelers, big data professionals who haven't had exposure to machine learning
  • Yep! Engineers who want to understand or learn machine learning and apply it to problems they are solving
  • Yep! Product managers who want to have intelligent conversations with data scientists and engineers about machine learning
  • Yep! Tech executives and investors who are interested in big data, machine learning or natural language processing
Students Who Viewed This Course Also Viewed
Curriculum For This Course
Expand All 19 Lectures Collapse All 19 Lectures 04:34:46
+
Decision Fatigue, And Decision Trees
11 Lectures 02:30:59

What are Decision Trees and how are they useful? Decision Trees are a visual and intuitive way of predicting what the outcome will be given some inputs. They assign an order of importance to the input variables that helps you see clearly what really influences your outcome.
Preview 17:00

Recursive Partitioning is the most common strategy for growing Decision Trees from a training set.

Learn what makes one attribute be higher up in a Decision Tree compared to others.

Growing the Tree - Decision Tree Learning
18:03

We'll take a small detour into Information Theory to understand the concept of Information Gain. This concept forms the basis of how popular Decision Tree Learning algorithms work.
Preview 18:51

ID3, C4.5, CART and CHAID are commonly used Decision Tree Learning algorithms. Learn what makes them different from each other. Pruning is a mechanism to avoid one of the risks inherent with Decision Trees ie overfitting.

Decision Tree Algorithms
07:50

Anaconda's iPython is a Python IDE. The best part about it is the ease with which one can install packages in iPython - 1 line is virtually always enough. Just say '!pip'

Installing Python - Anaconda and Pip
09:00

Numpy arrays are pretty cool for performing mathematical computations on your data.
Back to Basics : Numpy in Python
18:05

We continue with a basic tutorial on Numpy and Scipy
Back to Basics : Numpy and Scipy in Python
14:19

Build a decision tree to predict the survival of a passenger on the Titanic. This is a challenge posed by Kaggle (a competitive online data science community). We'll start off by exploring the data and transforming the data into feature vectors that can be fed to a Decision Tree Classifier.
Titanic : Decision Trees predict Survival (Kaggle) - I
19:21

We continue with the Kaggle challenge. Let's feed the training set to a Decision Tree Classifier and then parse the results.
Titanic : Decision Trees predict Survival (Kaggle) - II
14:16

We'll use our Decision Tree Classifier to predict the results on Kaggle's test data set. Submit the results to Kaggle and see where you stand!

Titanic : Decision Trees predict Survival (Kaggle) - III
13:00
+
A Few Useful Things to Know About Overfitting
6 Lectures 01:31:16

Overfitting is one of the biggest problems with Machine Learning - it's a trap that's easy to fall into and important to be aware of.

Overfitting - The Bane of Machine Learning
19:03

Overfitting is a difficult problem to solve - there is no way to avoid it completely, by correcting for it, we fall into the opposite error of underfitting.

Overfitting Continued
11:19

Cross Validation is a popular way to choose between models. There are a few different variants - K-Fold Cross validation is the most well known.

Cross-Validation
18:55

Overfitting occurs when the model becomes too complex. Regularization helps maintain the balance between accuracy and complexity of the model.

Simplicity is a virtue - Regularization
07:18

The crowd is indeed wiser than the individual - at least with ensemble learning. The Netflix competition showed that ensemble learning helps achieve tremendous improvements in accuracy - many learners perform better than just 1.

The Wisdom Of Crowds - Ensemble Learning
16:39

Bagging, Boosting and Stacking are different techniques to help build an ensemble that rocks!

Ensemble Learning continued - Bagging, Boosting and Stacking
18:02
+
Random Forests
2 Lectures 32:31

Decision trees are cool but painstaking to build - because they really tend to overfit. Random Forests to the rescue! Use an ensemble of decision trees - all the benefits of decision trees, few of the pains!

Random Forests - Much more than trees
12:28

Machine learning is not a one-shot process. You'll need to iterate, test multiple models to see what works better. Let's use cross validation to compare the accuracy of different models - Decision trees vs Random Forests

Back on the Titanic - Cross Validation and Random Forests
20:03
About the Instructor
4.3 Average rating
2,972 Reviews
20,688 Students
61 Courses
A 4-person team;ex-Google; Stanford, IIM Ahmedabad, IIT

Loonycorn is us, Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh. Between the four of us, we have studied at Stanford, IIM Ahmedabad, the IITs and have spent years (decades, actually) working in tech, in the Bay Area, New York, Singapore and Bangalore.

Janani: 7 years at Google (New York, Singapore); Studied at Stanford; also worked at Flipkart and Microsoft

Vitthal: Also Google (Singapore) and studied at Stanford; Flipkart, Credit Suisse and INSEAD too

Swetha: Early Flipkart employee, IIM Ahmedabad and IIT Madras alum

Navdeep: longtime Flipkart employee too, and IIT Guwahati alum

We think we might have hit upon a neat way of teaching complicated tech courses in a funny, practical, engaging way, which is why we are so excited to be here on Udemy!

We hope you will try our offerings, and think you'll like them :-)

Report Abuse