Find online courses made by experts from around the world.
Take your courses with you and learn anywhere, anytime.
Learn and practice real-world skills and achieve your goals.
Note: This course is a subset of our 20+ hour course 'From 0 to 1: Machine Learning & Natural Language Processing' so please don't sign up for both:-)
In an age of decision fatigue and information overload, this course is a crisp yet thorough primer on 2 great ML techniques that help cut through the noise: decision trees and random forests.
Prerequisites: No prerequisites, knowledge of some undergraduate level mathematics would help but is not mandatory. Working knowledge of Python would be helpful if you want to run the source code that is provided.
Taught by a Stanford-educated, ex-Googler and an IIT, IIM - educated ex-Flipkart lead analyst. This team has decades of practical experience in quant trading, analytics and e-commerce.
Mail us about anything - anything! - and we will always reply :-)
Not for you? No problem.
30 day money back guarantee.
Learn on the go.
Desktop, iOS and Android.
Certificate of completion.
|Section 1: Decision Fatigue, And Decision Trees|
You, This Course, and Us!Preview
|What are Decision Trees and how are they useful? Decision Trees are a visual and intuitive way of predicting what the outcome will be given some inputs. They assign an order of importance to the input variables that helps you see clearly what really influences your outcome.|
Recursive Partitioning is the most common strategy for growing Decision Trees from a training set.
Learn what makes one attribute be higher up in a Decision Tree compared to others.
|We'll take a small detour into Information Theory to understand the concept of Information Gain. This concept forms the basis of how popular Decision Tree Learning algorithms work.|
ID3, C4.5, CART and CHAID are commonly used Decision Tree Learning algorithms. Learn what makes them different from each other. Pruning is a mechanism to avoid one of the risks inherent with Decision Trees ie overfitting.
Anaconda's iPython is a Python IDE. The best part about it is the ease with which one can install packages in iPython - 1 line is virtually always enough. Just say '!pip'
|Numpy arrays are pretty cool for performing mathematical computations on your data.|
|We continue with a basic tutorial on Numpy and Scipy|
|Build a decision tree to predict the survival of a passenger on the Titanic. This is a challenge posed by Kaggle (a competitive online data science community). We'll start off by exploring the data and transforming the data into feature vectors that can be fed to a Decision Tree Classifier.|
|We continue with the Kaggle challenge. Let's feed the training set to a Decision Tree Classifier and then parse the results.|
We'll use our Decision Tree Classifier to predict the results on Kaggle's test data set. Submit the results to Kaggle and see where you stand!
|Section 2: A Few Useful Things to Know About Overfitting|
Overfitting is one of the biggest problems with Machine Learning - it's a trap that's easy to fall into and important to be aware of.
Overfitting is a difficult problem to solve - there is no way to avoid it completely, by correcting for it, we fall into the opposite error of underfitting.
Cross Validation is a popular way to choose between models. There are a few different variants - K-Fold Cross validation is the most well known.
Overfitting occurs when the model becomes too complex. Regularization helps maintain the balance between accuracy and complexity of the model.
The crowd is indeed wiser than the individual - at least with ensemble learning. The Netflix competition showed that ensemble learning helps achieve tremendous improvements in accuracy - many learners perform better than just 1.
Bagging, Boosting and Stacking are different techniques to help build an ensemble that rocks!
|Section 3: Random Forests|
Decision trees are cool but painstaking to build - because they really tend to overfit. Random Forests to the rescue! Use an ensemble of decision trees - all the benefits of decision trees, few of the pains!
Machine learning is not a one-shot process. You'll need to iterate, test multiple models to see what works better. Let's use cross validation to compare the accuracy of different models - Decision trees vs Random Forests
Loonycorn is us, Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh. Between the four of us, we have studied at Stanford, IIM Ahmedabad, the IITs and have spent years (decades, actually) working in tech, in the Bay Area, New York, Singapore and Bangalore.
Janani: 7 years at Google (New York, Singapore); Studied at Stanford; also worked at Flipkart and Microsoft
Vitthal: Also Google (Singapore) and studied at Stanford; Flipkart, Credit Suisse and INSEAD too
Swetha: Early Flipkart employee, IIM Ahmedabad and IIT Madras alum
Navdeep: longtime Flipkart employee too, and IIT Guwahati alum
We think we might have hit upon a neat way of teaching complicated tech courses in a funny, practical, engaging way, which is why we are so excited to be here on Udemy!
We hope you will try our offerings, and think you'll like them :-)