From 0 to 1: Machine Learning, NLP & Python-Cut to the Chase
4.2 (563 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
5,728 students enrolled
Wishlisted Wishlist

Please confirm that you want to add From 0 to 1: Machine Learning, NLP & Python-Cut to the Chase to your Wishlist.

Add to Wishlist

From 0 to 1: Machine Learning, NLP & Python-Cut to the Chase

A down-to-earth, shy but confident take on machine learning techniques that you can put to work today
4.2 (563 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
5,728 students enrolled
Created by Loony Corn
Last updated 3/2017
English
Current price: $10 Original price: $50 Discount: 80% off
5 hours left at this price!
30-Day Money-Back Guarantee
Includes:
  • 20 hours on-demand video
  • 112 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Identify situations that call for the use of Machine Learning
  • Understand which type of Machine learning problem you are solving and choose the appropriate solution
  • Use Machine Learning and Natural Language processing to solve problems like text classification, text summarization in Python
View Curriculum
Requirements
  • No prerequisites, knowledge of some undergraduate level mathematics would help but is not mandatory. Working knowledge of Python would be helpful if you want to run the source code that is provided.
Description

Prerequisites: No prerequisites, knowledge of some undergraduate level mathematics would help but is not mandatory. Working knowledge of Python would be helpful if you want to run the source code that is provided.

Taught by a Stanford-educated, ex-Googler and an IIT, IIM - educated ex-Flipkart lead analyst. This team has decades of practical experience in quant trading, analytics and e-commerce.

This course is a down-to-earth, shy but confident take on machine learning techniques that you can put to work today

Let’s parse that.

The course is down-to-earth : it makes everything as simple as possible - but not simpler

The course is shy but confident : It is authoritative, drawn from decades of practical experience -but shies away from needlessly complicating stuff.

You can put ML to work today : If Machine Learning is a car, this car will have you driving today. It won't tell you what the carburetor is.

The course is very visual : most of the techniques are explained with the help of animations to help you understand better.

This course is practical as well : There are hundreds of lines of source code with comments that can be used directly to implement natural language processing and machine learning for text summarization, text classification in Python.

The course is also quirky. The examples are irreverent. Lots of little touches: repetition, zooming out so we remember the big picture, active learning with plenty of quizzes. There’s also a peppy soundtrack, and art - all shown by studies to improve cognition and recall.

What's Covered:

Machine Learning:

Supervised/Unsupervised learning, Classification, Clustering, Association Detection, Anomaly Detection, Dimensionality Reduction, Regression.

Naive Bayes, K-nearest neighbours, Support Vector Machines, Artificial Neural Networks, K-means, Hierarchical clustering, Principal Components Analysis, Linear regression, Logistics regression, Random variables, Bayes theorem, Bias-variance tradeoff

Natural Language Processing with Python:

Corpora, stopwords, sentence and word parsing, auto-summarization, sentiment analysis (as a special case of classification), TF-IDF, Document Distance, Text summarization, Text classification with Naive Bayes and K-Nearest Neighbours and Clustering with K-Means

Sentiment Analysis: 

Why it's useful, Approaches to solving - Rule-Based , ML-Based , Training , Feature Extraction, Sentiment Lexicons, Regular Expressions, Twitter API, Sentiment Analysis of Tweets with Python

Mitigating Overfitting with Ensemble Learning:

Decision trees and decision tree learning, Overfitting in decision trees, Techniques to mitigate overfitting (cross validation, regularization), Ensemble learning and Random forests

Recommendations:  Content based filtering, Collaborative filtering and Association Rules learning

Get started with Deep learning: Apply Multi-layer perceptrons to the MNIST Digit recognition problem

A Note on Python: The code-alongs in this class all use Python 2.7. Source code (with copious amounts of comments) is attached as a resource with all the code-alongs. The source code has been provided for both Python 2 and Python 3 wherever possible.


Using discussion forums

Please use the discussion forums on this course to engage with other students and to help each other out. Unfortunately, much as we would like to, it is not possible for us at Loonycorn to respond to individual questions from students:-(

We're super small and self-funded with only 2-3 people developing technical video content. Our mission is to make high-quality courses available at super low prices.

The only way to keep our prices this low is to *NOT offer additional technical support over email or in-person*. The truth is, direct support is hugely expensive and just does not scale.

We understand that this is not ideal and that a lot of students might benefit from this additional support. Hiring resources for additional support would make our offering much more expensive, thus defeating our original purpose.

It is a hard trade-off.

Thank you for your patience and understanding!


Who is the target audience?
  • Yep! Analytics professionals, modelers, big data professionals who haven't had exposure to machine learning
  • Yep! Engineers who want to understand or learn machine learning and apply it to problems they are solving
  • Yep! Product managers who want to have intelligent conversations with data scientists and engineers about machine learning
  • Yep! Tech executives and investors who are interested in big data, machine learning or natural language processing
  • Yep! MBA graduates or business professionals who are looking to move to a heavily quantitative role
Students Who Viewed This Course Also Viewed
Curriculum For This Course
93 Lectures
19:50:05
+
Introduction
2 Lectures 06:36

We - the course instructors - start with introductions. We are a team that has studied at Stanford, IIT Madras, IIM Ahmedabad and spent several years working in top tech companies, including Google and Flipkart.

Next, we talk about the target audience for this course: Analytics professionals, modelers and big data professionals certainly, but also Engineers, Product managers, Tech Executives and Investors, or anyone who has some curiosity about machine learning.

If Machine Learning is a car, this class will teach you how to drive. By the end of this class, students will be able to: spot situations where machine learning can be used, and deploy the appropriate solutions. Product managers and executives will learn enough of the 'how' to be able intelligently converse with their data science counterparts, without being constrained by it.

This course is practical as well : There are hundreds of lines of source code with comments that can be used directly to implement natural language processing and machine learning for text summarization, text classification in Python.

Preview 02:24

This course is both broad and deep. It covers several different types of machine learning problems, their solutions and shows you how to practically apply them using Python. 

A sneak peek at what's coming up
04:12
+
Jump right in : Machine learning for Spam detection
5 Lectures 42:19

There are different approaches to using computers to solve problems. We'll compare and contrast those approaches in this section

Solving problems with computers
02:11

Machine learning is quite the buzzword these days. While it's been around for a long time, today its applications are wide and far-reaching - from computer science to social science, quant trading and even genetics. From the outside, it seems like a very abstract science that is heavy on the math and tough to visualize. But it is not at all rocket science. Machine learning is like any other science - if you approach it from first principles and visualize what is happening, you will find that it is not that hard. So, let's get right into it, we will take an example and see what Machine learning is and why it is so useful.

Preview 07:28

Machine learning usually involves a lot of terms that sound really obscure. We'll see a real life implementation of a machine learning algorithm (Naive Bayes) and by end of it you should be able to speak some of the language of ML with confidence.

Plunging In - Machine Learning Approaches to Spam Detection
11:48

We have gotten our feet wet and seen the implementation of one ML solution to spam detection - let's venture a little further and see some other ways to solve the same problem. We'll see how K-Nearest Neighbors and Support Vector machines can be used to solve spam detection.

Spam Detection with Machine Learning Continued
11:07

So far we have been slowly getting comfortable with machine learning - we took one example and saw a few different approaches. That was just the the tip of the iceberg - this class is an aerial maneuver, we will scout ahead and see what are the different classes of problems that Machine Learning can solve and that we will cover in this class.

Get the Lay of the Land : Types of Machine Learning Problems
09:45
+
Solving Classification Problems
10 Lectures 01:42:58

We've described how to identify classification problems. This section covers some of the most popular classification algorithms such as the Naive Bayes classifier, K-Nearest neighbors, Support Vector machines and Artificial Neural Networks

Solving Classification Problems
00:59

Many popular machine learning techniques are probabilistic in nature and having some working knowledge helps. We'll cover random variables, probability distributions and the normal distribution.
Random Variables
11:27

We have been learning some fundamentals that will help us with probabilistic concepts in Machine Learning. In this class, we will learn about conditional probability and Bayes theorem which is the foundation of many ML techniques.
Bayes Theorem
11:55

Naive Bayes Classifier is a probabilistic classifier. We have built the foundation to understand what goes on under the hood - let's understand how the Naive Bayes classifier uses the Bayes theorem
Naive Bayes Classifier
05:26

We will see how the Naive Bayes classifier can be used with an example.
Preview 09:18

Let's understand the k-Nearest Neighbors setup with a visual representation of how the algorithm works.
K-Nearest Neighbors
13:09

There are few wrinkles in k-Nearest Neighbors. These are just the things to keep in mind if and when you decide to implement it.
K-Nearest Neighbors : A few wrinkles
14:47

We have been talking about different classifier algorithms. We'll learn about Support Vector Machines which are linear classifiers.

Support Vector Machines Introduced
08:16

Support Vector Machines algorithm can be framed as an optimization problem. The kernel trick can be used along with SVM to perform non-linear classification.
Support Vector Machines : Maximum Margin Hyperplane and Kernel Trick
16:23

Artificial Neural Networks are much misunderstood because of the name. We will see the Perceptron (a prototypical example of ANNs) and how it is analogous to Support Vector Machine

Artificial Neural Networks:Perceptrons Introduced
11:18
+
Clustering as a form of Unsupervised learning
2 Lectures 32:49
Clustering helps us understand what are the patterns in a large set of data that we don't know much about. It is a form of unsupervised learning.
Preview 19:07

K-Means and DBSCAN are 2 very popular clustering algorithms. How do they work and what are the key considerations?
Clustering : K-Means and DBSCAN
13:42
+
Association Detection
1 Lecture 09:12
It is all about finding relationships in the data - sometimes there are relationships that you would not intuitively expect to find. It is pretty powerful - so let's take a peek at what it does.
Association Rules Learning
09:12
+
Dimensionality Reduction
2 Lectures 29:15

Data that you are working can be noisy or garbled or difficult to make sense of. It can be so complicated that its difficult to process efficiently. Dimensionality reduction to the rescue - it cleans up the noise and shows you a clear picture. Getting rid of unnecessary features makes the computation simpler.

Dimensionality Reduction
10:22

PCA is one of the most famous Dimensionality Reduction techniques. When you have data with a lot of variables and confusing interactions, PCA clears the air and finds the underlying causes.
Principal Component Analysis
18:53
+
Regression as a form of supervised learning
2 Lectures 24:07
Regression can be used to predict the value of a variable, given some predictor variables. We'll see an example to understand its use and cover two popular methods : Linear and Logistic regression
Regression Introduced : Linear and Logistic Regression
13:54

In this class, we will talk about some trade-offs which we have to be aware of when we choose our training data and model.
Preview 10:13
+
Natural Language Processing and Python
18 Lectures 03:41:42

This section will help you put all your hard earned knowledge to practice! Here's a quick overview of what's coming up. 

Applying ML to Natural Language Processing
00:56

Anaconda's iPython is a Python IDE. The best part about it is the ease with which one can install packages in iPython - 1 line is virtually always enough. Just say '!pip'

Installing Python - Anaconda and Pip
09:00

Natural Language Processing is a serious application for all the Machine Learning techniques we have been using. Let's get our feet wet by understanding a few of the common NLP problems and tasks. We'll get familiar with NLTK - an awesome Python toolkit for NLP

Preview 07:26

We'll continue exploring NLTK and all the cool functionality it brings out of the box - tokenization, Parts-of-Speech tagging, stemming, stopwords removal etc

Natural Language Processing with NLTK - See it in action
14:14

Web Scraping is an integral part of NLP - its how you prepare the text data that you will actually process. Web Scraping can be a headache - but Beautiful Soup makes it elegant and intuitive.

Web Scraping with BeautifulSoup
18:09

Auto-summarize newspaper articles from a website (Washington Post). We'll use NLP techniques to remove stopwords, tokenize text and sentences and compute term frequencies. The Python source code (with many comments) is attached as a resource.
A Serious NLP Application : Text Auto Summarization using Python
11:34

Code along with us in Python - we'll use NLTK to compute the frequencies of words in an article.

Python Drill : Autosummarize News Articles I
18:33

Code along with us in Python - we'll use NLTK to compute the frequencies of words in an article and the importance of sentences in an article.

Python Drill : Autosummarize News Articles II
11:28

Code along with us in Python - we'll use Beautiful Soup to parse an article downloaded from the Washington Post and then summarize it using the class we set up earlier.
Python Drill : Autosummarize News Articles III
10:23

Classify newspaper articles into tech and non-tech. We'll see how to scrape websites to build a corpus of articles. Use NLP techniques to do feature extraction and selection. Finally, apply the K-Nearest Neighbours algorithm to classify a test instance as Tech/NonTech. The Python source code (with many comments) is attached as a resource.

Put it to work : News Article Classification using K-Nearest Neighbors
19:29

Classify newspaper articles into tech and non-tech. We'll see how to scrape websites to build a corpus of articles. Use NLP techniques to do feature extraction and selection. Finally, apply the Naive Bayes Classification algorithm to classify a test instance as Tech/NonTech. The Python source code (with many comments) is attached as a resource.

Put it to work : News Article Classification using Naive Bayes Classifier
19:24

Code along with us in Python - we'll use BeautifulSoup to build a corpus of news articles
Python Drill : Scraping News Websites
15:45

Code along with us in Python - we'll use NLTK to extract features from articles.

Python Drill : Feature Extraction with NLTK
18:51

Code along with us in Python - we'll use KNN algorithm to classify articles into Tech/NonTech
Python Drill : Classification with KNN
04:15

Code along with us in Python - we'll use a Naive Bayes Classifier to classify articles into Tech/Non-Tech
Python Drill : Classification with Naive Bayes
08:08

See how search engines compute the similarity between documents. We'll represent a document as a vector, weight it with TF-IDF and see how cosine similarity or euclidean distance can be used to compute the distance between two documents.
Document Distance using TF-IDF
11:03

Create clusters of similar articles within a large corpus of articles. We'll scrape a blog to download all the blog posts, use TF-IDF to represent them as vectors. Finally, we'll perform K-Means clustering to identify 5 clusters of articles. The Python source code (with many comments) is attached as a resource.

Put it to work : News Article Clustering with K-Means and TF-IDF
14:32

Code along with us in Python - We'll cluster articles downloaded from a blog using the KMeans algorithm.
Python Drill : Clustering with K Means
08:32
+
Sentiment Analysis
10 Lectures 02:32:05

Lots of new stuff coming up in the next few classes. Sentiment Analysis (or) Opinion Mining is a field of NLP that deals with extracting subjective information (positive/negative, like/dislike, emotions). Learn why it's useful and how to approach the problem. There are Rule-Based and ML-Based approaches. The details are really important - training data and feature extraction are critical. Sentiment Lexicons provide us with lists of words in different sentiment categories that we can use for building our feature set. All this is in the run up to a serious project to perform Twitter Sentiment Analysis. We'll spend some time on Regular Expressions which are pretty handy to know as we'll see in our code-along.

Solve Sentiment Analysis using Machine Learning
02:36

As people spend more and more time on the internet, and the influence of social media explodes, knowing what your customers are saying about you online, becomes crucial. Sentiment Analysis comes in handy here - This is an NLP problem that can be approached in multiple ways. We examine a couple of rule based approaches, one of which has become standard fare (VADER)

Sentiment Analysis - What's all the fuss about?
17:17

SVM and Naive Bayes are popular ML approaches to Sentiment Analysis. But the devil really is in the details. What do you use for training data? What features should you use? Getting these right is critical.

ML Solutions for Sentiment Analysis - the devil is in the details
19:57

Sentiment Lexicon's are a great help in solving problems where the subjectivity/emotion expressed by a word are important. SentiWordNet is different even among the popular sentiment lexicons (General Inquirer, LIWC, MPQA etc) all of which are touched upon

Sentiment Lexicons ( with an introduction to WordNet and SentiWordNet)
18:49

Regular expressions are a handy tool to have when you deal with text processing. They are a bit arcane, but pretty useful in the right situation. Understanding the operators from basics help you build up to constructing complex regexps.

Regular Expressions
17:53

re is the module in python to deal with regular expressions. It has functions to find a pattern, substitute a pattern etc within a string.
Regular Expressions in Python
05:41

A serious project - Accept a search term from a user and output the prevailing sentiment on Twitter for that search term. We'll use the Twitter API, Sentiwordnet, SVM, NLTK, Regular Expressions - really work that coding muscle :)

Put it to work : Twitter Sentiment Analysis
17:48

We'll accept a search term from a user and download a 100 tweets with that term. You'll need a corpus to train a classifier which can classifiy these tweets. The corpus has only tweet_ids, so connect to Twitter API and fetch the text for the tweets.

Twitter Sentiment Analysis - Work the API
20:00

The tweets that we downloaded have a lot of garbage, clean it up using regular expressions and NLTK and get a nice list of words to represent each tweet.

Preview 12:24

We'll train 2 different classifiers on our training data , Naive Bayes and SVM. The SVM will use Sentiwordnet to assign weights to the elements of the feature vector.

Twitter Sentiment Analysis - Naive Bayes, SVM and Sentiwordnet
19:40
+
Decision Trees
8 Lectures 01:49:21

Tree based models are very useful to solve a variety of classification problems. The next few sections will introduce you to decision trees, problems inherent to tree learning such as overfitting and how to use ensemble learning techniques to solve these problems. 

Using Tree Based Models for Classification
01:00

What are Decision Trees and how are they useful? Decision Trees are a visual and intuitive way of predicting what the outcome will be given some inputs. They assign an order of importance to the input variables that helps you see clearly what really influences your outcome.

Preview 17:00

Recursive Partitioning is the most common strategy for growing Decision Trees from a training set.

Learn what makes one attribute be higher up in a Decision Tree compared to others.

Growing the Tree - Decision Tree Learning
18:03

We'll take a small detour into Information Theory to understand the concept of Information Gain. This concept forms the basis of how popular Decision Tree Learning algorithms work.

Branching out - Information Gain
18:51

ID3, C4.5, CART and CHAID are commonly used Decision Tree Learning algorithms. Learn what makes them different from each other. Pruning is a mechanism to avoid one of the risks inherent with Decision Trees ie overfitting.

Decision Tree Algorithms
07:50

Build a decision tree to predict the survival of a passenger on the Titanic. This is a challenge posed by Kaggle (a competitive online data science community). We'll start off by exploring the data and transforming the data into feature vectors that can be fed to a Decision Tree Classifier.

Titanic : Decision Trees predict Survival (Kaggle) - I
19:21

We continue with the Kaggle challenge. Let's feed the training set to a Decision Tree Classifier and then parse the results.

Titanic : Decision Trees predict Survival (Kaggle) - II
14:16

We'll use our Decision Tree Classifier to predict the results on Kaggle's test data set. Submit the results to Kaggle and see where you stand!

Titanic : Decision Trees predict Survival (Kaggle) - III
13:00
6 More Sections
About the Instructor
Loony Corn
4.3 Average rating
4,595 Reviews
36,740 Students
75 Courses
An ex-Google, Stanford and Flipkart team

Loonycorn is us, Janani Ravi and Vitthal Srinivasan. Between us, we have studied at Stanford, been admitted to IIM Ahmedabad and have spent years  working in tech, in the Bay Area, New York, Singapore and Bangalore.

Janani: 7 years at Google (New York, Singapore); Studied at Stanford; also worked at Flipkart and Microsoft

Vitthal: Also Google (Singapore) and studied at Stanford; Flipkart, Credit Suisse and INSEAD too

We think we might have hit upon a neat way of teaching complicated tech courses in a funny, practical, engaging way, which is why we are so excited to be here on Udemy!

We hope you will try our offerings, and think you'll like them :-)