All-in-One:Machine Learning,DL,NLP,AWS Deply [Hindi][Python]
4.2 (245 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
20,514 students enrolled

All-in-One:Machine Learning,DL,NLP,AWS Deply [Hindi][Python]

Complete hands-on Machine Learning Course with Data Science, NLP, Deep Learning and Artificial Intelligence
4.2 (245 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
20,514 students enrolled
Created by Rishi Bansal
Last updated 8/2020
Current price: $41.99 Original price: $59.99 Discount: 30% off
23 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 17.5 hours on-demand video
  • 1 article
  • 59 downloadable resources
  • Full lifetime access
  • Access on mobile and TV
  • Assignments
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Master in creating Machine Learning Models on Python
  • Visualizing various ML Models wherever possible to develop a better understanding about it.
  • How to Analyse the Data, Clean it and Prepare (Data Preprocessing Techniques) it to feed into Machine Learning Models.
  • Learn the most Basic Mathematics behind Simple Linear Regression and its Best fit line.
  • What is Gradient Descent, how it works Internally with full Mathematical explanation.
  • Make predictions using Simple Linear Regression, Multiple Linear Regression.
  • Deploy your own model on AWS using Flask so that anyone can access it and get the prediction.
  • Make predictions using Logistic Regression, K-Nearest Neighbours and Naive Bayes.
  • Fundamental Concept of Deep Learning and Natural Language Processing. Python Code is include at some place for explanation.
  • Regularisation and idea behind it. See it in action using Lasso and Ridge Regression.
  • For Machine Learning Concept no prerequisite. Anyone can do this course.
  • Prior Understanding of Python is required.

This course is designed to cover maximum Concept of Machine Learning.  Anyone can opt for this course. No prior understanding of Machine Learning is required. 

As a Bonus Introduction Natural Language Processing and Deep Learning is included.

Below Topics are covered 

Chapter - Introduction to Machine Learning

- Machine Learning?

- Types of Machine Learning

Chapter - Setup Environment

- Installing Anaconda, how to use Spyder and Jupiter Notebook

- Installing Libraries

Chapter - Creating Environment on cloud (AWS)

- Creating EC2, connecting to EC2

- Installing libraries, transferring files to EC2 instance, executing python scripts

Chapter - Data Preprocessing

- Null Values

- Correlated Feature check

- Data Molding

- Imputing

- Scaling

- Label Encoder

- On-Hot Encoder

Chapter - Supervised Learning: Regression

- Simple Linear Regression

- Minimizing Cost Function - Ordinary Least Square(OLS), Gradient Descent

- Assumptions of Linear Regression, Dummy Variable

- Multiple Linear Regression

- Regression Model Performance - R-Square

- Polynomial Linear Regression

Chapter - Supervised Learning: Classification

- Logistic Regression

- K-Nearest Neighbours

- Naive Bayes

- Saving and Loading ML Models

- Classification Model Performance - Confusion Matrix

Chapter: UnSupervised Learning: Clustering

- Partitionaing Algorithm: K-Means Algorithm, Random Initialization Trap, Elbow Method

- Hierarchical Clustering: Agglomerative, Dendogram

- Density Based Clustering: DBSCAN

- Measuring UnSupervised Clusters Performace - Silhouette Index

Chapter: UnSupervised Learning: Association Rule

- Apriori Algorthm

- Association Rule Mining

Chapter: Deploy Machine Learning Model using Flask

- Understanding the flow

- Serverside and Clientside coding, Setup Flask on AWS, sending request and getting response back from flask server

Chapter: Non-Linear Supervised Algorithm: Decision Tree and Support Vector Machines

- Decision Tree Regression

- Decision Tree Classification

- Support Vector Machines(SVM) - Classification

- Kernel SVM, Soft Margin, Kernel Trick

Chapter - Natural Language Processing

Below Text Preprocessing Techniques with python Code

- Tokenization, Stop Words Removal, N-Grams, Stemming, Word Sense Disambiguation

- Count Vectorizer, Tfidf Vectorizer. Hashing Vector

- Case Study - Spam Filter

Chapter - Deep Learning

- Artificial Neural Networks, Hidden Layer, Activation function

- Forward and Backward Propagation

- Implementing Gate in python using perceptron

Chapter: Regularization, Lasso Regression, Ridge Regression

- Overfitting, Underfitting

- Bias, Variance

- Regularization

- L1 & L2 Loss Function

- Lasso and Ridge Regression

Chapter: Dimensionality Reduction

- Feature Selection - Forward and Backward

- Feature Extraction - PCA, LDA

Chapter: Ensemble Methods: Bagging and Boosting

- Bagging - Random Forest (Regression and Classification)

- Boosting - Gradient Boosting (Regression and Classification)

Who this course is for:
  • Anyone who is looking or dont know from where to start Machine Learning, Deep Learning and Natural Language Processing can opt for this course.
  • This will provide a good foundation in understanding concept of Machine Learning.
Course content
Expand all 177 lectures 17:34:05
+ Introduction to Machine Learning
4 lectures 28:03

Full Course Material can be download from github:

Preview 09:38

Supervised - labeled data is used to help machines recognize characteristics and use them for future data. E.g: classify pictures of cats and dogs.

Unsupervised - we simply put unlabeled data and let machine understand the characteristics and classify it. E.g: Clustering (News Article)

Reinforcement Learning: RML interact with the environment by producing actions and then analyze errors or rewards. E.g: Chess game

Preview 06:46

Regression: This is a type of problem where we need to predict the continuous-response value (ex : above we predict number which can vary from -infinity to +infinity)

E.g: House Price, Value of stock

Classification: This is a type of problem where we predict the categorical response value where the data can be separated into specific “classes” (ex: we predict one of the values in a set of values).

E.g: Mail spam or not, Diabetes or not, etc

Supervised Learning

Test your understanding about Regression and Classification Problems

Quiz 1
2 questions
+ Optional: Setup Environment
4 lectures 21:57

Anaconda is a distribution of Python, including a selection of libraries and other useful tools. It is not an IDE but does include the Jupiter and Spyder IDE

Installing Anaconda
How to Use Spyder Notebook
How to use Jupiter Notebook
Installing Library
+ Optional: Setup Environment on cloud (AWS)
5 lectures 25:53
Why AWS?
Creating EC2 instance
Connect to EC2 instance

#sudo yum install python3

#sudo pip3 install pandas

#sudo pip3 install sklearn

Installing Packages
Transferring Files to AWS EC2 instance
+ Data Preprocessing
9 lectures 55:07

•Preprocessing refers to transformation before feeding to machine learning

•Quality of data is important to train the model

•Source – Government databases, professional or company data sources(twitter), your company, etc

•Data will never be in the format you need – Pandas Dataframe for reformatting

•Columns to remove – No values, duplicate(correlated column, e.g: house size in ft and metres)

•Learning algorithms understands only number, converting text image to number is required

•Unscaled or unstandardized data have might have unacceptable prediction

What is Data Preprocessing?

•Check for Null values

•Remove or Impute


•df = df.dropna(how='any',axis=0)

Checking for Null Values: Concept + Python Code

•Sometimes two features that are meant to measure different characteristics of a model are influenced by common mechanism and they move together.

How to Handle Correlation:

•Remove one of the feature

•Apply Principal Component Analysis(PLA)

Correlated Feature Check: Concept + Python Code

•Adjusting Data Types - Inspect data types to see if there are any issues. Data should be numeric.

•If required create new columns

Data Molding(Encoding): Concept + Python Code

Missing Data - Ways to Handle

•Drop rows

•Replace values (Impute)

Impute Missing Values: Concept + Python Code

•Feature Scaling is a technique to standardize the independent features present in the data in a fixed range.

•It is performed during the data pre-processing to handle highly varying magnitudes or values or units.


• Without Feature Scaling a machine learning algorithm tends to weigh greater values -> higher and consider smaller values as the lower values, regardless of the unit of the values.

Scaling: Python Code

Convert text values to numbers. These can be used in the following situations:

•There are only two values for a column in your data. The values will then become 0/1 - effectively a binary representation

•The values have relationship with each other where comparisons are meaningful (e.g. low<medium<high)

Label Encoder: Concept + Code

•Use when there is no meaningful comparison between values in the column

•Creates a new column for each unique value for the specified feature in the data set

One-Hot Encoder: Concept + Python Code
Data Preprocessing
3 questions
+ Supervised Learning: Regression
20 lectures 02:35:52

Full Course content (Code) can be downloaded from Github:

Simple Linear Regression: Concept

•Error = (y_pred – y_act)^2

•Two Methods:

1.Least Square Criterian (OLS)

2.Gradient Descent

Minimizing Cost Function

•non-iterative method that fits a model such that the sum-of-squares of differences of observed and predicted values is minimized

•Error = (y_pred – y_act)^2

•Line => y = bo + b1x

Ordinary Least Square(OLS)

•Cost Function, J(m,c) = (y_pred – y_act)^2 / No. of data point

•Hypothesis: y_pred = c + mx

Gradient Descent

It tells how well regression equation explains the data.

•A value of R^2 = 1 means regression predictions perfectly fit/explains the data.

Question: Can ?2 be negative?

•Ans: When: ( Sum of Square Errors(SSE) > {Total Sum of Squares(SST)} )

•This means when our predicted model performs worst than average line which is a very rare case.

Measuring Regression Model Performance: R^2 (R - Square)

Code file and datasets can be found in the

Simple Linear Regression: Python Code -1
Simple Linear Regression: Python Code -2


•linear regression is sensitive to outlier effects

•needs the relationship between the independent and dependent variables to be linear

•linearity assumption can best be tested with scatter plots

2. Homoscedasticity

•meaning the residuals are equal across the regression line

•Heteroscedasticity Test to check - The Goldfeld-Quandt Test

3. Multivariate Normality

•This assumption can best be checked with a histogram or a Q-Q-Plot

•Normality can be checked with a goodness of fit test(Kolmogorov-Smirnov)

4. No Autocorrelation in the Data

•when the residuals are not independent from each other.

•in simple terms when the value of y(x+1) is not independent from the value of y(x)

•Durbin-Watson test

5. Lack of Multicollinearity

•Multicollinearity: Model cannot differentiate between the effect of D1 and D2 as these are totally related.

•fixed using correlation in data pre processing

Assumptions of Linear Regression

There is a linear relationship between both the dependent and independent variables.
It also assumes no major correlation between the independent variables.

•Multiple regressions can be linear and nonlinear.

Multiple Linear Regression: Concept

y = b0 + b1x1 + b2x2 + b3x3 + b4x4 + b5x5

Here x4,x5 are dummy variable

x5 = 1 – x4

Multicollinearity -> that’s why its called as dummy variable

for 2 -> 1 & 0

For: > 2 -> column

Dummy Variable
Multiple Linear Regression: Python - 1
Multiple Linear Regression: Python - 2
Multiple Linear Regression: Python - 3

If data is not linear, we need polynomial terms to fit it better

Polynomial Linear Regression: Concept
Polynomial Linear Regression: Python - 1
Polynomial Linear Regression: Python - 2
Polynomial Linear Regression: Python - 3
Polynomial Linear Regression: Python - 4
Linear Regressions Comparisons
Simple Linear Regression: Quiz
4 questions
Question in this section is related to Supervised Learning: Regression
Boston Housing Price Prediction
1 question
Assignment: Predicting Housing Prices (Boston Data Solution): Optional
+ Supervised Learning: Classification
14 lectures 01:31:30

Issue with Linear Regression

•But if we have an outlier, it will go horribly wrong

•Because of one outlier, whole linear regression prediction is going wrong

Logistic Regression

  • Logistic regression can be understood by standard logistic function. Logistic function is a Sigmoid function, which takes real value between zero and one.

  • If we plot sigmoid function, the graph will be S curve. When there is an outlier, sigmoid function takes care of it.

  • Linear regression assumes that the data follows a linear function.

  • Logistic regression models the data using the sigmoid function

Logistic Regression

Describe the performance of a classification model

•Accuracy: Is fraction of correct predictions in all prediction made by model

•Precision: Is fraction of correct positive predictions in all positive predictions made by the model

•Recall: Is fraction of correct positive predictions made in actual positive data

Confusion Matrix: Measuring Performance of Classification Model

Spam Filter (positive class: spam): Optimize for precision or specificity because false negatives (spam goes to inbox) are more acceptable than false positives (non-spam caught by spam filter).

Fraudulent transaction detector ( positive class: fraud): Optimize for sensitivity because false positives (normal transactions that are flagged as possible fraud) are more acceptable than false negatives (fraudulent transactions that are not detected)

Confusion Matrix: Case Study
Logistic Regression: Python 1
Logistic Regression: Python 2
Logistic Regression: Python 3
Logistic Regression: Python 4

It assumes that similar things exist in close proximity.

* Step 1: Choose the no. K of neighbours
* Step 2: Take the K nearest neighbours of the new data points by Euclidean distance
* Step 3: Among K Neighbours, count the no. of data points in each category
* Step 4: Assign new data point to the category where you counted most neighbour

K - Nearest Neighbours Algorithm
K - Nearest Neighbours: Python 1
K - Nearest Neighbours: Python 2

Its Naive(innocent) because it assumes that all the features are independent of each other. Which is almost never possible.

•Easy to understand.

•All features are independent.

•All impact results equally.

•Need small amount of data to train the model.

•Fast – up to 100X faster.

•It is highly scalable.

•It can make probabilistic predictions.

•It's simple & out-performs many sophisticated methods.

•Stable to data changes.

Naive Bayes
Naive Bayes: Python Code
Pickle File: Saving and Loading ML Models: Python
Question in this section is related to Supervised Learning: Classification
Wine Quality Prediction
1 question
Assignment 2: Predicting Wine Quality: Optional
It includes three iris species with 50 samples each as well as some properties about each flower. One flower species is linearly separable from the other two, but the other two are not linearly separable from each other.
Classify iris plants into three species
1 question
+ UnSupervised Learning: Clustering
15 lectures 01:26:19


1.Initialize k centroids.

2.Select at random K points, the centroids(not necessary from the dataset)

3.Assign each data to the nearest centroid, this step will create clusters.

4.Compute and place the new centroid of each cluster.

5.Reassign each data point to the new closest centroid. If any new reassignment, Repeat steps 4 otherwise go to Finish

K-Means Algorithm

•Solution(Fix) -> K-Means++

•K-Means++ -> smarter initialization of centroids, rest is same

Random Initialization Trap


•Euclidean distance between a given point and centroid to which it is assigned.

•Iterate this process for all the points in the cluster

•Sum all the values and divide by no. of points

Total WCSS decreases as no. of clusters increases

Total WCSS is minimum when No. of clusters is equal to no. of data points

Elbow Method to find the optimal number of clusters

Elbow Method: Choosing optimum no of clusters
K-Means++ : Python 1
K-Means++ : Python 2
K-Means++ : Python 3

•These methods does hierarchical decomposition of datasets.

Agglomerative method (Bottom-Up): assume each data as cluster & merge to create a bigger cluster

Divisive method (Top-Down): start with one cluster & continue splitting


•Start with assigning one cluster to each data - N Cluster

•Combine two closest point in one cluster - (N - 1) Cluster

•Combine two closest cluster into one cluster - (N - 2) Cluster

•Repeat Step 3 until there is only one cluster left

Hierarchical - Agglomerative Algorithm
Agglomerative - Dendrogram
Agglomerative - Python 1
Agglomerative - Python 2

All above techniques are distance based & such methods can find only spherical clusters and not suited for clusters of other shapes. All they are severely impacted by noise or outliers in the data.


•If data is of arbitrary shape

•Data contain noise

Algorithm has two parameters:
eps: The radius of our neighborhoods around a data point p. If distance between two points is lower or equal to eps then they are neighbours. Small value will lead to large data points as outlier and large value will lead to majority of data points to same cluster.

minPts: The minimum number of data points we want in a neighborhood to define a cluster. minPts >= D +1 and should be at least 3.

Density Based Clustering - DBSCAN
DBSCAN - Python 1
DBSCAN - Python 2

•Not as straight forward as Supervised Algorithm

•Question of Good clustering is relative

Some Popular Index:


•Evaluates intra-cluster similarity and inter-cluster differences

•Not Normalized, so difficult to compare between two different datasets

Silhouette Index

•calculates using the mean intra-cluster distance (a) and the mean nearest-cluster distance (b) for each sample

•The Silhouette Coefficient for a sample is (b - a) / max(a, b). To clarify, b is the distance between a sample and the nearest cluster that the sample is not a part of.

•Normalized, a value close to 1 is always good

•good for spherical data structures

Measuring UnSupervised Clusters Performance
Silhouette Index - Python 1
+ UnSupervised Learning: Association Rule
5 lectures 42:16

Apriori Algorithm:

•Used to identify frequent item sets.

•Uses bottom-up approach, identify individual items first that fulfill a min occurrence threshold. After this, it add one item at a time and check if the resulting item set still meet the specified threshold.

•Algorithm stops when there are no more item left to add to meet the min. occurence threshold

Apriori Algorithm

•Once we generated itemsets using Apriori, we can apply association rules.

•As our item size is having 2 items so our association rule will be of the form (A) -> (B)

Three Stage:

1. Support

2. Confidence

3. Lift

Association Rule Mining
Apriori Association: Python 1
Apriori Association - Python 2
Apriori Association- Python 3
+ Deploy Machine Learning Model on AWS Using Flask
5 lectures 27:16
Deploying ML on AWS - Concept
Saving the ML Model
Serverside - Python
Clientside - Python
Configuring and sending request
+ Supervised Learning: Decision Tree and Support Vector Machines
17 lectures 01:40:19

•Its a tree like data structure to make a model of the data

•uses if-else at every node of the tree

•can be used for both classification and regression analysis

Algorithm : Decision Trees

•ID3 (Entropy and Information Gain)

•Gini Index

•Chi Square

My Github:

For detailed Entropy explanation refer to file : "Decision Tree" in above Repo.

Decision Tree Regression - Concept 1
Decision Tree Regression - Concept 2
Decision Tree Regression - Python 1
Decision Tree Regression - Python 2
Decision Tree Classification - Concept 1
Decision Tree Classification - Concept 2
Decision Tree Classification - Python 1
Decision Tree Classification - Python 2

•Operates well in higher dimensions

•Avoids curse of dimensionality

•Fast to compute

•Max Margin: A slight error in measurement will not cause a misclassification

Support Vector Machines - Concept
Support Vector Machines - Python 1
Support Vector Machines - Python 2
Kernel SVM
Kernel SVM - Python 1
Kernel SVM - Python 2

•fit line is the hyperplane that has a maximum number of points

•Y = mx +c

•-e < Ypred – Yact < e

Support Vector Regression - Concept
Support Vector Regression - Python 1
Support Vector Regression - Python 2