Practical Data Science
3.3 (24 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
714 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Practical Data Science to your Wishlist.

Add to Wishlist

Practical Data Science

You will gain the necessary practical skills to jump start your career as a Data Scientist!
3.3 (24 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
714 students enrolled
Created by Atul Bhardwaj
Last updated 8/2015
Price: $20
30-Day Money-Back Guarantee
  • 5.5 hours on-demand video
  • 1 Article
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Understand the entire Data Science Process
  • Use Python and its Scientific Libraries: Pandas, NumPy, StatsModels and more...
  • Put Theory and Concepts into action through Practical Application
  • Use various Statistical Methods to Extract useful Information from Data
  • Hands on Experience with handling Big Data
View Curriculum
  • Python - IPython Notebook (Download/Installation instructions will be provided)
  • You should have Microsoft Excel

"Junior Level Data Scientist Median Salary from $91,000 and up to $250,000".

As an experienced Data Analyst I understand the job market and the expectations of employers. This data science course is specifically designed with those expectations and requirements in mind. As a result you will be exposed to the most popular data mining tools, and you will be able to leverage my knowledge to jump start (or further advance) your career in Data Science.

You do not need an advanced degree in mathematics to learn what I am about to teach you. Where books and other courses fail, this data science course excels; that is each section of code is broken down through the use of Jupyter and explained in a easy to digest manner. Furthermore, you will get exposed to real data and solve real problems which gives you valuable experience!

Who is the target audience?
  • Junior Data Scientist
  • Statistical Analyst
  • Data Analyst
  • This course is suited for individuals who want to advance their career in data science or data analytics
Students Who Viewed This Course Also Viewed
Curriculum For This Course
Expand All 41 Lectures Collapse All 41 Lectures 05:22:05
What is Data Science?
2 Lectures 19:18

This is introduction to the topic of Data Science. We discuss what is Data Science and some of the buzz words surrounding this subject.

Preview 05:23

We look at the most popular views on the Data Science Process to gain import insights behind this topic. The topics include the Knowledge Discovery Process (KDD), Industry Standard Data Mining Process (CRISP-DM) and much more.

The Process
Python Basics
5 Lectures 56:55

In this lecture we will install Anaconda, which is a completely free (and popular) Python distribution.

Python Installation

We look at the update version of iPython now known as Jupyter.

Jupyter (formerly iPython) Introduction

In this lecture, we cover the basics of a very popular scientific library in Python, called NumPy.


For the purpose of creating visuals, we look at matplotlib which is a 2D plotting library.


Pandas "aims to be the fundamental high-level building block for doing practical, real world data analysis in Python". This is one of the most important libraries for a data analysis to be familiar with when using Python. It leverages the power of NumPy and matplotlib among other things.

Statistical Methods → Data Summarization
5 Lectures 30:19

In this two part lecture on Data (or Variable) Types we look at identifying different types of variables.

Data Types (Part 1) — Identifying Types of Variables

In the second part, we learn numerical methods of summarizing individual variables, whether they are qualitative or quantitative.

Data Types (Part 2) — Summarizing Variables Numerically

Here we look at calculating descriptive statistics in Python.

Descriptive Statistics (in Python)

We use Excel to generate Descriptive Statistics.

Descriptive Statistics (in Excel)

This can be thought of as a bonus lecture, where we use SAS to access Descriptive Statistics.

Descriptive Statistics (in SAS)
Statistical Methods → Exploratory Data Analysis
9 Lectures 01:08:57

Perhaps the most commonly used data visualization technique is a Histogram. This lecture answers: What is a Histogram and How to generate one in Python.

Preview 09:08

Probability Mass Functions are not routinely included in texts of Statistics, however, it can provide you with more information than a Histogram. We look at implementing Probability Mass Functions in Python.

Analyzing Individual Variable — Probability Mass Functions

The next logical concept in Exploratory Data Analysis (after Probability Mass Functions) is Cumulative Distribution Functions. We use smoothing to gain insights about the underlying distrubution of our emperical data.

Analyzing Individual Variable — Cumulative Distribution Functions

In this lecture we look at Probability Density Functions and the difference between Empirical and Analytical distributions.

Probability Density Functions & Modelling Empirical Distribution

We look at the differences between Probability Density and Probability Distribution. Additionally, we look at how to generate a Kernel Density Plot in Python.

Smoothing Variable Distribution — Kernel Density Estimation

We move away from analysing individual variables and look at how variables affect each other. Specifically, we look at a very common technique Box Plot to examine relationship between two variables.

Relationship Between Two Variables — Box Plots

We continue with the Exploratory Data Analysis techniques to visualize two variables in concert. In this lecture, we look at Scatter Plots.

Relationship Between Two Variables — Scatter Plots

Here we look at methods that quantify relationship between two variables. Specifically the two common measures are known as: Correlation and Covariance.

Relationship Between Two Variables — Correlation & Covariance

Analyzing relationship between two Categorical Variables can prove to be very insightful. In this lecture we look at comparing different populations, testing the difference and visualizing the relationship.

Bivariate Relationship Between Categorical Variables
Exploratory Data Analysis (EDA) → Practical Example
1 Lecture 16:46

We conduct exploratory data analysis on the Titanic passanger data set made popular by Kaggle.

Exploratory Data Analysis of The Titanic Disaster
Statistical Methods → Statistical Analysis
7 Lectures 39:03

Central Limit Theorem is a critical concept in statistics. The properties of this theorem allow us to make inferences about a population without knowing its true distribution. In this lecture we use simulations (in Python) to prove Central Limit Theorem (CLT) and use the CLT properties to evaluate central tendency and variance of a non-normal (population) distribution.

Central Limit Theorem

We expand on the previous lecture about Central Limit Theorem and introduce estimation, specifically looking at the probability of correctly estimating a parameter.


In this lecture we answer:

  • Why use vectors in data analysis?
  • How to Represent vectors in Python?
  • What's the difference between using List and NumPy array?
  • Vectorization vs. Loops, Why use loops?
  • What are Matrices in Python?
  • and more...

Linear Algebra and Matrices — Basics

You do not need to rely on any external packages in order to generate summary statistics. In this lecture, we discuss how matrices can be used to calculate summary statistics of one, two or many variables.

Linear Algebra and Matrices — Summary Statistics

We Introduce Parametric Models (for Statistics) and extend this idea to Linear Response Modelling. Before we can apply this to popular statistical techniques such as Linear Regression, we need to discuss the assumptions of Linear Response Models.

Parametric Statistical Analysis — Linear Response Models

In this lecture we define linear regression, estimate model parameters and list regression assumptions.

Linear Regression

In this lecture we estimate regression model parameters through Ordinary Least Squares using Matrices.

Linear Algebra and Matrices — Ordinary Least Squares
Application of Statistical Methods
3 Lectures 17:02

Multiple regression in Excel - we look at important regression statistics and how they can be calculated from the sample and our regression line. We also look at the implication of multiple t-tests and why f-test is more important in terms of the Regression model.

Multiple Regression (in Excel)

Linear Regression forms the basis of Statistical Analysis. We use the trusted Python library to find the Ordinary Least Squares (OLS) estimate in this practical example.

Linear Regression (in Python)

In this practical example we look to extend simple linear regression to multiple regression through the use of Statsmodels python library.

Multiple Regression (in Python)
Information Retrieval Using Query Language
5 Lectures 15:04

Tools you need to complete the exercises for this section are discussed in this lecture. We also look at an important learning resource for SQL

Getting Started with SQL

We discuss the CREATE TABLE statement in SQL and create our demo table.

CREATE TABLE Statement — Creating a Table in Database

We look at SELECT statement and SELECT DISTINCT variation in SQL. We also look at the LIMIT Clause, which is equivalent to SELECT TOP Clause.

SELECT & LIMIT — Selecting Data from Database

The ORDER BY keyword is used to sort the output in SQL, we discuss its usage in this video demonstration.

ORDER BY — Sorting Query Output

Grouping is commonly used to perform aggregation, and in this lecture we discuss the usage of GROUP BY in SQL.

GROUP BY — Grouping Output
Big Data
3 Lectures 38:48

Data Integration is performed at the early stage of a data science process. This video introduces you to HDF (Hierarchical Data Format) and you will learn how to easily implement this platform indepedent technology in Python.

Data Integration — Introduction to HDF (Hierarchical Data Format)

We look at various methods available in Python that deal with large datasets which do not fit into memory. In addition we will look at combining chunks of these datasets to generate a Data Warehouse.

Data Integration — A Practical Example

This lecture contains the update Notebook of the example discussed in the previous video. Specifically, we utilize vectorization instead of for loops for the Table solution. This lecture is compleltely optional.

Data Integration — A Practical Example (Update)
Data Science for Business & Marketing
1 Lecture 19:54

It is a common business objective to find which products or promotions increase sales. This lecture gives you an idea about how to utilize Exploratory Data Analysis as a means of Feature Selection and as well as Knowledge Discovery. We then use multiple regression to verify whether the effect really exists (based on what we learned in our Exploratory Data Analysis!).

Product Promotion (in Python)
About the Instructor
Atul Bhardwaj
3.3 Average rating
24 Reviews
714 Students
1 Course
Data Analyst

I have an educational background in statistics, data mining and data science. In addition to being a SAS 9 Base certified programmer, I have experience with real world data science projects and research (in the health care sector). Data Science is my passion, and I want to pass my knowledge onto like minded people. Please review my Linkedin Page to learn more about me