The Comprehensive Data Analyst Course.

Name: The Comprehensive Data Analyst Course.
Rating: 3.9 (17 reviews)

Learn about Numpy, Pandas, SQL, Linear Algebra, Visualization and more through solved case study

Created byNewton Academy

Last updated 6/2025

English

What you'll learn

Basics of Python.
Introduction to Numpy package for handling arrays
Introduction to Pandas package for cleaning and analysing data
Introduction to SQL
Basics of Linear Algebra - What is a point, Line, Distance of a point from a line
What is a Vector and Vector Operations
What is a Matrix and Matrix Operations
Visualizing data, including bar graphs, pie charts, histograms
Data distributions, including mean, variance, and standard deviation, and normal distributions and z-scores
Analyzing data, including mean, median, and mode, plus range and IQR and box plots
Data Distributions like Normal and Chi Square
Probability, including union vs. intersection and independent and dependent events and Bayes' theorem
Central Limit Theorem
Hypothesis Testing

Course content

9 sections • 191 lectures • 30h 50m total length

Keywords, Identifiers and Variables8:12
Learn how Python keywords are reserved words and why you cannot use them as identifiers or variable names, and how identifiers and variables support typing in Python 3.8.10 on Colab.
Variable Assignment6:53
Discover how Python assigns variables with the equals sign, handles int, float, and string values, and understands memory behavior, id, and type to ensure correct operations.
Strings & List17:13
Explore Python basics by assigning variables, understanding types and memory behavior, and practicing strings and lists with indexing, slicing, mutability, and append operations.
Tuple3:19
Set4:19
Dictionary5:20
Data type conversion9:07
Python Comments2:47
Learn how Python comments improve readability, using hashes for single-line notes and triple quotes for multi-line blocks, with start and end markers or repeated hashes.
Print Statement5:40
Learn to improve readability by breaking long lines with a backslash, printing values, and formatting outputs with curly braces and dot format for A and B.
Python Arithmetic and Logical Operators10:02
Explore Python arithmetic and logical operators, including plus, minus, division and multiplication, modulus division, floor division, and exponent, then compare values, apply logical and or not, and learn augmented assignment.
Identity & Membership Operators6:05
Discover how the identity operator uses is to compare variables in Python, revealing when objects share storage. Explore membership with in and not in for lists.
For & While loop7:03
Explore how for and while loops enable iteration, using range and lists to print sequences and tables efficiently, while emphasizing scalable, minimal-repetition code.
Conditional Statement2:50
Functions19:10
Modules7:11
In this module, learn how Python files become modules that store code for reuse, import them with aliases, and selectively import classes to keep programs compact and organized.
List - Part 16:19
List - Part 213:26
Master Python list operations: append vs insert vs extend, and delete methods del, pop, and remove. Learn zero-based indexing, handling duplicates, and how extend differs from append.
List - Part 310:34
Use reverse, in, and not in to access and check list elements; leverage sorted with reverse to view ascending or descending orders. Understand how sort mutates lists and memory references.
List - Part 412:56
List - Part 59:27
Tuple - Part 16:01
Tuple - Part 26:01
Set - Part 15:38
Acquire hands-on skills to create and manage sets in Python, including unordered, mutable collections of unique elements; use curly braces or set(), and add or update for multiple values.
Set - Part 28:11
Set - Part 32:56
Dictionary16:38
Strings11:16
Explore strings as immutable, ordered data in Python, and learn indexing, slicing, concatenation, repetition, and use split, join, find, and replace for processing text.
Numpy Introduction8:56
Creating arrays16:54
Array Operations - Part 112:50
Explore indexing and slicing of arrays, including zero-based starts, negative indices, and reversing; learn that arrays are mutable, support filtering, and use dot copy to create independent copies.
Array Masking3:59
Array Operations - Part 29:33
Explore NumPy array operations, including element by element and dot multiplication, and learn shape requirements for matrix multiplication, with examples using 2x3 and 3x2 arrays.
Array Operations - Part 313:10
Array broadcasting6:37
Array - Shape Manipulation & Sorting10:18
Learn to shape arrays, flatten with ravel, and reshape 1d arrays into 2d or 3d, ensuring element counts match. Master axis-based sorting and argsort for indices without altering the original.
Pandas - Introduction15:04
Learn how pandas reads csv into a data frame, inspects data with head and tail, and checks shape and columns.
Creating a DataFrame6:12
Accessing elements in a DataFrame12:07
DataFrame Filtering4:32
DataFrame Operations24:50

SQL Introduction6:16
Select Command5:07
Limit Command2:58
Column Filtering2:11
DISTINCT command9:32
Master the distinct command in sql by returning unique city and country combinations and ordering by city and country to reveal non-redundant, structured data.
WHERE command3:56
master SQL querying by using where clauses to filter rows, select specific columns, order results, and apply distinct to reveal unique values in product data.
AGGREGATE Functions6:13
GROUP BY command11:53
AND, OR, NULL commands7:58
Master the and, or, and null conditions in sql by combining filters with between, greater than, and less than, and handling missing data with is null and is not null.
LIKE command & WILDCARD characters7:05
Explore the like operator and wildcard characters, including percent sign (zero or more characters) and underscore (one character), to find Maria in the customers table for the course.
JOINS - Part 111:18
Explore left, right, inner, and full outer joins, plus self joins, and learn how to connect order details and products with on clauses.
JOINS - Part 29:17
Explore left and right joins to preserve data, selectively select columns with table stars, and join orders with employees to show order details and product names.
JOINS - Part 37:48
Explains inner, left, and full outer joins with practical examples, showing how matches, nulls, and duplicates arise; introduces self joins to pair customers by city.
IN command1:06
Learn how to use the in command to filter customers by multiple cities and countries, such as Berlin, Germany, France, and the UK, in a single query.
HAVING Command4:26
Learn how the having clause, used after where and after group by, filters aggregated results like country counts, and why having is an elegant, optimized alternative.
UNION command1:35
Master the union command to append results from two queries, ensuring identical column structure and matching column names.
ANY & ALL command7:15
Master the any and all sql commands with subqueries to filter products by related order details, understand why subqueries return a single column, and manage duplicates.

Quick Introduction25:18
What is a random variable13:13
Explore the definition of a random variable as an unknown value outcome from experiments, with discrete and continuous examples like dice faces and rain indicators.
Nominal and Ordinal Data23:51
Central tendency - Introduction24:26
Central tendency - Examples12:28
Data Visualization20:50
Types of Quartile, Inter Quartile Range10:16
Explore percentile, range, and quartiles, and learn to identify the median (50th percentile), the lower and upper quartiles (25th and 75th), and the interquartile range.
Types of Quartile, Inter Quartile Range - Example16:05
Standard Deviation & Variance17:35
Explore how standard deviation and variance quantify data spread around the mean, compare population and sample formulas, and apply the coefficient of variation to assess variability.
Sample Standard Deviation22:36
Explain why sample standard deviation uses n minus one, not n, and how using the sample mean instead of the population mean biases the sample variance.
Co Variance9:33
Normal Distribution23:40
Explore the normal (gaussian) distribution, its properties, and how to standardize it into a unit normal via z-scores, linking to chi-square distribution.
Chi Square Distribution23:05
Explore the chi square distribution, the sum of squares of k independent standard normal variables, with degrees of freedom guiding its shape and table-based probabilities for categorical associations.
Chi Square Goodness of Fit21:10
Association between Categorical variables11:39
Explore how the chi-square distribution tests the association between two categorical variables, comparing observed and expected values, formulating null and alternative hypotheses, and interpreting results with degrees of freedom.
Correlation26:02
Explore correlation and association between variables using the Pearson coefficient, scatter plots, and covariance concepts, then compare linear and monotonic relationships with Spearman rank.

Introduction to EDA13:49
Iris Dataset8:33
Scatter Plot11:12
Two dimensional Scatter plot21:39
Explore two-dimensional scatter plots using matplotlib and seaborn, color-coding iris species to reveal separations between Setosa and Versicolor versus Virginica, and learn multiple plotting approaches.
Three dimensional scatter plot4:11
Pair plots10:55
Explore pair plots to compare all four features across every pair, revealing six unique plots and a distribution plot, with petal length and petal width best separating Setosa from others.
One dimensional scatter plot3:41
Learn how to visualize a feature on the x axis with a 1D scatter plot, separating Setosa, Virginica, and Versicolor by color, and relate to histogram, pdf, and cdf concepts.
Histogram, PDF, CDF15:17
Kde plots6:07
Kde plot - Intuition6:32
PDF and its properties9:28
Explore histograms and pdfs, where the area under the curve equals one and the density axis indicates probability between intervals.
CDF - Code snippet1:29
Mean, Median, Standard deviation, MAD - Code snippet9:16
Box plots5:10
Violin plot2:41

Haeberman Data - Introduction6:47
Explore the Haberman survival dataset through exploratory data analysis with matplotlib and seaborn, inspecting 306 records and four features: age at operation, year, nodes, and survival status.
Data Overview8:44
Univariate Analysis15:28
Explore univariate analysis of age, year, and nodes using histograms and distribution plots to reveal substantial overlap between survival and non-survival, with nodes under four indicating higher survival chances.
Bivariate Analysis8:44

Donors Choose - Introduction8:17
Dive into the DonorsChoose Kaggle dataset, learn hands-on data analysis with Python, and predict proposal approvals while honing data storytelling and interpretation.
Data Understanding6:39
Explore the Kaggle data by analyzing the train and resource CSVs. Map project IDs to connect resource needs, quantities, and prices with metadata like teacher and state.
Data Defintion9:16
Understanding basics data statistics13:05
Univariate Analysis - Part 110:48
Explore univariate analysis by grouping applications by state to compute approval percentages and rank states by acceptance. Also analyze prefixes and grades for their effect on approvals.
Univariate Analysis - Part 214:53
Apply univariate analysis to clean and normalize project subcategories into single terms like literacy_language, then count and sort their occurrences with a Counter to reveal literacy_language as the top category.
Univariate Analysis - Part 310:01
Univariate Analysis - Part 47:57
Univariate Analysis - Part 55:38

Introduction to Linear Equations13:36
Learn how linear algebra solves for unknowns in systems of equations, using a bank chase example to connect speeds, head start, and vector and matrix concepts.
Application of Linear Algebra7:41
What is a scaler9:38
What is a point and distance between 2 points15:40
What is a vector9:15
Row and Column Vector21:28
Transpose of a Matrix2:35
Unit Vector9:48
Learn how to compute a vector's magnitude using the L2 norm (Euclidean distance) and convert any vector to a unit vector by dividing by its magnitude, illustrated with examples.
Vector Addition and Subtraction10:56
Learn how to perform vector addition and subtraction: ensure equal lengths and formats, add element-wise, apply dot product, and extend to n dimensions with x_i + a_i.
Inverse of a vector2:54
Dot Product between two vectors11:00
Explore how the dot product of vectors works and why it matters in data analysis, data science, and machine learning, with rules for compatibility and scalar results.
Multiplication of a vector with a scaler2:29
Angle between 2 vectors - Part 12:26
Explore distributive properties of vectors and scalars, and learn how the angle between two vectors is defined and measured, including multiple configurations and the smallest angle.
Angle between 2 vectors - Part 24:27
Orthogonal Vectors1:42
Orthonormal vectors2:50
Equation of a line - Part 111:04
Equation of a line - Part 25:33
Equation of a line - Part 36:28
Equation of a line - Part 415:57
Explore the line equation in vector form, where W^T x = 0 defines a line through the origin and its perpendicular W, using dot products and origin-shifted cases.
Projection of a point on a line5:42
Explore projecting a vector onto a line in the plane using magnitude and angle, including projections on axes and arbitrary lines with cos theta and sine theta relationships.
Distance of a point from a line24:04
How to determine point on the negative and positive side of a line13:28
Determine the positive or negative side of a line using the signed distance w^T x / ||w|| and its extension to circles, spheres, and higher dimensions.
Matrix Introduction5:30
Matrix Operations15:07
Symmetric, Square, Identity and Diagonal Matrix9:09
Orthogonal Matrix9:14
Minor, Cofactor and Determinant of a Matrix (Optional)12:04
Explore how to compute a matrix's inverse to perform division, and master minors, cofactors, and determinants, including 2x2 and 3x3 expansion methods.
Inverse of a matrix (Optional)15:22

Preface for Dimensionality Reduction - Part 113:18
Explore dimensionality and why reduction aids visualization. Learn to represent data as column vectors and matrices, with rows as points and columns as features, using X and X transpose.
Preface for Dimensionality Reduction - Part 211:43
Preface for Dimensionality Reduction - Part 312:27
Preface for Dimensionality Reduction - Part 417:56
Compute covariance and variance from data sets and matrices. Understand how to treat x and y vectors, and how column-wise means and a covariance matrix S form the covariance calculation.
Preface for Dimensionality Reduction - Part 59:48
Gometric Intuition of PCA9:51
Mathematical formulation of PCA - Part 116:58
Define the data matrix and mean vector, standardize to zero mean and unit variance, then project points onto a unit vector mu to maximize variance.
Mathematical formulation of PCA - Part 27:22
Mathematical formulation of PCA - Part 324:34
Formulate PCA as a constrained optimization with the covariance matrix; solve s mu = lambda mu to obtain eigenvalues and eigenvectors, select top components, and project X onto these axes.
Failure cases of PCA4:01
Connecting Colab to Gdrive6:16
Explore a real data set and dimensionality in visualization, connecting Google Colab to Google Drive, access mnist train csv from Kaggle, and read it with pandas.
Understanding MNIST dataset12:41
Visualizing MNIST single digit5:27
MNIST Visualization - Method 117:14
MNIST Visualization - Method 22:41

Probability Mass Function3:47
Probability Distribution Function9:11
Bernoulli Distribution4:51
Discover the Bernoulli distribution, a discrete two-outcome model where one outcome has probability p and the other 1-p, and its use in Bernoulli trials.
Binomial Distribution21:49
Expected Value8:51
Expected Value - Example8:25
Expected Value for Bernoulli Distribution2:22
Compute the expected value for a Bernoulli distribution by multiplying outcomes 0 and 1 by their probabilities and summing, yielding E = P.
Expected Value for Binomial Distribution13:15
Law of large numbers5:26
Normal Distribution and its properties17:23
Explore the normal (gaussian) distribution, a bell-shaped, continuous, symmetric curve where the area under the curve equals one, and learn how mean and standard deviation shift it to define probabilities.
Impact of standard deviation on the PDF1:32
Cumulative Distribution Function10:16
Explore how histograms and cumulative distributions connect, showing how the CDF reflects the area under the PDF and how left-right areas relate in normal distributions.
Formula of Normal Distribution2:48
Understanding Normal Distribution through excel7:26
Explore normal distribution with a custom Excel utility to visualize how mu and sigma shape the PDF and CDF, compute them step by step, and illustrate the 68-95-99.7 rule.
Normal Standard deviation6:09
Understand the standard normal distribution by transforming any normal distribution to mean zero and standard deviation one, enabling use of a z table by subtracting mu and dividing by sigma.
Extreme values in normal distribution1:16
Z score Introduction5:11
Learn how the z score standardizes any normal distribution to a unit normal, using (x−μ)/σ, and how the area under the pdf relates to probabilities between x and y.
Z score detailed explanation7:20
How to read a z score table8:20
Master how to read a z score table, interpret positive and negative z values, and use left and right area concepts under the normal distribution to solve problems.
Using z score - Example 15:04
Using z score - Example 211:47
Analyze the normal distribution with mean 16.3 and sd 0.2 by calculating z scores for pizza sizes, then find right-tail probability above 16.5 and the interval probability using the z-table.
Using z score - Example 34:30
Apply z-score calculations to a normal distribution with mean 70 and sd 5, read the z-table to estimate probabilities for x<65, x>75, and 65≤x≤75, then convert to counts.
Using z score - Example 48:13
Use z-score methods for a normal distribution to determine mu and sigma from P(X<30)=0.15 and P(X>50)=0.10 in a battery lifespan example.
Symmetric Distribution and Skewness12:43
Central Limit Theorem - Introduction8:20
Central Limit Theorem - Revisiting10:13
Central limit theorem states that, for any population, sampling with over 30 yields sample-means distribution centered at population mean, with standard deviation equal to population standard deviation divided by sqrt(n).
Central Limit Theorem - Conclusion5:48
Explore central limit theorem: from any population with 30 samples, distribution of sample means is normal with mean mu and sigma over sqrt(n); if population is normal, any size suffices.
Central Limit Theorem - Solved Example 18:12
Central Limit Theorem - Solved Example 23:49
Apply the central limit theorem to find the probability that the sample mean of 49 shoppers, with mu 448 and sigma 21, lies between 441 and 446, yielding about 24.5%.
Uniform Distribution6:29
Explore discrete and continuous uniform distributions and their relation to normal distribution. See discrete uniform's equal-probability finite outcomes, like a dice, and continuous uniform's height making area one.
Log Normal Distribution7:19
Log Normal Distribution - Examples9:11
Explore how lognormal distributions arise in everyday data, with examples from online comments, dwell time, game durations, tissue sizes, surgery times, income, citations, file sizes, and traffic.
Power Law Distribution8:33
Pareto Distribution6:13
Pareto Distribution Formula10:09
Q-Q plot10:41
Explore the quantile-quantile (q-q) plot to compare an unknown distribution with a known one, especially against normal and log normal distributions, by sorting data and assessing linearity.
Box Cox Transformation2:35
How distributions are used3:08
Introduction to Null Hypothesis15:46
Learn hypothesis testing by combining normal distribution and the central limit theorem, clarifying null and alternate hypotheses, alpha, and p value through experiments and data samples.
Confidence Interval - Example 113:39
Construct a 90% confidence interval for a mean using a random sample, central limit theorem, and z-scores, illustrated with a US-India trade example.
Confidence Interval - Example 23:42
z table vs t table11:21
Understand why z score and z table are limited when population standard deviation is unknown, and how the t table uses degrees of freedom and the sample standard deviation.
Hypothesis Testing - Example 115:08
Hypothesis Testing - Example 28:21
Hypothesis Testing - Example 35:54
Conduct a one-tailed t-test for mu = 82 vs mu > 82 with n = 25. 85 and s = 4.1 give t = 3.65, p ≈ 0.005, reject null.
Concluding Hypothesis Testing5:26
Explore how alpha and p-values guide hypothesis testing, defining null and alternate hypotheses, and interpreting evidence levels from p values with correct rejection and not rejecting the null.

Requirements

Foundational Mathematics

Description

THE COMPREHENSIVE DATA ANALYST COURSE IS SET UP TO MAKE LEARNING FUN AND EASY

This 100+ lesson course includes 20+ hours of high-quality video and text explanations of everything from Linear Algebra, Probability, Statistics, Permutation and Combination. Topic is organized into the following sections:

Python Basics, Data Structures - List, Tuple, Set, Dictionary, Strings
Pandas and Numpy.
Linear Algebra - Understanding what is a point and equation of a line.
What is a Vector and Vector operations
What is a Matrix and Matrix operations
Data Type - Random variable, discrete, continuous, categorical, numerical, nominal, ordinal, qualitative and quantitative data types
Visualizing data, including bar graphs, pie charts, histograms, and box plots
Analyzing data, including mean, median, and mode, IQR and box-and-whisker plots
Data distributions, including standard deviation, variance, coefficient of variation, Covariance and Normal distributions and z-scores.
Different types of distributions - Uniform, Log Normal, Pareto, Normal, Binomial, Bernoulli
Chi Square distribution and Goodness of Fit
Central Limit Theorem
Hypothesis Testing
Probability, including union vs. intersection and independent and dependent events and Bayes' theorem, Total Law of Probability
Hypothesis testing, including inferential statistics, significance levels, test statistics, and p-values.
Permutation with examples
Combination with examples
Expected Value
Donors Choose case study.

AND HERE'S WHAT YOU GET INSIDE OF EVERY SECTION:

We will start with basics and understand the intuition behind each topic.
Video lecture explaining the concept with many real-life examples so that the concept is drilled in.
Walkthrough of worked out examples to see different ways of asking question and solving them.
Logically connected concepts which slowly builds up.

Enroll today! Can't wait to see you guys on the other side and go through this carefully crafted course which will be fun and easy.

YOU'LL ALSO GET:

Lifetime access to the course
Friendly support in the Q&A section
Udemy Certificate of Completion available for download
30-day money back guarantee

Who this course is for:

Aspiring Data Analysts
Business Analyst
Business Managers
Anyone wanting to learn basics of story telling through data

The Comprehensive Data Analyst Course.

What you'll learn

Explore related topics

Course content

Basic Python for Data Analysis40 lectures • 6hr 10min

Basics of SQL17 lectures • 1hr 46min

Basics of Statistics16 lectures • 5hr 2min

Visualization of Iris Dataset using Seaborn and Matplotlib15 lectures • 2hr 10min

Visualization of Haberman dataset4 lectures • 40min

Donors Choose9 lectures • 1hr 27min

Linear Algebra29 lectures • 4hr 37min

Principal Component Analysis15 lectures • 2hr 52min

Advanced Statistics46 lectures • 6hr 8min

Requirements

Description

Who this course is for: