Fundamentals of Statistics and Visualization in Python
4.3 (2 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
11 students enrolled

Fundamentals of Statistics and Visualization in Python

Learn to display your data using Python's visualization tools
4.3 (2 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
11 students enrolled
Created by Packt Publishing
Last updated 3/2020
English
English
Current price: $86.99 Original price: $124.99 Discount: 30% off
5 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 3.5 hours on-demand video
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Basic concepts in statistics and data visualization
  • Use Python data visualization tools to perform data visualization
  • Apply probability to statistics with the use of Bayesian Inference, a powerful alternative to classical statistics
  • Calculate and build confidence intervals in Python
  • Run basic regressions focused on linear and multilinear data
  • Run hypothesis tests and perform Bayesian inference for effective analysis and visualization
  • Apply probability to statistics by updating beliefs
Course content
Expand all 22 lectures 03:14:59
+ Getting Started with Statistics and Visualization
4 lectures 31:22

This video will give you an overview about the course.

Preview 05:23

In this video, we will download and install Anaconda distribution, which will have the modules, libraries, and packages that we will use for the course.

   •  Download and install Anaconda distribution

   •  Check the installation

   •  Review resources available to the course

Installing Anaconda for Python
06:53

The goal of this video is to highlight the key aspects of statistics and visualization.

   •  Explore the basics of descriptive statistics as a summary of data

   •  Understand inferential statistics to make inferences or claims about data

   •  Explore data visualization to represent data

Understanding the Key Aspects of Statistics and Visualization
09:23

In this video, we will learn how to get data in different ways.

   •  Pull data using pandas datareader

   •  Pull data using an API key

   •  Pull data by downloading a CSV file

Getting Data and Performing Operations
09:43
Test your knowledge
5 questions
+ Getting Grips with Core Statistics Concepts
5 lectures 42:46

In this video, we will learn the numerical analysis of descriptive statistics to summarize your data.

   •  Understand the measures of central tendency to identify the typical, middle, and most frequent values

   •  Learn about ranges and percentiles to determine the spread of the data distribution

   •  Understand measures of spread to see how far the average data point deviates from the center

Preview 08:27

This video focuses on grouping operations, using the split-apply-combine method.

   •  Understand how to split your data into groups and examine how to use a multi-index object to subset further

   •  Apply a function to each group

   •  Combine the results into a data structure

Grouping Data
05:42

In this video, we will discover the characteristics of a normal distribution.

   •  Learn about a standard normal distribution, which is a special case of a normal distribution

   •  Check normality with histograms

   •  Check normality with Q-Q plots

Performing Normal Distribution
10:56

In this video, we will go over how to calculate confidence intervals and interpret them.

   •  Learn how a sample estimate can be used as a proxy for the true population estimate, given a normal distribution

   •  Look at an example of how a confidence interval is calculated in terms of its lower and upper bounds

   •  Learn how to make interpretations of confidence intervals

Confidence Intervals
09:57

In this video, we will go over the relationship between two variables as measured by correlation.

   •  Understand how correlation is calculated

   •  Apply the correlation function

   •  Look at a correlational matrix and rolling correlations

Correlational Relationship
07:44
Test your knowledge
5 questions
+ Running Linear and Logistic Regression
5 lectures 50:19

This video aims to highlight linear regression.

   •  Start with the linear regression equation, which is based on the equation for a line

   •  Estimate a linear regression model and look at the summary results

   •  Provide interpretations and inferences for linear regression

Linear Regression: The Big Picture
09:42

This video will demonstrate multivariate linear regression.

   •  Start with the equation for a multivariate linear regression model with explicit statements for the hypothesis test

   •  Interpret the model summary results and predictions

   •  Finally, a discussion of the residual errors will further enhance your understanding of regression analysis

Multivariate Linear Regression
10:26

This goal of this video is to demonstrate logistic regression.

   •  Learn about the logit function, which transforms binary outcome data into log odds

   •  Estimate a basic logistic regression model

   •  Learn how to make interpretations and inferences, using hypothesis testing and confidence intervals

Logistic Regression: The Big Picture
10:41

The purpose of this video is to illustrate multivariate logistic regression.

   •  Check for collinearity among variables

   •  Run multivariate logistic regression, examine statistical significance, and coefficient effects

   •  Conduct model prediction and then evaluate the model, using an accuracy score

Multivariate Logistic Regression
10:31

The aim of this video is to address missing data.

   •  Check for missing data in your dataset

   •  Learn how to fill in the missing value, either through assignment of a value, interpolation, or through forward or back fill

   •  Learn how to drop missing values

Handling Missing Data
08:59
Test your knowledge
5 questions
+ Seeing and Understanding Your Data Through Visualization
4 lectures 36:28

In this video, we will visualize summary statistics with Python’s pandas library.

   •  Work with histogram and scatterplots to highlight the data’s distribution

   •  Visualize the summary statistics with boxplots

   •  Learn about the data’s distribution shape, focusing on skewness and kurtosis

Visualizing Summary Statistics with Pandas
10:47

In this video, you will learn about Python’s data visualization library called Matplotlib.

   •  Perform line plots and a bar plot

   •  See how to make a scatterplot to see the relationship between variables

   •  Use histograms and boxplots to better understand the distribution of your data

How to Work with Matplotlib
08:15

The aim of this video is to apply Python’s data visualization library called seaborn.

   •  Perform line plots and scatter plots

   •  Use category plots and distribution plots with focus on facet grids, pair grids, and pair plots

   •  See how to work with color palettes to highlight your data

Using Seaborn for Data Visualization
09:35

The aim of this video is to address how to handle outliers in your data set.

   •  Learn how to drop outliers

   •  Learn how to flag outliers with a value so that you can keep track of them

   •  Perform a rescale log transformation to retain your outliers in your data

Handling Outliers
07:51
Test your knowledge
5 questions
+ Updating Beliefs with Bayesian Inference
4 lectures 34:04

In this video, we will learn about Bayes’ theorem.

   •  Understand the four components of Bayes’ theorem, namely the prior, the likelihood, the posterior, and the evidence

   •  Go through a classic example to see how to apply Bayes’ theorem

   •  We will review how to input the probabilities of the four component parts for Bayes’ theorem

Understanding Bayes’ Theorem
09:00

In this video, we will explore how to perform statistical hypothesis testing with focus on frequentist and Bayesian inference.

   •  Get introduced to the idea about the plausibility of a hypothesis, given the data

   •  See how frequentists approach hypothesis testing

   •  Understand the Bayesian approach hypothesis testing

How to Perform Statistical Hypothesis Testing
07:30

In this video, we will build a Bayesian linear regression model.

   •  Create simulated data

   •  Build a Bayesian linear regression model

   •  Look at posterior plots and regression lines

Bayesian Statistics with Linear Regression
07:19

In this video, we will go over Bayesian statistics with logistic regression.

   •  Read the data and perform exploratory data analysis

   •  Run a Bayesian logistic regression model

   •  Review traceplots and posterior predictive plots

Bayesian Statistics with Logistic Regression
10:15
Test your knowledge
5 questions
Requirements
  • Please note that prior knowledge of Python programming and some familiarity with pandas and NumPy are needed in order to get the best out of this course.
Description

Statistics and visualization in Python can be applied to a wide variety of areas; having these skills is crucial for data scientists. In this course, we explore several core statistical concepts to utilize data; construct confidence intervals in Python and assess the results; discover correlations; and update your beliefs using Bayesian Inference.

In this tutorial, you will discover how to use the Statsmodels, Matplotlib, pandas, and Seaborn Python libraries for statistical data visualization. Follow along with author—Dr. Karen Yang, a seasoned data scientist and data engineer—to explore, learn, and strengthen your skills in fundamental statistics and visualization. This course utilizes the Jupyter Notebook environment to execute tasks.

By the end of this learning journey, you'll have developed a solid understanding of fundamental statistics and visualization concepts and will be confident enough to apply them to your data analysis projects.

Please note that prior knowledge of Python programming and some familiarity with pandas and NumPy are needed in order to get the best out of this course.

About the Author

Karen Yang has been a data engineer, an author, and a passionate computer science self-learner for 7 years. She has 6 years' experience in Python programming and big data processing. Her recent interests include cloud computing.

She holds a PhD in Political Science from Ohio State University and loves working with data to gather meaningful information by performing analysis and research. This interest led her to publish data analysis research papers on Inferential Data Analysis on Tooth Growth and Predicting Activity for Samsung SensorData. She is also a published author of the 'Apache Spark in 7 Days' course.

Who this course is for:
  • This course is for Python programmers who want to master essential statistics and visualization concepts using the Python programming language and are keen to learn to perform visualization effectively in conjunction with multiple visualization tools.