Fundamentals of Statistics and Visualization in Python
- 3.5 hours on-demand video
- Full lifetime access
- Access on mobile and TV
- Certificate of Completion
Get your team access to 4,000+ top Udemy courses anytime, anywhere.Try Udemy for Business
- Basic concepts in statistics and data visualization
- Use Python data visualization tools to perform data visualization
- Apply probability to statistics with the use of Bayesian Inference, a powerful alternative to classical statistics
- Calculate and build confidence intervals in Python
- Run basic regressions focused on linear and multilinear data
- Run hypothesis tests and perform Bayesian inference for effective analysis and visualization
- Apply probability to statistics by updating beliefs
The goal of this video is to highlight the key aspects of statistics and visualization.
• Explore the basics of descriptive statistics as a summary of data
• Understand inferential statistics to make inferences or claims about data
• Explore data visualization to represent data
In this video, we will learn the numerical analysis of descriptive statistics to summarize your data.
• Understand the measures of central tendency to identify the typical, middle, and most frequent values
• Learn about ranges and percentiles to determine the spread of the data distribution
• Understand measures of spread to see how far the average data point deviates from the center
In this video, we will go over how to calculate confidence intervals and interpret them.
• Learn how a sample estimate can be used as a proxy for the true population estimate, given a normal distribution
• Look at an example of how a confidence interval is calculated in terms of its lower and upper bounds
• Learn how to make interpretations of confidence intervals
This video aims to highlight linear regression.
• Start with the linear regression equation, which is based on the equation for a line
• Estimate a linear regression model and look at the summary results
• Provide interpretations and inferences for linear regression
This video will demonstrate multivariate linear regression.
• Start with the equation for a multivariate linear regression model with explicit statements for the hypothesis test
• Interpret the model summary results and predictions
• Finally, a discussion of the residual errors will further enhance your understanding of regression analysis
This goal of this video is to demonstrate logistic regression.
• Learn about the logit function, which transforms binary outcome data into log odds
• Estimate a basic logistic regression model
• Learn how to make interpretations and inferences, using hypothesis testing and confidence intervals
The purpose of this video is to illustrate multivariate logistic regression.
• Check for collinearity among variables
• Run multivariate logistic regression, examine statistical significance, and coefficient effects
• Conduct model prediction and then evaluate the model, using an accuracy score
In this video, we will visualize summary statistics with Python’s pandas library.
• Work with histogram and scatterplots to highlight the data’s distribution
• Visualize the summary statistics with boxplots
• Learn about the data’s distribution shape, focusing on skewness and kurtosis
In this video, you will learn about Python’s data visualization library called Matplotlib.
• Perform line plots and a bar plot
• See how to make a scatterplot to see the relationship between variables
• Use histograms and boxplots to better understand the distribution of your data
The aim of this video is to apply Python’s data visualization library called seaborn.
• Perform line plots and scatter plots
• Use category plots and distribution plots with focus on facet grids, pair grids, and pair plots
• See how to work with color palettes to highlight your data
In this video, we will learn about Bayes’ theorem.
• Understand the four components of Bayes’ theorem, namely the prior, the likelihood, the posterior, and the evidence
• Go through a classic example to see how to apply Bayes’ theorem
• We will review how to input the probabilities of the four component parts for Bayes’ theorem
In this video, we will explore how to perform statistical hypothesis testing with focus on frequentist and Bayesian inference.
• Get introduced to the idea about the plausibility of a hypothesis, given the data
• See how frequentists approach hypothesis testing
• Understand the Bayesian approach hypothesis testing
- Please note that prior knowledge of Python programming and some familiarity with pandas and NumPy are needed in order to get the best out of this course.
Statistics and visualization in Python can be applied to a wide variety of areas; having these skills is crucial for data scientists. In this course, we explore several core statistical concepts to utilize data; construct confidence intervals in Python and assess the results; discover correlations; and update your beliefs using Bayesian Inference.
In this tutorial, you will discover how to use the Statsmodels, Matplotlib, pandas, and Seaborn Python libraries for statistical data visualization. Follow along with author—Dr. Karen Yang, a seasoned data scientist and data engineer—to explore, learn, and strengthen your skills in fundamental statistics and visualization. This course utilizes the Jupyter Notebook environment to execute tasks.
By the end of this learning journey, you'll have developed a solid understanding of fundamental statistics and visualization concepts and will be confident enough to apply them to your data analysis projects.
Please note that prior knowledge of Python programming and some familiarity with pandas and NumPy are needed in order to get the best out of this course.
About the Author
Karen Yang has been a data engineer, an author, and a passionate computer science self-learner for 7 years. She has 6 years' experience in Python programming and big data processing. Her recent interests include cloud computing.
She holds a PhD in Political Science from Ohio State University and loves working with data to gather meaningful information by performing analysis and research. This interest led her to publish data analysis research papers on Inferential Data Analysis on Tooth Growth and Predicting Activity for Samsung SensorData. She is also a published author of the 'Apache Spark in 7 Days' course.
- This course is for Python programmers who want to master essential statistics and visualization concepts using the Python programming language and are keen to learn to perform visualization effectively in conjunction with multiple visualization tools.