Data Manipulation in Python: A Pandas Crash Course
4.7 (264 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
2,173 students enrolled

Data Manipulation in Python: A Pandas Crash Course

Learn how to use Python and Pandas for data analysis and data manipulation. Transform, clean and merge data with Python.
Bestseller
4.7 (264 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
2,173 students enrolled
Last updated 8/2020
English
English [Auto]
Current price: $139.99 Original price: $199.99 Discount: 30% off
5 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 9 hours on-demand video
  • 2 articles
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Visualise data using methods from histograms to dimensionality reduction.
  • Create, save and serialise data frames in and out of multiple formats.
  • Clean and format data easily.
  • Detect and intelligently fill missing values.
  • Group, aggregate and summarise your data.
  • Merge data sources into a beautiful whole.
  • Pivot and cross-tabulate data like a pro.
  • Intersplice, summarise and investigate time series data.
  • Seamlessly work with data from different time zones.
  • Learn the common pitfalls and traps that ensnare beginners and how to avoid them.
Course content
Expand all 57 lectures 08:46:51
+ Introduction
6 lectures 27:26
BONUS: Learning Path
00:34
Setting up python and editors
09:33
Live Install
06:30
Get the materials
00:04
+ Dataset Basics
6 lectures 44:51
Finding Datasets
04:29
Jupyter Notebooks and Loading Data
12:14
Pandas vs Numpy
07:38
Creating DataFrames
04:20
Saving and Serialising
09:13
Inspecting DataFrames
06:57
+ Visual exploration
7 lectures 01:10:26
Introduction and super basic plots
10:04
Pandas vs Matplotlib
08:56
Visualising 1D distributions
13:12
Visualising 2D distributions
14:16
Styling Pandas Table outputs
08:56
Higher dimension visualisations
13:08
Summary
01:54
+ Basic Data Manipulations
6 lectures 01:03:51
Introduction, Labelling and Ordering
12:21
Slicing and Filtering
13:11
Replacing and Thresholding
07:07
Removing and adding data
13:57
Apply, map and vectorised functions
14:44
Summary
02:31
+ Grouping
5 lectures 37:37
Introduction and motivation
01:42
Basic grouping syntax
13:30
Intelligent imputation
10:19
Grouping aggregation
08:56
Summary
03:10
+ Merging
4 lectures 41:38
Introduction and basic syntax
14:00
Different types of merging
16:14
Helpful merging functions
09:10
Summary
02:14
+ Advanced Manipulation - MultiIndex, Pivoting and more
8 lectures 01:32:35
Introduction and basic MultiIndexes
12:20
MultiIndex II - MultiIndex Strikes Back
13:29
Stacking and Unstacking
13:30
Pivoting
15:45
Pivot Margins
15:29
Crosstab
09:25
Melting
07:26
Summary
05:11
+ Time Series Data
6 lectures 57:14
Introduction and the Datetime Index
09:47
Reindexing
11:22
Resampling
10:50
Rolling functions
12:08
Summary
03:28
+ Conclusion
9 lectures 01:31:13
A recap and a thank you
05:09
Extra - Customising Jupyter Notebooks
08:54
Extra - Chapter 2 Data Runthrough
04:46
Extra - Chapter 3 Visualisation Runthrough
14:47
Extra - Chapter 4 Basics Runthrough
06:50
Extra - Chapter 5 Grouping Runthrough
14:13
Extra - Chapter 6 Merging Runthrough
11:25
Extra - Chapter 7 Advanced Runthrough
12:44
Extra - Chapter 8 TimeSeries Runthrough
12:25
Requirements
  • Basic knowledge of Python
Description

In the real-world, data is anything but clean, which is why Python libraries like Pandas are so valuable.


If data manipulation is setting your data analysis workflow behind then this course is the key to taking your power back.


Own your data, don’t let your data own you!


When data manipulation and preparation accounts for up to 80% of your work as a data scientist, learning data munging techniques that take raw data to a final product for analysis as efficiently as possible is essential for success.


Data analysis with Python library Pandas makes it easier for you to achieve better results, increase your productivity, spend more time problem-solving and less time data-wrangling, and communicate your insights more effectively.


This course prepares you to do just that!


With Pandas DataFrame, prepare to learn advanced data manipulation, preparation, sorting, blending, and data cleaning approaches to turn chaotic bits of data into a final pre-analysis product. This is exactly why Pandas is the most popular Python library in data science and why data scientists at Google, Facebook, JP Morgan, and nearly every other major company that analyzes data use Pandas.


If you want to learn how to efficiently utilize Pandas to manipulate, transform, pivot, stack, merge and aggregate your data for preparation of visualization, statistical analysis, or machine learning, then this course is for you.


Here’s what you can expect when you enrolled with your instructor, Ph.D. Samuel Hinton:


  • Learn common and advanced Pandas data manipulation techniques to take raw data to a final product for analysis as efficiently as possible.

  • Achieve better results by spending more time problem-solving and less time data-wrangling.

  • Learn how to shape and manipulate data to make statistical analysis and machine learning as simple as possible.

  • Utilize the latest version of Python and the industry-standard Pandas library.

Performing data analysis with Python’s Pandas library can help you do a lot, but it does have its downsides. And this course helps you beat them head-on:


1. Pandas has a steep learning curve: As you dive deeper into the Pandas library, the learning slope becomes steeper and steeper. This course guides beginners and intermediate users smoothly into every aspect of Pandas.


2. Inadequate documentation: Without proper documentation, it’s difficult to learn a new library. When it comes to advanced functions, Pandas documentation is rarely helpful. This course helps you grasp advanced Pandas techniques easily and saves you time in searching for help.


After this course, you will feel comfortable delving into complex and heterogeneous datasets knowing with absolute confidence that you can produce a useful result for the next stage of data analysis.


Here’s a closer look at the curriculum:

  • Loading and creating Pandas DataFrames

  • Displaying your data with basic plots, and 1D, 2D and multidimensional visualizations.

  • Performing basic DataFrame manipulations: indexing, labeling, ordering slicing, filtering and more.

  • Performing advanced Pandas DataFrame manipulations: multiIndexing, stacking, hierarchical indexing, pivoting, melting and more.

  • Carrying out DataFrame grouping: aggregation, imputation, and more.

  • Mastering time series manipulations: reindexing, resampling, rolling functions, method chaining and filtering, and more.

  • Merging Pandas DataFrames

Lastly, this course is packed with a cheatsheet and practical exercises that are based on real-life examples. So not only will you learn the theory, but you will also get some hands-on practice with Pandas too.

Who this course is for:
  • Python students that want to learn how to manipulate data professionally.
  • Aspiring data analysts and scientists looking to upgrade their skillset.
  • People who would prefer to spend more time solving interesting problems than formatting data.
  • Old hands at programming that want to see what new methods and industry-leading tools are at their fingertips in the new decade.