Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
Data Manipulation with Pandas Masterclass
Rating: 4.3 out of 5(34 ratings)
1,535 students

Data Manipulation with Pandas Masterclass

Learn the main functions of Pandas for data analysis and visualization in less than 2 hours. Theory and hands-on.
Last updated 5/2021
English

What you'll learn

  • This is a short masterclass in Pandas, the most famous library for data manipulation in Python.
  • You will learn what Pandas is, and how it can help you load, manage, and transform tabular data.
  • Learn to analyze real world data using Python & Pandas.
  • Import data from multiple sources, clean, reshape, impute and visualize your data.
  • Use Python and Pandas to select, group and summarize your data.
  • Decide what data to keep and what to ignore.
  • Create compelling visualizations using Seaborn and Matplotlib.

Course content

2 sections26 lectures1h 6m total length
  • Introduction4:02
  • Agenda0:49
  • Tabular Data6:19
  • Data Manipulation with Pandas2:33
  • Data Structures1:25

    Examine data structures in Pandas, comparing the series—one-dimensional with an index—and the data frame—two-dimensional with rows and columns. Learn how selecting a column returns a series within a unified framework.

  • Pandas IO1:16
  • Selections & Filters3:17
  • Question: Numpy & Pandas1:29
  • Question: Indexes0:55

    Explain zero-indexed range notation in pandas, with start inclusive and end exclusive, using nine to twenty five for rows and two to five for columns.

  • Feature Engineering1:16
  • Aggregations2:42

    Explore Pandas aggregations and summary statistics, including mean calculations, group by operations, multiple aggregations with agg, and counting with value_counts for passenger class analysis.

  • Sort & Pivot2:30

    Sort values with sort_values by column and order, then pivot data between long and wide formats and build pivot tables with index, columns, and mean aggregation.

  • Joins1:50
  • Time Series2:04
  • Question: in memory0:52
  • Other Commands2:31
  • Data Visualization3:46

Requirements

  • Previous experience programming in Python is advised to make best use of the masterclass.
  • Some prior experience with tabular data formats such as CSV or Excel is also encouraged.

Description

This masterclass introduces you to concepts and practices for building compelling analyses and dashboards on datasets of any size.  It is designed to be self contained and to be consumed quickly in a single session. It will get you up to speed from zero knowledge of Pandas to understanding how the library operates and using it in several different scenarios.

You will learn:

  • What tabular data is and where you find it

  • How Pandas allows you to load from, and save to, multiple data formats

  • How to use two main components of Pandas: the Series and the DataFrame

  • The main methods to select, group and summarize your data using Pandas

  • How to perform complex operations such as pivot tables and split-apply-combine

  • How to create compelling visualizations using Seaborn and Matplotlib directly from Pandas

The masterclass is designed to maximize the learning experience for everyone and includes 50% theory and 50% hands-on practice. It includes a lab with hands-on exercises and solutions.

No software installation required. You can run the code on Google CoLab and get started right away.

This class is the fastest way to get up to speed in Pandas.

Why Pandas?

Pandas is the most famous data manipulation library and it is used by millions of people every day to analyze and manipulate large datasets. It is mature, robust, easy to use and it has extensive documentation, so it's the perfect entry point for beginners and pros.

Who this course is for:

  • Python enthusiasts that want to deepen their knowledge of data analysis, data manipulation and data visualization.
  • Analysts in finance, insurance, consulting who are pro at Excel and want to start migrating towards Python and Pandas to scale their work.