Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Data Analysis with Pandas and Python [2026]

Name: Data Analysis with Pandas and Python [2026]
Rating: 4.7 (26497 reviews)

Analyze data quickly and easily with Python's powerful pandas library! All datasets included --- beginners welcome!

Bestseller

Created byBoris Paskhaver

Last updated 6/2026

English

Bulgarian [Auto],Czech [Auto],

What you'll learn

Perform a multitude of data operations in Python's popular pandas library including grouping, pivoting, joining and more!
Learn hundreds of methods and attributes across numerous pandas objects
Possess a strong understanding of manipulating 1D, 2D, and 3D data sets
Resolve common issues in broken or incomplete data sets

Coding Exercises

This course includes our updated coding exercises so you can practice your skills as you learn.

Course content

16 sections • 155 lectures • 17h 43m total length

Introduction to the Course11:10
Welcome to Data Analysis with Pandas and Python! In this lesson, we'll introduce the pandas library, the Python language, the structure of the course, the prerequisites, and the setup process.
[macOS] Intro to the Terminal8:19
In this lesson, we introduce the Terminal application for issuing commands to the system via a command-line. We also introduce the ls, pwd, cd, and clear commands and the Tab Autocompletion features.
[macOS] Install uv, a Python package and project manager4:11
In this lesson, we install the uv command-line tool for managing Python projects and dependencies. We also setup autocompletion for uv commands and discuss how to uninstall the tool.
[macOS] Download Course Materials and Setup Project5:05
It's time to get the course materials! In this lesson, we download the course repository and setup Python and our dependencies (Pandas, JupyterLab, and more) with the uv sync command.
[Windows] Intro to PowerShell7:56
In this lesson, we introduce the PowerShell/Terminal application for issuing commands to the system via a command-line. We also introduce the ls, pwd, cd, and clear commands and the Tab Autocompletion features.
[Windows] Install uv, a Python package and project manager5:08
In this lesson, we install the uv command-line tool for managing Python projects and dependencies. We also setup autocompletion for uv commands and discuss how to uninstall the tool.
[Windows] Download Course Materials and Setup Project5:41
It's time to get the course materials! In this lesson, we download the course repository and setup Python and our dependencies (Pandas, JupyterLab, and more) with the uv sync command.
Jupyter Lab Startup and Shutdown9:25
In this lesson, we walk through the process of starting up and shutting down Jupyter Lab, our coding environment. We open some sample Jupyter Notebooks and describe how a Python server runs continuously in the background, waiting to execute the contents of a code cell.
Intro to Jupyter Lab12:38
In this lesson, we walk through the Jupyter Lab interface. A Notebook consists of cells, which can have different types (Markdown. We introduce some common actions like adding cells, deleting cells, restarting the kernel, and more.
Setting Up Ruff Formatter in Jupyter Lab2:16
In this lesson, we configure our Jupyter Notebook settings to enable Ruff, a code formatter that styles our Python code.
Import Libraries into Jupyter Lab3:28
To conserve memory, Jupyter won't load Python modules into your Notebook automatically. In this lesson, we use the import keyword to bring the pandas library into a Notebook. We also talk about assigning aliases with the as keyword.
Installation and Setup

Comments6:55
A comment is a line ignored by the Python interpreter when the program/cell runs. Declare a comment with a hashtag (#) symbol.
Data Types10:19
In this lesson, we introduce common data types in Python including integers, floating-points, strings, Booleans, and None.
Operators12:28
In this lesson, we discuss common mathematical and logical operators including addition, subtraction, multiplication, two types of division, concatenation, and modulo.
Equality and Inequality Operators8:52
In this lesson, we focus on the equality ( == ) and inequality ( !=) operators for comparing two values against each other.
Variables7:01
A variable is a name we assign to a value in our program. In this lesson, we practice declaring variables and discuss Python community conventions for naming them.
Declare Variables
Built-In Functions12:03
A function is a reusable procedure, a sequence of steps to follow in order. In this lesson, we introduce Python's built-in functions and the syntax for invoking them. We cover len, str, int, and more.
Built-in Functions
Custom Functions15:27
Now it's time to build our own custom functions! In this lesson, we define a custom temperature conversion function from start to finish.
Custom Functions
String Methods18:23
A method is a function attached to an object. It's a command or action we can ask the object to take. In this lesson, we explore some common string methods including lower, upper, strip, and replace.
String Methods
Lists13:22
A list is a mutable collection of ordered values. In this lesson, we learn the square bracket syntax for declaring lists as well as some common methods like pop and append.
Creating Lists
Index Positions and Slicing17:49
Python assigns each list element and each string character an index position that reflects its place in line. In this lesson, we learn how to extract elements and characters from their lists/strings using square bracket notation. The index starts counting from 0!
Index Positions and Slicing
Tuples7:40
A tuple is an immutable list. It's an ordered sequence of values in order but it cannot be modified after creation. We technically declare a tuple with a comma-separated sequence of values but the community convention is to wrap the sequence in parentheses.
Dictionaries13:37
A dictionary is a mutable collection of key-value pairs. A key serves as a unique identifier for a value. The keys must be unique, while the values can contain duplicates. In this lesson, we practice declaring some dictionary objects.
Creating Dictionaries
Classes and Objects6:32
A class is a blueprint/template for creating an object, which we call an instance. The class defines the attributes and methods that all objects/instances will have. In this lesson, we walk through the terminology and provide a real-world analogy.
Importing Modules11:02
In this lesson, we review the import keyword for importing either a Python module or a library like pandas. We import the datetime library and use the as keyword to assign it an alias of dt.
Importing Libraries4:05
In this lesson, we utilize the same import keyword to bring the pandas library into our Jupyter Notebook. We'll have to repeat this step in every Jupyter Notebook.
Python Crash Course

Create A Series Object from a List8:17
A Series is a one-dimensional labelled array that combines the best features of a list and a dictionary. In this lesson, we instantiate our first Series objects and introduce the index, the collection of identifiers for the Series's values.
Create A Series Object from a Dictionary3:10
In this lesson, we practice creating Series objects with dictionaries as the data source. Pandas will use the keys for the Series's index labels ad the values for the Series's values.
Create a Series Object
Intro to Series Methods6:13
In this lesson, we invoke some sample methods like sum, product, and mean on Series objects. Methods utilize a dot, then the method name and a pair of parentheses.
Intro to Series Attributes7:20
An attribute is a piece of data that lives on an object. It's a fact, a detail, a characteristic of the object. In this lesson, we access various attributes on the Series and introduce the concept of composition, where an object is made up of many smaller objects.
Attributes and Methods on a Series
Parameters and Arguments12:52
A parameter is the name for an expected input to a function/method/class instantiation. An argument is the concrete value we provide for a parameter during invocation. In this lesson, we discuss the data and index parameters of the Series constructor.
Parameters and Arguments
Import Series with the pd.read_csv Function15:51
A CSV is a plain text file that uses line breaks to separate rows and commas to separate row values. In this lesson, we use the pd.read_csv function to import 2 CSV datasets into pandas. We also introduce the 2-dimensional DataFrame object and learn how to convert it to a 1-dimensional Series with the squeeze method.
Import Series with the read_csv Function
The head and tail Methods4:20
The head method returns a number of rows from the beginning of the Series. The complementary tail method returns a number of rows from the end of the Series.
The head and tail Methods
Passing Series to Python Built-In Functions7:50
In this lesson, we pass a Series to Python's built-in functions including len, type, list, dict, sorted, max, and min.
Check for Inclusion with Python's in Keyword5:03
In this lesson, we practice using Python's in and not in keywords to check for inclusion among the Series's values and index labels. We utilize the index and values attribute to make sure we perform the search within the right collection.
The sort_values Method3:58
The sort_values method sorts a Series's values in order. In this lesson, we invoke the method on both our alphabetical and numeric Series and also learn how to customize the sort type with the ascending parameter.
The sort_values Method
The sort_index Method6:46
In this lesson, we set a custom index on our Series with the read_csv function's index_col parameter and learn how to sort an index using the sort_index method.
The sort_index Method
Check for Inclusion with Python's in Keyword
Extract Series Values by Index Position11:09
In this lesson, we use the iloc accessor to extract a Series value by its index position. iloc is short for "index location" and requires a special square bracket syntax. It supports single values, Python lists, and slices as well.
Extract Series Values by Index Label7:49
In this lesson, we use the loc accessor to extract a Series value by its index label. loc requires a special square bracket syntax. Like the iloc accessor, it supports single values, Python lists, and slices.
Extract Series Values by Index Position or Index Label
The get Method6:01
In this lesson, we introduce the get method for retrieving a Series value by index label and providing a fallback value in case the label does not exist. The default fallback value is None.
Overwrite a Series Value5:34
In this lesson, we show the syntax to overwrite a Series value. We first target it with the iloc/loc accessor, then provide an equal sign and the value to overwrite the origin value with.
Copy-on-Write and the copy Method7:40
In this lesson, we introduce the Copy-on-Write principle introduced in Pandas 3. Pandas will create a copy when a mutational operation occurs. We can treat any filtered subset or targeted segment as effectively a copy, even though Pandas will try to reuse the same memory chunks under the hood.
Math Methods on Series11:50
In this lesson, we run through some common mathematical methods on Series including count, sum, product, mean, max, min, median, mode, and more.
Broadcasting5:53
Broadcasting describes the process of applying a consistent arithmetic operation to an array. We can combine mathematical operators with a Series to apply the mathematical operation to every value. In this lesson, we practice adding and subtracting a consistent value from every Series entry.
Pandas Aligns by Index Labels5:11
In this lesson, we show how Pandas uses index labels to align multiple Series together when performing mathematical operations between them.
The value_counts method3:19
In this lesson, we explore the value_counts method, which returns the number of times each distinct value occurs in the Series. The normalize parameter returns the relative frequencies/percentages of the values instead of the counts.
The value_counts Method
The apply Method6:17
In this lesson, we use the apply method to invoke a function for every Series value. Pandas collects the results in a new Series. The advantage of apply is that we can utilize basic Python code to achieve whatever manipulation we want. If we don't know a specific Series method but can accomplish the same result with Python constructs, apply can be a useful tool.
The map Method5:08
The map method connects each Series value to a complementary value from another data structure. It provides a connection/association to the other value. In this lesson, we practice using the method with arguments of a dictionary and a Series.
Series

Methods and Attributes between Series and DataFrames11:51
A DataFrame is a 2-dimensional table with an index. In this lesson, we introduce this new data structure and explore some of the methods and attributes it shares with the Series object. We also identify some unique attributes that exist only on one object but not the other.
Differences between Shared Methods7:42
In this lesson, we do a deeper dive into the sum method and how it operates differently between Series and DataFrame objects.
Select One Column from a DataFrame8:19
In this lesson, we introduce two syntax options to extract a column from a DataFrame: attribute access and square brackets. We also discuss the tradeoffs between the two approaches.
Select One Column from a DataFrame
Select Multiple Columns from a DataFrame2:41
In this lesson, we learn how to extract multiple DataFrame columns by passing a list between the square bracket extraction syntax. Pandas returns a copy/new DataFrame when extracting multiple columns.
Select Multiple Columns from a DataFrame
Add New Column to DataFrame8:43
In this lesson, we add a new column to a DataFrame using square bracket notation. We show how to populate the new Series with a single value or a dynamic calculation from performing an operation on another Series's values.
Create Columns with the assign Method2:55
In this lesson, we utilize the assign method to return a new DataFrame with new columns. Each keyword parameter represents the new column name and the complementary value represents the contents to populate the new column with.
Drop Rows with Missing Values6:29
In this lesson, we practice using the dropna method to remove DataFrame rows consisting of missing/NaN values. We discuss how to target rows that only hold missing values as well as rows with a missing value in a target column.
Drop DataFrame Rows with Missing Values
Fill in Missing Values with the fillna Method4:08
In this lesson, we explore an alternative approach for dealing with missing values: using the fillna method to populate missing values with a static value. We invoke the method on both a DataFrame and a Series.
Fill Missing Values with a Forward Fill or Back Fill4:57
In this lesson, we use the ffill and bfill method to forward-fill and back-fill the previous/next present value whenever there is a missing value. We also discuss how to set a max limit on the number of replaced consecutive values.
Convert Data Types with the astype Method10:56
In this lesson, we introduce the astype method for converting the data types in a Series. We practice converting our floating-point columns to store integers.
Convert to Numbers with the to_numeric Function4:38
In this lesson, we practice using the pd.to_numeric function to convert a Series's values into numeric types. One of the advantages of to_numeric over astype is the ability to react to errors in conversion.
Select Columns by Type with the select_dtypes Method3:02
In this lesson, we introduce the select_dtypes method to target DataFrame columns by their data type.
Convert Data Types with the astype Method II: Categories6:21
In this lesson, we introduce the category type, which is ideal when you have a small number of distinct values within a column. Categories help reduce total memory consumption.
Attribute Namespaces5:35
In this lesson, we introduce attributes that expose objects with additional methods and attributes. There's often a category for a specific data type. For example, string columns have string methods underneath a str attribute/namespace and datetime method underneath a dt attribute/namespace.
The astype Method
Sort a DataFrame with the sort_values Method I5:54
In this lesson, we explore the sort_values method on a DataFrame. The default sort order is ascending (smallest to greatest, alphabetical), but we can customize the order with the ascending parameter. We also discuss the na_position parameter for placing the NaN values at the beginning or end of the sorted values.
Sort a DataFrame with the sort_values Method II: Multiple Columns8:19
In this lesson, we sort a DataFrame by multiple columns by passing a list of column names to the by parameter. We also customize the sort order for each type by passing a list to the ascending parameter.
Sort with Ordered Categories6:02
In this lesson, we showcase how to use the category data type to set a custom sort order for the values in a column.
Sort a DataFrame with the sort_values Method
Sort a DataFrame by its Index4:28
The sort_index method sorts a DataFrame by the index labels. In this lesson, we explore the method and a few of its parameters.
Rank Values with the rank Method9:31
In this lesson, we learn the rank method for ordering and ranking the values in a Series. We use it to the rank our NBA players by their salaries, with the top player earning a rank of #1. We also show various approaches for dealing with ties.
DataFrames I

This Module's Dataset + Memory Optimization15:26
Welcome to the next section of the course! In this lesson, we import and introduce the new employees DataFrame. We also convert some columns to their optimal formats (Booleans, categories, etc) and introduce the to_datetime function at the top level of pandas.
Filter a DataFrame Based on a Condition12:36
To filter a DataFrame, we must first generate a Boolean Series, then pass it in square brackets after the DataFrame. In this lesson, we practice extraction using a variety of data types and operations (equality, less than, greater than, and more).
Filter a DataFrame Based on a Condition
Filter with More than One Condition (AND - &)5:41
In this lesson, we introduce the & operator for combining two Boolean Series with AND logic. We use this technique to filter a subset of DataFrame rows that fit multiple conditions.
Filter DataFrame with More than One Condition (AND - &)
Filter with More than One Condition (OR - |)8:56
In this lesson, we introduce the | operator for combining two Boolean Series with OR logic. We use this technique to filter a subset of DataFrame rows that fit either one of several conditions. We also discuss caveats when combining & and | in the extraction syntax.
Filter DataFrame with More than One Condition (OR - |)
The isin Method3:30
The isin method checks for each Series's value presence in a predefined list. It returns a Boolean Series; a True indicates the row's value is found within the collection.
The isin Method
The isnull and notnull Methods3:50
In this lesson, we discuss the isnull and notnull methods. They generate Boolean Series that validate whether a row's value is NaN (missing/absent) or non-NaN.
The between Method5:19
In this lesson, we utilize the between method to check if each Series value exists within a range/boundary of values. Both endpoints are inclusive. We utilize the resulting Boolean Series to filter our DataFrame.
The between Method
The duplicated Method8:06
The duplicated method marks a row's record as a duplicate when pandas encounters the value for a second time (and beyond). The first occurrence is not marked as a duplicate. In this lesson, we discuss the nuances of this method and the parameters we can customize to target the first duplicate, the last duplicate or all duplicates.
The drop_duplicates Method6:25
In this lesson, we explore the drop_duplicates method for removing rows with duplicate values from a DataFrame. We discuss how to declare a subset of columns to search for the duplicates within and also review the keep parameter options from the previous lesson.
The drop_duplicates Method
The where Method3:40
In this lesson, we use the where method, which accepts a Boolean Series but returns a DataFrame with the same dimensions as the original one. Rows that meet the condition are retained, and rows that do not meet the condition are populated with NaNs (missing values).
The query Method7:13
The query method enables extracting a subset of DataFrame rows using natural language. We explore a few scenarios (equality, inequality, greater-than, inclusion, and even referencing an external Python variable).
DataFrames II

This Module's Dataset1:29
Welcome to the DataFrames: Data Extraction section. In this lesson, we introduce the James Bond movie dataset we'll be using throughout the section.
The set_index and reset_index Methods3:32
The set_index method sets a column as the new DataFrame index, replacing the current index. The complementary reset_index method brings the current index into the table as a regular column and generates the standard numeric index.
Retrieve Rows by Index Position and Index Label11:38
The iloc accessor extracts a DataFrame row by its numeric index position. We also discuss the variety of filtering options (single value, list, slice, and slice shortcuts). The complementary loc accessor extracts a DataFrame row by its index label. We also discuss the variety of filtering options (single value, list, slice, and slice shortcuts).
Second Arguments to loc and iloc Accessors8:51
In this lesson, we pass a second value inside the square brackets for loc and iloc. We also discuss the variety of filtering options (single value, list, slice, and slice shortcuts).
Overwrite Value in a DataFrame7:59
In this lesson, we overwrite a single value in the DataFrame and discuss a warning you may encounter when working with filtered DataFrames in pandas 3.0. We discuss a solution to the problem: passing a Boolean Series directly to the loc accessor.
Rename Index Labels or Columns in a DataFrame5:28
In this lesson, we use the rename method for renaming one or more index labels on the row or column axis. We also discuss overwriting the DataFrame's columns attribute directly.
Delete Rows or Columns from a DataFrame5:21
In this lesson, we discuss the drop method, the pop method, and Python's del keyword for deleting DataFrame columns.
Create Random Sample with the sample Method2:49
In this lesson, we introduce the sample method for extracting one or more random rows or columns from the DataFrame. We can specify a percentage of total rows to target.
The nlargest and nsmallest Methods3:22
In this lesson, we discuss the nsmallest and nlargest methods for extracting rows with the smallest or largest values from a given DataFrame column. This is a faster, simpler alternative to using the sort_values method.
The clip Method2:00
The clip method helps round values in two ways: up if they fall below a threshold and down if they fall above a threshold. NaN (missing) values will remain NaN.
The apply Method with DataFrames8:04
In this lesson, we re-introduce the apply method for invoking a function once per every DataFrame row. We write a custom function for categorizing the Bond movies based on my own arbitrary film preferences. We then pass the function to the apply method; Pandas supplies each row as a Series into the function.

This Module's Dataset4:45
Welcome to the Working with Text Data section! In this lesson, we import and optimize our chicago.csv dataset. It contains data for public employees in the city of Chicago (name, title, department, salary).
Common String Methods8:47
In this lesson, we access the str attribute on a Series to access the StringMethods object. This object enables string-based operations across all Series's values. We practice using common string methods like lower, upper, title, and strip.
Common String Methods
Filtering with String Methods7:34
We can filter a subset of DataFrame rows as long as we have a Boolean Series. In this lesson, we introduce some string-based methods for generating those Series including contains, startswith, and endswith. We also talk about normalizing data before performing our inclusion checks.
String Methods on Index and Columns3:11
In this lesson, we apply string-based operations to the row and column indexes of the DataFrame. The process remains the same -- access the str property to get access to the StringMethods object, then invoke the correct method on the nested object.
The split Method7:27
In this lesson, we review the split method on Python strings and then apply it to a column within the chicago DataFrame. We find the most common first word among job titles in the city of Chicago.
The expand and n Parameters of the split Method7:19
In this lesson, we introduce two additional parameters to the split method: the expand method expands the list into new DataFrame columns and n limits the maximum number of splits. We use these strategies to find the most common first name among the employees.
The explode Method3:32
The explode method extracts every element within a list into a separate row in the resulting Series. In this lesson, we practice extracting every employee skill from a Series of lists.

Intro to the MultiIndex Module10:08
Welcome to the MultiIndex section. A MultiIndex is an index that consists of multiple levels or tiers. In this lesson, we create our first MultiIndex DataFrame with both the set_index method and the index_col parameter of the read_csv function. We discuss the benefits of a MultiIndex and best strategies for determining which layer to place first.
Create a MultiIndex
Extract and Rename Index Level Values6:56
In this lesson, we utilize the get_level_values method on the MultiIndex object to pull out the values from a certain level within the larger MultiIndex. We also use the set_names method on the MultiIndex to rename one or more levels of the MultiIndex.
Extract Index Level Values
Extract Rows from a MultiIndex DataFrame12:26
In this lesson, we review the iloc and loc accessors for extracting DataFrame rows and columns. We practice new syntax that incorporates the levels of a MultiIndex when targeting a specific row or a slice of rows.
Extract Rows from a MultiIndex DataFrame
The xs Method4:29
The xs (cross-section) method allows us to extract values based on a match in a MultiIndex level. The benefit over loc is that we can target based on a nested level.
The swaplevel Method2:51
The swaplevel method replaces/swaps multiple levels of the MultiIndex. The order of the levels doesn't matter.
The transpose Method4:09
In this lesson, we learn the transpose method for swapping the axes of the DataFrame. The method will move the row axis to the column axis and move the column axis to the row axis
The stack Method5:11
In this lesson, we use the stack method to move an index level from the column axis to the row axis. This action will automatically create a MultiIndex on the row axis.
The unstack Method9:51
In this lesson, we learn the unstack method to move an index level from the row axis to the column axis. This action will automatically create a MultiIndex on the column axis.
The pivot Method7:01
In this lesson, we tackle the pivot method for reshaping a DataFrame. The pivot method converts a long dataset to a wide one by distributing row values across multiple columns. It's ideal when you want to summarize a long collection of values.
The melt Method7:06
In this lesson, we introduce the complementary melt method for reshaping a DataFrame. It acts as an inverse of then pivot method. It converts a wide dataset to a long one by consolidating multiple columns's values into a single column. The column headers are placed in a column, and the values are placed in another one.
The melt Method
The pivot_table Method13:02
The pivot_table offers similar functionality to Excel's Pivot Table feature. It organizes a table of data based on grouping distinct values on either the row axis or column axis (or both), then applies an aggregation function to each collection. Aggregation functions include sum, count, average, max, min, and more.

The GroupBy Object5:39
In this lesson, we introduce the Fortune 1000 dataset that we'll be utilizing throughout the section. We employ the groupby method to create a DataFrameGroupBy object holding a collection of nested DataFrames, one for each distinct value in the Sector column.
Retrieve Groups and Rows3:26
In this lesson, we learn the get_group method to extract a nested DataFrame from a GroupBy object. The GroupBy object will store 20+ groups, one DataFrame for each distinct Sector value.
Methods on the GroupBy Object4:50
In this lesson, we apply aggregation operations like sum, average, max, and min to the GroupBy object. We target the specific columns we want to apply the operations to.
The agg Method2:44
In this lesson, we describe the agg method, an alternative strategy for performing aggregation calculations (sum, mean, count, etc) on the nested DataFrames in a GroupBy object.
Grouping by Multiple Columns3:05
In this lesson, we pass a list of columns to the groupby method to create a DataFrameGroupBy object that accounts for each unique combination of sector and industry.
Iterating over Groups7:46
In this lesson, we introduce two ways to iterate over the groups within a GroupBy object. The first is with the traditional Python for loop, which yields the group name and DataFrame within a two-element tuple. The second approach is using the apply function to invoke a function upon every nested group and capture the function's return value in a new Series.

Intro to the Merging DataFrames Module4:34
In this section, we'll explore various ways to merge or join two DataFrames together including concatenation, inner joins, left joins, outer joins, and more. To kick things off, we introduce the 4 datasets we'll be using throughout the lessons. They model a library management system and include books, members, and checkouts.
The pd.concat Function I5:44
The pd.concat function appends one DataFrame to the end of another. In this lesson, we combine our jan and feb checkouts DataFrames in a vertical/row-axis direction.
The pd.concat Function II9:04
In this lesson, we utilize the pd.concat function for concatenating on the column axis. Pandas glues the columns from the second DataFrame on the right-side of the left DataFrame.
Left Joins5:16
A left join brings in rows from a right DataFrame whenever there is a match with a column value in the left DataFrame. In this lesson, we introduce the merge method, the primary approach for joining DataFrames in pandas, and apply a left join to our checkouts and and books tables.
The left_on and right_on Parameters4:57
The left_on and right_on parameters perform a join where the column names do not match between the two DataFrames. Each one accepts the column name from the respective DataFrame. We can only use the on parameter if the two column names match between the DataFrames.
Inner Joins6:52
An inner join identifies the shared values between two DataFrames. Values that exist in one DataFrame but not the other are excluded. In this lesson, we use an inner join to identify the members who took out a book in both January and February.
Matching across Multiple Columns4:36
A join is not limited to a single column. In this lesson, we pass a list of columns to the on parameter ensure a join based on values across multiple columns. We use this technique to identify the members who came in both January and February and who took out the same book each month.
Full/Outer Joins9:04
A full join keeps all rows from both tables. Wherever this is a match, pandas will combine the row values together. Whenever a value only exists in one table, it will still be kept -- with complementary NaN values for the other table's columns.
Merging by Indexes with the left_index and right_index Parameters5:32
The left_index and right_index parameters of the merge method perform a join based on matching values in the indices of the respective tables. In this lesson, we practice joining the books and members tables together.
The join Method2:56
The join method offers a convenient shortcut when merging two DataFrames together using shared index labels.

Requirements

Basic/intermediate experience with a spreadsheet software like Microsoft Excel/Google Sheets (common functions, vlookups, countif, pivot tables etc)
Basic experience with the Python programming language (we'll cover the basics if you're brand new!)
Strong knowledge of data types (strings, integers, floating points, booleans) etc

Description

** Newly recorded in 2026 for the release of Pandas 3 **

Student Testimonials:

The instructor knows the material, and has detailed explanation on every topic he discusses. Has clarity too, and warns students of potential pitfalls. He has a very logical explanation, and it is easy to follow him. I highly recommend this class, and would look into taking a new class from him. - Diana
This is excellent, and I cannot complement the instructor enough. Extremely clear, relevant, and high quality - with helpful practical tips and advice. Would recommend this to anyone wanting to learn pandas. Lessons are well constructed. I'm actually surprised at how well done this is. I don't give many 5 stars, but this has earned it so far. - Michael
This course is very thorough, clear, and well thought out. This is the best Udemy course I have taken thus far. (This is my third course.) The instruction is excellent! - James

Welcome to the most comprehensive Pandas course available on Udemy! An excellent choice for both beginners and experts looking to expand their knowledge on one of the most popular Python libraries in the world! This course has been re-recorded from scratch in 2026 for the release of Pandas 3.

Data Analysis with Pandas and Python offers 19+ hours of in-depth video tutorials on the most powerful data analysis toolkit available today. Lessons include:

installing
sorting
filtering
grouping
aggregating
de-duplicating
pivoting
munging
deleting
merging
visualizing

and more!

Why learn pandas?

If you've spent time in a spreadsheet software like Microsoft Excel, Apple Numbers, or Google Sheets and are eager to take your data analysis skills to the next level, this course is for you!

Data Analysis with Pandas and Python introduces you to the popular Pandas library built on top of the Python programming language.

Pandas is a powerhouse tool that allows you to do anything and everything with colossal data sets -- analyzing, organizing, sorting, filtering, pivoting, aggregating, munging, cleaning, calculating, and more!

I call it "Excel on steroids"!

Over the course of more than 19 hours, I'll take you step-by-step through Pandas, from installation to visualization! We'll cover hundreds of different methods, attributes, features, and functionalities packed away inside this awesome library. We'll dive into tons of different datasets, short and long, broken and pristine, to demonstrate the incredible versatility and efficiency of this package.

Data Analysis with Pandas and Python is bundled with dozens of datasets for you to use. Dive right in and follow along with my lessons to see how easy it is to get started with pandas!

Whether you're a new data analyst or have spent years (*cough* too long *cough*) in Excel, Data Analysis with pandas and Python offers you an incredible introduction to one of the most powerful data toolkits available today!

Who this course is for:

Data analysts and business analysts
Excel/Google Sheets users who looking to learn a more powerful software for data analysis

Data Analysis with Pandas and Python [2026]

What you'll learn

Explore related topics

Coding Exercises

Course content

Installation and Setup11 lectures • 1hr 15min

Python Crash Course15 lectures • 2hr 46min

Series22 lectures • 2hr 38min

DataFrames I: Introduction19 lectures • 2hr 3min

DataFrames II: Filtering Data11 lectures • 1hr 21min

DataFrames III: Data Extraction11 lectures • 1hr 1min

Working with Text Data7 lectures • 43min

MultiIndex11 lectures • 1hr 23min

GroupBy6 lectures • 28min

Merging DataFrames10 lectures • 59min

Requirements

Description

Who this course is for: