
This course includes our updated coding exercises so you can practice your skills as you learn.
See a demo
Explore NumPy and pandas through a hands-on, project-based course, covering arrays, indexing, slicing, filtering, sorting, broadcasting, and essential tools for analyzing and transforming data frames and time series.
Analyze a course project with over 2 million transactions by product, household, and store. Use Python to read flat files, join tables, compute KPIs, and visualize discount impacts on margins.
Learn core pandas functionality for data frame manipulation and basic data visualization with the map plot lib API in Jupyter notebooks. Google Colab is also introduced as an alternative.
Install and launch jupyter notebooks using Anaconda on mac and pc, navigate to Anaconda Navigator, create a coursework folder, and optionally use Google Colab for cloud notebooks.
Explore Pandas and NumPy fundamentals, including array creation, indexing, slicing, vectorization, and broadcasting; convert lists to NumPy arrays and use Pandas DataFrames built on NumPy for efficient data analysis.
Learn NumPy arrays as fixed size containers offering efficiency over Python lists; create one dimensional or two dimensional arrays from lists, inspect ndim, shape, size, dtype, and use transpose.
Convert a list to a NumPy array, inspect its shape, size, and data type, and explore reshaping it into 10x1, 2x5, or 5x2 arrays.
Explore how NumPy creates arrays with ones, zeros, arange, linspace, and reshape. Learn to specify dtype, create 2 by 5 arrays, and apply reshape, transpose, and identity matrices.
Explore numpy arrays with indexing and slicing, from one-dimensional access using zero-based indices to two-dimensional coordinates, using start, stop, and step, including negative indices.
Add five to every price in a NumPy array, reshape the data, and compute final owed amounts after a random discount, then apply array filtering.
Filter the product array to prices greater than 25. Include cola in the fancy feast special and build a shipping cost array with zero for prices above 20, else five.
Learn to filter a six-product array by a prices array with NumPy, using a mask and or logic to include COLA, and set free shipping with numpy.where.
Sort arrays with pandas using sort method and in-place options, compare to numpy sort returning a copy, and use indexing and negative slices to locate min, max, or reverse order.
Learn to compute mean, min, max, and median of the top three prices and extract unique price tiers using Python in a NumPy and Pandas context.
Learn vectorization with NumPy and pandas to maximize efficiency by pushing array operations into optimized C code, avoiding Python loops, and using broadcasting for element-wise computations.
Analyze sales data with NumPy and Pandas by reading a CSV, filtering by product family, sampling half, and classifying as above both, above mean and median, or below both.
Explore pandas data types, including numpy booleans, integers, floats, and object text, plus time series, and master type conversion with as_type for reliable numeric and date time handling.
Explore how Pandas series use the index to access data, compare default integer indices with custom labels, and slice with label-based indices for time series and data frames.
Learn how duplicate index values in pandas series and dataframes affect lookups and how reset index with drop=true restores a clean integer index.
Create a date index for oil series, compute mean of first and last ten rows with positional slices, then slice from 20170101 to 20170107 by labels and reset the index.
Filter pandas series with boolean masks using the lock assessor, applying comparisons and membership tests; build masks for combined conditions and use tilde for not in.
Learn how to sort a Pandas series by values or by index, using sort_values and sort_index, with ascending and in-place options.
Explore numeric series operations in pandas and NumPy, using Python operators or top pandas methods to add, subtract, multiply, divide, modulo, and exponential, and handle missing values with fill values.
Learn series operations in python data analysis: apply percentage and fixed increases, compute max price, derive percentage differences, and extract month from date strings using series indexing and aggregations.
Learn to filter a date-indexed series by month, compute mean and sum, use quantiles for 10th and 90th percentiles, explore counts and value counts, and convert data types for analysis.
Identify missing data in pandas using NumPy nan values and pd.NA, and learn to use arithmetic methods with a fill value to handle nans.
Learn how the pandas where method returns series values from a boolean test, with a false value and in-place option, and compare it to numpy where and tilde-inverted tests.
Apply a boolean function with a lambda and a 0.9 quantile to decide buy or wait, then use NumPy where to adjust prices and add a new price column.
Discover pandas series, their index, and how series form data frame columns; access rows with positional indexing (iloc) or label-based (loc), and use NumPy for filtering, sorting, and aggregation.
Practice reading a csv file from a relative path and accessing the transactions data. Inspect its number of rows and columns, column names, and data types.
Learn to inspect data frames with head, tail, and sample, then use info and describe to assess data types, memory usage, missing values, and statistics for numeric and categorical columns.
Learn to use pandas' drop method to remove rows (axis=0) and columns (axis=1), drop redundant id column, manage in-place changes, and optimize memory by keeping intermediate dataframes.
Read and summarize missing data in oil price data, including missing dates and values, then compare mean oil prices when filling with zero versus the mean.
Rename columns with a dictionary via the rename method and axis=1, then reorder with reindex in a single chained pipeline, keeping the original data frame intact for next steps.
Create new pandas columns using arithmetic between a series and a scalar or between series, including tax amount from sales and total column, plus boolean-based conditional arithmetic like taxable category.
Create arithmetic and boolean columns in pandas, compute percent to target and met target, determine bonus payable, and extract date parts like month and day of week.
Create a seasonal bonus column in a pandas DataFrame with December holiday bonus, May corporate month Sundays, and July summer special Mondays, then compute total owed at $100 per day.
Drop unnecessary columns to keep date, store number, and transaction count, recreate target percent, target bonus payable, month, day of week, and seasonal bonus, then spot check totals against 822,900.
Explore memory optimization with the Pandas categorical data type, converting repeated text to integers to reduce memory usage and boost dataframe performance.
Learn memory optimization in pandas by dropping unused columns, downcasting numeric types, and converting objects to numeric or datetime; explore when using categorical types helps or hurts memory usage.
Group data by store and month to compute total transactions, then sort by month ascending and by transactions descending within each month for dashboard-ready insights.
Modify pandas multi-index data frames by resetting indices to turn levels into columns, swapping index levels, and dropping levels, returning to a simple integer base index for easier filtering.
Learn to work with multi-index dataframes in pandas by selecting rows with iloc and loc, manipulating column levels, resetting and dropping indexes, and performing multiple aggregations with agg.
Master pandas transform to compute group-level statistics while preserving rows, enabling per-row comparisons like store sales against store averages and home-team goals against league averages.
calculate the mean of transactions by store number and day of week, add an average per store-day column, and compute a difference column, preserving all rows.
Pass min and max in ag func to pivot tables, use a dictionary of columns for multiple aggregations, and weigh pivot table versus groupby to avoid overly wide results.
Create heat maps from pivot tables by applying background gradients and a cmap color map to reveal insights across stores and product families.
Build a pivot table of store number by day of week, the sum of bonus payable, filter zeros, apply a heatmap, then melt to one row per store and day.
Learn to build a filtered pivot table from a transactions data frame, create a heat map by day of week and store, and melt and reshape the result for visualization.
Master aggregation with group by to generate summary reports from data frames, handling multi-index frames, and using pivot, melt, and named aggregations to shape insights for visualization.
Learn to visualize data in pandas with the plot method, creating line, bar, pie, scatter, and histograms, customizing charts via the Map Plot Lib integration to tell a story.
Plot oil prices by dates in the oil data frame to see trends with a simple line chart. Explore the 2014 oil price decline as a potential case study.
Format plots with titles, axis labels, colors, legends, figure size, and subplots; set a clear chart title and axis labels like date and daily transactions for store 44 transactions 2013–2017.
Learn to add, reposition, and customize legends in pandas map plot lib charts, including turning legends off, selecting locations, using box to anchor coordinates, and enabling grid lines.
create a stylized line chart by renaming the price column, adding a title and axis labels, using a dark grid, and converting the date column to date time 64.
Learn to create subplots and pivot tables of sales by store, using a date index, and compare four charts with a shared y-axis to reveal store performance and seasonal patterns.
Learn to create and interpret pie charts for categorical data as 100 percent composition, and scatter plots for numerical relationships, including sorting slices, start angle, labeling, and regression exploration.
Save your Pandas visualizations as image files with proper padding and explore next steps in seaborn, matplotlib, plotly, dash, and geospatial libraries for richer, shareable visuals.
Tackle a midcourse project that analyzes large transactions and products datasets using pandas. Create new discount metrics, perform aggregations, and build plots to reveal insights.
Explore dates and times in python and pandas, mastering the date time data type, time series concepts, date formatting, and operations like shifting, deltas, and moving averages.
Format pandas datetime data with custom date codes like %D and %Y using string formatted time. Apply these formats in dataframes for presentation and charts, noting the resulting object dtype.
Explore time deltas in pandas: compute differences between datetimes, extract days, store shipping days, and use two_time_delta to offset dates by days or weeks, including leap-year adjustments.
Manipulate date time data in pandas by converting to datetime64, adding a three-week delta to the last date, and deriving weeks from days using assignment and delta operations.
Explore time series missing data techniques, including forward fill, backfill, and linear interpolation, and compare their impact on sales data and plots.
Shift a series by a specified number of rows for period comparisons. Compute growth by dividing current sales by the prior row, subtracting one, multiplying by 100, and rounding.
Master diff to compute time-series changes by subtracting a shifted series from the original, enabling daily change, growth, and year-over-year analysis with pandas.
Plot the sum of monthly transactions for Store 47 in 2015 and 2014, grouped by year and month, to reveal year-over-year trends.
Filter store 47 transactions, group by year then month, and use the shift method to create year-prior sales for 2015 versus the prior year with a line plot.
Compute a 90-day rolling average of store 47 transactions using pandas, then plot the smoothed series to reduce daily noise.
Use pandas converters to clean data during import, filling missing values with zero and formatting currency with lambda functions; also apply converters to extract year parts from dates.
Learn to read text files with pandas using the correct delimiter in read_csv, and to read, select sheets, and concatenate Excel data with read_excel and concat, handling ignore_index.
Set up an Excel writer, loop through 2013–2017, filter by year, and write each year’s data to a sheet named after year, with a csv export using f-strings in pandas.
Learn how pandas reads and writes json, feather, html, pickle, and Python dictionaries, and use read_html to extract a Wikipedia gdp table for cleaning and plotting by state.
Learn how to combine Pandas data frames by joining on related fields and appending rows, and understand join types and when to use single or multiple keys.
Master how to append and join data frames in pandas by stacking rows with identical columns and adding columns via a shared key, using concat and index management.
Learn to append data frames with pandas by reading csv and excel files, dropping a useless column, and concatenating 2014 and 2015 data into a clean, unified frame.
Learn how inner joins merge tables on date, compare row counts, and identify when to switch to left joins using pandas merge with left_on and right_on for explicit joins.
This is a hands-on, project-based course designed to help you master two of the most popular Python packages for data analysis and business intelligence: NumPy and Pandas.
We'll start with a NumPy primer to introduce arrays and array properties, practice common operations like indexing, slicing, filtering and sorting, and explore important concepts like vectorization and broadcasting.
From there we'll dive into Pandas, and focus on the essential tools and methods to explore, analyze, aggregate and transform series and dataframes. You'll practice plotting dataframes with charts and graphs, manipulating time-series data, importing and exporting various file types, and combining dataframes using common join methods.
Throughout the course you'll play the role of Data Analyst for Maven Mega Mart, a large, multinational corporation that operates a chain of retail and grocery stores. Using the Python skills you learn throughout the course, you'll work with members of the Maven Mega Mart team to analyze products, pricing, transactions, and more.
COURSE OUTLINE:
Intro to NumPy & Pandas
Introduce NumPy and Pandas, two critical Python libraries that help structure data in arrays & DataFrames and contain built-in functions for data analysis
Pandas Series
Introduce Pandas Series, the Python equivalent of a column of data, and cover their basic properties, creation, manipulation, and useful functions for analysis
Intro to DataFrames
Work with Pandas DataFrames, the Python equivalent of an Excel or SQL table, and use them to store, manipulate, and analyze data efficiently
Manipulating Python DataFrames
Aggregate & reshape data in DataFrames by grouping columns, performing aggregation calculations, and pivoting & unpivoting data
Basic Python Data Visualization
Learn the basics of data visualization in Pandas, and use the plot method to create & customize line charts, bar charts, scatterplots, and histograms
MID-COURSE PROJECT
Put your skills to the test with a brand new dataset, and use your Python skills to analyze and evaluate a new retailer as a potential acquisition target for Maven MegaMart
Analyzing Dates & Times
Learn how to work with the datetime data type in Pandas to extract date components, group by dates, and perform time intelligence calculations like moving averages
Importing & Exporting Data
Read in data from flat files and apply processing steps during import, create DataFrames by querying SQL tables, and write data back out to its source
Joining Python DataFrames
Combine multiple DataFrames by joining data from related fields to add new columns, and appending data with the same fields to add new rows
FINAL COURSE PROJECT
Put the finishing touches on your project by joining a new table, performing time series analysis, optimizing your workflow, and writing out your results
Join today and get immediate, lifetime access to the following:
13+ hours of high-quality video
Python NumPy & Pandas PDF ebook (350+ pages)
Downloadable project files & solutions
Expert support and Q&A forum
30-day Udemy satisfaction guarantee
If you're a data analyst, data scientist, business intelligence professional or data engineer looking to add Pandas to your Python skill set, this course is for you.
Happy learning!
-Chris Bruehl (Python Expert & Lead Python Instructor, Maven Analytics)
__________
Looking for our full business intelligence stack? Search for "Maven Analytics" to browse our full course library, including Excel, Power BI, MySQL, Tableau and Machine Learning courses!
See why our courses are among the TOP-RATED on Udemy:
"Some of the BEST courses I've ever taken. I've studied several programming languages, Excel, VBA and web dev, and Maven is among the very best I've seen!" Russ C.
"This is my fourth course from Maven Analytics and my fourth 5-star review, so I'm running out of things to say. I wish Maven was in my life earlier!" Tatsiana M.
"Maven Analytics should become the new standard for all courses taught on Udemy!" Jonah M.