Data analysis and visualization in Python with Pandas

Name: Data analysis and visualization in Python with Pandas
Rating: 4.8 (49 reviews)

The student will gain knowledge of Python libraries pandas and matplotlib and data analysis and vizualization

Created byArdian Grezda

Last updated 3/2024

English

What you'll learn

Basics in pandas library
File reading and writing
Data visualization using matplotlib
Data wrangling
Data agreggation
Time series

Course content

7 sections • 92 lectures • 10h 39m total length

Series part 19:21
Begin exploring the pandas library by creating and inspecting a one-dimensional series, understanding its index and values, using default and custom indexes, and filtering positive values.
Series Part 28:40
Explore basic series arithmetic, creating series from dictionaries, and index handling in pandas, including isnull checks, null value detection, and named series and index; plus adding series with aligned indexes.
DataFrame part 18:08
Create a data frame as a tabular structure with row and column indexes in pandas from dictionary. Explore columns like state, year, statistics, and depth, and access data via indexing.
DataFrame part 26:31
Modify a data frame by assigning to a column and using a series, create new columns, delete columns, and build data frames from dictionaries for Italy and Germany across years.
Index object2:20
Explore how the pandas index object maintains axis values for a series or data frame, illustrating creation, indexing, and the immutable nature of the index.
Reindexing10:55
Learn how reindexing aligns series and data frames, handles missing values with none, and uses forward fill or interpolation to manage new indexes and columns.
Deleting data from the axis6:41
Demonstrate deleting data from the axis in Pandas with the drop method on series and dataframe, removing indexes and columns using axis controls.
Indexing, selection and filtering Part 17:32
Explore indexing, selection, and filtering in pandas series, including label and position retrieval, slicing, boolean masking, and updating values such as B and C, similar to numpy.
Indexing, selection and filtering Part 29:34
Explore indexing, selection, and filtering in a data frame, including selecting columns and rows with lists, boolean masks, and conditional updates.
Indexing, selection and filtering Part 36:45
Arithmetics with Series and DataFrames9:21
explore arithmetics with series and data frames by adding pandas objects and aligning by index and columns. see how sums handle missing indices or columns with not a number.
Functions and mapping7:32
Learn element-wise and row or column mapping in pandas with numpy, applying abs and custom lambda functions to transform data frames and compute max-min per axis.
Sorting in pandas9:55
Sort data in pandas by index and by columns using sort_index, and order values with sort_values. Learn to sort by a specific column (axis=1) or by index across rows.
Indexes with duplicate values2:48
Explore how pandas handles indexes with duplicate values by inspecting a series and a data frame, and verify index uniqueness using is unique while selecting entries by index label.
Class work no 11:59
Create a pandas data frame with country index, populate population and female lists, compute male, add a dump column as 70% of male, then calculate percent male and drop dump.
Solution to class work no 17:24
Demonstrate a pandas workflow by building a country population dataframe, adding female and male columns, computing 70% of male, calculating percent of male, and removing the dump column.

Statistical description methods in pandas part 19:22
Explore how pandas computes statistical descriptions and simple aggregations. Use sum, axis options, and describe to obtain mean, std, min, max, and percentiles (20%, 50%, 70%) while handling NaN values.
Statistical description methods in pandas part 24:17
Describe how pandas handles non-numerical data, showing count, unique, top, and freq, and summarize methods like min, max, argmin, argmax, idxmin, idxmax, sum, mean, median, var, std, diff, and pct_change.
Unique Values and Value counting part 13:30
Unique Values and Value counting part 27:46
Explore using the isin method to filter unique values and build a boolean mask, then apply pandas value_counts across Q1–Q3 with zeros for non-numeric values.
Manipulation of missing data2:37
Learn to manipulate missing data in pandas by filtering with dropna, using a series containing none values, and observing how dropna drops missing entries while preserving valid data.
Filtering missing data8:21
Learn to filter missing data in pandas by using isnull and dropna to drop rows or columns with NaN, including how='all' to drop only all-NaN rows or columns.
Filling missing data7:17
Learn how to fill missing data in pandas using fillna with scalar values, per-column fills via a dictionary, and mean (average) imputation for a series in a data frame.
Hierachical indexing Part 19:56
Explore hierarchical indexing in pandas, using multi-level indexes and unstack to transform a multi-index series into a data frame, with practical examples of indexing, selecting, and unstacking.
Hierachical indexing Part 28:56
Explore hierarchical indexing in pandas by building a two-level index with two-level columns, renaming levels, swapping them, and summing by level and by columns.
Using dataframe columns4:51
Learn to work with dataframe columns in pandas by setting and resetting indices, creating multi-index dataframes, and understanding how columns become index in dataframes.
Class work no 22:33
Create a pandas data frame from cattle, chicken, and sheep, indexed by country; use lambda to find min and max, then compute sum and average; sort by index and sheep.
Solution to class work no 214:54
Explore the clustering exercise and build a pandas data frame from cattle, chicken, and sheep, perform filtering with loc, compute min, max, sums, means, and handle missing values.

Reading and writing data from file Part 18:38
Learn how to read and write data from files in Python using Pandas, including read_csv and read_table, handling headers, custom column names, and index columns.
Reading and writing data from file Part 28:24
Learn to read and write data from files with pandas, including multi-index creation using a list of index columns, handling missing values with na_values, and skipping rows with read_csv.
Partial reading of text files11:00
Learn to perform partial reading of large text files with pandas by reading in chunks using read_csv and chunk size, then compute value counts by key.
Writing data out to text format5:17
Learn how to write data to text formats using pandas, exporting with to_csv and reading with read_csv, and control headers and delimiters for csv files.
Reading Excel files2:43
Read excel files into pandas data frames using xlrd and openpyxl, load a file, parse sheet one, and produce a data frame with columns A, B, C, D, and message.
json data6:20
Learn json data concepts and handling in python by using the json library to load and dump data, and pandas to build dataframes from json with selected fields.
Class work no 31:33
Read 1880, 1881, 1882 into data frames df_1880, df_1881, df_1882. Count 1880 births by gender and Alice; Grace across frames; set 1180–1200 in df_1880 to ABC; list df_1882 last names.
Solution to class work no 312:22
Learn to load three year data frames with pandas read_csv, filter by gender, sum totals, and count names like Grace and Alice.

Data vizualization Line drawing11:47
Explore data visualization in Python using matplotlib and seaborn, learn to plot lines with various colors, markers, and styles, and apply auto scale and tight layout for clear figures.
Scatter graph4:38
Create a scatter graph in Python by plotting x and y values with Matplotlib, then customize size, color, marker type, and transparency to illustrate data points.
Bar graph6:10
Create and customize a bar graph in python using two one-dimensional arrays for x and y, including default and full-form bars, plus horizontal bars with colors and edge colors.
Pie graph5:37
Explore creating a pie graph with matplotlib using a one-dimensional data array, customizing with labels, explode, colors from seaborn palettes, auto percent, and an optional shadow.
Advanced drawing 2d7:46
Learn to create a single figure with multiple subplots on a 2x2 grid, placing bar, pie, and scatter plots in separate panels, and apply seaborn color palettes with panel titles.
Title tick and label positioning7:56
Position titles, ticks, and labels in a matplotlib plot using set_title, set_x_label, set_y_label, and set_x_ticks/set_y_ticks, with custom tick labels and rotation.
Legend positioning4:18
Learn how to position legends in matplotlib to identify multiple graphs, using legend and loc=best to automatically place the legend where space is greatest.
Line drawing in pandas8:02
Explore line drawing in pandas using the series and data frame plot methods, with matplotlib and numpy that plot indexed values and base ten logs.
Bar drawing in pandas7:30
Explore bar drawing in pandas by plotting bar and horizontal bar charts with kind='bar', using series from linspace data and a figure with two graphs.
Scatter graph in pandas5:28
Create a scatter plot in python using pandas and matplotlib by loading macro data, selecting CPI, M1, tb rate, and unemployment, and plotting M1 differences against unemployment with labeled axes.
Class work no 41:44
Read macro data and marker data csv files into data frames, create frames a, b, and c with year, real gdp, and real cons, and plot the graph for 1960–1963.
Solution to class work no 48:03
Read macro data from csv files with pandas to build data frames. Visualize real GDP and related metrics using line plots and horizontal bar charts.

Data wrangling1:01
Master data wrangling in Python with Pandas by reading, cleaning, transforming, and rearranging data, and learn how to combine datasets with merge and concat operations.
Merging DataFrames part 110:46
Merge data frames using pandas' merge method to link rows by keys. Learn inner, left, right, and outer joins and how to use on, left_on, and right_on for column names.
Merging DataFrames part 211:28
Learn to merge data frames in Python using left and outer joins on two keys, and resolve overlapping columns with suffixes.
Merging index objects12:27
Explore merging dataframes by index and by column in pandas, using left_on and right_index, including multi-index joins and the outer join to align all indexes.
Concatenation in pandas Part 14:43
Learn pandas concatenation with pd.concat to merge series and data frames along different axes, illustrating vertical and horizontal stacking and index/column alignment.
Concatenation in pandas Part 28:36
Explore how to concatenate data frames in pandas using pd.concat, control join behavior with axis and inner join, and build hierarchical indexes with keys.
DataFrame Rearrangement4:56
Learn how hierarchical indexing enables data frame rearrangement in pandas, using stack and unstuck to rotate between rows and columns.
Removing duplicate data3:34
Learn to remove duplicate data in pandas by examining a dataframe, identifying duplicates with the duplicated method, and using drop_duplicates to keep non-duplicate rows.
Data transformation using function or mapping6:10
Learn how to transform data in pandas by mapping values through a dictionary and using lambda functions, including lowercasing text and creating a new animal column.
Replacing values in pandas4:59
Learn to replace values in pandas series using the replace method, including replacing with numpy NaN, handling multiple replacements, and using a dict to map old values to new ones.
Renaming indexes in pandas5:49
Learn to rename pandas indexes using map and rename, converting indexes to upper or title case with str.upper and str.title, and adjust column labels using a deque structure.
Class work no 52:36
Read Titanic data in pandas by loading train and test csv into dataframes, select pclass, sex, age, fare, embarked for C and D, perform left join to merge and concatenate.
Solution to class work no 59:58
Learn to analyze Titanic data with pandas by reading train and test csvs, selecting pclass, sex, age, fare, and embarked, and merging and concatenating frames using index operations.

Groupby mechanics part 18:25
Explore data aggregation with pandas groupby mechanics, splitting a dataframe by keys along rows or columns, applying functions, and combining results into grouped summaries.
Groupby mechanics part 27:29
Explore advanced group by mechanics in python with pandas, using two keys and arrays to build multi-index results, unstack for wide format, and compute mean values across UK and USA.
Group iteration2:12
Learn group iteration in Python by grouping data by a key, yielding tuples with a group name and its data, and inspecting each group's key and data.
Column selection4:38
Explore column selection in pandas; compare selecting columns for aggregation with group by operations, including single and multi-column grouping, yielding series or data frames depending on brackets.
Grouping with dictionary and Series4:39
Group a dataframe by a dictionary mapping columns to red and blue, then sum values to compare red (a, b) and blue (c, d) groups.
Grouping with functions5:31
Explore grouping with functions in Pandas, using len to group by index length and compute sums or minimum. Build multi index results; relate functions to lists, arrays, dictionaries, and series.
Data aggregation8:40
Explore data aggregation in pandas, transforming arrays into scalar values via group by and agg functions, including peak to peak (max minus min) and tips.csv data insights.
Grouping by columns7:04
Group by columns in a pandas tips data frame to compute mean and other metrics (std, min, max, count) across sex and smoker, producing hierarchical, multi-index results.
Multiple functions application3:34
Apply multiple functions to dataframe columns using a dictionary in pandas, aggregating typekit with min, max, mean, and std, and size with sum after grouping by sex and smoker.
General form of operation split apply combine9:24
Learn the split-apply-combine workflow in pandas, create a top function to select top tip pct values, and apply it with group by smoker and day.
Pivot tables7:09
Explore pivot tables in pandas to summarize and aggregate data by multiple keys, using pivot_table and group by, with margins and aggregate functions like count and sum.
Class work no 61:38
Analyze Titanic data with pandas by loading Titanic.csv, selecting pclass, sex, age, fare, embarked, performing group by counts and aggregations, and identifying the oldest traveler without an index.
Solution to class work no 69:37
Analyze Titanic data with pandas by reading the csv, selecting five columns, grouping by sex and embarked, and applying min, max, sum, count, and size aggregations to reveal distributions.

Time series introduction1:05
Explore the fundamentals of time series data, its importance across finance, economics, ecology, and physics, and how pandas supports time stamps, separation, rearrangement, aggregation, and fixed frequencies.
Date and time data types7:51
Explore Python date and time data types using the datetime module. Import datetime, create current time, inspect year, month, and day, and compute differences with time delta between two datetimes.
Converting from string to date11:44
Convert strings to date objects in Python using datetime, strftime, and strptime, with pandas to_datetime for date arrays and formatting codes like %Y-%m-%d.
Time series basics3:53
Explore time series basics in pandas by creating datetime indices, building a series with timestamped values, inspecting the datetime index and dtype, and converting indices to timestamp objects.
Indexing and selection11:40
Explore indexing and selection in Python with pandas, focusing on time series operations, date-like indices, slicing, and extracting year and month from data frames and series.
Time series with double indexes5:01
Explore time series with double indexes in pandas, learn how to handle duplicate date indices, check uniqueness, and select all entries matching a specific date like 2018-01-02.
Resample conversion4:05
Learn to convert irregular time series to a fixed daily frequency with pandas resample('D'), applying daily mean and handling missing days as NaN in the resulting series.
Date range generation9:38
Learn to generate date time indexes with pandas date_range, using start, end, or periods, and apply daily (D) or last business day (BM) frequencies.
Frequencies and date shift6:04
Explore how pandas uses base frequencies and multipliers, such as four hours, and calendar codes like D, H, T, S, M, BM to build date ranges.
Data replacement (before and after)7:44
Explore data replacement before and after using the shift method on time series and data frames, moving values forward and backward while preserving the index, with monthly and daily examples.
Periods and arithemtics9:36
Explore periods and arithmetics in pandas, creating period objects for year-end monthly or quarterly intervals and using period range and period index. See additions or subtractions shift the period end.
Period conversion6:18
Learn to convert period index objects to different frequencies in pandas, transitioning from annual December periods to monthly start and end indexes, and creating series with month-end and month-start indexes.
Conversion from timestamps to periods8:54
Convert timestamps to periods using the two period method on datetime indexes and series, illustrating how monthly frequency yields period indexes for January and February 2020.
Creation of PeriodIndex from arrays7:45
Create a period index from year and quarter columns in pandas, with a q dec frequency, and assign it as the dataframe index.
Resampling and frequency conversion Downsampling13:18
Explore resampling and frequency conversion in pandas, focusing on downsampling high-frequency data to lower frequencies and upsampling to higher frequencies, with mean and sum aggregations.
Upsampling5:21
Upsample by converting lower frequency data to higher frequency using resample and mean aggregation, turning weekly data into daily values and aligning date ranges.
Drawing time series7:59
Draw time series in pandas by loading stock data from csv and parsing dates; set a date index, select AAPL, MSFT, XOM, SP, fill missing values forward, and plot trends.
Drawing time series example9:57
Depict a time series of Apple stock from 2005 to 2009 with simple, arithmetic, and exponential moving averages on a two-panel plot.

Requirements

The student should have basic understanding of Python programming language

Description

The course title is “Data analysis and visualization using Python” and it is using the pandas library.

It is divided into 7 chapters.

Chapter 1 talk about creation of pandas objects such as: Series, DataFrame, Index. This chapter includes basic arithmetic with pandas object. Also it describes other operations with pandas object such as: reindexing, deleting data from axis, filtering, indexing and sorting.

Chapter 2 describes statistical methods applied in pandas objects and manipulation with data inside pandas object. It describes pandas operations such as: unique values, value counting, manipulation with missing data, filtering and filling missing data.

Chapter 3 talks about reading and writing data from text file format and Microsoft Excel. Partial reading of large text files is also described with an example.

Chapter 4 describes data visualization using matplotlib library. It has example about the following graphs: line, scatter, bar and pie. Setting title, legend and labels in the graph is also describes with some practical examples. Drawing from pandas object is also described.

Chapter 5 talks about data wrangling. Merging Series object and DataFrame object is described with practical examples. Combining pandas objects and merging them is part of this chapter.

Chapter 6 talks about various forms of data aggregation and grouping. Creating and using pivot tables is also described.

Chapter 7 talks about time Series creation and manipulation. Classes DatetimeIndex and Period are included in the description of the chapter. Indexing and selection is described with practical examples.

Who this course is for:

Aspiring data analyst
Data analyst
Students that want to have knowledge about pandas library

Data analysis and visualization in Python with Pandas

What you'll learn

Explore related topics

Course content

Introduction to pandas library16 lectures • 1hr 55min

Operations in pandas library12 lectures • 1hr 24min

Reading and writing data to the file8 lectures • 56min

Data visualization12 lectures • 1hr 19min

Data wrangling in pandas13 lectures • 1hr 27min

Data grouping and aggregation13 lectures • 1hr 20min

Time series18 lectures • 2hr 18min

Requirements

Description

Who this course is for: