Data Processing with Python
4.1 (1,337 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
9,980 students enrolled

Data Processing with Python

Learn how to use Python and Pandas for cleaning and reorganizing huge amounts of data.
4.1 (1,337 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
9,980 students enrolled
Created by Ardit Sulce
Last updated 10/2019
English
English
Current price: $13.99 Original price: $19.99 Discount: 30% off
5 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 3.5 hours on-demand video
  • 17 articles
  • 17 downloadable resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Build 10 advanced Python scripts which together make up a data analysis and visualization program.
  • Solve six exercises related to processing, analyzing and visualizing US income data with Python.
  • Learn the fundamental blocks of the Python programming language such as variables, datatypes, loops, conditionals, functions and more.
  • Use Python to batch download files from FTP sites, extract, rename and store remote files locally.
  • Import data into Python for analysis and visualization from various sources such as CSV and delimited TXT files.
  • Keep the data organized inside Python in easily manageable pandas dataframes.
  • Merge large datasets taken from various data file formats.
  • Create pivot tables in Python out of large datasets.
  • Perform various operations among data columns and rows.
  • Query data from Python pandas dataframes.
  • Export data from Python into various formats such as TXT, CSV, Excel, HTML and more.
  • Use Python to perform various visualizations such as time series, plots, heatmaps, and more.
  • Create KML Google Earth files out of CSV files.
Course content
Expand all 50 lectures 03:46:33
+ Getting Started
2 lectures 11:27

You will learn how to install Python through the Anaconda package which is a complete package that will not only install Python into your computer, but also other libraries needed for data analysis and visualizations such as pandas, matplotlib, numpy, scipy, etc.

Preview 08:06

You will learn how to use the Spyder environment to write scripts of Python code and also learn how to use iPython which is an enhanced interactive shell where you type in and execute Python code. iPython is tailored for data analysis applications

Python editors - Spyder and iPython
03:21
+ Downloading Many Files with Python
7 lectures 38:16

Short lecture introducing you to this section of the course.

Section introduction
01:34

You will learn how to write Python code that establishes a connection to an FTP server and accesses the files of the FTP site.

Navigating through FTP directory trees with Python
07:00

You will learn how to use the Spyder editor for executing complete scripts of Python code.

Storing Python code
04:32

You will learn how to create a custom FTP function that logs in to an FTP site and generates a list of file names contained in the site.

Creating an FTP function
02:29

You will learn the Python code that downloads a single file from an FTP site.

Downloading an FTP file
08:32

Something to keep in mind for the next lecture.

About the next lecture
00:27

Here we start building our data analysis program.

In this particular lecture, we will build an FTP function that will login to the FTP site, and download a given range of files from the site.

Practice No.1: Creating an FTP File Downloader
13:42
+ Extracting Data from Archive Files
3 lectures 11:30

You will learn how to extract various types of archive files using the patool library and the for loop.

Extracting ZIP, TAR, GZ and other archive file formats
03:41

You will learn how to extract RAR archive files.

Extracting RAR files
01:57

Here you will write a function that will fetch the archive files downloaded by the FTP function and it will extract them all in a local directory.

Practice No.2: Creating a Batch Archive Extractor
05:52
+ Working with TXT and CSV Files
8 lectures 20:10

Short lecture introducing you to this section of the course.

Section introduction
01:22

You will learn how to easily read CSV and delimited TXT files using the pandas library and use their data inside Python.

Reading delimited TXT and CSV files
10:06
Reading Excel files
00:16

You will learn how to export data from Python to CSV and TXT files.

Exporting data from Python to files
04:14

You will learn how to open data from TXT files which columns are delimited by a certain width.

Reading fixed width TXT files
01:58

You will learn how to quickly export a pandas dataframe into an HTML file.

Exporting data back to HTML and other file formats
01:02
Data Analysis Exercise 1
01:09
Data Analysis Exercise 1: Solution
00:02
+ Getting Started with Pandas
4 lectures 12:13

We already used the pandas library in the previous section. Here you will be given an official tour to the pandas data analysis library.

Get started with Pandas
06:16

You will create a function that grabs all the TXT files of a folder, opens each of them in Python as dataframes, adds a column in each dataframe and exports the updated dataframes back to CSV files.

Practice No.3: Calculating and Adding Columns to CSV Files
04:57
Data Analysis Exercise 2
00:56
Data Analysis Exercise 2: Solution
00:03
+ Merging Data
8 lectures 19:44

You will write a function that gets all the CSV files and concatenates them vertically using the pandas concatenate function by creating a single CSV containing everything.

Practical No.4: Concatenating multiple CSV files
06:18
Data Analysis Exercise 3
00:40
Data Analysis Exercise 3: Solution
00:38

You will write a function that will join columns of a pandas dataframe to another dataframe.

Practice No. 5: Joining Data Based on a Matching Column
08:59
Data Analysis Exercise 4: Solution
00:21
Data Analysis Exercise 5
01:14
Solution: 5 of 6
00:03
+ Data Aggregation
1 lecture 07:41

You will learn how to use the pandas pivot function by creating a pivoted dataframe out of a large CSV file by aggregating the data values.

Practice No. 6: Pivoting Large Amounts of Data
07:41
+ Visualizing Data
5 lectures 28:18

You will learn how to use the visualization features available in Python and generate graphs using the matplotlib and the seaborn libraries.

Data visualization with Python
11:31

You will expand your knowledge on performing visualizations of different kinds out of pandas dataframes and adding labels and legends to the generated graphs.

Preview 12:23

You will learn create a function that will access the pivoted dataframe and it will generate a graph representing the data, and save the graph inside a PNG image file.

Practice No. 7: Producing Image Files
03:08
Data Analysis Exercise 6
01:08
Data Analysis Exercise 6: Solution
00:08
+ Mapping Spatial Data
2 lectures 12:23

You will learn how to create a point KML file using the simplekml library and display the file in Google Earth.

Programmatically creating KML Google Earth files with Python
04:37

You will create a function that grabs the data from a pandas dataframe and creates a KML file using the latitude and the longitude information contained in the dataframe.

Practice No, 8: Creating KML Google Earth fIles from CSV data
07:46
+ Putting everything together
6 lectures 22:37

You will learn how to make your script interact with a user who runs it.

User interaction
06:07
Exercise: User interaction
00:39
Exercise: User interaction: Solution
00:20

You will learn how to execute all the functions of the programs in one single click.

Practice No. 9: Polishing the Program, I
05:00

You will learn how to make your program more user friendly by integrating the user input functionality.

Practice No. 10: Polishing the Program, II
05:30

You will learn how to convert your program into a Python module so you can import it in other scripts.

Practice No. 11: Creating Python Modules
05:00
Requirements
  • A working computer (Windows, Mac, or Linux)
  • No prior knowledge of Python is required
Description

Data scientists spend only 20 percent of their time on building machine learning algorithms and 80 percent of their time finding, cleaning, and reorganizing huge amounts of data. That mostly happen because many use graphical tools such as Excel to process their data. However, if you use a programming language such as Python you can drastically reduce the time it takes for processing your data and make them ready for use in your project. This course will show how Python can be used to manage, clean, and organize huge amounts of data.

This course assumes you have basic knowledge of variables, functions, for loops, and conditionals. In the course you will be given access to a million records of raw historical weather data and you will use Python in every single step to deal with that dataset. That includes learning how to use Python to batch download and extract the data files, load thousands of files in Python via pandas, cleaning the data, concatenating and joining data from different sources, converting between fields, aggregating, conditioning, and many more data processing operations. On top of that, you will also learn how to calculate statistics and visualize the final data. The course also covers a series of exercises where you will be given some sample data then practice what you learned by cleaning and reorganizing those data using Python.

Who this course is for:
  • Those who come from any technology field that deals with any kind of data.
  • Those who want to leverage the power of the Python programming language for handling data.
  • Those who need to learn Python basics and want to quickly advance their skills by learning how to perform data cleaning, analysis and visualization with Python - all in one single course.
  • Those who want to switch from programming languages such as Java, C, R, Matlab, etc. to Python.