
This video gives an overview of entire course.
In this video, we will see how we load data into Python. Here we see, how to load data into a Pandas DataFrame from various sources, including CSV, Excel, XML, JSON, and via a web API.
In this video, we will see how we can connect to a (MySQL) database. Here we set up such connection.
Once we have a database, we will see how we add data to it. Here, we create a table in Python with SQL commands, then use a Panda’s dataframe to add data to the table in the database.
In this, we will see how we extract data from a database. Here we use SQL and Pandas to load a table’s data into a DataFrame.
Different datasets may contain different information that we may want together. Here in this video, we use joins to combine these datasets together.
Datasets may not come in appropriate formats or shape. Here, this video demonstrate how to manipulate a dataset’s shape, altering what is represented in rows and what is in columns.
Columns in a dataset may not represent data in a preferred manner. Here in this video we look at different data transformation techniques to remedy this.
Strings may contain useful data. Here we will show how to manipulate string contents and extract data from them.
In this video, we will show how we can form and use groups. We will use the DataFrame method groupby() to create groups, then will do the group operations you want.
In this video, we will learn how we get numerical summaries of groups. We will use data aggregation methods, such as Series/DataFrame methods, or agg().
In this video, we will see how we can perform operations at group levels. We will use split-apply-combine procedures.
In this video, we will show how we can perform cross tabulation or create pivot tables. We will use crosstab() or pivot_table().
In this video, we will show how we get messy data from web pages. Also how to write a web scraper.
In this video, we will see how we can practice safe web scraping. Also we will highlight some issues for you to consider.
In this video, we will see how we can get data from a simple HTML document. Also how to use BeautifulSoup.
This video will show how we actually use BeautifulSoup. Here this video show an example.
In this video, we will see how we can create a dataset spread across multiple web pages. We crawl the web with Python and BeautifulSoup.
This video explains what a Selenium is. It is a package for automating a Web browser.
This video explains how Selenium navigates the web. It does so much like a human would.
In this video, we will see how we get data from a webpage’s dynamic content. Load the webpage with Selenium, let the dynamic content activate, then read the page.
In this video, we will show what a Scrapy is. It is a software suite built on Python for Web crawling and scraping.
In this video, we will see how do we create a Scrapy spider? Scrapy provides an automated process for creating projects and their spiders.
In this video, we wil show how we make Scrapy spiders collect data. We program them, focusing on their parse() method.
Once we have a Scrapy spider programmed, we will see how we use it. We run it with the crawl command.
Python, a multi-paradigm programming language, has become the language of choice for data scientists for data analysis, visualization, and machine learning.
In this course, you’ll start by learning how to acquire data from the web in its already “clean” format, such as in a .csv file, or a database. You’ll then learn to transform this data so it’s in its most useful format for analysis. After that, you’ll dive into data aggregation and grouping, where you’ll learn to group similar data for easier analysis purposes. From there, you’ll be shown different methods of web scraping using Python. Finally, you’ll learn to extract large amounts of data using BeautifulSoup, as well as work with Selenium and Scrapy.
About the author
Curtis Miller is Associate Instructor at the University of Utah, and an MSTAT student. He is currently involved in research on data analysis from statistical and computer science perspectives. Curtis has published research on policy and economic issues.