Web Scraping In Python: Master The Fundamentals
4.1 (52 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1,378 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Web Scraping In Python: Master The Fundamentals to your Wishlist.

Add to Wishlist

Web Scraping In Python: Master The Fundamentals

Learn how to extract data from websites
4.1 (52 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1,378 students enrolled
Last updated 4/2017
English
Current price: $10 Original price: $100 Discount: 90% off
5 hours left at this price!
30-Day Money-Back Guarantee
Includes:
  • 4 hours on-demand video
  • 2 Coding exercises
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Tackle new challenges by understand the underlying method/approach to take
  • Scrape static webpages
  • Be able to scrape websites that use Javascript
  • Extract all sorts of data from websites
  • Know what to look for and how to approach parsing a website
  • Gather data from all over the internet
  • Use recursion algorithms to search through website content
View Curriculum
Requirements
  • Basic Python knowledge
  • A Python 3 Environment to Code in
Description

Web scraping is the art of picking out data from a website by looking at the HTML code and identifying patterns that can be used to identify your data. This data can then be gathered and later used for your own analysis.

In this course we will go over the basic of web scraping, learning all about how we can extract data from websites, and all of this is guided along by a work example.

At the end of the course you should be able to go off on your own, and pick out most common websites, and be able to extract all the relevant data you may need just through using Python code.


Who is the target audience?
  • Anyone interested in analyzing data
  • Anyone who doesn't know how to start gathering data
  • Anyone who wants to develop their ability to scrap data
  • Anyone interested in starting with web scraping
  • Anyone who is interested in expanding their Python knowledge
  • Anyone who wants to gather a wide array of data to play with
Students Who Viewed This Course Also Viewed
Curriculum For This Course
29 Lectures
04:12:14
+
Prerequisite knowledge
6 Lectures 24:27

Here we look at the generals of web-scraping and get introduced to what we will do.

Preview 03:20

Here we will quickly talk about the other way of scraping data from the web, namely APIs.

APIs
02:00

The main python libraries that we will be using in this tutorial series.

Prerequisite Libraries
03:00

Here we will cover what the modulus operation does and what use that can be for us later on.

Introduction to The Modulus Operation
05:01

Here we will look at how we can deal with error that appear in our code and how we can work around them, especially if we expect them.

Preview 04:25

Here we will learn about the dataframe data structure that Pandas provides to see the format that we want to our final data to have.

Introduction to Pandas
06:41
+
Static Data Extraction/Web Scraping
11 Lectures 01:46:31

Here we will make our first HTTP request and cover the possible outcomes.

Preview 07:18

Here we will test if you can correctly understand HTTP Error Codes. Feel free to search for a list of error codes and use the web as a basis, you don't need to remember them all, just be able to understand them once you get them.

Error codes
2 questions

In this tutorial we will look at how we can read the text response that we get when we contact a website.

Reading The Response Text From Our Request
11:40

Here we're going to cover how we can start the use the text response to parse out the data that we're looking for.

First Approach at Parsing The Data
13:18

Here we will look deeper into the exception cases and see how we should adapt our code to incorporate them.

Understanding the Exception Cases
06:39

Having considered the straightforward as well as the exception cases, we can now complete the data parse for one company.

Parsing Out All Data for One Company
09:33

Here we will see where we can get more ticker symbols from, and start by identifying and selecting the range of data that is of interest for us.

Determining Where We Can Get More Ticker Symbols
15:46

Here we will start our process of parsing out the ticker symbols, based on identifying patterns that we see in the website code.

Extracting Company Ticker Symbols Part 1
16:32

Here we will finish up our method of scraping out company ticker symbols, so that we have a complete, and much larger, set of company ticker symbols to scrape data for.

Extracting Company Ticker Symbols Part 2
10:41

Here we will quickly recap the extraction of the ticker symbols process, to make sure we understand why we did certain things.

Extracting Out Ticker Symbols
2 questions

Now that we have a complete set of symbols to scrap with, we can modify our code from before to incorporate these new companies.

Getting Data For All Parsed Companies
08:11

Here we will ensure that we have data for all companies and put them into well formatted pandas dataframe data structure.

Final Data For All Parsed Companies
05:13

Here we will go over what we got from our web scraping, and take a look at the final format of our data.

Final Result
01:40

Your own web scrap
1 question
+
Scraping Websites That Load Data With Javascript
11 Lectures 01:50:43

Here we will quickly cover the goal of this section as well as the extra libraries we're going to need to install.

Prerequisite Libraries
05:02

We'll go into a short review of what a recursive function is, using the Fibonacci sequence as our example.

Short review: Recursive Functions
07:43

We'll learn how to create a browser instance as well as the basic navigation that happens within it.

Getting started with Selenium
08:47

Here we'll start to look at and become familiar how the content of the website looks like.

View The Page Source
09:14

Here we'll learn how we can use elements and XPath to navigate our response data.

Website Elements and XPath
08:11

Here we'll use what we've learned so far to start parsing out the relevant data.

Navigating Deeper Into The Page Source
14:37

Now we're going to apply what we've learned so far to identify the direct path to our data.

Identifying The Path To Our Data
19:28

Now we will use the path that we've identified to navigate through the HTML to our data.

Using The XPath To Our Data
09:50

Here we will continue on to get the data out now after we've navigated to it.

Parsing Out Our Data
08:42

We'll combine everything we've produced before to get out our data efficiently and into a nice format.

Getting Our Final Data
14:56

A recap of our approach and our results.

Final Results
04:13

Scraping a website that uses AJAX to generate content
1 question
+
APIs overview
1 Lecture 10:33

APIs are the other way of getting data from the web, and make it a lot easier since the data is formatted for us nicely, and all we really have to do is ask for the right data. APIs are usually easier to get data from than web scraping, as we don't need to identify patterns and deal with exception cases to extract valuable data.

Introduction To APIs
10:33

Here we will talk about what to do next with APIs

APIs
2 questions
About the Instructor
Maximilian Schallwig
4.2 Average rating
178 Reviews
5,379 Students
3 Courses
Data Scientist

I've worked for over two years in physics research and mathematical analysis. I participated in two international physics competitions, where my two teammates and I won silver and gold. My thesis was in the field of Quantum Biology, focusing on analyzing the behavior of excitons at room temperature with electronic interaction. 

Due to my affinity for math and statistics from my studies in physics, I tend towards data mining, processing, and analysis, which are also the things that I find most exciting.

I enjoy learning new methods and developing my skills, and am constantly studying new literature and documentation to find exciting material that can be applied in the field of data analysis.

If you want to keep up with what else I'm doing in the fields of programming, data, and data science, you can check me out at codingwithmax.