Web Scraping and API Fundamentals in Python 2020
4.6 (420 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
3,355 students enrolled

Web Scraping and API Fundamentals in Python 2020

Learn Web Scraping with Beautiful Soup and requests-html; harness APIs whenever available; automate data collection!
Bestseller
4.6 (420 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
3,355 students enrolled
Created by 365 Careers
Last updated 6/2020
English
English [Auto]
Current price: $139.99 Original price: $199.99 Discount: 30% off
5 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 4 hours on-demand video
  • 11 articles
  • 8 downloadable resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Learn the fundamentals of Web Scraping
  • Implement APIs into your applications
  • Master working with Beautiful Soup
  • Start using requests-html
  • Create functioning scrapers
  • Scrape JavaScript
  • Familiarize yourself with HTML
  • Get the hang of CSS Selectors
  • Make HTTP requests
  • Understand website cookies
  • Explore scraping content locked behind a log-in system
  • Limit the rate of requests
Requirements
  • Python 3 and the Anaconda distribution
  • Basic Python knowledge
  • Curiosity and enthusiasm to learn and practice
Description

Are you tired of manually copying and pasting values in a spreadsheet?

Do you want to learn how to obtain interesting, real-time and even rare information from the internet with a simple script?

Are you eager to acquire a valuable skill to stay ahead of the competition in this data-driven world?

If the answer is yes, then you have come to the right place at the right time!

Welcome to Web Scraping and API Fundamentals in Python!

The definitive course on data collection!

Web Scraping is a technique for obtaining information from web pages or other sources of data, such as APIs, through the use of intelligent automated programs. Web Scraping allows us to gather data from potentially hundreds or thousands of pages with a few lines of code.

From reporting to data science, automating extracting data from the web avoids repetitive work. For example, if you have worked in a serious organization, you certainly know that reporting is a recurring topic. There are daily, weekly, monthly, quarterly, and yearly reports. Whether they aim to organize the website data, transactional data, customer data, or even more easy-going information like the weather forecast – reports are indispensable in the current world. And while sometimes it is the intern’s job to take care of that, very few tasks are more cost-saving than the automation of reports.

When it comes to data science – more and more data comes from external sources, like webpages, downloadable files, and APIs. Knowing how to extract and structure that data quickly is an essential skill that will set you apart in the job market.

Yes, it is time to up your game and learn how you can automate the use of APIs and the extraction of useful info from websites.

In the first part of the course, we start with APIs. APIs are specifically designed to provide data to developers, so they are the first place to check when searching for data. We will learn about GET requests, POST requests and the JSON format.

These concepts are all explored through interesting examples and in a straight-to-the-point manner.

Sometimes, however, the information may not be available through the use of an API, but it is contained on a webpage. What can we do in this scenario? Visit the page and write down the data manually?

Please don’t ever do that!

We will learn how to leverage powerful libraries such as ‘Beautiful Soup’ and ‘requests HTML’ to scrape any website out there, no matter what combination of languages are used – HTML, JavaScript, and CSS.

Certainly, in order to scrape, you’ll need to know a thing or two about web development. That’s why we have also included an optional section that covers the basics of HTML. Consider that a bonus to all the knowledge you will acquire!

We will also explore several scraping projects. We will obtain and structure data about movies from a “Rotten Tomatoes” rank list, examining each step of the process in detail. This will help you develop a feel for what scraping is like in the real world.

We’ll also tackle how to scrape data from many webpages at once, an all-to-common need when it comes to data extraction.

And then it will be your turn to practice what you’ve learned with several projects we'll set out for you.

But there’s even more!

Web Scraping may not always go as planned (after all, that’s why you will be taking this course). Different websites are built in different ways and often our bots may be obstructed. Because of this, we will make an extra effort to explore common roadblocks that you may encounter while scraping and present you with ways to circumnavigate or deal with those problems. These include request headers and cookies, log-in systems and JavaScript generated content.

Don’t worry if you are familiar with few or none of these terms… We will start from the basics and build our way to proficiency. Moreover, we are firm believers that practice makes perfect, so this course is not so much on the theory side of things, as it adopts more of a hands-on approach. What’s more, it contains plenty of homework exercises, downloadable files and notebooks, as well as quiz questions and course notes.

We, the 365 Data Science Team are committed to providing only the highest quality content to you – our students. And while we love creating our content in-house, this time we’ve decided to team up with a true industry expert - Andrew Treadway. Andrew is a Senior Data Scientist for the New York Life Insurance Company. He holds a Master’s degree in Computer Science with Machine learning from the Georgia Institute of Technology and is an outstanding professional with more than 7 years of experience in data-related Python programming. He’s also the author of the ‘yahoo_fin’ package, widely used for scraping historical stock price data from Yahoo.

As with all of our courses, you have a 30-day money-back guarantee, if at some point you decide that the training isn’t the best fit for you. So… you’ve got nothing to lose – and everything to gain ?

So, what are you waiting for?

Click the ‘Buy now’ button and let’s start collecting data together!

Who this course is for:
  • You should take this course if you want to learn how to use APIs
  • This course is for you if you want to learn how to scrape websites
  • Anyone who wants to learn how to automate the boring and mundane everyday tasks
  • Individuals who are curious and passionate about data
  • The course is ideal for beginners to programming who want to learn Beautiful Soup and requests-html
Course content
Expand all 62 lectures 03:51:35
+ Introduction to the course
4 lectures 10:40
What is Web Scraping?
2 questions
Ethics of Scraping
02:55
Ethics of Scraping
1 question
Download All Resources
00:15
+ Setting up the environment
6 lectures 17:47
Setting up the environment - Do not skip, please!
00:48
Why Python and why Jupyter?
04:48
Installing Anaconda
03:07
Jupyter Dashboard - Part 1
02:27
Jupyter Dashboard - Part 2
05:13
Installing the packages
01:24
+ Working with APIs
15 lectures 46:03
API overview
3 questions
HTTP requests: GET and POST requests
02:35
HTTP requests: GET and POST requests
2 questions
JSON: preferred data exchange format for APIs
02:24
JSON: preferred data exchange format for APIs
3 questions
Exchange rates API: GETting a JSON reply
04:57
Exchange rates API: GETting a JSON reply
2 questions
Incorporating parameters in a GET request
03:18
Incorporating parameters in a GET request
1 question
Additional API functionalities
04:39
Additional API functionalities
2 questions
Creating a simple currency converter
04:52
iTunes API
1 question
iTunes API: Exercise
00:12
iTunes API: Structuring and exporting the data
02:10
iTunes API: Structuring and exporting the data
1 question
APIs: Exercise
00:14
GitHub API: Pagination
04:21
GitHub API: Pagination
1 question
EDAMAM API: Initial setup and registration
03:14
EDAMAM API: Initial setup and registration
1 question
EDAMAM API: Sending a POST request
04:14
EDAMAM API: Sending a POST request
1 question
Downloading files with requests
00:48
+ HTML overview
8 lectures 38:51
What is HTML?
03:05
What is HTML?
1 question
Structure of HTML
02:36
Structure of HTML
1 question
Syntax of HTML. Tags
06:20
Syntax of HTML. Tags
3 questions
Tag attributes
06:00
Tag attributes
2 questions
Popular tags
06:27
Popular tags
2 questions
CSS and JavaScript
06:23
CSS and JavaScript
3 questions
Character encoding
06:12
Character encoding
1 question
XHTML and code style
01:48
XHTML and code style
1 question
+ Web Scraping with Beautiful Soup
11 lectures 47:43
Introduction to the Beautiful Soup package
02:04
Workflow of Web Scraping
06:27
Workflow of Web Scraping
2 questions
Setting up your first scraper
02:54
Searching and navigating the HTML tree
3 questions
Searching the HTML tree by attributes
03:30
Searching the HTML tree by attributes
1 question
Extracting data from the HTML tree
03:04
Extracting text from an HTML tag
04:40
Extracting text from an HTML tag
1 question
Practical example: dealing with links
05:36
Practical example: Exercise
00:26
Extracting data from nested HTML tags
04:35
Scraping multiple pages automatically
07:33
+ Practical project: Scraping Rotten Tomatoes
7 lectures 25:06
Setting up your scraper
04:16
Extracting the title and year of each movie
06:37
Extracting the score of each movie: Exercise
00:10
Extracting the rest of the information
05:58
Dealing with the cast of the movies
05:17
Extracting the rest of the information: Exercise
00:07
Storing and exporting the data in a structured form
02:41
+ Practical projects
2 lectures 00:57
Scraping Steam
00:31
Scraping YouTube
00:26
+ Common roadblocks when scraping
1 lecture 12:45
Common roadblocks when Web Scraping.
12:45
Common roadblocks when Web Scraping
3 questions
+ The requests-html package
7 lectures 26:11
Introduction to the requests-html package
01:35
Exploring the capabilities of requests-html for Web Scraping
05:27
Searching for text
02:36
CSS selectors
09:20
CSS selectors
2 questions
Scraping JavaScript
06:13
Scraping JavaScript: Exercise
00:34
Completing 100%
00:26