Complete Python Web Scraping : Real Projects & Modern Tools

Name: Complete Python Web Scraping : Real Projects & Modern Tools
Rating: 4.4 (87 reviews)

Web Scraping with BeautifulSoup, Selenium, Scrapy and Scrapy-Playwright. 4 Project-like Exercises + 4 Real Life Projects

Created byAlp Can

Last updated 8/2024

English

What you'll learn

Scrapy
Web Automation with Selenium
Scrapy-Playwright
Scraping websites using Python
Python's most popular and effective web scraping libraries
Using the right method according to the structure of the website
Requests and Beautiful Soup
Reading and Analyzing HTML code
Saving scraped data
Downloading bulk images

Course content

15 sections • 97 lectures • 12h 53m total length

Section Introduction0:59
Course Material0:08
Checking If the Website is Static or Dynamic3:18
Deciding on the Method to be Used2:36
Reading and Analyzing HTML Code23:14
Installing Python and Pycharm (WINDOWS)5:17
Installing Python and Pycharm (LINUX)3:54
Installing Python and Pycharm (MACOS)4:56
Installing Jupyter Notebook on Windows, Linux and MacOS (OPTIONAL)6:17
Install and run jupiter notebook on Windows, Linux, and macOS using pip or python -m notebook, then create, save, and manage ipynb notebooks in your chosen folder.
About Real Life Examples...2:02
Course Syllabus0:43

Section Introduction0:49
Installing Libraries and Making the First Request4:52
Selectors of Beuatiful Soup21:50
Scraping 1 Book's Data20:38
Constructing the Outer Loop and Dealing with Pagination13:50
Master pagination in Python web scraping by using a known page count with a for loop or an unknown count with a while loop, and extract books from article.product_pod.
Constructing the Inner Loop and Scraping Every Book8:01
Saving the Data and Summary of the Project9:27
Downloading Images with Requests9:12
Using Proxies with Requests9:10
Learn to use proxies with the requests library, including unauthenticated and authenticated proxies, by loading proxy list, setting a user agent, handling errors with try/except, and testing access to books.toscrape.com.
About Proxy Authentication0:15

Section Introduction0:47
Installing Selenium and Chromedriver3:01
Install selenium and chromedriver by downloading chromedriver for your OS, then add its path to system variables. Install selenium in PyCharm and import webdriver to prepare a browser driver.
Creating Driver and Opening Browser2:08
CSS Selectors20:30
XPATH28:16
Login with Selenium6:43
Scraping the First Page10:01
Browsing Through Pages12:29
Learn to browse pages with Selenium by detecting the next button in a while loop, breaking on the last page, scraping quotes, authors, and tags to an excel file.
Infinite Scroll9:22
Learn to build an infinite scroll workflow with Selenium to load all items on a page, using Python, WebDriver, and a loop that scrolls to bottom and waits between loads.
Waits9:53
Actions2:51

Section Introduction0:34
Creating Driver and Opening Browser3:33
set up the project by installing selenium, pandas, and openpyxl; configure chrome options to start maximized with detach, create a driver with implicit wait, and open imdb.com to begin automation.
Important Note0:34
Adjustment to the Next Video (Automation)7:12
Reaching the Page with Automation21:07
Automate browser actions with Selenium to reach the Oscar-nominated comedy movie list, handling cookies, navigating genres, and applying filters for scraping 1,047 titles.
Scrolling to Load All Items3:38
Write a while true loop to scroll to the end and click the see more button until all items load, using a CSS selector and a two-second wait.
Scraping the Data19:27

Section Introduction0:40
Installing Scrapy, Creating Scrapy Project and Spider3:32
Scrapy Shell and Scrapy's Selectors13:24
Constructing Parse Method11:34
Items1:57
Constructing Parse Books Method17:11
develop a parse_books method to create and populate a book item from scraped data—name, price excluding tax, upc, availability, category, rating, image URL—yielding items for json export with scrapy.
Pipelines8:49
Scrapy pipelines process scraped items before saving, adjusting fields with an item adapter to uppercase names, extract availability, and convert pounds to dollars, then save to json.
Dropping Items with Pipelines5:17
Learn to drop items with scrapy dropper pipeline by checking stock, converting stock to int, and saving only books with at least 10 in the output json.
Saving Data to Excel with Pipelines9:28
Note about SQLite0:21
Saving Data to SQLite Database with Pipelines6:07
Middlewares17:13
Crawler9:09
Learn to build a Scrapy crawlspider with link extractors and rules to navigate catalog pages, category pages, and book pages, then scrape data into JSON.

Requirements

Just a PC with internet connection.
No programming experience is needed. I will teach you everything you need to know.
Fundamental Python knowledge is nice to have but not a must.

Description

In today's data-driven world, web scraping is a powerful tool that enables you to gather data from websites efficiently.

I designed this course to be the most complete web scraping course on Udemy. It is practical and exercise-based, ensuring you learn by doing through exercises and real-life projects.

We'll start with the basics on bookstoscrape and quotestoscrape (which are designed to be scraped) to help you grasp the fundamentals of web scraping. After learning the basics, we’ll dive deep into web scraping on real websites.

If you're new to Python, don't worry, we've got an extra section covering Python fundamentals to get you ready for this course.

What You'll Learn:

Requests and BeautifulSoup:
Parse and extract data from HTML using eBay as an example.
Selenium:
Automate browser interactions with real projects from IMDb.
Scrapy:
Build scalable web scrapers with real-life examples from Flying Tiger and Yelp.
Scrapy-Playwright:
Learn how to scrape dynamic websites with Scrapy by integrating Playwright.

Why This Course?

Hands-on Learning: The course is packed with exercises and real-life projects to help you apply what you learn immediately.
Practical Approach: I will focus on teaching you practical skills that you can use in your own projects.
Support for Beginners: An extra section on Python fundamentals ensures that even those new to programming can follow along and succeed.

Join me in this journey to unlock the full potential of web scraping. With practical exercises and real-world examples, you'll be well-equipped to gather data from the web effectively. Let's get started!

Who this course is for:

Total beginners who want to learn Web Scraping
Developers who intend to start Data Science from the essence of it

Complete Python Web Scraping : Real Projects & Modern Tools

What you'll learn

Explore related topics

Course content

INTRODUCTION11 lectures • 53min

BEAUTIFUL SOUP and REQUESTS 1 - BASICS10 lectures • 1hr 38min

BEAUTIFUL SOUP and REQUESTS 2 - EXERCISE 1 (QUOTES)3 lectures • 21min

BEAUTIFUL SOUP and REQUESTS 3 - REAL LIFE EXAMPLE 1 (EBAY)4 lectures • 39min

SELENIUM 1 - BASICS11 lectures • 1hr 46min

SELENIUM 2 - EXERCISE 2 (AUTHORS)3 lectures • 21min

SELENIUM 3 - REAL LIFE EXAMPLE 2 (IMDB)7 lectures • 56min

SELENIUM 4 - EXERCISE 3 (IMDB DIRECTORS)5 lectures • 36min

SCRAPY 1 - BASICS13 lectures • 1hr 45min

SCRAPY 2 - EXERCISE 4 (QUOTES)3 lectures • 22min

Requirements

Description

Who this course is for: