
Install and run jupiter notebook on Windows, Linux, and macOS using pip or python -m notebook, then create, save, and manage ipynb notebooks in your chosen folder.
Master pagination in Python web scraping by using a known page count with a for loop or an unknown count with a while loop, and extract books from article.product_pod.
Learn to use proxies with the requests library, including unauthenticated and authenticated proxies, by loading proxy list, setting a user agent, handling errors with try/except, and testing access to books.toscrape.com.
Install selenium and chromedriver by downloading chromedriver for your OS, then add its path to system variables. Install selenium in PyCharm and import webdriver to prepare a browser driver.
Learn to browse pages with Selenium by detecting the next button in a while loop, breaking on the last page, scraping quotes, authors, and tags to an excel file.
Learn to build an infinite scroll workflow with Selenium to load all items on a page, using Python, WebDriver, and a loop that scrolls to bottom and waits between loads.
set up the project by installing selenium, pandas, and openpyxl; configure chrome options to start maximized with detach, create a driver with implicit wait, and open imdb.com to begin automation.
Automate browser actions with Selenium to reach the Oscar-nominated comedy movie list, handling cookies, navigating genres, and applying filters for scraping 1,047 titles.
Write a while true loop to scroll to the end and click the see more button until all items load, using a CSS selector and a two-second wait.
develop a parse_books method to create and populate a book item from scraped data—name, price excluding tax, upc, availability, category, rating, image URL—yielding items for json export with scrapy.
Scrapy pipelines process scraped items before saving, adjusting fields with an item adapter to uppercase names, extract availability, and convert pounds to dollars, then save to json.
Learn to drop items with scrapy dropper pipeline by checking stock, converting stock to int, and saving only books with at least 10 in the output json.
Learn to build a Scrapy crawlspider with link extractors and rules to navigate catalog pages, category pages, and book pages, then scrape data into JSON.
In today's data-driven world, web scraping is a powerful tool that enables you to gather data from websites efficiently.
I designed this course to be the most complete web scraping course on Udemy. It is practical and exercise-based, ensuring you learn by doing through exercises and real-life projects.
We'll start with the basics on bookstoscrape and quotestoscrape (which are designed to be scraped) to help you grasp the fundamentals of web scraping. After learning the basics, we’ll dive deep into web scraping on real websites.
If you're new to Python, don't worry, we've got an extra section covering Python fundamentals to get you ready for this course.
What You'll Learn:
Requests and BeautifulSoup:
Parse and extract data from HTML using eBay as an example.
Selenium:
Automate browser interactions with real projects from IMDb.
Scrapy:
Build scalable web scrapers with real-life examples from Flying Tiger and Yelp.
Scrapy-Playwright:
Learn how to scrape dynamic websites with Scrapy by integrating Playwright.
Why This Course?
Hands-on Learning: The course is packed with exercises and real-life projects to help you apply what you learn immediately.
Practical Approach: I will focus on teaching you practical skills that you can use in your own projects.
Support for Beginners: An extra section on Python fundamentals ensures that even those new to programming can follow along and succeed.
Join me in this journey to unlock the full potential of web scraping. With practical exercises and real-world examples, you'll be well-equipped to gather data from the web effectively. Let's get started!