Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
Practical Web Scraping Course in Python, Scrapy and Selenium
Rating: 4.4 out of 5(9 ratings)
143 students

Practical Web Scraping Course in Python, Scrapy and Selenium

The core of Python web scraping in less than 60 minutes + GitHub repo + Selenium, Scrapy real-life use-cases
Last updated 7/2022
English

What you'll learn

  • How to get data for your content, stock info, crypto exchange reserves, etc.
  • How to scrape sites with popular frameworks
  • Advanced techniques like scraping images, pdfs, graphics, etc.
  • Get more information in less time: save yourself hours of research
  • Working, tested instrument that'll get you data from 95% of sites

Coding Exercises

This course includes our updated coding exercises so you can practice your skills as you learn.

See a demo
Image of coding exercise example

Course content

5 sections28 lectures52m total length
  • Hello and Welcome1:04

    Welcome. This course would teach you practical ways of capturing data from the internet.


    My name is Mykhailo Kushnir, and currently, I’m working as ML Engineer in Ukraine, Lviv.


    I need data for both my work and my pet projects. I’m sure most of you are here for similar reasons.


    Same as I, you probably don’t want to commit all the time in the world to it. Because of that, I would try to keep tutorials as short and condensed as possible. That’s my intention.


    Throughout this course, you’ll learn many ways to scrape data, store and version control data efficiently, use selenium for data capturing, and many more.


    Finally, a small disclaimer: Udemy typically asks you to rate a course faster than you actually went through it enough to form an opinion. If that is the case, feel free to postpone the rating decision until you understand whether this course gave you enough value for the money it costs. Also, if you face any obstacles during the education process, please let me know about them, and we’ll see if I can be helpful to you. Otherwise, enjoy the course!

  • How to use the course1:25

    Hi, everyone. In this video, I’ll try to explain to you how to use this course for your own good.


    First of all, I assume I know your problem. You either want to get data for your own pet project or you’re looking for a side-gag skill which scraping is.


    And you want it now.


    I’ve created a course that I would like to watch myself and I don’t really like long-running stuff. I’ll supplement you with reading materials, links and scripts that would help you immediately, but nonetheless, you’d have to google. On my end, I promise you to pack the content with information and useful tools.


    The main part of the code would be placed on this GitHub repository. You’ll find the links to it after the video. By the way, that would be a common pattern. Whenever you see an external resource on the screen, a link to it would be possible to find after the video in the reading materials.


    For the best efficiency you need to follow the course in 3 steps:

    • Watch the videos

    • Reproduce the code from it

    • Extend this code for some real use case problems. I’ll give you some ideas.


    If something goes wrong, reach out to our slack community for a potential answer.


    Now you’re fully ready for your first tutorial. It won’t be a simple one, but you’ll make it. Good luck!


  • Development environment setup2:01

    Hey everyone, in this section I’ll introduce you to the course and give some tips on how to learn with the highest efficiency


    After the initial overview, we will learn how to set up a programming environment for web scraping. When you complete the video part, you’ll find reading materials with links. Make sure you go through them as there would be something to grasp.


    In this initial setup, you would need Python Docker and your favourite IDE. I’d suggest VS code.


    First, you have to learn how to install python. There’s no better way of doing it except going to python's official website and following tutorials under the Downloads section​


    Next, we’d have to install a virtual environment package​. And use it to create a new environment​. You’ll be using it for installing requirements.txt through various projects in this course.


    Virtualenv package helps you skip versioning issues so it’s definitely a useful tool.


    If everything was done correctly, you would be able to create a virtual environment and install the requirements.txt file. Make sure you’ve pulled the source code for this course from GitHub.


    Go to the docker install page to see how you can set it up on your specific operating system​

    When docker would be installed, for the start it would be enough for you to pull selenium standalone-chrome for this course​

    And then start it with the run command


    Here is a useful link for VS code installation as well


    Once again, if you face issues with this initial setup - make sure you’ve glanced at the reading materials after the video section. You can also go to our slack community to search for help from other students.


  • Reading Materials0:06
  • Last Check Before We Start
  • [LEARNING TIP] Use captions created for this course0:03

Requirements

  • Basic knowledge of Python

Description

With the vast amount of data available on the internet, it's no wonder that web scraping has become such a popular tool for extracting information. Whether you're looking to gather data for research purposes or collect information from a competitor's website, web scraping can be a valuable skill in your toolkit. And with this practical web scraping course, you'll learn everything you need to know to start extracting data from any website. So if you're ready to start learning web scraping, this is the course for you.


Right now, the "Practical Web Scraping Course" is an ongoing project and therefore it will contain the most recent ways to parse data and would be updated often. You'll also get your answers to the questions you'd have in a short period. Here's the list of all themes that you'd learn within this course eventually:

  • Tracking HTTP requests in practice

  • Basic scraping with BS4 and requests libraries

  • BS4 tools in detail

  • Efficient scraping with Selenium

  • Visual Intro to Selenium tools

  • Dealing with authentication and user sessions

  • Bypassing Captcha

  • Scraping dynamic websites

  • Selenium and pagination

  • Scraping HighCharts.JS

  • Use Heroku to host your spiders

  • Scrapy Introduction

  • Scrapy integration with DB

  • [Items below would be added in the next part of the course]

  • Hosting Scrapy spiders locally

  • Use schedulers to run Scrapy spiders locally

  • Ethical scraping tools

  • Avoid getting banned

  • Scraping images and pdf’s

  • Real-time scraping


With this course you will be able to:

- Save time by learning modern methods of data scraping

- Get information about the most up-to-date scraping tools and techniques

- Avoid being scammed by others selling outdated courses

- Get your money's worth with a complete and comprehensive course


Who this course is for:

  • Data Scientiets, Software Engineers, Open Internet Enthusiasts