What you'll learn

Use the Selenium module and scrape with Selenium
Find out how to set up a web driver
Perform debugging with the console and download files
Learn to work with Nested selectors and regular expression basics
Discover how to perform parsing with BeautifulSoup
Understand authentication with Wireshark.
Master the use of URL Query Strings and HTTP Requests (GET and POST)
Implement streamlining with headless browser

Course content

3 sections • 12 lectures • 1h 35m total length

The Course Overview2:44
This video provides an overview of the entire course.
When to Web Scrape2:56
This video aims to explain the course’s expected prerequisite knowledge and system requirements, then introduce the concept of web scraping, situations in which you may want to use it,and why it is a valuable skill to know.
What Makes up a Website9:49
Without understanding the foundations of web development, it is challenging to write efficient and robust web scraping scripts, so we will cover how a website is structured and how to locate data with precision.
How to Interact with a Website8:31
In order to query a website to scrape data from it, we need to see how the website is structured in its underlying code. We also need an application that will let us test our queries.To do this, we will learn about the element explorer and console of the Chrome Developer Tools.
Using the Selenium Module12:11
Now we know how to create CSS selectors and use the Chrome developer tools to look at HTML and construct a query, but how do we turn this into a Python script? We use the selenium module and a web driver.
Ethical Web Scraping4:38
Now that we know how to web scrape with Python, we need to be aware of the ethical and legal ramifications associated with web scraping. Mainly, the solution is to be considerate and use common sense.

Requesting HTML9:13
BeautifulSoup cannot work alone. Although it’s a great tool for parsing and organizing a website’s HTML, it doesn’t get the HTML for us, so we have to figure out another method to request a website’s HTML.
Using the BeautifulSoup Module13:17
So, now we have some HTML strings loaded in Python, but how can we use BeautifulSoup to intelligently start selecting important data from it?
Example: Parsing Wikipedia11:21
The aim of the video is to show an example on how to parse a webpage. For eg, Wikipedia

Requirements

Pre-requistes - • Basic foundation in Python programming • Basic Python + pip knowledge suggested • Web development experience beneficial, but not mandatory
Softwares used: - • Python (3.3+) • Pip package manager • Windows 10 • PhantomJS • Selenium

Description

Python is a high-level programming language used for general-purpose programming. It has a design philosophy which emphasizes code readability and a syntax which allows programmers to express concepts in fewer lines of code than possible in languages such as C++ or Java.

This video course is a rich collection of recipes that will come in handy when you are scraping a website using Python, addressing your usual and unusual problems while scraping websites by diving deep into the capabilities of Python’sweb scraping tools such as Selenium, BeautifulSoup, and urllib2. The video will start with showing how to use selenium module for scraping by setting up a web driver, debugging with the Console and downloading files and streamlining with a Headless Browser (PhantomJS). The video will then move on to demonstrate how to do parsing with Beautifulsoup which would include introduction to the BeautifulSoupObjects, Nested Selectors and Regular Expressions Basics and how to do UTF-8 Encoding. The video will finally end by showing how to do fetching with urlib2 by using the developer tools Network tab, how to bypass the browser and retrieve files.

By The end of this video, you will be successfully able to understand the in-depth capabilities of python web scraping tools.

About the Author

Charles Clayton is a sole proprietor of crclayton technologies co and an independent web developer. He is an experienced developer and Python specialist in Python web scraping solutions and tools such asSelenium, BeautifulSoup,and urllib2. He has 2 years of experience as a Reliability Engineer with West frazweer.

Who this course is for:

This video is for Python developers and web analysts who want to improve their web scraping skills in Python. It is ideal for those who are looking for reference guide they can use to solve any challenges encountered while web scraping in Python.

What you'll learn

Explore related topics

Course content

Scraping with Selenium6 lectures • 41min

Parsing with BeautifulSoup3 lectures • 34min

Fetching the urlib2 and API’s3 lectures • 21min

Requirements

Description

Who this course is for: