Professional Web Scraping with Java

Learn how to scrape data from any static or dynamic / AJAX web page using Java in a short and concise way.
4.0 (23 ratings) Instead of using a simple lifetime average, Udemy calculates a
course's star rating by considering a number of different factors
such as the number of ratings, the age of ratings, and the
likelihood of fraudulent ratings.
596 students enrolled
$19
$50
62% off
Take This Course
  • Lectures 14
  • Length 1 hour
  • Skill Level Intermediate Level
  • Languages English
  • Includes Lifetime access
    30 day money back guarantee!
    Available on iOS and Android
    Certificate of Completion
Wishlisted Wishlist

How taking a course works

Discover

Find online courses made by experts from around the world.

Learn

Take your courses with you and learn anywhere, anytime.

Master

Learn and practice real-world skills and achieve your goals.

About This Course

Published 5/2016 English

Course Description

In this short and concise course you will learn everything to get started with web scraping using Java. 

You will learn the concepts behind web scraping that you can apply to practically any web page (static AND dynamic / AJAX).


Course structure

We start with an overview of what web scraping is and what you can do with it. 

Then we explain the difference in scraping static pages vs dynamic / AJAX pages. You learn how to classify a website in one of the two categories and then apply the right concept in order to scrape the data you want.

Now you will learn how to export the scraped data either as CSV or JSON. These are some popular formats that can be used for further processing.

Unfortunately many websites try to block scrapers or sometimes you just do not want to be detected. In the section going undercover you will learn how to stay undetected and avoid getting blocked.

At the end of the course you can download the full source code of all the lectures and we discuss an outlook to some advanced topics (private proxies, cloud deployment, multi threading ...). Those advanced topics are covered in a follow up course I am going to teach.


Why you should take this course

Stop imagining you can scrape data from websites and use the skills for your next web project, you can do it now.

  • Stay ahead of your competition
  • Be more efficient and automate tedious, manual tasks
  • Increase your value by adding web scraping to your skill set


Enroll now!

What are the requirements?

  • You should already be familiar with Java and Maven at a basic to medium level (the course will not show you how to setup Java, Maven or an IDE)
  • You should be familiar with HTML/CSS and know how to use your browser's developer tools
  • You should know about CSS selectors, though we use them for scraping static web pages
  • Prior knowledge of jQuery helps you getting started faster with Jsoup, though this is not required
  • You should know what a web API and AJAX is (basic level is enough)

What am I going to get from this course?

  • Have a solid understanding of web scraping with Java
  • Beeing able to scrape practically any web page (static AND dynamic / AJAX) though you learn the concepts behind web scraping
  • Download, parse and extract data from websites with Jsoup
  • Call web APIs in Java with Unirest
  • Export your data as CSV or JSON
  • Build web scrapers that stay undetected and do not get blocked or banned

What is the target audience?

  • Anyone with an interest in learning web scraping and understanding the concepts
  • Anyone who likes a short and concise course
  • This course is NOT an introduction to Java
  • This course will NOT show you how to setup your development environment
  • This course is intended to get you started with web scraping. Very advanced topics (e.g. private proxies, cloud deployment, multi threading) are discussed but not implemented in this course. I will do an an advanced / enterprise level course on this separately...
  • Windows, Mac, or Linux PC

What you get with this course?

Not for you? No problem.
30 day money back guarantee.

Forever yours.
Lifetime access.

Learn on the go.
Desktop, iOS and Android.

Get rewarded.
Certificate of completion.

Curriculum

Section 1: Course Introduction
01:53

Get an overview of the course and show you the requirements needed to proceed.

Section 2: Scraping static web pages
00:52

In this lecture we discuss what a static web page is.

02:00

In this lecture you learn the concept behind scraping static web pages. We look at the concrete steps needed to scrape practically any static page out there. A live example is provided at the end of this section...

05:45

We introduce the Jsoup library. It helps downloading, parsing and extracting elements from a page using CSS selectors. It has a lot of similarities of jQuery - so prior knowledge of jQuery is helpful but not necessary.

Also we develop a simple example program using this library...

14:03

In this example we build a web scraper that gets the top 10 Google search results for any search query and prints the title and url of each search result to the console. Later we store the results in a simple text file.

Section 3: Scraping dynamic / AJAX web pages
01:55

In this lecture we discuss what a dynamic / AJAX web page is and how it differs from a static one.

02:35

In this lecture you learn the concept behind scraping dynamic / AJAX web pages. Later we show you how to actually apply this concept to a concrete example.

11:20

In this lecture you learn how to make HTTP Requests with the Unirest Java library. We develop a simple live example where you can see the most important features for web scraping in action.

14:59

In this example we scrape the results from peoplefinders.com which are loaded dynamically via AJAX requests.

Section 4: Exporting your data
02:10

In this lecture we export the data from the Google top 10 search results example as CSV for further processing. You can open it in Numbers, Excel or Open Office. There you can do all kinds of sorting and filtering which is really useful.

04:22

In this lecture we export the data from the Google top 10 search results example as JSON for further processing.

Section 5: Going undercover
02:22

You will learn how to become invisible and hide traces of beeing a web scraper. This will help you avoid getting blocked or banned.

Bonus: in the resource section you find an undercover web scraper that builds upon the google scraper from the previous lecture. You can use this as a foundation for creating your own scrapers....

Section 6: Conclusion
01:20

Thank you for taking this online course. You can download the full source code of all lectures here. I will give you an overview of what's next...

Article

In this lecture you will find a Mind Map with the contents of the course. So you have a one page overview of all the information.

Students Who Viewed This Course Also Viewed

  • Loading
  • Loading
  • Loading

Instructor Biography

Patrick Meier, Entrepreneur, Software Developer

I am an entrepreneur and software developer who really enjoys to build and learn new things. I now have over 7 years of experience from working in different companies (big and small) and even founding my own startup .

I built several scalable backend systems in the cloud running on Java and Spring. Then I discovered JavaScript as a language for creating different kinds of things - from webapps to mobile apps and even the backend using NodeJs.

I love to share what I have learned with YOU to be more effective and successful.

Ready to start learning?
Take This Course