Tips and tricks for web scraping and data extraction
4.0 (7 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1,998 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Tips and tricks for web scraping and data extraction to your Wishlist.

Add to Wishlist

Tips and tricks for web scraping and data extraction

A guide to the most interesting software and add ons to start learning about web data extraction
4.0 (7 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1,998 students enrolled
Created by Valentina Porcu
Last updated 11/2015
English
Current price: $10 Original price: $50 Discount: 80% off
1 day left at this price!
30-Day Money-Back Guarantee
Includes:
  • 2 hours on-demand video
  • 2 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Extract data from a lot of websites
  • Scrape data from hard websites
  • Extracting user comments from Amazon, Youtube, IMDB, Facebook and Disqus
  • Anonymize your search
  • Extracting data from Twitter
View Curriculum
Requirements
  • A PC, an ADSL. If Mac or Linux users a virtual machine (a lot of software are only for Windows)
Description

Attention please, the course will be completely reviewed in the next weeks! If you want to buy it, please, come back or be patient! Thanks :)

This course has been designed to provide the basics of web scraping and data extraction. You will find a lot of techniques for the data mining, from the single page extraction, to the most complex techniques like the creation of a crawler for the research.

Some sections are also focused to software tailored for extracting:

- images

- geolocated data

- SEO data

- users comments

You will also discover how to anonymize your extractions to prevent a block or a temporary ban.

No programming knowledge is required, all the software are really easy to use.

You will find many examples of data mining in the field of marketing and SEO and all the advice about the best software to "tame" difficult sites.

PS: some tutorial are about paid softwares. I have no agreement with the software house producing it, I just use them in my work. When it's possible I will use free software.

The course is now updated to the new Importio API.

Who is the target audience?
  • People that love challenges
  • Students that want to learn more about scraping
Students Who Viewed This Course Also Viewed
Curriculum For This Course
32 Lectures
01:56:45
+
Introduction
1 Lecture 04:49

An introduction lecture to the website scraping.

Preview 04:49
+
The easiest tools to extract a single page from web
4 Lectures 07:32

Some tools to screen scrape a single page: awesomescreenshot

Preview 02:33

Some tools to screen scrape a single page: url2png

Preview 00:51

Some tools to screen scrape a single page: printfriendly

Preview 01:27

Let's start with a plugin for data extraction: Convextra. It works with browsers like Chrome and Firefox

Preview 02:41
+
How to extract data by Import.io
7 Lectures 33:04

First steps on Import

Starting Import.io
01:49

How to install one of the best web scrapers: Importio

How to install Import.io
01:22

How to create an extractor for mining website through Import

Magic tool
03:00

Setting an extractor manually with Import

Setting an extractor
03:39

Create a crawler for mining and crawl a product page

How to create a crawler
08:45

Scraping a website with Import through the connector

How to create a connector
10:29

How to extract data from multiple links

Bulk link on Import
04:00
+
How to import data on Google Drive and extract tables
2 Lectures 14:06

Extract some data with Google Spreadsheet

Google Drive: IMPORTFEED, IMPORTDATA and IMPORTHTML
06:50

Google Drive: IMPORTXML
07:16
+
How to extract data from Twitter
5 Lectures 15:46

How to register a twitter app

How to create a Twitter App
02:39

The scraping tool to extract data from Twitter

How to extract Tweets with TAGS
04:26

In this lesson, we will learn how extracting tweets from TAGS and Google Sheets

Extract tweets with Twitter Quick Extractor
01:26

Software for tweets extraction

Bonus lecture - 1
1 page

Ho to mine tweeters automatically

How to extract Tweeters
07:15
+
Scraping for SEO and Marketing
3 Lectures 18:30

Scrape box for keyword analysis

Scrapebox for keyword extraction and analysis
14:29

Extract data from Yellow pages - with Datatool
01:22

Extract data from Yellow pages - with Import
02:39
+
Sezione 8: Other tools
4 Lectures 12:41
Datatool
04:06

Scraper
01:02

Outwithub
05:15

File download with Downthemall
02:18
+
Tools to automate some data extraction
5 Lectures 08:14
Some tools to automate data extraction
00:41

Extracting data from Ebay
02:16

From Etsy
01:29

Geolocated images from Instagram
02:15

From a RSS feed
01:33
+
Conclusions
1 Lecture 01:03
Conclusions
01:03
About the Instructor
Valentina Porcu
4.4 Average rating
107 Reviews
2,496 Students
9 Courses
Data Scientist

I'm a computer geek, data mining and research passionate, with a Ph.D in communication and complex systems and years of experience in teaching in Universities in Italy, France and Morocco, and online, of course!

I work as consultant in the field of data mining and machine learning and I like writing about new technologies and data mining.

I spent the last 9 years working as freelance and researcher in the field of social media analysis, benchmark analysis and web scraping for database building, in particular in the field of buzz analysis and sentiment analysis for universities, startups and web agencies across UK, France, US and Italy.