Tips and tricks for web scraping and data extraction

A guide to the most interesting software and add ons to start learning about web data extraction
4.0 (7 ratings) Instead of using a simple lifetime average, Udemy calculates a
course's star rating by considering a number of different factors
such as the number of ratings, the age of ratings, and the
likelihood of fraudulent ratings.
1,964 students enrolled
Instructed by Valentina Porcu Business / Strategy
$50
Take This Course
  • Lectures 32
  • Contents Video: 2 hours
    Other: 1 min
  • Skill Level All Levels
  • Languages English
  • Includes Lifetime access
    30 day money back guarantee!
    Available on iOS and Android
    Certificate of Completion
Wishlisted Wishlist

How taking a course works

Discover

Find online courses made by experts from around the world.

Learn

Take your courses with you and learn anywhere, anytime.

Master

Learn and practice real-world skills and achieve your goals.

About This Course

Published 8/2014 English

Course Description

This course has been designed to provide the basics of web scraping and data extraction. You will find a lot of techniques for the data mining, from the single page extraction, to the most complex techniques like the creation of a crawler for the research.

Some sections are also focused to software tailored for extracting:

- images

- geolocated data

- SEO data

- users comments

You will also discover how to anonymize your extractions to prevent a block or a temporary ban.

No programming knowledge is required, all the software are really easy to use.

You will find many examples of data mining in the field of marketing and SEO and all the advice about the best software to "tame" difficult sites.

PS: some tutorial are about paid softwares. I have no agreement with the software house producing it, I just use them in my work. When it's possible I will use free software.

The course is now updated to the new Importio API.

What are the requirements?

  • A PC, an ADSL. If Mac or Linux users a virtual machine (a lot of software are only for Windows)

What am I going to get from this course?

  • Extract data from a lot of websites
  • Scrape data from hard websites
  • Extracting user comments from Amazon, Youtube, IMDB, Facebook and Disqus
  • Anonymize your search
  • Extracting data from Twitter

What is the target audience?

  • People that love challenges
  • Students that want to learn more about scraping

What you get with this course?

Not for you? No problem.
30 day money back guarantee.

Forever yours.
Lifetime access.

Learn on the go.
Desktop, iOS and Android.

Get rewarded.
Certificate of completion.

Curriculum

Section 1: Introduction
04:49

An introduction lecture to the website scraping.

Section 2: The easiest tools to extract a single page from web
02:33

Some tools to screen scrape a single page: awesomescreenshot

00:51

Some tools to screen scrape a single page: url2png

01:27

Some tools to screen scrape a single page: printfriendly

02:41

Let's start with a plugin for data extraction: Convextra. It works with browsers like Chrome and Firefox

Section 3: How to extract data by Import.io
01:49

First steps on Import

01:22

How to install one of the best web scrapers: Importio

03:00

How to create an extractor for mining website through Import

03:39

Setting an extractor manually with Import

08:45

Create a crawler for mining and crawl a product page

10:29

Scraping a website with Import through the connector

04:00

How to extract data from multiple links

Section 4: How to import data on Google Drive and extract tables
06:50

Extract some data with Google Spreadsheet

Google Drive: IMPORTXML
07:16
Section 5: How to extract data from Twitter
02:39

How to register a twitter app

04:26

The scraping tool to extract data from Twitter

01:26

In this lesson, we will learn how extracting tweets from TAGS and Google Sheets

1 page

Software for tweets extraction

07:15

Ho to mine tweeters automatically

Section 6: Scraping for SEO and Marketing
14:29

Scrape box for keyword analysis

Extract data from Yellow pages - with Datatool
01:22
Extract data from Yellow pages - with Import
02:39
Section 7: Sezione 8: Other tools
Datatool
04:06
Scraper
01:02
Outwithub
05:15
File download with Downthemall
02:18
Section 8: Tools to automate some data extraction
Some tools to automate data extraction
00:41
Extracting data from Ebay
02:16
From Etsy
01:29
Geolocated images from Instagram
02:15
From a RSS feed
01:33
Section 9: Conclusions
Conclusions
01:03

Students Who Viewed This Course Also Viewed

  • Loading
  • Loading
  • Loading

Instructor Biography

Valentina Porcu, Marketing Strategist and Data Analyst

Valentina is a computer geek, data mining and research passionate, with a Ph.D in communication and complex systems and years of experience in teaching in Universities in Italy, France and Morocco, and online, of course!

She works as consultant in the field of data mining and machine learning and she like writing about new technologies and data mining.

She spent the last 9 years working as freelance in the field of social media analysis, benchmark analysis and web scraping for database building, in particular in the field of buzz analysis and sentiment analysis for startups and web agencies across UK, France, US and Italy.

Ready to start learning?
Take This Course