Udemy
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
Development
Web Development Data Science Mobile Development Programming Languages Game Development Database Design & Development Software Testing Software Engineering Development Tools No-Code Development
Business
Entrepreneurship Communications Management Sales Business Strategy Operations Project Management Business Law Business Analytics & Intelligence Human Resources Industry E-Commerce Media Real Estate Other Business
Finance & Accounting
Accounting & Bookkeeping Compliance Cryptocurrency & Blockchain Economics Finance Finance Cert & Exam Prep Financial Modeling & Analysis Investing & Trading Money Management Tools Taxes Other Finance & Accounting
IT & Software
IT Certification Network & Security Hardware Operating Systems Other IT & Software
Office Productivity
Microsoft Apple Google SAP Oracle Other Office Productivity
Personal Development
Personal Transformation Personal Productivity Leadership Career Development Parenting & Relationships Happiness Esoteric Practices Religion & Spirituality Personal Brand Building Creativity Influence Self Esteem & Confidence Stress Management Memory & Study Skills Motivation Other Personal Development
Design
Web Design Graphic Design & Illustration Design Tools User Experience Design Game Design Design Thinking 3D & Animation Fashion Design Architectural Design Interior Design Other Design
Marketing
Digital Marketing Search Engine Optimization Social Media Marketing Branding Marketing Fundamentals Marketing Analytics & Automation Public Relations Advertising Video & Mobile Marketing Content Marketing Growth Hacking Affiliate Marketing Product Marketing Other Marketing
Lifestyle
Arts & Crafts Beauty & Makeup Esoteric Practices Food & Beverage Gaming Home Improvement Pet Care & Training Travel Other Lifestyle
Photography & Video
Digital Photography Photography Portrait Photography Photography Tools Commercial Photography Video Design Other Photography & Video
Health & Fitness
Fitness General Health Sports Nutrition Yoga Mental Health Dieting Self Defense Safety & First Aid Dance Meditation Other Health & Fitness
Music
Instruments Music Production Music Fundamentals Vocal Music Techniques Music Software Other Music
Teaching & Academics
Engineering Humanities Math Science Online Education Social Science Language Teacher Training Test Prep Other Teaching & Academics
AWS Certification Microsoft Certification AWS Certified Solutions Architect - Associate AWS Certified Cloud Practitioner CompTIA A+ Cisco CCNA Amazon AWS CompTIA Security+ AWS Certified Developer - Associate
Graphic Design Photoshop Adobe Illustrator Drawing Digital Painting InDesign Character Design Canva Figure Drawing
Life Coach Training Neuro-Linguistic Programming Personal Development Mindfulness Life Purpose Meditation Personal Transformation Neuroscience Emotional Intelligence
Web Development JavaScript React CSS Angular PHP WordPress Node.Js Python
Google Flutter Android Development iOS Development Swift React Native Dart Programming Language Mobile Development Kotlin SwiftUI
Digital Marketing Google Ads (Adwords) Social Media Marketing Google Ads (AdWords) Certification Marketing Strategy Internet Marketing YouTube Marketing Email Marketing Retargeting
SQL Microsoft Power BI Tableau Business Analysis Business Intelligence MySQL Data Analysis Data Modeling Big Data
Business Fundamentals Entrepreneurship Fundamentals Business Strategy Online Business Business Plan Startup Freelancing Blogging Home Business
Unity Game Development Fundamentals Unreal Engine C# 3D Game Development C++ 2D Game Development Unreal Engine Blueprints Blender
30-Day Money-Back Guarantee
Development Web Development Web Scraping

Web Crawling with Nodejs (H&M, Amazon, LinkedIn, AliExpress)

Learn how to create a web crawler using various methods on popular sites like H&M, Amazon, LinkedIn, AliExpress!
Highest Rated
Rating: 4.8 out of 54.8 (45 ratings)
298 students
Created by Stefan Hyltoft
Last updated 11/2020
English
English [Auto]
30-Day Money-Back Guarantee

What you'll learn

  • Differences between web crawling and web scraping in Nodejs
  • The 3 main methods to use in web crawling, and when to use what method!
  • How to get data from sites like H&M and AliExpress easily and fast using their hidden API's
  • How to build a web crawler for server rendered sites like Amazon to crawl all their products
  • How to build a Puppeteer based web crawler for a site that requires JavaScript like Linkedin

Course content

5 sections • 38 lectures • 2h 38m total length

  • Preview03:42
  • Preview04:56
  • Preview01:18
  • Preview02:32
  • Preview05:21

  • Preview06:12
  • Testing hidden API inside Postman, and finding other section API endpoints
    07:34
  • Initializing NPM + some info about Nodejs Request and Needle
    01:45
  • Creating our HTTP request with needle inside Nodejs
    02:20
  • Adding User-Agent header to get past denial in nodejs
    03:24
  • Creating MongoDB cluster for saving data
    04:05
  • Connecting to MongoDB cluster from Nodejs
    05:20
  • Saving data to MongoDB
    04:53
  • Getting all products in MongoDB using a loop with offset variable and pagesize
    07:35

  • Finding hidden API using Chrome Dev Tools
    05:24
  • Making API request from Postman with correct headers
    04:25
  • Making API request from Nodejs using Fetch API
    05:49
  • Getting many items using a for loop and sleep function
    04:58
  • Preview03:03

  • Intro to project
    01:08
  • Why are we using HTTP requests and not Puppeteer?
    04:12
  • Initializing NPM + installing jest, cheerio and needle npm packages
    01:26
  • Writing our reuseable httpRequest module for our testing and crawling
    02:39
  • Creating our test HTML file (check resources for URL)
    05:11
  • Setting up testing and intro to testing
    03:26
  • Writing our first test for our HTML parser
    04:16
  • Getting title from product page and making our test pass
    05:37
  • Preview02:50
  • Making our second test and getting product links from page
    06:40
  • Writing out our actual webcrawling in 6 minutes!
    08:38
  • Setup so we only crawl only unique product ID's
    03:07
  • Adding a new test case for different layout + outtro
    05:21

  • Intro to project
    00:31
  • Initializing project with puppeteer and cheerio packages
    00:33
  • Opening Puppeteer browser and navigating to URL
    02:43
  • Preview06:04
  • Getting profile links on a LinkedIn profile
    04:47
  • Building web crawler loop for Puppeteer
    04:32

Requirements

  • Basic JavaScript

Description

Do you want to build a webcrawler in Nodejs?

In this course you will learn how to build a webcrawler using the newest JavaScript syntax with popular sites like H&M, Amazon, LinkedIn and AliExpress!

You'll learn how to find hidden API's on sites like H&M and AliExpress and see how you can even avoid building a web crawler in the first place, you can save a lot of time this way!

Then I show how to build a web crawler for Amazon the test-driven way, by building out tests for the various product page layouts there is on Amazon.

After that we'll take a look at how to automate login and scraping profiles from LinkedIn using Puppeteer, the automated Chromium browser!

Who this course is for:

  • Students looking to learn web crawling with Nodejs
  • Students looking to learn web scraping with Nodejs

Instructor

Stefan Hyltoft
B.Eng Software Engineer
Stefan Hyltoft
  • 4.3 Instructor Rating
  • 1,556 Reviews
  • 26,977 Students
  • 8 Courses

Stefan has been building software since elementary school starting out with Visual Basic 6.0. Since then he has been dabbling with Python(Pygame), PHP & MySQL, and Java during university. Since he discovered the JavaScript world he found an intense interest in web development especially using ReactJs/React Native and using NodeJs for the backend. 

  • Udemy for Business
  • Teach on Udemy
  • Get the app
  • About us
  • Contact us
  • Careers
  • Blog
  • Help and Support
  • Affiliate
  • Terms
  • Privacy policy
  • Cookie settings
  • Sitemap
  • Featured courses
Udemy
© 2021 Udemy, Inc.