Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
50 Hrs Big Data Mastery: PySpark, AWS, Scala & Data Scraping
Rating: 4.1 out of 5(228 ratings)
2,190 students

50 Hrs Big Data Mastery: PySpark, AWS, Scala & Data Scraping

Comprehensive Big Data Mastery: Scala, Spark, PySpark, AWS, Data Scraping & Data Mining with Python, Mining and MongoDB
Last updated 12/2025
English

What you'll learn

  • Introduction and importance of this course in this day and age
  • Approach all essential concepts from the beginning
  • Clear unfolding of concepts with examples in Python,Scrapy, Scala, PySpark and MongoDB
  • All theoretical explanations followed by practical implementations
  • Data Scraping & Data Mining for Beginners to Pro with Python
  • Master Big Data with Scala and Spark
  • Master Big Data With PySpark and AWS
  • Mastering MongoDB for Beginners
  • Building your own AI applications

Course content

4 sections623 lectures54h 39m total length
  • Introduction: Why Data Scraping2:42

    Explore how data scraping extracts internet data for research, analysis, and machine learning, and why it’s a high-demand, high-pay skill with freelance and professional opportunities.

  • Introduction: Applications of Data Scraping7:09
  • Introduction: Introduction of Instructor0:40
  • Introduction: Introduction to Course, Scraping, Tools1:39
  • Introduction: Projects Overview3:42
  • Introduction: Request for Your Honest Review1:18

    Explore remaining sections to judge how concepts are presented and whether the content merits five-star ratings in the Udemy review system, then we update the course to ensure your satisfaction.

  • Requests: Introduction to Python Requests3:57
  • Requests: Hand on with Requests8:28

    Use the Python requests module to fetch a web page, inspect the HTML response, and use status codes to validate the request for effective web scraping and data extraction.

  • Requests: Extracting Quotes Manually10:05

    Practice using the requests module to fetch a server response, extract text and emails, parse HTML to pull quotes, and save the results to a file.

  • Requests: Quiz(Extracting Authors)0:40

    Participate in a quiz that requires extracting author names from a code diff and saving them in a file, noting the two codes and their order.

  • Requests: Solution(Extracting Authors)6:11
  • Requests: Pagination9:46
  • Requests: Quiz(Extracting Author and Quotes)0:58
  • Requests: Solution 01(Extracting Author and Quotes)6:27
  • Requests: Solution 02(Extracting Author and Quotes)5:52

    Extract quotes and author names from a structured response by iterating lines, saving codes, and pairing each code with its following author name, then write to a file.

  • Requests: Ajax Requests6:36
  • Requests: Ajax Requests for Cricinfo8:25

    Learn to fetch Cricinfo data with the requests module, parse JSON with json.loads, and extract authors and news summaries from a list of dictionaries, including pagination across pages.

  • Requests: Ajax Requests Paggination3:53
  • Requests: Quiz(Extracting Top Stats from Cricinfo)1:22
  • Requests: Solution 01(Extracting Top Stats from Cricinfo)7:16

    Inspect ajax requests with the browser network panel to identify the API endpoints that fetch data as you scroll, then replicate these requests locally to extract top stats from Cricinfo.

  • Requests: Solution 02(Extracting Top Stats from Cricinfo)9:17
  • Beautiful Soap 4(BS4): Introduction to BS43:02
  • Beautiful Soap 4(BS4): Quiz(Difference between Requests and BS4)0:25
  • Beautiful Soap 4(BS4): Solution(Difference between Requests and BS4)1:04
  • Beautiful Soap 4(BS4): Hands on with BS45:54
  • Beautiful Soap 4(BS4): Extracting Data from Tree8:50
  • Beautiful Soap 4(BS4): Extracting Quotes from the Website7:33
  • Beautiful Soap 4(BS4): Quiz(Extracting Author Names)0:38
  • Beautiful Soap 4(BS4): Solution(Extracting Author Names)5:28

    Use python with requests and BeautifulSoup to extract author names from html by targeting small tags with class author, then write the names to a csv file.

  • Beautiful Soap 4(BS4): Attributes of Tags in BS49:10
  • Beautiful Soap 4(BS4): Multi Valued Attributes of Tags in BS43:55
  • Beautiful Soap 4(BS4): Scraping Movie Names from IMDB19:31
  • Beautiful Soap 4(BS4): Quiz(Getting the Rattings,Year,Name of the Movie)0:55
  • Beautiful Soap 4(BS4): Solution 01(Getting the Rattings,Year,Name of the Movie)7:00

    Fetch movie name, year, and IMDb rating by scraping HTML with requests and BeautifulSoup, then parse the table body and rows to extract data safely.

  • Beautiful Soap 4(BS4): Solution 02(Getting the Rattings,Year,Name of the Movie)7:08
  • Beautiful Soap 4(BS4): Scraping Time,Genre and Releasing Date from IMDB 016:56
  • Beautiful Soap 4(BS4): Scraping Time,Genre and Releasing Date from IMDB 0217:21
  • Beautiful Soap 4(BS4): Combining Two Requests Data for IMDB6:50
  • Beautiful Soap 4(BS4): Movies Recommender System (CreatingMovie Url)6:26
  • Beautiful Soap 4(BS4): Movies Recommender System (Creating Director Url)6:10
  • Beautiful Soap 4(BS4): Movies Recommender System using BS4(Getting Top 4 Movies)8:55
  • Beautiful Soap 4(BS4): Movies Recommender System using BS4(Merge All Requests Together)4:02
  • CSS Selectors: Introduction to CSS Selectors2:49

    Explore the basics of CSS selectors, how they target specific DOM elements, and how to inspect, highlight, and extract text or attributes from targeted regions.

  • CSS Selectors: CSS Selectors Handson(Tags)5:17
  • CSS Selectors: Quiz(Tags)1:08

    Explore CSS selectors through a quiz that focuses on extracting specific tags like span and paragraphs from a sample page. Apply real-world tag patterns to precise selectors.

  • CSS Selectors: Solution(Tags)2:15
  • CSS Selectors: CSS Selectors Handson(Decendants, Id, Class)7:04
  • CSS Selectors: Quiz(Descendants)0:49

    Practice writing a CSS selector to target the two nested span elements inside a div, using descendant selectors, in this quick quiz.

  • CSS Selectors: Solution(Descendants)1:50
  • CSS Selectors: Quiz(ID)0:44
  • CSS Selectors: Solution(ID)1:46
  • CSS Selectors: Solution(Class)1:00
  • CSS Selectors: Solution(Class)3:16
  • CSS Selectors: CSS Selectors Handson(Nested Tags, ID Tags, Class Tags)4:32
  • CSS Selectors: Quiz(Class with Tag)0:40
  • CSS Selectors: Solution(Class with Tag)2:26

    Explore css selectors to target a specific element by combining its tag name with its class. Use tag and class combinations to limit selections to the desired element.

  • CSS Selectors: CSS Selectors Handson(Coma Seprator, Universial Selectors6:31
  • CSS Selectors: Quiz(Combining Two Selectors)0:46
  • CSS Selectors: Solution(Combining Two Selectors)2:48
  • CSS Selectors: CSS Selectors Handson(Sibling Notations and Direct Child)7:24

    Learn adjacent and general sibling selectors in CSS, using plus and tilde, and master direct child selectors for precise, immediate element targeting.

  • CSS Selectors: Quiz(Adjacent Sibling)0:45
  • CSS Selectors: Solution(Adjacent Sibling)2:38

    Explore how to correctly apply the adjacent sibling selector by first identifying a unique element, then selecting its adjacent sibling to avoid unintended matches in CSS.

  • CSS Selectors: Quiz(General Sibling)0:57
  • CSS Selectors: Solution(General Sibling)2:59
  • CSS Selectors: CSS Selectors Handson(Child Selectors)7:19
  • CSS Selectors: Quiz(First Child)0:40

    Practice writing a css selector to target only this div in a nested structure, focusing on the first-child concept, and review the solution in the next video.

  • CSS Selectors: Solution(First Child)3:49

    Master css selectors for reliably selecting the first child, explain why first-child can select multiple elements, and show how to use an id-based path to target a specific first child.

  • CSS Selectors: Quiz(Only Child)0:40
  • CSS Selectors: Solution(Only Child)2:58

    Master CSS selectors by applying first-child and only-child approaches to locate a specific element in a DOM, as demonstrated with practical browser inspection.

  • CSS Selectors: Quiz(Last Child)0:44
  • CSS Selectors: Solution(Last Child)3:10
  • CSS Selectors: CSS Selectors Handson (Nigations, Attributes)6:36
  • CSS Selectors: Quiz(Negation)0:41

    Explore negation in CSS selectors by identifying a selector that targets all child divs except the first one inside a container; practice with a quick quiz.

  • CSS Selectors: Solution(Negation)2:06
  • CSS Selectors: CSS Selectors Handson (Attributes, Attributes Values)3:51
  • CSS Selectors: Quiz(Attributes Values)0:39
  • CSS Selectors: Solution(Attributes Values)3:26

    Explore how to use CSS selectors to filter elements by attribute values, focusing on random attributes and narrowing with span to select specific elements.

  • CSS Selectors: CSS Selectors Handson (Attributes Wild Cards Values)6:25

    discover how to use css selectors to match attribute values with starts with, ends with, contains, and wildcards, including case sensitivity, for precise element selection.

  • CSS Selectors: Quiz(Attributes Wild Card)0:50
  • CSS Selectors: Solution(Attributes Wild Card)2:49
  • Scrapy: Introduction to Scrapy4:10

    Explore Scrapy as a fast, powerful Python framework for crawling websites, extracting structured data, and enabling asynchronous data pipelines with easy extensibility and cross-platform support.

  • Scrapy: Comparison of Scrapy and Requests3:40
  • Scrapy: Scrapy at a Glance Documentation8:31

    Learn how to use the Scrapy framework to crawl websites, extract structured data, and build spiders with Python, requests, callbacks, and css selectors.

  • Scrapy: Getting Started with Scrapy11:04
  • Scrapy: Running Documentation Spider 13:25
  • Scrapy: Running Documentation Spider 212:00
  • Scrapy: Writing Spider from the Scratch7:23

    Create a new scrappy project from scratch, organize it in a dedicated folder, and define a class inheriting from scrappy spider to use start URLs and handle responses.

  • Scrapy: Understanding the Response(url, Status)7:09
  • Scrapy: Understanding the Response(headers)4:12
  • Scrapy: Understanding the Response(values in headers)6:51
  • Scrapy: Understanding the Response(body)6:04
  • Scrapy: Understanding the Response(request)4:41
  • Scrapy: Understanding the Response(meta)8:29

    Learn how scrapy uses the response meta to transfer data between callbacks, by passing a dictionary through requests across redirects to combine extracted information.

  • Scrapy: Understanding the Response(flags, certificate, ip_address, copy)5:16

    Learn how Scrapy exposes flags, certificate information, server IP address, and the ability to copy a response for testing, logging, and debugging, with emphasis on response status such as 200.

  • Scrapy: Understanding the Response(replace, urljoin, follow, follow_all)8:07

    Learn how to manipulate Scrapy responses with replace and AllJoyn options, and use response.follow and response.follow_all to follow links, handle relative URLs, and chain requests with callbacks.

  • Scrapy: Response CSS and Scrapy Shell9:26
  • Scrapy: Extracting quotes5:47
  • Scrapy: Understanding Nested selectors10:02
  • Scrapy: Extracting the Author and Quotes10:05
  • Scrapy: Checking for Next Page7:36
  • Scrapy: Checking for Next Page in Spider5:36
  • Scrapy: Checking for Next Page URL8:16
  • Scrapy: Scraping Quotes from Next Pages11:07
  • Scrapy: Exporting Extracted Data3:26

    Learn how to export scraped data to a csv file with Scrapy crawl, specifying the output file, ensuring the spider name matches the file, and cleaning the file before export.

  • Scrapy: Quiz(Get The Tags)0:58

    Write a Scrapy spider to extract the code, author, and associated tags from a page, then output the author and comma-separated tag values.

  • Scrapy: Solution(Get The Tags)7:30
  • Scrapy: Next Website1:57
  • Scrapy: CSS Selectors for Movie Names and URLs12:09

    Learn to build a Scrapy project, create a spider, and use CSS selectors to extract movie names and URLs from IMDb pages, including anchor text and href attributes.

  • Scrapy: Combined CSS Selectors for Movie Names and URLs3:22
  • Scrapy: Sent request to the film info page4:35
  • Scrapy: Merge Data from Two Callbacks8:44
  • Scrapy: Extracting Movie Duration and Genres6:59
  • Scrapy: Exporting the Extracted Data5:34

    Export scraped IMDb data with Scrapy by building and yielding dictionaries of movie name, duration, and genres, and save output with -o while tuning concurrency for parallel requests.

  • Scrapy: Quiz(Extracting the Year)1:18
  • Scrapy: Solution(Extracting the Year)10:08

    Learn to build a scrapy spider that scrapes IMDb to extract movie names and release dates, using anchor tags and CSS selectors to navigate pages and export data.

  • Scrapy: Getting Director Name and Url8:21
  • Scrapy: Getting Top Four Movies of Directors9:28
  • Scrapy: Extracting Data Anomaly (dont_filter Flag)9:50
  • Scrapy Project: Hugoboss webiste for scraping2:30
  • Scrapy Project: Understanding Site Structure7:11
  • Scrapy Project: Writing CSS Selectors for Listings7:43

    Learn to craft a css selector to extract listings from a website, selecting the relevant anchor tags and handling mobile and desktop variants with unique classes.

  • Scrapy Project: Listings in Scrapy Shell4:20
  • Scrapy Project: Sending Request to Listings Urls7:23

    Master sending requests to listing URLs with Scrapy by switching from response.follow to Scrapy's Request, iterating over category pages, and printing product listings for each category.

  • Scrapy Project: Extracting Products Url from the Listings11:02
  • Scrapy Project: Sending Requests to Products of the Listings5:02
  • Scrapy Project: Writing CSS for getting the Product Info16:55
  • Scrapy Project: Getting the bigger Images of the Product7:54

    Learn to fetch bigger product images in a Scrapy project by swapping URL parameters and using Python to split on the question mark and assemble bigger image URLs.

  • Scrapy Project: Checking Next Page Url13:57
  • Scrapy Project: Adding Pagination to Spider and Running it9:40

    Master Scrapy pagination by teaching a spider to detect next page buttons, issue requests to subsequent pages, and reuse the same callback to extract products across categories.

  • Scrapy Project: Output of the Spider4:36
  • Selenium: Introduction To Selenium2:12
  • Selenium: Getting Started with Selenium3:36
  • Selenium: Configuring the Webdriver3:40
  • Selenium: Extracting Quotes10:16
  • Selenium: Extracting Quotes and Author Names7:17
  • Selenium: Quiz(Extracting Quotes)0:41
  • Selenium: Solution(Extracting Quotes)7:22
  • Selenium: Clicking on Button5:01
  • Selenium: Paggination and Extracting Data8:06
  • Selenium: Exception Handling for Unavailable Element5:41

    Learn how to use Selenium to handle unavailable elements with try-except blocks during pagination, preventing script termination while extracting quotes and authors from successive pages.

  • Selenium: Navigating the Website for Login9:37
  • Selenium: Quiz(Log in and Extract Quote)0:43
  • Selenium: Solution(Log in and Extract Quote)7:03
  • Project Selenium: Overview of Project1:28
  • Project Selenium: Closing the Cookie Button3:26
  • Project Selenium: Setting the Language for Translation5:50
  • Project Selenium: Sending the Text for Transaltion3:46
  • Project Selenium: Downaloading the Translation3:55

    Automate a translation workflow with Selenium by entering text, waiting for translation, and triggering a file download using element selectors and a deliberate delay.

  • Project Selenium: Reading Data from File for Translation3:44

    Read text from a local file and automate a Selenium-based translation workflow, sending text to a website, waiting for translation, and downloading the translation to the local machine.

  • Project Selenium: THANK YOU Bonus Video1:20
  • Link for the Course's Materials and Codes0:09

Requirements

  • Basic understanding of HTML tags. Python, SQL and Node JS
  • No prior knowledge of data scraping and Scala is needed. You start right from the basics and then gradually build your knowledge of the subject.
  • Basic understanding of programming.
  • A willingness to learn and practice.
  • Since we teach by practical implementations so practice is a must thing to do

Description

Welcome to the comprehensive Big Data and Data Science bundle, where you'll embark on an educational journey covering a wide range of essential skills and technologies. This course equips you with expertise in Scala, PySpark, AWS, Data Scraping, Data Mining, and MongoDB. Whether you're an absolute beginner or possess some programming knowledge, this course provides in-depth coverage of these critical topics.


I. Scala:

Scala may not be the most popular coding language, but it's undeniably one of the most sought-after skills for data scientists and data engineers. This course is meticulously designed to make Scala simple to grasp and implement. You'll engage with quizzes and mini-projects to reinforce your learning, making your Scala experience seamless. 


Key Highlights:

  • High Demand Skill: Scala is in high demand in the industry, and this course ensures you acquire essential skills

  • Practical Learning: Quizzes and mini-projects serve as building blocks for a comprehensive understanding of Scala

  • Hands-on Experience: Gain practical experience by working on a Scala Spark project

  • Versatility: Scala is a powerful language suitable for a wide range of applications, from web development to machine learning


Learning Materials:

  • Comprehensive Scala tutorials

  • Scala quizzes and assessments

  • Hands-on Scala Spark project

  • Scala code examples and exercises


II. PySpark and AWS:

Python and Apache Spark are at the forefront of Big Data analytics, and PySpark bridges the gap between them. In this section, you'll start with the basics and progress to advanced data analysis. You'll work with PySpark for data analysis, explore Spark RDDs, Dataframes, and Spark SQL queries, and delve into Spark and Hadoop ecosystems. Additionally, you'll discover how to leverage AWS cloud services with Spark. 


Key Highlights:

  • Python and Spark Integration: Master the art of using Python and Spark together for effective Big Data analysis

  • Comprehensive Coverage: Explore Spark RDDs, Dataframes, Spark SQL queries, and seamlessly integrate with AWS

  • Hands-on Practice: Apply your knowledge through practical exercises and projects


Learning Materials:

  • In-depth PySpark and AWS tutorials

  • PySpark quizzes and assessments

  • AWS integration guides and examples

  • PySpark code samples and hands-on projects


III. Data Scraping and Data Mining:

Data scraping involves extracting data from websites and APIs, making it a valuable skill for data professionals. This section is tailored for beginners, starting with foundational concepts and gradually delving into advanced techniques through practical implementations. Hands-on projects are a pivotal part of this segment, allowing you to learn through experimentation and real-world applications. 


Key Highlights:

  • Beginner-Friendly: Perfect for individuals new to data scraping and mining

  • Practical Implementation: Gain deep insights through hands-on projects and real-world examples

  • Lucrative Career: Data scraping offers rewarding career prospects and competitive salaries


Learning Materials:

  • Comprehensive Data Scraping and Mining tutorials

  • Hands-on data extraction projects

  • Data scraping and mining quizzes and assessments

  • Data scraping code samples and automation scripts


IV. MongoDB:

This section introduces you to MongoDB, a popular NoSQL database. You'll learn the fundamentals of MongoDB, including Create, Read, Update, and Delete operations. Dive deep into MongoDB query and project operators, enhancing your understanding of NoSQL databases. Two comprehensive projects will provide you with practical experience using MongoDB in Django and implementing an ETL (Extract, Transform, Load) pipeline with PySpark. 


Key Highlights:

  • NoSQL Proficiency: Develop expertise in MongoDB, a highly sought-after NoSQL database

  • Hands-on Projects: Apply your knowledge to real-world scenarios and gain practical skills

  • Versatile Skills: MongoDB is invaluable for data management and analytics


Learning Materials:

  • MongoDB fundamentals and advanced tutorials

  • Hands-on MongoDB projects, including Django integration and ETL pipeline development

  • MongoDB quizzes and assessments

  • MongoDB code examples and best practices



Course Benefits:

Upon completing this comprehensive course successfully, you will be proficient in implementing projects from scratch that require expertise in Data Scraping, Data Mining, Scala, PySpark, AWS, and MongoDB. You'll be adept at connecting theoretical concepts to real-world problem-solving, efficiently extracting data from websites, and be well-prepared for various data-related roles. 


Learning Materials:

  • Video lectures and tutorials.

  • Quizzes, assessments, and solutions.

  • Hands-on projects with step-by-step guidance.

  • Code examples and templates.

  • Reference materials and best practices. 



Enroll now to embark on your journey toward mastering Big Data and Data Science comprehensively!


Who Should Enroll:

  • Ideal for beginners or those looking to apply theoretical knowledge in practical scenarios

  • Aspiring data scientists and machine learning experts

  • Individuals aiming to excel in the realm of Big Data and Data Science


What You'll Learn:

  • Proficiency in implementing projects requiring expertise in Data Scraping, Data Mining, Scala, PySpark, AWS, and MongoDB

  • Efficient data extraction from websites

  • Skills applicable to various data-related roles


Why This Course:

  • High demand for Scala skills in the industry

  • Comprehensive coverage of PySpark, AWS, Data Scraping, Data Mining, and MongoDB

  • Hands-on experience through projects and practical exercises

  • Versatile skills for a wide range of applications



List of Keywords:

  • Big Data

  • Data Science

  • Scala

  • PySpark

  • AWS

  • Data Scraping

  • Data Mining

  • MongoDB

  • NoSQL Database

  • Data Extraction

  • Data Analysis


Who this course is for:

  • People who are absolute beginners.
  • People who want to make smart solutions.
  • People who want to learn with real data.
  • People who love to learn theory and then implement it practically.
  • Data Scientists, Machine learning experts and Drop Shippers.