Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
PySpark Foundation for Data Analysis | Beginners
Rating: 4.1 out of 5(45 ratings)
907 students

PySpark Foundation for Data Analysis | Beginners

Data Engineering, PySpark, Data Analysis, Coding exercise, Data Analytics
Last updated 1/2025
English

What you'll learn

  • Fundamentals of PySpark
  • Hands on experience in PySpark
  • Understanding of data using PySpark
  • Performing various data analysis operations
  • Data Analytics
  • Analysis of data

Course content

1 section28 lectures56m total length
  • Introduction to Course0:45

    Explore data analysis fundamentals in the second series of the foundation course, focusing on data analytics workflows, core transformations, and common operations like grouping, joins, and filters.

  • What is Data Analysis0:43
  • Data analysis in Elections0:40
  • Data Analysis in Cricket1:37

    Data analysis drives cricket decision making by providing data-driven insights and signals from the dressing room. England captain Ian Morgan and Virat Kohli are cited as examples of data-driven strategy.

  • Learning Outcomes1:05
  • Insights from data2:20
  • Upload the data1:01
  • Read the data0:40
  • Understanding the data6:17
  • Cleaning the data1:36

    Clean the data by filtering a data frame to remove unwanted values using not equal conditions, and remove discontinuities to produce a clean dataset.

  • Understanding data part 21:38
  • Aggregation Assignment 04:05
  • Aggregation Assignment 12:54
  • Aggregation Assignment 20:50
  • Validation of results0:50
  • Aggregation Assignment 32:27
  • Aggregation Assignment 41:29
  • Aggregation Assignment 53:14

    Use aggregation and grouping by batsman to compare runs when chasing versus batting first. Identify scorers like Kohli and de Villiers and consolidate inning results into a single data frame.

  • Buckets creation2:42
  • Assignment 61:19
  • Assignment 73:13
  • Assignment 81:57
  • Assignment 91:18
  • Understanding Bowlers data3:12
  • Assignment 102:44
  • Assignment 111:50
  • Assignment 121:27
  • Recap and Summary2:53

Requirements

  • There are no pre-requisites for the course. We will learn and practice together.
  • Basic Python knowledge is a plus
  • Good to have watched 1st part of this course

Description

Have you ever wondered How Big Data is helping Teams Win Big at the T20 World Cups/IPL?

In this course we will focus on very basic Data analysis to get useful insights on IPL dataset with the help of PySpark.


Learn to code PySpark like a real world developer. Here our major focus will be on Practical applications of PySpark and bridge the gap between academic knowledge and practical skill.


About PySpark:

Learn the latest Big Data Technology - Spark! And learn to use it with one of the most popular programming languages, Python!

One of the most valuable technology skills is the ability to analyze huge data sets, and this course is specifically designed to bring you up to speed on one of the best technologies for this task, Apache Spark! The top technology companies like Google, Facebook, Netflix, Airbnb, Amazon, NASA, and more are all using Spark to solve their big data problems!

Spark can perform up to 100x faster than Hadoop MapReduce, which has caused an explosion in demand for this skill! Because the Spark 2.0 DataFrame framework is so new, you now have the ability to quickly become one of the most knowledgeable people in the job market!


What you will learn :

  • What is Data Analysis

  • Data analysis in Elections

  • Data Analysis in Cricket

  • Big Data Cleaning

  • Calculating Averages

  • Manipulating Data

  • GROUPBY

  • Aggregations

  • Sorting

  • Joins in PySpark

Prerequisites :

  • Some basic programming skills (Not Mandatory)

  • Will to implement theoretical knowledge in pratical.


Who this course is for:

  • Beginners who want to learn Big Data or experienced people who want to transition to a Big Data role

  • Big data beginners who want to learn how to code in the real world

  • Aspiring candidates for data analytics or data engineering role

Who this course is for:

  • Anyone with an interest in Data engineering and data analysis