Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Apache Spark Project World Development Indicators Analytics

Name: Apache Spark Project World Development Indicators Analytics
Rating: 3.9 (140 reviews)

World Development Indicators Analytics Project in Apache Spark for beginner using Apache Zeppelin and Databricks

Created byBigdata Engineer

Last updated 2/2026

English

What you'll learn

Understand and explore the World Development Indicators dataset from the World Bank.
Set up and configure a free Databricks account and Spark cluster for analytics.
Work with Spark DataFrames to load, transform, and analyze real-world datasets.
Apply Spark SQL and DataFrame operations to generate development insights.
Analyze GINI Index, GDP per capita, literacy rates, life expectancy, poverty rates, infant mortality, and trade statistics across countries.
Compare development trends between rich vs. poor countries over decades.
Perform country-level and global analytics using Spark.
Visualize global development metrics and publish Spark notebooks for sharing insights.
Build confidence in handling real-world Spark projects with practical datasets.

Course content

10 sections • 89 lectures • 5h 31m total length

Welcome to the Course4:07
What are World Development Indicators (WDI)?3:04
What You Will Learn2:20
Tools We’ll Use: Apache Spark, Spark SQL, Apache Zeppelin3:06

WDI Dataset Overview – Countries, Regions, and Indicators3:24
Explore the world development indicator dataset, with country data in country.csv (region and income group) and indicator data in indicator.csv, enabling spark analytics and time-series dashboards.
Types of Indicators in WDI (Economic, Social, Demographic, Health, Environment)3:45
Explore the world development indicators categories—economic, social, demographic, health, and environmental—and use over 1.4k indicators to build spark visualizations and comparisons.
Understanding the WDI Dataset Structure (Attributes, Years, Metadata)3:04
Understand the WDI data structure, including attributes like country name, region, ISO codes, indicator names and codes, year, and value. Metadata adds context for time-series analysis in spark.
Data Sources of WDI – World Bank & Other International Organizations2:54

Requirements0:05
(Hands On) Installing JAVA4:55
Steps for Installing JAVA0:26
(Hands On) Setting JAVA environments2:08
Set up the java environment by editing the /etc/profile to configure java home, java path, and java jre, then verify with echo $JAVA_HOME.
Steps for Setting JAVA environments0:25
(Hands On) Apache Zeppelin Installation Steps on Ubuntu machine5:01
Install and run Apache Zeppelin on Ubuntu by downloading Zeppelin 0.12.0, extracting the tarball, starting the daemon, and accessing the notebook interface on localhost:8080.
Steps for Installing Apache Zeppelin on Ubuntu machine0:20
(Hands On) Installing Docker Desktop on Windows 10/111:54
Steps for Installing Docker on Windows0:07
(Hands On) Running Apache Zeppelin on Docker (Windows)3:58
Steps for Running Apache Zeppelin on Docker1:23
(Hands On) Configure and Connect to Spark interpreter10:46
Steps for Configure and Connect to Spark Interpreter1:04

What is Apache Zeppelin4:08
Features & Benefits7:18
Explore Apache Zeppelin's features and benefits for big data analytics, including multi-language support, real-time execution, and native integration with Spark, Hive, HDFS, and Hadoop for collaboration and sharing across teams.
Notebook UI Overview11:21
Markdown and text formatting10:34
Creating and Running Paragraphs5:32
Hands on Creating and Running paragraphs12:19
Visualization Options (Tables, Bar chart, Pie chart, etc.)4:27
Explore how Apache Zeppelin converts data into visual insights with tables and charts like bar, line, pie, and scatter plots, and learn to label axes for dashboards.
Hands On - Types of Default Chart in Zeppelin4:12
Explores five default charts in Zeppelin and demonstrates creating bar, pie, area, line, and scatter charts from an employee data frame, highlighting interactive visuals for business decisions.

Spark interpreter details7:06
Working with RDDs and DataFrames7:17
Spark SQL queries and caching9:09
Run Spark SQL queries on dataframes within Apache Zeppelin and visualize results with built-in charts. Cache dataframes to speed repeated queries and enable interactive analytics.
Visualizing Spark outputs9:00
Visualize spark outputs in Apache Zeppelin by turning spark data frames into interactive tables and charts using temp views and the percentage SQL interpreter, with bar, line, and pie charts.
Job tracking and performance tuning basics11:00

Loading CSV Data into Spark DataFrames (Country Dataset)10:16
Load csv data into spark data frames for country and indicator datasets from world development indicators project, using header true and infer schema true, and show the first 20 rows.
Creating Temporary Views for Spark SQL Queries3:58
Analyzing Income Inequality with Gini Index in WDI4:46
Analyzing Youth Literacy Rates with Spark SQL5:32
Analyze youth literacy rate using Spark SQL on the World Development Indicator dataset, ranking countries by literacy in 1990 and 2010, and visualizing results for policy insights.
Comparing Trade (% of GDP) for India and China4:32
Analyzing Exports of Goods and Services for India and China4:11
Analyzing Imports of Goods and Services for India and China5:00
Analyzing GDP per Capita (PPP) for India and China4:00
Analyzing Poverty Alleviation in India and China3:52
Analyzing Life Expectancy at Birth in India and China4:17
Analyzing Urban Population Growth in India and China3:41
Analyze urban population growth in India and China using Spark SQL and the World Development Indicators dataset to compare urbanization trends and their impact on infrastructure, housing, and SDG 11.
Analyzing Infant Mortality as a Measure of Healthcare in India and China4:11
The 10 Poorest Countries in 1962 vs 20145:42
The 10 Richest Countries in 1962 vs 20144:58
Analyzing Average Income in Rich Countries (1960–2014)4:28
Average Income in Poor Countries (1960–2014)4:16
Analyze income trends in four poor countries from 1960 to 2014 using Spark SQL and the World Development Indicator data set, measuring gross national income per capita in US dollars.
Comparing Average Incomes in 1962 (Case Study: 4 Countries)3:59
Comparing Average Incomes in 2014 (Case Study: 4 Countries)4:07
Compare 2014 average income across Malawi, China, Luxembourg, and the United States using GNP per capita, ranked by income with Spark SQL, highlighting global inequality and China's rise.
Life Expectancy in France (1960–2013)3:57
Birth Rates in G7 Countries (1960–2013)5:04

File level details0:53
(Old) Free Account creation in Databricks1:51
Learn how to create a free Databricks account by navigating sign-up flows, entering work email, confirming your registration, and signing in to access the platform.
(New) Free Account creation in Databricks1:50
Tips to Improve Your Course Taking Experience1:35
Importing Databricks Notebook2:03
Overview and Project Objective1:51
Explore the world development indicators analytics project using Apache Spark to load World Bank data, analyze global indicators, and visualize the results in Spark.
File Content Explaination2:32
Launch Spark Cluster2:14
Spark Notebook Basics6:30
Explore Spark notebook basics by connecting to a cluster, executing code on the cluster, and using notebook cells to insert, edit, and document workflows with magic commands.
Loading data into Spark Dataframe8:50
Learn to load data into a Spark DataFrame for world development indicators analytics by enabling header and inferred schema, uploading files, and previewing schema and data for analysis.
GINI Index6:14
Youth Literacy Rate4:38
Learn to extract and plot the youth literacy rate from world development indicators by joining country and indicators tables, filtering by indicator code, and querying for 1990 and 2010.
Trade as a percentage of GDP for China and India2:43
Exports of goods and services3:01
Import of goods and services1:36
GDP per capita1:52
Poverty Alleviation1:04
Life Expectancy at birth, total (years)1:38
Urban Population growth1:32
Infant Mortality1:15
The 10 Countries with Lowest Average Income in 1962/20142:08
Examine the ten countries with the lowest average income in 1962 and 2014, comparing income values across years using world development indicators data.
The 10 Countries with Highest Average Income1:10
Average Income from 1960-2014 in Rich Countries1:58
Average Income from 1960-2014 in Poor Countries1:32
Average Income in 19621:10
Average Income in 20141:20
Life Expectancy in France 1960-20131:08
Explore life expectancy in France from 1960 to 2000 using a line chart to visualize indicators and plot the x values for the selected country data.
G-7 Country Birth Rates 1960-20131:14
World Per Capita Income in 20131:14
Explore how to view world per capita income in 2013 by selecting indicators and a country, then display the data on a world map with the appropriate plotting options.
Publish Notebook to the Web1:16
Bonus Lecture1:05

Requirements

No prior Spark experience required — this is a beginner-friendly project.
Basic understanding of Python or SQL will be helpful (but not mandatory).
Access to a computer with internet connection.
A free Databricks account (setup is covered step by step in the course).
Curiosity to learn how data engineering and analytics can provide insights into global development.

Description

Apache Spark Project: World Development Indicators Analytics

Are you ready to take your Apache Spark and Big Data skills to the next level by working on a real-world analytics project?

In this hands-on course, we’ll use Apache Spark, Spark SQL, and Apache Zeppelin to analyze one of the most important and widely used datasets in the world — the World Bank’s World Development Indicators (WDI). Covering over 200 countries, 50+ years of data, and hundreds of economic, social, demographic, health, and environmental indicators, this project is the perfect way to apply your Spark skills to real-world problems.

You’ll learn step by step how to:

Set up Spark and Zeppelin on your system (Windows, Ubuntu, or Docker)
Load and explore massive datasets with Spark DataFrames
Write Spark SQL queries to analyze GDP, literacy, poverty, trade, population, life expectancy, urbanization, and more
Build interactive visualizations and dashboards in Zeppelin
Compare economic and social development patterns across countries, regions, and decades
Deliver a resume-ready Spark project that you can showcase in interviews

What makes this course different?

Practical, project-based approach: Learn Spark by solving real-world questions.
Step-by-step guidance: Easy to follow, even if you’re new to Spark.
Comprehensive coverage: From environment setup → to data exploration → to insights.
Portfolio-ready project: By the end, you’ll have a complete Spark + Zeppelin project to demonstrate your skills.

Who is this course for?

Beginners who want to break into Big Data and Analytics with a hands-on project.
Data engineers & data analysts looking to strengthen their Spark SQL and Zeppelin skills.
Job seekers & interview candidates who need a portfolio project to stand out.
Anyone interested in exploring global development trends through the power of big data.

Real-World Case Studies Covered

Gini Index (Income Inequality)
Youth Literacy Rates
GDP per Capita (PPP) for India & China
Trade, Imports & Exports Analysis
Poverty Alleviation Trends
Life Expectancy in India, China & France
Urbanization & Infant Mortality Studies
Richest vs Poorest Countries (1962 vs 2014)
Birth Rates in G7 Countries
Global Per Capita Income in 2013

By the end of this course, you will be able to:

Confidently work with Apache Spark, Spark SQL, and Zeppelin.
Perform advanced data analysis on large, real-world datasets.
Build interactive notebooks and dashboards for visualization.
Showcase your Spark project in interviews and on your resume.

This is not just another Spark course — it’s a career-boosting project that prepares you for the real-world challenges of data engineering and analytics.

Who this course is for:

Data Engineers and Data Analysts who want hands-on experience with Apache Spark.
Students and Beginners in Big Data who want a guided project to apply Spark in the real world.
Aspiring Data Scientists looking to practice analytics on meaningful, real-world datasets.
Researchers and Analysts interested in exploring global development metrics with Spark.
Professionals preparing for Spark-related roles who want practical project experience to showcase.
Anyone curious about using data and analytics to understand world development trends.

Apache Spark Project World Development Indicators Analytics

What you'll learn

Explore related topics

Course content

Introduction to the Course4 lectures • 13min

World Development Indicators Deep Dive4 lectures • 13min

Setting Up the Environment13 lectures • 33min

Download Resources2 lectures • 2min

Zeppelin Basics8 lectures • 1hr

Zeppelin with Apache Spark5 lectures • 44min

Data Exploration with Spark20 lectures • 1hr 35min

Introduction1 lecture • 2min

Download Resources1 lecture • 1min

Project Begins31 lectures • 1hr 11min

Requirements

Description

Who this course is for: