Learn Big Data Technologies for Complete Beginners

Name: Learn Big Data Technologies for Complete Beginners
Rating: 3.9 (7 reviews)

Hands on course to learn MapReduce, MongoDB and Spark to become Data Engineer.

Created bySachin Kafle

Last updated 5/2024

English

What you'll learn

Learn to use MapReduce can be used to analyze big data sets
Create your own MapReduce jobs using Python and MRJob
Understand what Big Data Technologies is for, and how it works
Learn writing queries in MongoDB
Learn Data Engineering technologies
Learn to write code with Apache Spark
Learn about Spark SQL, Dataframes etc.

Course content

4 sections • 53 lectures • 7h 20m total length

Introduction2:03

Big Data and its Characterstics6:09
Hadoop6:38
Core Components of Hadoop4:26
Introduction to MapReduce5:51
MapReduce in detail10:50
Difference between return and yield in python5:55
Explore the difference between return and yield in Python, learning how return exits a function and yields values over time as a generator, with practical contrasts for MapReduce coding.
Use Google Colab1:27
MapReduce: Count Ratings16:03
Learn how MapReduce counts movies by rating from 1 to 5 using a Python EMR job, with a mapper emitting rating and one, and a reducer summing counts.
MapReduce: Sum of Order Amounts per customer8:14
MapReduce: Average per age7:38
MapReduce: More example5:55
MapReduce: Sorting11:39
MapReduce: Word Count Program6:08
MapReduce: Improve using Regex1:50
MapReduce: Sort Word Count Program10:31

Spark Environment2:49
Learn Apache Spark environment setups for cloud and on premise projects, choosing between notebook and Python IDE workflows, with hands-on in Databricks cloud and local development.
Databricks Cloud3:23
Confirm databricks account0:52
Databricks cloud Introduction24:44
Introduction to Dataframes14:23
Apache Spark Project0:44
Learn Apache Spark by loading a data set, creating a Spark DataFrame, and solving questions with the Spark DataFrame API and Spark SQL.
Databases4:11
Spark Dataframe vs Spark Table9:12
Spark Dataframes18:48
Spark Table11:30
Insert data into spark table5:11
Insert data into table from view2:46
Copy data from a global temporary view into the demo table using insert into, then verify the table now contains the view's data for later sql queries.
Spark SQL: Solve first question6:33
Spark SQL examples9:45
Spark SQL examples7:48
Spark DataFrame example13:22
Spark SQL with DataFrame3:17
Spark Architecture3:32
Spark Architecture Continued8:25
Spark Transformations and Actions15:09
Explore Spark data processing concepts, including transformations and actions, immutable data frames, lazy evaluation, and the dag of operations that drives narrow and wide dependencies.
Spark DataFrame Examples14:22
Spark APIs6:23
RDD and Spark SQL2:28
Spark Optimization4:22

Introduction to MongoDB7:40
What are collections and documents?6:40
Introduction to MongoDB Compass18:36
Explore MongoDB Compass to view databases, collections, and documents, and learn to connect to a local MongoDB service, create a bookstore database, and insert or update data.
Insert data using MongoShell11:57
Find Documents13:27
Chaining of Functions4:58
Chain MongoDB queries with find, count, limit, and sort to filter by author, limit results to three, and sort titles in ascending or descending order.
Nested Documents9:42
Comparison operators10:58
Several useful operators6:15
Apply filters on arrays13:09
Delete Documents8:01
Update Documents15:12
Indexing in MongoDB8:25

Requirements

Access to a personal computer. This course uses Windows OS.
Simple understanding of Python programming languages such as variables, loops etc

Description

Dive into the world of Big Data with this comprehensive course designed to equip you with the knowledge and skills needed to navigate and leverage large datasets effectively. This course will introduce you to key Big Data technologies, focusing on MapReduce, MongoDB, and Apache Spark. In today's data-driven world, the ability to process and analyze large volumes of data is crucial for making informed business decisions, driving innovation, and gaining a competitive edge. This course, "Learn Big Data Technologies for Complete Beginners" is designed to provide you with a solid foundation in the key technologies and methodologies used to handle Big Data, with a focus on MapReduce, MongoDB, and Apache Spark.

Key Topics:

Introduction to Big Data:
- Understanding the concept of Big Data
- The importance and impact of Big Data in various industries
MapReduce:
- Fundamentals of the MapReduce programming model
- Developing and executing MapReduce programs
- Real-world use cases
MongoDB:
- Basics of NoSQL databases and the need for MongoDB
- MongoDB architecture and data modeling
- CRUD operations
- Indexing for scalability and performance
Apache Spark:
- Introduction to Apache Spark and its ecosystem
- Spark architecture and components
- Spark SQL and DataFrames
- Hands-on projects to solidify your understanding

How This Course Can Be Useful:

This course is essential for beginners seeking to advance their careers in data science and engineering. By learning these powerful Big Data technologies, you will gain practical skills that are highly valued in the job market, making you a competitive candidate for data-related roles. The hands-on projects and real-world applications covered in this course will enable you to tackle complex data challenges and drive data-driven decision-making in your organization.

For businesses, this course offers a pathway to harness the power of Big Data to improve operational efficiency, enhance customer experiences, and foster innovation. By understanding how to process and analyze large datasets, you can uncover valuable insights that lead to better strategies and outcomes.

Academics and researchers will benefit from the course by gaining the ability to handle large-scale data, which is crucial for conducting cutting-edge research and contributing to advancements in various fields. The skills learned here will be foundational for any further studies or research projects in data science and related areas.

Who this course is for:

Beginners who want to become Big Data Engineers
Beginners who want to learn about Big Data technologies

Learn Big Data Technologies for Complete Beginners

What you'll learn

Explore related topics

Course content

Introduction1 lecture • 2min

Learn about Big Data and MapReduce15 lectures • 1hr 49min

Learn about Apache Spark24 lectures • 3hr 14min

Learn fundamentals of MongoDB13 lectures • 2hr 15min

Requirements

Description

Who this course is for: