Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
Big Data Foundation for Developers
Rating: 4.3 out of 5(13 ratings)
136 students

Big Data Foundation for Developers

A hands on developers course to learn popular big data tools Hadoop, Hive & Spark including Machine Learning with Spark
Last updated 11/2022
English

What you'll learn

  • Apache Hadoop, Hive and Spark are very popular big data tools used by many organizations. Don't let your skills become obsolete.
  • Upskill yourself with the in-demand big data and machine learning skills
  • Practice with 20 demos and more than 50 practice activities that push you beyond what you learn in the class to become a big data developer
  • You will implement machine learning techniques using Spark to solve business problems like prediction, recommendation engine and anomaly detection.
  • By the end of this course, you will be able to set up a big data cluster, copy data to it and process with big data tools
  • Query big data using Hive, process big data through dataframes in Spark
  • Store data in Parquet format to take advantage of predicate pushdowns, chain multiple chain multiple transformations of data including windowing and pivoting
  • Includes introduction to Scala for use with Spark

Course content

9 sections183 lectures9h 4m total length
  • Introduction2:08

    Introduction to the course

  • Introduction continued0:51
  • Course prerequisites0:53
  • Course Structure0:59
  • Data Sizes in Big Data1:59
  • How is big data technology different?7:21
  • 3 Vs of Big Data2:47
  • Big data case study2:06
  • Big Data Solution2:33
  • Big data solution stages2:41
  • Apache Hadoop4:17
  • Yarn0:59
  • Hive1:29
  • Spark1:22
  • Practice Activity
  • Things to remember1:02

Requirements

  • Development experience with Java or C++, database experience.
  • Access to Linux environment or virtual machine with Linux on Windows.

Description

Apache Hadoop, Yarn, Hive and Spark are popular big data tools used by many organizations to develop big data analytics solutions. Through this course students can develop big data applications using these tools to process data and derive valuable insights from data. By the end of the course, students will be able to set up a personal big data development environment, master the fundamental concepts of Hadoop, Yarn, Hive and Spark, copy data into and from a big data cluster, process the data using the Map/Reduce paradigm, run Map/Reduce and Spark jobs on Yarn, Learn to process big data using Scala programming language in Spark, Use RDDs and dataframes to process big data, use Parquet format to store data, and finally use Machine Learning Libraries of Spark to develop Machine Learning solutions like decision trees, recommendation engine, Linear Regression and Anomaly detection.

This is a hands on development course and you will practice more than 50 activities during this course. While Java knowledge is assumed, fundamentals of Scala are taught so that you can write Scala code to process data in Spark. The course provides a foundation for developers to join big data development teams in their organization.

Who this course is for:

  • Beginners who want to learn big data tools
  • Talented professionals who want to practice Popular big data tool Spark and machine learning
  • IT professionals keen to upskill with lot of big data practice