Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
System Design for Big Data Pipelines
Rating: 4.2 out of 5(61 ratings)
632 students

System Design for Big Data Pipelines

Analyze, Design and Build scalable, resilient and cost-effective Big Data pipelines with a methodical process
Last updated 4/2023
English

What you'll learn

  • Learn about the building blocks of a big data pipeline, their functions and challenges
  • Adapt an end-to-end methodical approach to designing a big data pipeline
  • Explore techniques to ensure overall scaling of a big data pipeline
  • Study design patterns for building blocks, their advantages, shortcomings, applications and available technologies
  • Focus additionally on Infrastructure, Operations and Security for Big Data deployments
  • Exercise the learnings in the course with a Batch and Realtime use case study

Course content

15 sections90 lectures6h 32m total length
  • Need for Quality Pipeline Design3:46

    Discuss the need for quality pipeline design for big data pipelines. Explore the key activities in building such a design

  • Course Coverage and Pre-requisites4:16

    Familiarize with the covered topics, out-of-scope topics and pre-requisites for the course.

  • Cloud Serverless Technologies1:50

    Discuss how serverless technologies from cloud providers relate to the contents of this course.

Requirements

  • Big Data Technology Concepts
  • Familiarity with Big Data Technologies like Apache Spark, Apache Kafka and NoSQL
  • Development / Deployment Experience with Big Data Technologies and Pipelines
  • Software Design and Development Experience including Cloud & Microservices

Description

Big data technologies have been growing exponentially over the past few years and have penetrated into every domain and industry in software development. It has become a core skill for a software engineer. Robust and effective big data pipelines are needed to support the growing volume of data and applications in the big data world. These pipelines have become business critical and help increase revenues and reduce cost.

Do quality big data pipelines happen by magic? High quality designs that are scalable, reliable and cost effective are needed to build and maintain these pipelines.

How do you build an end-to-end big data pipeline that leverages big data technologies and practices effectively to solve business problems? How do you integrate them in a scalable and reliable manner? How do you deploy, secure and operate them? How do you look at the overall forest and not just the individual trees? This course focuses on this skill gap.

What are the topics covered in this course?

We start off by discussing the building blocks of big data pipelines, their functions and challenges.

We introduce a structured design process for building big data pipelines.

We then discuss individual building blocks, focusing on the design patterns available, their advantages, shortcomings, use cases and available technologies.

We recommend several best practices across the course.

We finally implement two use cases for illustration on how to apply the learnings in the course to a real world problem. One is a batch use case and another is a real time use case.


Who this course is for:

  • Big Data Pipeline Designers & Architects
  • Big Data Developers looking to move into Design/Architecture roles
  • Software Architects looking to gain Big Data Experience