Fast Data Processing Systems with SMACK stack
4.0 (1 rating)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
7 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Fast Data Processing Systems with SMACK stack to your Wishlist.

Add to Wishlist

Fast Data Processing Systems with SMACK stack

Build data processing platforms that can take on even the hardest of your data troubles!
4.0 (1 rating)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
7 students enrolled
Created by Packt Publishing
Last updated 7/2017
English
Curiosity Sale
Current price: $10 Original price: $125 Discount: 92% off
30-Day Money-Back Guarantee
Includes:
  • 6.5 hours on-demand video
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Design and implement a fast data Pipeline architecture
  • Think and solve programming challenges in a functional way with Scala
  • Learn to use Akka, the actors model implementation for the JVM
  • Make on memory processing and data analysis with Spark to solve modern business demands
  • Build a powerful and effective cluster infrastructure with Mesos and Docker
  • Manage and consume unstructured and No-SQL data sources with Cassandra
  • Consume and produce messages in a massive way with Kafka
View Curriculum
Requirements
  • With the help of various industry examples, you will learn about the full stack of big data architecture, taking the important aspects in every technology. You will learn how to integrate the technologies to build effective systems rather than getting incomplete information on single technologies. You will learn how various open source technologies can be used to build cheap and fast data processing systems with the help of various industry examples.
Description

SMACK is an open source full stack for big data architecture. It is a combination of Spark, Mesos, Akka, Cassandra, and Kafka. This stack is the newest technique developers have begun to use to tackle critical real-time analytics for big data. This highly practical tutorial will teach you how to integrate these technologies to create a highly efficient data analysis system for fast data processing.We’ll start off with an introduction to SMACK and show you when to use it. First you’ll get to grips with functional thinking and problem solving using Scala. Next you’ll come to understand the Akka architecture. Then you’ll get to know how to improve the data structure architecture and optimize resources using Apache Spark. Moving forward, you’ll learn how to perform linear scalability in databases with Apache Cassandra. You’ll grasp the high throughput distributed messaging systems using Apache Kafka. We’ll show you how to build a cheap but effective cluster infrastructure with Apache Mesos. Finally, you will deep dive into the different aspects of SMACK using 2 practical case studies. By the end of the video, you will be able to integrate all the components of the SMACK stack and use them together to achieve highly effective and fast data processing.

About The Author

Raúl Estrada Aparicio is a programmer since 1996 and Java Developer since 2001. He loves functional languages such as Scala, Elixir, Clojure, and Haskell. He also loves all the topics related to Computer Science. With more than 12 years of experience in High Availability and Enterprise Software, he has designed and implemented architectures since 2003.

His specialization is in systems integration and has participated in projects mainly related to the financial sector. He has been an enterprise architect for BEA Systems and Oracle Inc., but he also enjoys Mobile Programming and Game Development. He considers himself a programmer before an architect, engineer, or developer.

He is also a Crossfitter in San Francisco, Bay Area, now focused on Open Source projects related to Data Pipelining such as Apache Flink, Apache Kafka, and Apache Beam.

Raul is a supporter of free software, and enjoys to experiment with new technologies, frameworks, languages, and methods.


Who is the target audience?
  • If you are a developer, data architect, or a data scientist looking for information on how to integrate the Big Data stack architecture and how to choose the correct technology in every layer, this video is what you are looking for.
Students Who Viewed This Course Also Viewed
Curriculum For This Course
42 Lectures
06:30:09
+
An Introduction to SMACK
5 Lectures 34:36

       This video gives an overview of the entire course.      

Preview 05:19

 To find an efficient solution, we need to learn about the data processing challenges first.       

Modern Data-Processing Challenges
06:28

 It is important to know the process or pipeline of SMACK to use it better.       

The Data-Processing Pipeline Architecture
07:09

 To use each technology, you need to understand each technology.       

SMACK Technologies
07:04

 Now learn about data expert profiles and how data processing can be a data center operation.       

Understanding Data Expert Profiles and Changing the Data Center Operations
08:36
+
The Language – Scala
3 Lectures 30:19

 We need to understand Scala hierarchy and the selection of a Scala to work with Scala. This video will teach you that.       

Preview 07:44

 Iterators are an important part of Scala. This video uses iterators and shows their importance.       

Iterators in Scala
03:43

This video shows a host of functions with Scala that includes filtering, merging, sorting and also sets, arrays queues, and stacks.       

More Functions with Scala
18:52
+
The Model – Akka
2 Lectures 23:11

This video shows the comparison between the Actor Model and traditional OOP, then describing about the actor system and reference.

Preview 13:32

Here, we will be learning about the functioning of actors using various katas.

Working with Actors
09:39
+
The Engine – Apache Spark
4 Lectures 01:09:12

Apache Spark cluster-based installations can become a complex task, when we integrate Mesos, Kafka, and Cassandra from: databases, telecommunications, operating systems, and infrastructure.       

Preview 06:44

Spark has four design goals: make in memory (Hadoop is not in-memory) data storage, distribute in a cluster, be fault tolerant, and be fast and efficient.       

Resilient Distributed Datasets
22:00

Apache Spark has its own built-in cluster standalone manager but you can run multiple cluster managers, including Apache Mesos, Hadoop YARN, and Amazon EC2.       

Spark in Cluster Mode
20:26

Spark Streaming is the module for managing data flows. Much of Spark is built with the concept of RDD. It provides the concept of DStreams or Discretized Streams.       

Spark Streaming
20:02
+
The Storage – Apache Cassandra
6 Lectures 42:07

NoSQL is a distributed database with an emphasis on scalability, high availability, and ease of administration, the opposite of 

established relational databases.       

Preview 04:33

The task of creating a scalable database massively decentralized, optimized for read operations, painlessly modifying data structures. The solution was found by combining two existing technologies that is Google's BigTable and Amazon's Dynamo.       

Apache Cassandra Installation
09:50

Cassandra offers to create a back up on the local computer. It creates a copy of the base using a snapshot. It is possible to make a snapshot of all the key spaces. Compression increases the cluster nodes capacity, reducing the data size on the disk.       

Backup and Compression
04:17

If you use an incremental backup, it is also necessary to provide the incremental backups created after the snapshot. There are multiple ways to perform a recovery from the snapshot.       

Recovery Techniques
03:32

 Work with DBMS optimization       

Recovery Techniques EDBMS Optimization, Bloom Filter, and More
15:08

The Spark Cassandra connector is a client used to achieve this connection, but this client is special because it has been designed specifically for Spark and not for a specific language.       

The Spark Cassandra Connector
04:47
+
Connectors – Spark, Cassandra, and Akka
4 Lectures 31:46

 In this video, you will learn the basics of the Spark Cassandra connector       

Preview 05:20

Spark streaming allows for handling and processing of high throughput and fault tolerant live data streams. In this video, you will learn about Spark Cassandra streaming and create a stream.       

Cassandra and Spark Streaming Basics
03:35

 Once our Spark Cassandra is set up, we'll look at the different operations we can perform with Cassandra.       

Functions with Cassandra
11:57

In this video, we will use the Akka Cassandra connector to build a simple Akka application, make HTTP requests, and store the data in Cassandra.       

Akka and Cassandra
10:54
+
The Broker – Apache Kafka
7 Lectures 01:03:30

Increasing data requires better data processing systems. Hence, Kafka comes into picture. In this video, you will learn about the features of Kafka and basics of Kafka.       

Preview 10:46

 We need to install Kafka to work with it. This video will enable you to do that.       

Installation
02:16

 Clusters are Kafka’s Publisher-subscriber messaging systems. In this video, you will learn to program with them.       

Cluster
13:14

 In this video, we will look at how the Kafka architecture is designed and understand the components that make it what it is.       

Architecture
09:55

Producers are applications that create messages and publish them to the broker. You need to understand the working of producers.       

Producers
05:59

Consumers are applications that consume the messages published by the broker. So they are the next step in the Kafka architecture.       

Consumers
07:19

To process large volumes of data, we require to integrate Kafka with other big data tools. Integration teaches us that. Also there are numerous tools provided by Kafka to manage features. We will learn about that in administration.       

Integration and Administration
14:01
+
Connectors – Akka, Spark, Kafka, and Cassandra
2 Lectures 11:00

 In this video, we will be looking at the relation between Akka and Spark and Kafka and Akka.

Preview 08:52

In this video, we will review the connectors between Kafka and Cassandra.

Kafka and Cassandra
02:08
+
The Manager – Apache Mesos
9 Lectures 01:24:28

 In this video, you will be introduced to Mesos and learn about the Mesos architecture.       

Preview 16:28

Resource allocation module of Mesos decides quantity of resources allocated to each framework. Hence, it is important to know about the resource allocation in Mesos.       

Resource Allocation
20:34

If you don’t want to use cloud services from Amazon, Google, or Microsoft, we can set up our cluster on our private data center. This video will teach you how to do that.       

Running a Mesos Cluster on a Private Data Center
10:01

We need frameworks to deploy, discover, balance load, and handle failure of services. In this video, we will look at the frameworks that are used for service management.       

Scheduling and Managing the Frameworks
15:15

 Aurora is a Mesos framework for long running services and cron jobs. Learn about job scheduling with Aurora.       

Apache Aurora
04:53

Singularity is a platform that enables deploying and running services and scheduled jobs in the cloud or data centers. Combined with Apache Mesos, it provides efficient management of the underlying processes life cycle and effective use of cluster resource. Let's see what it is all about.       

Singularity
03:42

 In this video, you will learn how to run Apache Spark on Mesos.       

Apache Spark on Apache Mesos
04:56

 In this video, we will deploy Apache Cassandra on Apache Mesos with the help of Marathon.       

Apache Cassandra on Apache Mesos
02:12

 In this video, we will deploy Apache Kafka on Apache Mesos.       

Apache Kafka on Apache Mesos
06:27
About the Instructor
Packt Publishing
3.9 Average rating
7,336 Reviews
52,330 Students
616 Courses
Tech Knowledge in Motion

Packt has been committed to developer learning since 2004. A lot has changed in software since then - but Packt has remained responsive to these changes, continuing to look forward at the trends and tools defining the way we work and live. And how to put them to work.

With an extensive library of content - more than 4000 books and video courses -Packt's mission is to help developers stay relevant in a rapidly changing world. From new web frameworks and programming languages, to cutting edge data analytics, and DevOps, Packt takes software professionals in every field to what's important to them now.

From skills that will help you to develop and future proof your career to immediate solutions to every day tech challenges, Packt is a go-to resource to make you a better, smarter developer.

Packt Udemy courses continue this tradition, bringing you comprehensive yet concise video courses straight from the experts.