
We'll start with an introduction, what the course covers and who it benefits.
Understand how stream processing is different from batch processing
Stream processing is great for certain applications, but performance can be an issue at large scale. How do we solve this?
Understand Spouts and Bolts which make up a Storm topology
Understand how a Storm topology allows parellelism across components
A Storm topology runs on a cluster. Understand the different services which run on the cluster
Storm is to real-time stream processing what Hadoop is to batch processing. Using Storm you can build applications which need you to be highly responsive to the latest data and react within seconds and minutes, such as finding the latest trending topics on twitter, or monitoring spikes in payment gateway failures. From simple data transformations to applying machine learning algorithms on the fly, Storm can do it all.
This course has 25 Solved Examples on building Storm Applications.
What's covered?
1) Understanding Spouts and Bolts which are the building blocks of every Storm topology.
2) Running a Storm topology in the local mode and in the remote mode
3) Parallelizing data processing within a topology using different grouping strategies : Shuffle grouping, fields grouping, Direct grouping, All grouping, Custom Grouping
4) Managing reliability and fault-tolerance within Spouts and Bolts
5) Performing complex transformations on the fly using the Trident topology : Map, Filter, Windowing and Partitioning operations
6) Applying ML algorithms on the fly using libraries like Trident-ML and Storm-R.