
Explore Apache Druid, a high-performance real-time analytics database that enables fast ad-hoc analytics, instant data visibility, high concurrency, streaming and batch data ingestion with sub-second latency.
Install Apache Druid, explore the Druid console, load data from multiple sources, run queries to explore Druid features, and get an introduction to Druid internals and industry use.
Install Apache Druid on a local machine via the single-server quickstart, meet Linux or macOS requirements and Java 8 update 92 or later, then start micro-quickstart.
Explore the Druid console through its tiles, data sources, segments, and services, learn how to load data, manage ingestion tasks, and run queries with real-time insights.
Load data from a local disk file into a Druid data source via the console, create and submit an ingestion spec, then query with sql syntax and apply countryName filter.
Explore multiple methods to load local disk files into druid, including the console load data, the ingestion tab json tasks, the bin post-index-task script, and curl calls.
Transform data during load in Apache Druid by adding transformed columns, such as uppercasing country names and computing comment lengths, using transform expressions like upper and strlen.
Apply a pre-row filter at load time with a like condition on countryName to pass only Australia; the sample confirms Germany yields no records.
Load a nested json file, flatten the address fields into city, state, and pincode, then set parse time, disable roll up, set day granularity, and query to confirm five rows.
Learn how to load data from kafka into druid, including installing kafka, creating a wikipedia topic, producing json records, and configuring druid load spec for kafka stream ingestion.
Load csv data from https into Apache Druid by connecting data, parsing headers, and publishing a real-estate data source.
Learn how Druid SQL translates into native queries, explore JSON native query formats, and use intervals, data sources, and result formats to query data efficiently.
Explore Druid architecture and how coordinators, overlords, brokers, middle managers, and historical processes coordinate ingestion, segment management, and query serving with ZooKeeper, metadata storage, and deep storage.
Learn to integrate Apache Druid with Superset, configure security and host settings, start Superset via Docker, connect to Druid, and create datasets and time series visualizations.
See how Netflix uses Apache Druid for real-time insights at scale, ingesting two million events per second and delivering subsecond to a few-second queries with roll-up and compaction.
Apache Druid is the latest database in the Big Data technology and is rapidly gaining momentum in the market. It is playing a crucial role in the real-time analytics pipeline.
Demand of Druid in market is already swelling. Big companies like Netflix, Airbnb, Google, Walmart have already started using Apache Druid to process their Real-time Big data and thousands other are diving into.
Apache Druid takes in the best features from Search platforms, Timeseries databases and OLAP systems. So, if you have data that is organized around time, if you are doing slicing and dicing of that data for user-facing analytics, if you are doing full-text search, these are all markers of a good use-case for Druid.
What's included in the course?
Complete course on Apache Druid concepts and capabilities explained from Scratch to Production use-cases.
Each and Every Apache Druid concept is explained with a HANDS-ON.
Include even those concepts, the explanation to which is not very clear even in Druid official documentation.
Related Commands and Datasets used in lectures are attached in the course for your convenience.