From 0 to 1: The Cassandra Distributed Database
4.2 (1,085 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
8,230 students enrolled

From 0 to 1: The Cassandra Distributed Database

A complete guide to getting started with cluster management and queries on Cassandra
4.2 (1,085 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
8,229 students enrolled
Created by Loony Corn
Last updated 10/2016
English
English [Auto-generated]
Current price: $64.99 Original price: $99.99 Discount: 35% off
22 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 6 hours on-demand video
  • 93 downloadable resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Set up a cluster, keyspaces, column families and manage them
  • Run queries using the CQL command shell
  • Design primary keys and secondary indexes with partitioning and clustering considerations
  • Use the Cassandra Java driver to connect and run queries on the cluster
Course content
Expand all 46 lectures 05:54:35
+ Introduction: Cassandra as a distributed, decentralized, columnar store
4 lectures 31:56

Cassandra manages huge datasets using it's columnar layout which is more efficient and saves space.

Preview 10:39

What are our requirements of a product catalog system and why do we need a distributed, columnar, de-centralized database to manage this?

Requirements For A Product Catalog System
08:07

What use cases does Cassandra work with? When would you use Cassandra over other databases?

What Is Cassandra?
08:33

How does Cassandra stack up against HBase? HBase is the columnar store available in the Hadoop eco-system.

Cassandra Vs HBase
04:37
+ Install And Set Up
4 lectures 17:39

Install and set up Cassandra on your machine.

Install Cassandra (Mac and Unix based systems)
04:34
Install the Cassandra Cluster Manager (Mac and Unix)
02:20
Install Maven On Your Machine
02:20

If you are unfamiliar with softwares that require working with a shell/command line environment, this video will be helpful for you. It explains how to update the PATH environment variable, which is needed to set up most Linux/Mac shell based softwares. 

[For Linux/Mac OS Shell Newbies] Path and other Environment Variables
08:25
+ The Cassandra Cluster Manager
2 lectures 18:58

Get started using the Cassandra Cluster Manager

Preview 11:54
Basic CCM Commands
07:04
+ The Cassandra Data Model
3 lectures 19:38

Cassandra does not have tables, it has column families instead!

Preview 08:02
Super Column Family And Keyspace
07:17
Comparing Cassandra With A Relational Database
04:19
+ Shell Commands
7 lectures 01:00:21

All the configuration options available on a column family.

Column Families And Their Properties
12:02
Modify Column Families
02:42
Insert Data Into A Column Family
06:52

Collections and counters allow you to store rich data in your column family

Advanced Data Types: Collections And Counters
10:56
Update Simple And Collection Data Types
15:54
Manage Cluster Roles
05:01
+ Keys And Indexes: Primary Keys, Partition Keys, Clustering Key, Secondary Indexe
8 lectures 01:04:39

Primary keys are made up of partition and clustering keys. Partition keys determine how data is distributed across a cluster.

Partition Keys: Distributing Data Across Cluster Nodes
12:14

Primary keys are made up of partition and clustering keys. Clustering keys determine how data is laid out on a single node.

Clustering Keys: Data Layout On A Node
03:36

The design of partition keys determine what queries are valid in your cluster. See the restrictions on queries based on partition keys.

Restrictions On Partition Keys
14:38

The design of clustering keys determine what queries are valid in your cluster. See the restrictions on queries based on clustering keys.

Restrictions On Clustering Keys
09:12

Allow querying on additional columns by enabling secondary indexes. There are trade-offs when using this though!

Secondary Indexes
08:32
Restrictions On Secondary Indexes
08:52
Allow Filtering
02:27
+ Tunable Consistency
3 lectures 31:50
Write Consistency Levels And Hinted Handoff
12:18
Replication Factors And Quorum Value
08:14
+ Storage Systems
5 lectures 35:33
Overview Of Cassandra Storage Components
06:38
The SSTable And Its Components
09:44
Anatomy Of A Write Request
08:32
Anatomy Of A Read Request And The Gossip Protocol
07:25
+ A Mini-Project: A Miniature Catalog Management System In Java
9 lectures 01:12:16
Create A Session And Execute Our First Query
07:39
Create A Column Family
03:27
Check If A Column Family Has Been Created
04:59
Insert Data Into The Products Column Family
09:59
Search For Products
13:32
Delete A Listing
04:17
Update Mulitple Column Families Using Logged Batch
14:42
Requirements
  • The basics of SQL and traditional relational databases
  • The basics of Java in order to use the Cassandra Java library
Description

Taught by a team which includes 2 Stanford-educated, ex-Googlers  and 2 ex-Flipkart Lead Analysts. This team has decades of practical experience in working with large-scale data processing. 

Has your data gotten huge, unwieldy and hard to manage with a traditional database? Is your data unstructured with an expanding list of attributes? Do you want to ensure your data is always available even with server crashes? Look beyond Hadoop - the Cassandra distributed database is the solution to your problems.

Let's parse that.

  • Huge, unwieldy data: This course helps your set up a cluster with multiple nodes to distribute data across machines
  • Unstructured: Cassandra is a columnar store. There are no empty cells or space wasted when you store data with variable and expanding attributes
  • Always available: Cassandra uses partitioning and replication to ensure that your data is available even when nodes in a cluster go down


What's included in this course:

  •  The Cassandra Cluster Manager (CCM) to set up and manage your cluster
  •  The Cassandra Query Language (CQL) to create keyspaces, column families, perform CRUD operations on column families and other administrative tasks
  • Designing primary keys and secondary indexes, partitioning and clustering keys
  • Restrictions on queries based on primary and secondary key design
  • Tunable consistency using quorum and local quorum. Read and write consistency in a node
  • Architecture and Storage components: Commit Log, MemTable, SSTables, Bloom Filters, Index File, Summary File and Data File
  • A real world project: A Miniature Catalog Management System using the Cassandra Java driver
Who this course is for:
  • Yup! Engineers and analysts who understand traditional, relational databases and want to move to big data storage systems
  • Nope! Students who are just starting out understanding databases and have no prior experience with one