
Explore Apache Cassandra, a masterless distributed white column store designed for high availability across commodity servers, offering native language drivers, fault detection and recovery, and easy operations.
Outline the prerequisites for mastering Apache Cassandra, noting familiarity with keys, indexes, and distributed systems can be advantageous and that you will learn them during the course.
Explore the rise of NoSQL databases, their schema-less, highly scalable design for big data and real-time apps, and the four types—key-value, wide-column, document, and graph—using Cassandra as a key example.
Cap theorem explains the tradeoffs among consistency, availability, and partition tolerance, detailing the cp, ap, and ca combinations and contrasting them with acid and base properties in Cassandra.
Explore Cassandra, an open source distributed, decentralized database inspired by Bigtable, designed for high availability, no single point of failure, elasticity, and tunable consistency.
Download Apache Cassandra from the official site, select the latest version, and use community or enterprise options via Planet Cassandra and mirror links for tarball downloads.
Install Oracle JDK 7 or higher, verify its presence with java -version, then install Cassandra, ensuring the environment uses Oracle Java rather than OpenJDK.
Install Cassandra across platforms, review the Cassandra.yaml configuration, inspect data and log directories, and set up directories and permissions while examining the partitioner settings.
Learn to start Cassandra from the command line, locate its running process ID in a new terminal, verify the process is active, and safely stop it with Control-C.
Cassandra distributes data across multiple nodes to enable a distributed database with horizontal scalability. Its peer-to-peer architecture avoids a single point of failure and ensures data availability across data centers.
Explore the system keyspace in Cassandra, which stores cluster operation data such as space usage and system settings, plus notes on node bootstrap, migrations, and dynamic loading.
Explore how the gossip protocol powers Cassandra's peer-to-peer communication, enabling failure detection and replication through epidemic-style rounds where nodes exchange digests, acks, and state information.
Explore anti-entropy and read repair in Cassandra, using the gossip protocol to update replicas to the newest version and a hash tree to summarize blocks for background repairs.
Discover how Cassandra uses memtables and commit logs to capture writes in memory, then flushes to SSTables on disk, where immutable files are compacted to support crash recovery and reads.
Explore how compaction frees space by merging sstables, how tombstones mark deletions until compaction, and how bloom filters reduce disk reads to boost Cassandra performance.
Explore how to configure Cassandra from default to customized setups, focusing on keyspace, replicas, replica placement strategy, replication factors, virtual nodes, petitioners, and niches.
Explore keyspaces, the basic unit in Cassandra, created using the create keyspace command, and how a keyspace defines a database-like hierarchy with tables, column names, and values.
Explore how Cassandra handles replicas, replication factor, and replica placement within the ring, including tokens, partitioners, and replication strategies.
Explore replica placement strategies in Cassandra, including SimpleStrategy, NetworkTopologyStrategy, and rack placement, to optimize data distribution across data centers and racks.
Learn how replication factor determines how many copies of each data are stored and distributed across Cassandra clusters, and how consistency levels relate to replication factor.
Explore virtual nodes in Cassandra, where each node handles many token ranges or slices, defaulting to 256, to ease adding nodes and keep the cluster balanced.
Mastering Apache Cassandra: partitioners determine how keys are sorted and distributed across nodes, shaping range queries and performance; Cassandra offers random, order-preserving, and byte-order partitioners.
This lecture explains how Cassandra snitches determine node proximity and cluster topology, guiding routing and replication across data centers and racks, and reviews simple, dynamic, and property file snitches.
Discover how to communicate with Cassandra using CQL, compare it with the legacy Thrift API, and practice basic select queries and administrative tools.
Learn to start Cassandra, access cqlsh from the terminal, connect to your cluster, and run core CQL commands like describe cluster, create table, and create user, using semicolons.
Describe how Cassandra stores its data: a cluster contains keyspaces, each keyspace holds column families, and each column family contains rows with multiple column names and their values.
Explore how a Cassandra cluster distributes data across multiple nodes, delivering a single logical view. Learn about nodes, replicas, and the replication factor that keeps data available.
Explains keyspaces as the outer container for data and column families as ordered collections of rows and columns. Shows columns as basic units with a name, value, and clock.
Discover what Cassandra is by examining keyspaces, including system and system traces keyspaces. Learn to list keyspaces and describe a keyspace to view its tables and definitions in the cluster.
Create keyspaces with the create keyspace command, choose a replication strategy such as network topology strategy or simple strategy, and set replication factor per data center; verify with describe keyspace.
Mastering Apache Cassandra from scratch reveals a wide range of data types, including string, 64-bit long, blob, boolean, counters, lists, maps, sets, and inet, timestamp, uuid, and text.
Create a Cassandra table by defining columns like id, date_time, and text, with a primary key and clustering by date. Use the keyspace and drop tables as needed.
Specify the clustering order for a Cassandra table by selecting the column and choosing ascending or descending. Understand defaults and when you must redefine the table to change the order.
Learn to delete a keyspace in Cassandra using the drop keyspace command and verify deletion with describe keyspace to confirm removal.
Mastering Apache Cassandra: use insert into to add a single row with specified id, date time, and event, and use select to retrieve specific columns from the activity table.
Unpack the Cassandra copy command, showing how to copy data between tables by listing columns and matching order, then run a select to verify results.
Discover how Cassandra stores data, highlighting the partition key as the internal factor that uniquely identifies rows alongside the primary key, and inspect table contents with examples.
Discover how Cassandra differs from RDBMS: no standard query language but an API, range-based sorting instead of order by, secondary indexes, normalization, and no referential integrity or joins.
Explore cassandra design patterns such as materialized views with a second column family for denormalization techniques, valueless columns, and aggregate keys for efficient lookups.
Learn how the where clause in Cassandra retrieves data by specifying the partition key or primary key, and why queries on non-key columns fail without secondary indexes, with practical examples.
learn how Cassandra uses secondary indexes for non primary key lookups, with a per node hidden index, and that they don’t speed queries; consider a query-specific table.
Explore how a composite partition key uses multiple columns to prevent endless partition growth in cassandra, with Wakil ID, date, and time.
In this course you will learn about Apache's NoSQL Database-Cassandra and how is it used to store the Big-Data.