Nitendra Gautam

Introduction to Apache Cassandra

Apache Cassandra is a free and open-source distributed NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure.

Replication Strategy in Cassandra

A replication strategy determines the nodes where replicas are placed. Cassandra stores replicas on multiple nodes to ensure reliability and fault tolerance. A replication strategy determines the nodes where replicas are placed.

Two replication strategies are available.

  • SimpleStrategy Use only for a single data center and one rack. If you ever intend more than one datacenter, use the NetworkTopologyStrategy. SimpleStrategy places the first replica on a node determined by the partitioner. Additional replicas are placed on the next nodes clockwise in the ring without considering topology (rack or datacenter location).

  • NetworkTopologyStrategy We use this strategy when you have (or plan to have) your cluster deployed across multiple datacenters. This strategy specifies how many replicas you want in each datacenter. NetworkTopologyStrategy places replicas in the same data center by walking the ring clockwise until reaching the first node in another rack. NetworkTopologyStrategy attempts to place replicas on distinct racks because nodes in the same rack (or similar physical grouping) often fail at the same time due to power, cooling, or network issues.

Create KeySpace in Cassandra for Single DataCenter

Example to create Keyspace for Single data center with replication factor of 3

    WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3};

Create KeySpace in Cassandra for Multiple DataCenter

     WITH replication = {'class': 'NetworkTopologyStrategy', 
           'DC1' : 1, 'DC2' : 3}
            AND durable_writes = false;

Durable writes tells whether to use the commit log for updates in the keyspace.It’s default value is true and should be left that way as per best practices . When Durable writes is set to false data written to the keyspace bypasses the commit log. Be careful using this option because you risk losing data. Do not set this attribute on a keyspace using the SimpleStrategy.

Bloom Filters

Cassandra uses Bloom filters to determine whether an SSTable has data for a particular row. Bloom filters are unused for range scans, but are used for index scans. They are fast, non deterministic algorithms for testing whether an element is a member of a set. It serves as a special kind of cache allowing quick look-ups/search as they reside in memory. They can be false positive but not false negative. Hence, used to check for assessing the disk. Cassandra uses bloom filters to save IO when performing a key lookup: each SSTable has a bloom filter associated with it that Cassandra checks before doing any disk seeks, making queries for keys that don’t exist almost free. What are problems with Bloom Filter ?


Tombstones are analogous to soft delete in traditional RDBMS world. It is a deletion marker that is required to suppress older data in SStables until compaction can run. It uses tombstones to perform a soft delete functionality.

Memtable in Cassandra

In memory storage and later flushed into SSTable .Memtables store incoming data for a Column Family temporarily in Cassandra’s JVM heap before that data is written or flushed to disk Compaction Process in Cassandra- Merges old SSTables and minimizes the data footprint on disk by consolidating row updates, clearing tombstones and in the process making data READS more efficient. Writes to Cassandra first hit an in-memory data structure called the Memtable. [Memtables store incoming data for a Column Family temporarily in Cassandra’s JVM heap before that data is written or flushed to disk.] After the Memtable gets full, Cassandra flushes the Memtable to disk and creates an SSTable. [SSTables are immutable on-disk data structures with N SSTables per ColumnFamily.] As writes, row updates, and deletions (mutations) continue to become incorporated into SSTables, Cassandra merges these incoming changes by ‘compacting’ the SSTables.

Compaction Types In Cassandra

There are 3 different types of compaction: Size-Tiered and Leveled and Date-Tiered Compaction Strategy. Which one is used depends on the expected workload in the cluster.

  • Size-Tiered Compaction Strategy—Good for write heavy, low update or I/O limited workloads.
  • Leveled compaction Strategy —Good for read heavy or update heavy workloads.
  • Date-Tiered Compaction Strategy :Mainly used for time series data

SSTable in Cassandra

It is Physical storage in disk.SSTables are immutable on-disk data structures with N SSTables per ColumnFamily

Coordinator Node

Client connects to coordinator node first then to the whole cluster Write request in Cassandra The coordinator sends a write request to all replicas that own the row being written. As long as all replica nodes are up and available, they will get the write regardless of the consistency level specified by the client. The write consistency level determines how many replica nodes must respond with a success acknowledgment in order forthe write to be considered successful. Success means that the data was written to the commit log and the Memtable as described in About writes.

Handling failover in Cassandra when node goes down

Using Hinted Handoff .Hinted Handoff can be enabled in cassandra.yaml

How Cassandra Node Join with Each Other?

Virtual nodes (vnodes) greatly simplify adding nodes to an existing cluster: Calculating tokens and assigning them to each node is no longer required. Rebalancing a cluster is no longer necessary because a node joining the cluster assumes responsibility for an even portion of the data.

Seed Node in Cassandra

Seed nodes are only for finding the way into the cluster on node startup - no overload problems. Seeds are used during startup to discover the cluster. Seeds are also referred by new nodes on bootstrap to learn other nodes in ring. When you add a new node to ring, you need to specify at least one live seed to contact. Once a node join the ring, it learns about the other nodes, so it doesn’t need seed on subsequent boot.

Consistency in Cassandra

Consistency means the number of node available when client sends an read or write request .In Cassandra, Consistency can be of two types –

  • Eventual consistency
  • Strong consistency.

Eventual consistency makes sure that the client is approved as soon as the cluster accepts the write.

Strong consistency means that any update is broadcasted to all machines or all the nodes where the particular data is situated. You also have the freedom to blend both eventual and strong consistency. For instance, you can go for eventual consistency in case of remote data centers where latency is quite high and go for Strong consistency for local data centers where latency is low.

  • Tunable Consistency Because different reads/writes may have different needs in terms of consistency, you can specify the consistency at read/write-time.

Consistency level (CL) ANY is for writes only and ensures that the write will persist on any server in the cluster.

CL _ONE ensures that at least one server within the replica set will persist the write or respond to the read; this is the minimum consistency level for reads.

CL_ QUORUM means the read/write will go to half of the nodes in the replica set plus one{RF/2 +1).

CL LOCAL_QUORUM is like QUORUM but applies to only those nodes within the same data center.

CL EACH_QUORUM is like QUORUM but ensures a quorum read/write on each of the data centers. CL ALL ensures that all nodes in a replica set will receive the read/write.

Snitch in Cassandra

A snitch determines which data centers and racks nodes belong to.A snitch determines which data centers and racks are written to and read from.Snitches inform Cassandra about the network topology so that requests are routed efficiently and allows Cassandra to distribute replicas by grouping machines into data centers and racks. All nodes must have exactly the same snitch configuration. Cassandra does its best not to have more than one replica on the same rack (which is not necessarily a physical location).


[1] Apache Cassandra Documentation