Apache HBase vs Apache Cassandra
This comparative study was done by me and Larry Thomas in May, 2012. Cassandra stuff was prepared by Larry Thomas.
This information is NOT intended to be a tutorial for either Apache Cassandra orApache HBase. We tried our level best to provide the most accurate information. Please comment or email me if you find any corrections. I would be happy to maintain this list with the most accurate and updated information.
This comparative study was done by me and Larry Thomas in May, 2012. Cassandra stuff was prepared by Larry Thomas.
This information is NOT intended to be a tutorial for either Apache Cassandra orApache HBase. We tried our level best to provide the most accurate information. Please comment or email me if you find any corrections. I would be happy to maintain this list with the most accurate and updated information.
Point
|
HBase
|
Cassandra
| ||
|
HBase is based on BigTable (Google)
|
Cassandra is based on DynamoDB (Amazon). Initially developed at Facebook by former Amazon engineers. This is one reason why Cassandra supports multi data center. Rackspace is a big contributor to Cassandra due to multi data center support.
| ||
Infrastructure
|
HBase uses the Hadoop Infrastructure (Zookeeper, NameNode, HDFS). Organizations that will deploy Hadoop anyway may be comfortable with leveraging Hadoop knowledge by using HBase
|
Cassandra started and evolved separate from Hadoop and its infrastructure and Operational knowledge requirements are different than Hadoop. However, for analytics, many Cassandra deployments use Cassandra + Storm (which uses Zookeeper), and/or Cassandra + Hadoop.
| ||
Infrastructure Simplicity and SPOF
|
The HBase-Hadoop Infrastructure has several "moving parts" consisting of Zookeeper, Name Node, Hbase Master, and Data Nodes, Zookeeper is clustered and naturally fault tolerant. Name Node needs to be clustered to be fault tolerant.
|
Cassandra uses a a single Node-type. All nodes are equal and perform all functions. Any Node can act as a coordinator, ensuring no SPOF. Adding Storm or Hadoop, of course, adds complexity to the infrastructure.
| ||
Read Intensive Use Cases
|
HBase is optimized for reads, supported by single-write master, and resulting strict consistency model, as well as use of Ordered Partitioning which supports row-scans. HBase is well suited for doing Range based scans.
|
Cassandra has excellent single-row read performance as long as eventual consistency semantics are sufficient for the use-case. Cassandra quorum reads, which are required for strict consistency will naturally be slower than Hbase reads. Cassandra does not support Range based row-scans which may be limiting in certain use-cases. Cassandra is well suited for supporting single-row queries, or selecting multiple rows based on a Column-Value index.
| ||
Multi-Data Center Support and Disaster Recovery
|
HBase provides for asynchronous replication of an HBase Cluster across a WAN. HBase clusters cannot be set up to achieve zero RPO, but in steady-state HBase should be roughly failover-equivalent to any other DBMS that relies on asynchronous replication over a WAN. Fall-back processes and procedures (e.g. after failover) are TBD.
|
Cassandra Random Partitioning provides for row-replication of a single row across a WAN, either asynchronous (write.ONE, write.LOCAL_QUORUM), or synchronous (write.QUORUM, write.ALL). Cassandra clusters can therefore be set up to achieve zero RPO, but each write will require at least one wan-ACK back to the coordinator to achieve this capability.
| ||
Write.ONE Durability
|
Writes are replicated in a pipeline fashion: the first-data-node for the region persists the write, and then sends the write to the next Natural Endpoint, and so-on in a pipeline fashion. HBase’s commit log "acks" a write only after *all* of the nodes in the pipeline have written the data to their OS buffers. The first Region Server in the pipeline must also have persisted the write to its WAL.
|
Cassandra's coordinators will send parallel write-requests to all Natural Endpoints, The coordinator will "ack" the write after exactly one Natural Endpoint has "acked" the write, which means that node has also persisted the write to its WAL. The writes may or may not have committed to any other Natural Endpoint.
| ||
Ordered Partitioning
|
HBase only supports Ordered Partitoning. This means that Rows for a CF are stored in RowKey order in HFiles, where each Hfile contains a "block" or "shard" of all the rows in a CF. HFiles are distributed across all data-nodes in the Cluster
|
Cassandra officially supports Ordered Partitioning, but no production user of Cassandra uses Ordered Partitioning due to the "hot spots" it creates and the operational difficulties such hot-spots cause. Random Partitioning is the only recommended Cassandra partitioning scheme, and rows are distributed across all nodes in the cluster.
| ||
RowKey Range Scans
|
Because of ordered partitioning, HBase queries can be formulated with partial start and end row-keys, and can locate rows inclusive-of, or exclusive of these partial-rowkeys. The start and end row-keys in a range-scan need not even exist in Hbase.
|
Because of random partitioning, partial rowkeys cannot be used with Cassandra. RowKeys must be known exactly. Counting rows in a CF is complicated. It is highly recommended that for these types of use-cases, data should be stored in columns in Cassandra, not in rows.
| ||
Linear Scalability for large tables and range scans
|
Due to Ordered Partitioning, HBase will easily scale horizontally while still supporting rowkey range scans.
|
If data is stored in columns in Cassandra to support range scans, the practical limitation of a row size in Cassandra is 10's of Megabytes. Rows larger than that causes problems with compaction overhead and time.
| ||
Atomic Compare and Set
|
HBase supports Atomic Compare and Set. HBase supports supports transaction within a Row.
|
Cassandra does not support Atomic Compare and Set. Counters require dedicated counter column-families which because of eventual-consistency requires that all replicas in all natural end-points be read and updated with ACK. However, hinted-handoff mechanisms can make even these built-in counters suspect for accuracy. FIFO queues are difficult (if not impossible) to implement with Cassandra.
| ||
Read Load Balancing - single Row
|
Hbase does not support Read Load Balancing against a single row. A single row is served by exactly one region server at a time. Other replicas are used ony in case of a node failure. Scalability is primarily supported by Partitioning which statistically distributes reads of different rows across multiple data nodes.
|
Cassandra will support Read Load Balancing against a single row. However, this is primarily supported by Read.ONE, and eventual consistency must be taken into consideration. Scalability is primarily supported by Partitioning which distributes reads of different rows across multiple data nodes.
| ||
Bloom Filters
|
Bloom Filters can be used in HBase as another form of Indexing. They work on the basis of RowKey or RowKey+ColumnName to reduce the number of data-blocks that HBase has to read to satisfy a query. (Bloom Filters may exhibit false-positives (reading too much data), but never false negatives (reading not enough data).
|
Cassandra uses bloom filters for key lookup.
| ||
Triggers
|
Triggers are supported by the CoProcessor capability in HBase. They allow HBase to observe the get/put/delete events on a table (CF), and then execute the trigger-logic. Triggers are coded as java classes.
|
Cassandra does not support co-processor-like functionality (as far as we know)
| ||
Secondary Indexes
|
Hbase does not natively support secondary indexes, but one use-case of Triggers is that a trigger on a "put" can automatically keep a secondary index up-to-date, and therefore not put the burden on the application (client).
|
Cassandra supports secondary indexes on column families where the column name is known. (Not on dynamic columns).
| ||
Simple Aggregation
|
Hbase CoProcessors support out-of-the-box simple aggregations in HBase. SUM, MIN, MAX, AVG, STD. Other aggregations can be built by defining java-classes to perform the aggregation
|
Aggregations in Cassandra are not supported by the Cassandra nodes - client must provide aggregations. When the aggregation requirement spans multiple rows, Random Partitioning makes aggregations very difficult for the client. Recommendation is to use Storm or Hadoop for aggregations.
| ||
HIVE Integration
|
HIVE can access HBase tables directly (uses de-serialization under the hood that is aware of the HBase file format).
| Work in Progress (https://issues.apache.org/jira/browse/CASSANDRA-4131) | ||
PIG Integration
|
PIG has native support for writing into/reading from HBase.
|
Cassandra 0.7.4+
|
Point
|
HBase
|
Cassandra
|
CAP Theorem Focus
|
Consistency, Availability
|
Availability, Partition-Tolerance
|
Consistency
|
Strong
|
Eventual (Strong is Optional)
|
Single Write Master
|
Yes
|
No (R+W+1 to get Strong Consistency)
|
Optimized For
|
Reads
|
Writes
|
Main Data Structure
|
CF, RowKey, Name Value Pair Set
|
CF, RowKey, Name Value Pair Set
|
Dynamic Columns
|
Yes
|
Yes
|
Column Names as Data
|
Yes
|
Yes
|
Static Columns
|
No
|
Yes
|
RowKey Slices
|
Yes
|
No
|
Static Column Value Indexes
|
No
|
Yes
|
Sorted Column Names
|
Yes
|
Yes
|
Cell Versioning Support
|
Yes
|
No
|
Bloom Filters
|
Yes
|
Yes(only on Key)
|
CoProcessors
|
Yes
|
No
|
Triggers
|
Yes(Part of Coprocessor)
|
No
|
Push Down Predicates
|
Yes(Part of Coprocessor)
|
No
|
Atomic Compare and Set
|
Yes
|
No
|
Explicit Row Locks
|
Yes
|
No
|
Row Key Caching
|
Yes
|
Yes
|
Partitioning Strategy
|
Ordered Partitioning
|
Random Partitioning recommended
|
Rebalancing
|
Automatic
|
Not Needed with Random Partitioning
|
Availability
|
N-Replicas across Nodes
|
N-Replicas across Nodes
|
Data Node Failure
|
Graceful Degredation
|
Graceful Degredation
|
Data Node Failure - Replication
|
N-Replicas Preserved
|
(N-1) Replicas Preserved + Hinted Handoff
|
Data Node Restoration
|
Same as Node Addition
|
Requires Node Repair Admin-action
|
Data Node Addition
|
Rebalancing Automatic
|
Rebalancing Requires Token-Assignment Adjustment
|
Data Node Management
|
Simple (Roll In, Role Out)
|
Human Admin Action Required
|
Cluster Admin Nodes
|
Zookeeper, NameNode, HMaster
|
All Nodes are Equal
|
SPOF
|
Now, all the Admin Nodes are Fault Tolerant
|
All Nodes are Equal
|
Write.ANY
|
No, but Replicas are Node Agnostic
|
Yes (Writes Never Fail if this option is used)
|
Write.ONE
|
Standard, HA, Strong Consistency
|
Yes (often used), HA, Weak Consistency
|
Write.QUORUM
|
No (not required)
|
Yes (often used with Read.QUORUM for Strong Consistency
|
Write.ALL
|
Yes (performance penalty)
|
Yes (performance penalty, not HA)
|
Asynchronous WAN Replication
|
Yes, but it needs testing on corner cases.
|
Yes (Replica's can span data centers)
|
Synchronous WAN Replication
|
No
|
Yes with Write.QUORUM or Write.EACH-QUORUM
|
Compression Support
|
Yes
|
Yes
|
No comments:
Post a Comment