NoSQL Database Showdown: A CAP Theorem Perspective

This technical overview compares seven prominent databases across various dimensions including their genres, data types, transaction support, and key differentiators. The analysis extends to how each database adheres to the CAP theorem, which states that distributed databases can only achieve two out of three properties: consistency, availability, and partition tolerance. The comparison highlights how different databases make different trade-offs, with some focusing on consistency and availability (Redis, PostgreSQL, Neo4J), others on consistency and partition tolerance (MongoDB, HBase), and some choosing availability and partition tolerance (CouchDB). This understanding is crucial for selecting the right database for specific use cases in distributed systems.

2 Minutes reading time

Behold the masterpiece that AI hallucinated while reading this post:

"The Three Database Wishes: Why You Can't Have Them All"

(after I fed it way too many marketing blogs and memes)

Created using DALL-E 3

AI-Generated: The Three Database Wishes: Why You Can't Have Them All

This summary is taken from the book “Seven Databases in Seven Weeks”. See the Books section for details.

Database

Version

Genre

Data Types

Data Relations

Standard Object

Written in

Transactions

Triggers

Main Differentiator

Weaknesses

MongoDB

2.0

Document

Typed

None

JSON

C++

No

No

Easily query Big Data

Embed-ability

CouchDB

1.1

Document

Typed

None

JSON

Erlang

No

Update validation or Changes API

Durable and embeddable clusters

Query-ability

Riak

1.0

Key-value

Blob

Ad hoc(Links)

Text

Erlang

No

Ore/post commits

High available

Query-ability

Redis

2.4

Key-value

Semi-typed

None

String

C/C++

Multi operation queries

No

Very, very fast

Complex data

PostgreSQL

9.1

Relational

Predefined and typed

Predefined

Table

C

ACID

Yes

Best of OSS RDBMS model

distributed availability

Neo4J

1.7

Graph

Untyped

Ad hoc(Edges)

Hash

Java

ACID

Transaction event handlers

Flexible graph

Blobs or terabyte scale

HBase

0.90.3

Columnar

Predefined and typed

None

Columns

Java

Yes

No

Very large-scale, Hadoop infrastructure

Flexible growth, query-ability

The CAP Theorem and how it matters in distributed systems

Brewer’s CAP Theorem proves that you can create a distributed database that is consistent(writes are atomic and all subsequent requests retrieve the new value), available(the database will always return a value as long as a single server is running), or partition tolerant(the system will still function even if server communication is temporarily lost), but you can only have two at one, so either CA, AP or CP.

Redis, PostgreSQL and Neo4J are consistent and available(CA); they do not distribute data

MongoDB and HBase are generally consistent and partition tolerant(CP)

CouchDB is available and partition tolerant (AP), but it doesn’t guarantee consistency between distributed nodes.

Git revision: 2e692ad