Database.Advanced.How do you implement eventual consistency in a distributed database?

🧠 1. Understand the Use Case

Eventual consistency is useful when:

  • High availability is critical.
  • Temporary stale reads are acceptable (e.g., social media likes, shopping carts).
  • You want to keep writing during network partitions.

⚙️ 2. Key Techniques for Eventual Consistency

🔁 a) Asynchronous Replication

  • Writes are sent to one or more nodes.
  • The system acknowledges the write before all replicas are updated.
  • Background processes replicate the data eventually.

Example:
In Cassandra, a write can go to one node (QUORUM, ONE, etc.) and be propagated to others later.

📦 b) Versioning with Vector Clocks or Timestamps

  • Each version of data is tagged with a vector clock or timestamp.
  • When conflicting writes happen, the system can:
    • Auto-resolve (e.g., “last-write-wins”)
    • Expose conflicts to the application to resolve

Used in: Amazon Dynamo, Riak.

📬 c) Read Repair

  • When a client reads from a replica, it may compare data with other replicas.
  • If it detects inconsistencies, it can update out-of-date replicas in the background.

Example: Cassandra performs read repair behind the scenes.

🔎 d) Anti-Entropy Mechanisms (e.g., Merkle Trees)

  • Periodically compare data across nodes to detect divergence.
  • Only synchronize what’s different to reduce bandwidth.

Used in: Dynamo, Cassandra.

⚖️ e) Conflict-Free Replicated Data Types (CRDTs)

  • Data structures that resolve concurrent updates deterministically.
  • Guarantees convergence without coordination.

Used in: Redis CRDTs, Riak.

💡 3. Practical Implementation Sketch

Let’s say you’re building a custom distributed key-value store:

  1. Write Process
    • Client sends a write to Node A.
    • Node A stores the data with a timestamp and returns success.
    • Node A asynchronously replicates the write to Nodes B and C.
  2. Read Process
    • Client reads from Node C.
    • Node C compares version with A and B (if possible).
    • If stale, returns value + triggers background sync (read repair).
  3. Background Anti-Entropy
    • Nodes periodically compare data via Merkle Trees.
    • Only update changed keys.

✅ Real-World Systems

SystemStrategy
CassandraTunable consistency, read repair, hints
DynamoDBLast-write-wins, vector clocks
CouchbaseConflict resolution via timestamps
RiakVector clocks, CRDTs
MongoDBReplica sets with tunable read/write concern

⚠️ Challenges

  • Conflict resolution logic can be complex.
  • Data staleness needs to be acceptable for your use case.
  • Testing eventual consistency is harder than strong consistency.
This entry was posted in Без рубрики. Bookmark the permalink.