Skip to main content
QUICK REVIEW

[Paper Review] CRDTs: Consistency without concurrency control

Mihai Leţia, Nuno Preguiça|ArXiv.org|Jul 6, 2009
Distributed systems and fault tolerance13 references36 citations
TL;DR

This paper introduces CRDTs (Commutative Replicated Data Types) as a scalable alternative to traditional concurrency control in distributed systems, enabling eventual consistency without consensus. It presents Treedoc, a practical CRDT for concurrent document editing that uses a naming tree to ensure commutative operations, compact identifiers, and efficient causal ordering, achieving high performance and scalability in large-scale, dynamic environments with minimal coordination overhead.

ABSTRACT

A CRDT is a data type whose operations commute when they are concurrent. Replicas of a CRDT eventually converge without any complex concurrency control. As an existence proof, we exhibit a non-trivial CRDT: a shared edit buffer called Treedoc. We outline the design, implementation and performance of Treedoc. We discuss how the CRDT concept can be generalised, and its limitations.

Motivation & Objective

  • To address the scalability and consistency challenges in large-scale distributed systems with mutable shared data.
  • To eliminate the need for complex concurrency control mechanisms like consensus or vector clocks in replicated data types.
  • To design a practical, efficient, and scalable CRDT that supports concurrent insertions and deletions in an ordered sequence.
  • To overcome practical limitations such as indefinite growth, identifier bloat, and garbage collection in CRDTs.
  • To demonstrate that CRDTs can be used in real-world systems like collaborative editing with performance comparable to production systems.

Proposed method

  • Designing a CRDT abstraction for an ordered set with insert-at-position and delete operations using unique, stable identifiers.
  • Using a naming tree structure to assign compact, globally unique identifiers to atoms, ensuring commutativity across replicas.
  • Implementing a two-tier architecture with core and nebula sites: core maintains the global state, nebula sites handle local updates and propagate them via a reliable broadcast protocol.
  • Applying a tree restructuring algorithm that preserves commutativity by transforming subtrees into major nodes when needed.
  • Generating update operations (insert/delete) only for black nodes and tombstones in the final tree, ensuring idempotent and commutative propagation.
  • Using implicit causal ordering encoded in the tree structure to avoid the need for precise vector clocks, enabling scalable approximate causal tracking.

Experimental results

Research questions

  • RQ1Can a non-trivial, practical, and efficient CRDT be designed for concurrent editing of ordered sequences?
  • RQ2How can CRDTs be made scalable in systems with dynamic, varying numbers of writable replicas?
  • RQ3What mechanisms ensure commutativity of operations without relying on consensus or vector clocks?
  • RQ4How can identifier size and data structure bloat be minimized in CRDTs while preserving correctness?
  • RQ5Can garbage collection be decoupled from the critical path of application updates in CRDTs?

Key findings

  • Treedoc successfully implements a CRDT for concurrent document editing that ensures eventual convergence across replicas without consensus or complex conflict resolution.
  • The use of a naming tree enables compact, globally unique identifiers that remain stable across flattening operations, preventing identifier explosion.
  • Tree restructuring is performed in a way that maintains commutativity, allowing dynamic scaling without breaking consistency.
  • Causal ordering is implicitly encoded in the tree structure, reducing reliance on expensive vector clocks and enabling scalable approximate tracking.
  • Garbage collection is feasible and non-blocking, as it operates outside the critical path of application updates, though it requires consensus.
  • Performance evaluation using Wikipedia traces shows that Treedoc scales well and performs competitively with production systems, even under high update loads.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.