QUICK REVIEW

[Paper Review] Efficient Single Writer Concurrency.

Naama Ben-David, Guy E. Blelloch|arXiv (Cornell University)|Mar 23, 2018

Distributed systems and fault tolerance43 references2 citations

TL;DR

This paper presents a lock-free, single-writer concurrency control framework based on multiversion concurrency control that ensures strict serializability, no aborts, wait-freedom with bounded step complexity, and precise garbage collection. By enforcing purely functional access patterns and using efficient version maintenance, it achieves low overhead for writers and incremental version propagation to readers in systems like search indices and HTAP databases.

ABSTRACT

In this paper we consider single writer multiple reader concurrency - any number of transactions can atomically access shared data structures, but only one thread can atomically update the data. Limiting ourselves to a single writer allows us to achieve strong properties, some of which are not achievable with multiple writers. In particular, we guarantee strict serializability, no aborts, wait-freedom with strong bounds on step complexity, and precise garbage collection. Our approach is based on a variant of multiversion concurrency control. The user code accesses shared data in a purely functional manner. This allows each read to get a snapshot of the current database. The writer simply swaps in a new root to commit a new version. For garbage collection, we define the version maintenance problem and present efficient algorithms that solve it. This framework allows for very low overhead for maintaining and collecting multiple versions. The single-writer setup is particularly applicable to search indices, online graph analysis, and hybrid transactional/analytical processing (HTAP) databases, in which the bulk of the work is done by transactions that analyze the data. We have implemented the approach in C++ based on a parallel library PAM on balanced binary trees. We present results for two use cases: concurrent range-sum queries and a search index on a corpus. Both support adding new elements to the database. Experiments show that in both cases there is very little overhead when adding a single writer running in the background, while the queries can gradually get newly updated versions. Also, using our lock-free algorithm for version maintenance, the maximum live versions is bounded by the number of working transactions at the same time.

Motivation & Objective

To design a concurrency control mechanism that guarantees strong consistency and performance properties in single-writer, multiple-reader scenarios.
To eliminate transaction aborts and ensure wait-freedom with bounded step complexity in concurrent access to shared data structures.
To enable precise garbage collection of old data versions by solving the version maintenance problem efficiently.
To support high-performance, low-overhead updates in systems where most operations are read-heavy, such as search indices and analytical databases.
To demonstrate practical viability through implementation and evaluation on range-sum queries and search index workloads.

Proposed method

The system uses a variant of multiversion concurrency control where readers access consistent snapshots of the data structure.
The writer commits updates by atomically swapping in a new root node, ensuring version consistency without locks.
All user code operates in a purely functional style, preventing side effects and enabling safe snapshot reads.
A lock-free algorithm manages version maintenance, tracking live versions and enabling efficient garbage collection.
The approach bounds the maximum number of live versions to the number of concurrent working transactions.
The implementation is built on a C++ parallel library (PAM) over balanced binary trees to support efficient updates and queries.

Experimental results

Research questions

RQ1Can a single-writer concurrency model achieve strict serializability without transaction aborts?
RQ2How can version maintenance be performed efficiently to support precise garbage collection in a concurrent setting?
RQ3What is the performance overhead of maintaining multiple versions under high reader and single writer workloads?
RQ4Can the system ensure wait-freedom with bounded step complexity in practice?
RQ5How does the approach scale for real-world workloads such as range-sum queries and search index updates?

Key findings

The system achieves strict serializability and guarantees no transaction aborts, even under high contention.
The maximum number of live versions is bounded by the number of concurrent working transactions, enabling predictable memory usage.
Experiments show minimal overhead for the background writer, with queries gradually receiving updated data versions.
The lock-free version maintenance algorithm ensures efficient garbage collection without blocking or excessive memory retention.
In both range-sum query and search index workloads, the system maintains high throughput with low latency for readers and low cost for updates.
The purely functional access pattern enables safe, lock-free snapshot reads without data races or consistency violations.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.