QUICK REVIEW

[论文解读] Durable Algorithms for Writable LL/SC and CAS with Dynamic Joining

Prasad Jayanti, Siddhartha Jayanti|arXiv (Cornell University)|Jan 1, 2023

Distributed systems and fault tolerance被引用 3

一句话总结

本论文提出了 DuraCAS 和 DuraLL，这是首个支持动态加入和自适应空间复杂度的持久化、可写且无 ABA 问题的 CAS 和 LLSC 原语实现。通过利用持久化 ECSC（外部上下文比较并交换）对象和基于句柄的协调机制，这些算法实现了常数时间操作和 O(m + n) 的空间复杂度，其中 m 为对象数量，n 为活跃进程数量，显著优于以往针对固定-N 协议的 O(m + N²) 上限。

ABSTRACT

We present durable implementations for two well known universal primitives -- CAS (compare-and-swap), and its ABA-free counter-part LLSC (load-linked, store-conditional). All our implementations are: writable, meaning they support a Write() operation; have constant time complexity per operation; allow for dynamic joining, meaning newly created processes (a.k.a. threads) of arbitrary names can join a protocol and access our implementations; and have adaptive space complexities, meaning the space use scales in the number of processes $n$ that actually use the objects, as opposed to previous protocols which are designed for a maximum number of processes $N$. Our durable Writable-CAS implementation, DuraCAS, requires $O(m + n)$ space to support $m$ objects that get accessed by $n$ processes, improving on the state-of-the-art $O(m + N^2)$. By definition, LLSC objects must store "contexts" in addition to object values. Our Writable-LLSC implementation, DuraLL, requires $O(m + n + C)$ space, where $C$ is the number of "contexts" stored across all the objects. While LLSC has an advantage over CAS due to being ABA-free, the object definition seems to require additional space usage. To address this trade-off, we define an External Context (EC) variant of LLSC. Our EC Writable-LLSC implementation is ABA-free and has a space complexity of just $O(m + n)$. To our knowledge, we are the first to present durable CAS algorithms that allow for dynamic joining, and our algorithms are the first to exhibit adaptive space complexities. To our knowledge, we are the first to implement any type of durable LLSC objects.

研究动机与目标

设计首个支持任意名称进程动态加入的持久化、可写 CAS 和 LLSC 算法。
在最小化空间使用的同时实现常数时间操作复杂度，且空间复杂度自适应于实际活跃进程数量，而非预设的最大值。
通过引入外部上下文（EC）变体，解决无 ABA 问题的 LLSC 的空间开销问题，将空间复杂度降低至 O(m + n)，同时不牺牲持久性或无 ABA 问题特性。
通过支持崩溃-重启容错性与可检测性、可恢复性，为非易失性内存（NVM）系统中的持久化并发数据结构提供实用基础。

提出的方法

DuraCAS 算法为每个 DuraCAS 对象使用两个持久化 ECSC 对象（W 和 Z），并通过句柄（h.Critical 和 h.Casual）区分关键操作与非关键操作。
采用基于句柄的方法，其中 h.Critical 仅用于安装写入操作和标记影响对象状态的 CAS 操作，从而实现可检测性。
通过 h.Critical 跟踪检测器值，区分可见（影响状态）操作与可重复安全操作，确保可检测性。
对于 LLSC，DuraLL 实现采用上下文感知设计，将上下文存储在外部通过 ECSC 实现，相比对象内存储上下文，显著降低空间开销。
LLSC 的 EC 变体通过使用持久化 ECSC 对象将上下文外部化存储，使空间复杂度从传统 LLSC 的 O(m + n + C) 降低至 O(m + n)。
通过使用持久化 ECSC 原语协调状态变更和崩溃后的恢复，确保算法的无等待性与线性化正确性。

实验结果

研究问题

RQ1能否实现支持动态加入和自适应空间复杂度的持久化、可写 CAS 对象？
RQ2能否首次实现持久化、无 ABA 问题的 LLSC 对象，且空间开销最小化？
RQ3在崩溃-重启语义下，如何实现持久化 CAS 和 LLSC 对象的可检测性？
RQ4持久化、可写且无 ABA 问题的并发对象可实现的最小空间复杂度是多少？
RQ5能否在不损害持久性或无 ABA 问题特性的情况下，降低 LLSC 中上下文存储的空间成本？

主要发现

DuraCAS 实现了 O(m + n) 的空间复杂度，其中 m 为 DuraCAS 对象数量，n 为活跃进程数量，优于以往针对固定-N 协议的 O(m + N²) 上限。
WriTable-LLSC 的 DuraLL 算法需要 O(m + n + C) 的空间，其中 C 为存储的上下文总数，但其 EC 变体通过外部化上下文存储，将空间复杂度降低至 O(m + n)。
DuraCAS 算法具备可检测性：影响对象状态的操作会增加检测器值，而可重复安全操作则不会，从而实现崩溃检测。
所有操作，包括 Recover、Detect、Constructor 和 CreateHandle，均为无等待操作，且运行时间恒定，确保在进程故障情况下仍能保证进度。
算法支持动态加入：新进程可通过创建句柄随时加入，无需预先协调，即可访问现有或新对象。
本工作首次实现了无 ABA 问题的持久化 LLSC 对象，解决了持久化并发对象设计中长期存在的空白。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。