QUICK REVIEW

[论文解读] Nemo: A Low-Write-Amplification Cache for Tiny Objects on Log-Structured Flash Devices

Xufeng Yang, Tingting Tan|arXiv (Cornell University)|Mar 10, 2026

Advanced Data Storage Technologies被引用 0

一句话总结

Nemo 通过使用小哈希空间和 SG 级分组，将集合关联的闪存缓存重新设计为处理微小对象，以最小化应用层写放大，同时采用近似索引和选择性元数据下移来维持内存效率。

ABSTRACT

Modern storage systems predominantly use flash-based SSDs as a cache layer due to their favorable performance and cost efficiency. However, in tiny-object workloads, existing flash cache designs still suffer from high write amplification. Even when deploying advanced log-structured flash devices (e.g., Zoned Namespace SSDs and Flexible Data Placement SSDs) with low device-level write amplification, application-level write amplification still dominates. This work proposes Nemo, which enhances set-associative cache design by increasing hash collision probability to improve set fill rate, thereby reducing application-level write amplification. To satisfy caching requirements, including high memory efficiency and low miss ratio, we introduce a bloom filter-based indexing mechanism that significantly reduces memory overhead, and adopt a hybrid hotness tracking to achieve low miss ratio without losing memory efficiency. Experimental results show that Nemo simultaneously achieves three key objectives for flash cache: low write amplification, high memory efficiency, and low miss ratio.

研究动机与目标

Motivate and quantify the high application-level write amplification (ALWA) in tiny-object flash caches and identify root causes.
Propose Nemo as a cache architecture that achieves near-ideal WA while preserving memory efficiency and caching performance.
Design Nemo components: small hash-space set-group (SG), SG-level flushing/eviction, PBFG-based approximate indexing, and selective in-memory/offloaded metadata.
Demonstrate that Nemo reduces ALWA and DLWA while maintaining acceptable memory overhead and performance.

提出的方法

Analyze the sources of write amplification in hierarchical flash caches and model L2SWA (log-to-set write amplification) components.
Introduce Nemo’s architecture with a small in-memory hash space organized as Set-Groups (SGs) and SG-level batched writes to on-flash SG pools.
Employ Parallel Bloom Filter Group (PBFG) for approximate object indexing to reduce memory footprint and facilitate parallel lookups.
Use selective metadata offloading (keeping hot metadata in memory, offloading stable metadata to flash) and a hybrid 1-bit hotness tracker to minimize memory overhead.
Define write amplification of Nemo as WA(Nemo) = 1 / E[FR_SG] and provide rationale that SG-level batching yields high fill rates.
Outline three design challenges (C1–C3) and corresponding solutions to ensure well-filled SGs, efficient offloading, and effective eviction.

Figure 1 . Application-level write amplification comparison. “S” denotes Set, “SG” denotes Set-Group.

实验结果

研究问题

RQ1What are the major sources of write amplification in existing tiny-object flash caches (e.g., FairyWREN, Kangaroo) and how can they be mitigated?
RQ2Can a set-group based, small-hash-space design with SG-level batching reduce ALWA to near-ideal levels without sacrificing cache performance or memory overhead?
RQ3How can approximate indexing (PBFG) and selective metadata offloading sustain memory efficiency while supporting fast lookups for tiny objects?
RQ4What is the impact of hotness tracking strategy on memory footprint and cache effectiveness in Nemo?

主要发现

Nemo can achieve ALWA close to the theoretical optimum (1.56) with DLWA as low as 1 on existing log-structured SSDs.
Nemo reduces flash writes by up to 90% while maintaining low memory overhead (about 8.3 bits per object for metadata) compared to prior designs.
PBFG indexing enables in-memory indexing with around 9.6 bits per object at 1% false positive rate, significantly reducing memory cost.
SG-level batching and a FIFO on-flash SG pool align flush/eviction with SSD log-structured behavior to minimize write amplification.
Approximate indexing and hybrid hotness tracking preserve caching performance and memory efficiency, enabling practical deployment within CacheLib.
Compared to log-structured front tiers or traditional set-associative designs, Nemo eliminates log-to-set migration overhead and mitigates RMW-related amplification.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。