Skip to main content
QUICK REVIEW

[論文レビュー] Nemo: A Low-Write-Amplification Cache for Tiny Objects on Log-Structured Flash Devices

Xufeng Yang, Tingting Tan|arXiv (Cornell University)|Mar 10, 2026
Advanced Data Storage Technologies被引用数 0
ひとこと要約

Nemoは小さなオブジェクト向けに小さなハッシュ空間とSGレベルのバッチ処理を用いてアプリケーションレベルの書き込み増幅を最小化するように設計されたセットアソシアティブフラッシュキャッシュであり、概算インデックス付けと選択的メタデータオフロードを組み合わせてメモリ効率を維持する。

ABSTRACT

Modern storage systems predominantly use flash-based SSDs as a cache layer due to their favorable performance and cost efficiency. However, in tiny-object workloads, existing flash cache designs still suffer from high write amplification. Even when deploying advanced log-structured flash devices (e.g., Zoned Namespace SSDs and Flexible Data Placement SSDs) with low device-level write amplification, application-level write amplification still dominates. This work proposes Nemo, which enhances set-associative cache design by increasing hash collision probability to improve set fill rate, thereby reducing application-level write amplification. To satisfy caching requirements, including high memory efficiency and low miss ratio, we introduce a bloom filter-based indexing mechanism that significantly reduces memory overhead, and adopt a hybrid hotness tracking to achieve low miss ratio without losing memory efficiency. Experimental results show that Nemo simultaneously achieves three key objectives for flash cache: low write amplification, high memory efficiency, and low miss ratio.

研究の動機と目的

  • Motivate and quantify the high application-level write amplification (ALWA) in tiny-object flash caches and identify root causes.
  • Propose Nemo as a cache architecture that achieves near-ideal WA while preserving memory efficiency and caching performance.
  • Design Nemo components: small hash-space set-group (SG), SG-level flushing/eviction, PBFG-based approximate indexing, and selective in-memory/offloaded metadata.
  • Demonstrate that Nemo reduces ALWA and DLWA while maintaining acceptable memory overhead and performance.

提案手法

  • Analyze the sources of write amplification in hierarchical flash caches and model L2SWA (log-to-set write amplification) components.
  • Introduce Nemo’s architecture with a small in-memory hash space organized as Set-Groups (SGs) and SG-level batched writes to on-flash SG pools.
  • Employ Parallel Bloom Filter Group (PBFG) for approximate object indexing to reduce memory footprint and facilitate parallel lookups.
  • Use selective metadata offloading (keeping hot metadata in memory, offloading stable metadata to flash) and a hybrid 1-bit hotness tracker to minimize memory overhead.
  • Define write amplification of Nemo as WA(Nemo) = 1 / E[FR_SG] and provide rationale that SG-level batching yields high fill rates.
  • Outline three design challenges (C1–C3) and corresponding solutions to ensure well-filled SGs, efficient offloading, and effective eviction.
Figure 1 . Application-level write amplification comparison. “S” denotes Set, “SG” denotes Set-Group.
Figure 1 . Application-level write amplification comparison. “S” denotes Set, “SG” denotes Set-Group.

実験結果

リサーチクエスチョン

  • RQ1What are the major sources of write amplification in existing tiny-object flash caches (e.g., FairyWREN, Kangaroo) and how can they be mitigated?
  • RQ2Can a set-group based, small-hash-space design with SG-level batching reduce ALWA to near-ideal levels without sacrificing cache performance or memory overhead?
  • RQ3How can approximate indexing (PBFG) and selective metadata offloading sustain memory efficiency while supporting fast lookups for tiny objects?
  • RQ4What is the impact of hotness tracking strategy on memory footprint and cache effectiveness in Nemo?

主な発見

  • Nemo can achieve ALWA close to the theoretical optimum (1.56) with DLWA as low as 1 on existing log-structured SSDs.
  • Nemo reduces flash writes by up to 90% while maintaining low memory overhead (about 8.3 bits per object for metadata) compared to prior designs.
  • PBFG indexing enables in-memory indexing with around 9.6 bits per object at 1% false positive rate, significantly reducing memory cost.
  • SG-level batching and a FIFO on-flash SG pool align flush/eviction with SSD log-structured behavior to minimize write amplification.
  • Approximate indexing and hybrid hotness tracking preserve caching performance and memory efficiency, enabling practical deployment within CacheLib.
  • Compared to log-structured front tiers or traditional set-associative designs, Nemo eliminates log-to-set migration overhead and mitigates RMW-related amplification.
Figure 2 . Hierarchical cache.
Figure 2 . Hierarchical cache.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。