QUICK REVIEW

[论文解读] Eva-CiM: A System-Level Energy Evaluation Framework for Computing-in-Memory Architectures.

Di Gao, Dayane Reis|arXiv (Cornell University)|Jan 27, 2019

Parallel Computing and Optimization Techniques被引用 3

一句话总结

Eva-CiM 是一种面向计算内存（CiM）架构的系统级能效评估框架，整合了 GEM5、McPAT 和 DESTINY，可在器件、内存和系统多个层级实现精确、快速且全面的能效估算。该框架在基于 SRAM 的 CiM 中实现 1.3–6.0× 的能效提升，在 FeFET-RAM 中实现 2.0–7.9× 的能效提升，支持设计空间探索与工作负载特定优化。

ABSTRACT

Computing-in-Memory (CiM) architectures aim to reduce costly data transfers by performing arithmetic and logic operations in memory and hence relieve the pressure due to the memory wall. However, determining whether a given workload can really benefit from CiM, which memory hierarchy and what device technology should be adopted by a CiM architecture requires in-depth study that is not only time consuming but also demands significant expertise in architectures and compilers. This paper presents an energy evaluation framework, Eva-CiM, for systems based on CiM architectures. Eva-CiM encompasses a multi-level (from device to architecture) comprehensive tool chain by leveraging existing modeling and simulation tools such as GEM5, McPAT [2] and DESTINY [3]. To support high-confidence prediction, rapid design space exploration and ease of use, Eva-CiM introduces several novel modeling/analysis approaches including models for capturing memory access and dependency-aware ISA traces, and for quantifying interactions between the host CPU and CiM modules. Eva-CiM can readily produce energy estimates of the entire system for a given program, a processor architecture, and the CiM array and technology specifications. Eva-CiM is validated by comparing with DESTINY [3] and [4], and enables findings including practical contributions from CiM-supported accesses, CiM-sensitive benchmarking as well as the pros and cons of increased memory size for CiM. Eva-CiM also enables exploration over different configurations and device technologies, showing 1.3-6.0X energy improvement for SRAM and 2.0-7.9X for FeFET-RAM, respectively.

研究动机与目标

解决因系统级能效权衡复杂性而导致难以识别真正受益于 CiM 架构的工作负载的挑战。
通过提供一体化、自动化的评估框架，减少评估 CiM 系统设计所需的时间与专业知识。
实现对多样化 CiM 配置、内存层次结构及器件技术的精确能效估算。
支持快速的设计空间探索，以指导针对特定工作负载的最优 CiM 架构与技术选型。
提供关于内存大小与访问模式对 CiM 能效影响的深入洞察。

提出的方法

将现有工具——GEM5（用于处理器仿真）、McPAT（用于功耗建模）和 DESTINY（用于器件级建模）——整合为统一的多层级仿真堆栈。
开发依赖感知的指令集架构（ISA）轨迹模型，以捕捉 CiM 工作负载中的内存访问模式与数据依赖性。
对主机 CPU 与 CiM 模块之间的协同设计交互进行建模，量化卸载操作对性能与能效的影响。
自动化完整系统配置（包括处理器、内存层次结构、CiM 阵列与器件技术）的能效估算。
通过可配置的仿真参数，支持对器件技术（如 SRAM、FeFET-RAM）与内存大小变化的快速探索。
通过与 DESTINY 及先前研究的对比验证，确保能效估算具有高置信度。

实验结果

研究问题

RQ1哪些工作负载及应用特征在通过 CiM 架构加速时能实现最显著的能效节省？
RQ2不同的内存层次结构配置与内存大小如何影响 CiM 系统的能效？
RQ3在多样化工作负载下，SRAM 与 FeFET-RAM 在 CiM 阵列中的相对能效优势是什么？
RQ4数据依赖性与内存访问模式如何影响 CiM 卸载的有效性？
RQ5主机-CiM 交互与通信开销在多大程度上限制了 CiM 架构的能效增益？

主要发现

Eva-CiM 在所评估的工作负载中，使基于 SRAM 的 CiM 架构实现 1.3–6.0× 的能效降低。
对于基于 FeFET-RAM 的 CiM，能效提升范围为 2.0–7.9×，展现出在低功耗应用中的巨大潜力。
该框架识别出 CiM 支持的内存访问带来的实际能效贡献，表明并非所有工作负载均能同等受益。
CiM 阵列中内存大小的增加仅在一定范围内提升能效，超过最优配置后即出现收益递减。
CiM 敏感的基准测试表明，具有高数据重用与规律访问模式的工作负载能获得最大的能效增益。
主机 CPU 与 CiM 模块之间的交互显著影响整体系统能效，凸显协同优化设计的必要性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。