[论文解读] The Blacklisting Memory Scheduler: Balancing Performance, Fairness and Complexity
Blacklisting Memory Scheduler (BLISS) 提出了一种两组内存调度方法,根据连续请求数动态将应用程序分类为‘易受干扰’或‘造成干扰’两类,优先处理前者。BLISS 在保持 5% 更高系统性能和 25% 更高公平性的同时,将关键路径延迟降低 79%,硬件面积减少 43%。
In a multicore system, applications running on different cores interfere at main memory. This inter-application interference degrades overall system performance and unfairly slows down applications. Prior works have developed application-aware memory schedulers to tackle this problem. State-of-the-art application-aware memory schedulers prioritize requests of applications that are vulnerable to interference, by ranking individual applications based on their memory access characteristics and enforcing a total rank order. In this paper, we observe that state-of-the-art application-aware memory schedulers have two major shortcomings. First, such schedulers trade off hardware complexity in order to achieve high performance or fairness, since ranking applications with a total order leads to high hardware complexity. Second, ranking can unfairly slow down applications that are at the bottom of the ranking stack. To overcome these shortcomings, we propose the Blacklisting Memory Scheduler (BLISS), which achieves high system performance and fairness while incurring low hardware complexity, based on two observations. First, we find that, to mitigate interference, it is sufficient to separate applications into only two groups. Second, we show that this grouping can be efficiently performed by simply counting the number of consecutive requests served from each application. We evaluate BLISS across a wide variety of workloads/system configurations and compare its performance and hardware complexity, with five state-of-the-art memory schedulers. Our evaluations show that BLISS achieves 5% better system performance and 25% better fairness than the best-performing previous scheduler while greatly reducing critical path latency and hardware area cost of the memory scheduler (by 79% and 43%, respectively), thereby achieving a good trade-off between performance, fairness and hardware complexity.
研究动机与目标
- 解决现有应用感知内存调度器中因逐应用排名导致的高硬件复杂度和不公平延迟问题。
- 在保持或提升系统性能与公平性的前提下,降低调度延迟和硬件成本。
- 探究基于简化两组分类是否能有效缓解多核系统中的应用间内存干扰。
- 设计一种满足现代 DDR 内存协议严格时序约束的调度器。
提出的方法
- 基于同一应用程序的连续内存请求数,将应用程序分类为两类——易受干扰和造成干扰。
- 对连续请求数设置阈值;若超过阈值,则将该应用程序列入黑名单并降低优先级。
- 在调度过程中,优先处理未被列入黑名单(易受干扰)的应用程序请求,而非被列入黑名单(造成干扰)的应用程序请求。
- 通过仅使用两个优先级级别,避免逐应用排名,显著降低硬件复杂度。
- 采用动态分类机制,根据运行时访问模式自适应调整,无需长期性能分析。
- 可与现有内存子系统集成,并兼容子行交错和源端节流等技术。
实验结果
研究问题
- RQ1基于连续请求模式的两组应用分类是否能在不使用逐应用排名的情况下有效减少内存干扰?
- RQ2消除全序排名是否能降低硬件复杂度和调度延迟,同时保持或提升性能?
- RQ3此类简化调度器是否能实现优于严重降权低排名应用的基于排名方法的公平性?
- RQ4与最先进调度器相比,BLISS 在多样化工作负载和系统配置下的表现如何?
- RQ5将 BLISS 与其他干扰缓解技术(如子行交错或源端节流)结合使用时,其影响如何?
主要发现
- 在多样化工作负载下,BLISS 的系统性能比表现最佳的先前调度器(TCM)高出 5%,公平性提升 25%。
- 与最先进的基于排名的调度器 TCM 相比,BLISS 将关键路径延迟降低 79%,硬件面积减少 43%。
- 基于连续请求数的两组分类能以极低开销有效识别造成干扰的应用程序。
- BLISS 在性能和公平性方面均优于 FRFCFS 和其他应用无关调度器,同时保持低复杂度。
- BLISS 与子行交错的交互可能使高行缓冲局部性应用程序的不公平性加剧,表明需要协同设计。
- BLISS 与互补技术(如源端节流和银行分区)兼容,可进一步增强干扰缓解效果。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。