QUICK REVIEW

[论文解读] A Survey on Graph Neural Network Acceleration: Algorithms, Systems, and Customized Hardware

Shichang Zhang, Atefeh Sohrabizadeh|arXiv (Cornell University)|Jun 24, 2023

Advanced Graph Neural Networks被引用 10

一句话总结

本综述提出一个统一的GNN加速分类体系，覆盖三类—算法、商用现成系统（COTS）与定制硬件—回顾训练和推理加速技术并讨论未来方向。

ABSTRACT

Graph neural networks (GNNs) are emerging for machine learning research on graph-structured data. GNNs achieve state-of-the-art performance on many tasks, but they face scalability challenges when it comes to real-world applications that have numerous data and strict latency requirements. Many studies have been conducted on how to accelerate GNNs in an effort to address these challenges. These acceleration techniques touch on various aspects of the GNN pipeline, from smart training and inference algorithms to efficient systems and customized hardware. As the amount of research on GNN acceleration has grown rapidly, there lacks a systematic treatment to provide a unified view and address the complexity of relevant works. In this survey, we provide a taxonomy of GNN acceleration, review the existing approaches, and suggest future research directions. Our taxonomic treatment of GNN acceleration connects the existing works and sets the stage for further development in this area.

研究动机与目标

通过解决大规模图上GNN的可扩展性挑战及延迟约束来推动研究。
提供GNN加速技术的统一分类法。
评审现有的GNN加速的算法、系统和硬件方法。
讨论GNN加速的局限性、适用性及未来研究方向。

提出的方法

提出GNN加速的三类分类法：算法、COTS系统与定制硬件。
评审通过修改图或对计算进行采样以减少计算图的训练加速方法。
综述推理加速技术，包括剪枝、量化与蒸馏。
讨论COTS系统的优化，例如针对稀疏矩阵运算的GPU核加速及针对目标硬件的代码生成。
考察定制硬件设计，包括具有不同层自定义和稀疏性支持的加速器。
解决对特殊异构和动态图的加速问题。

Figure 1. A taxonomy of GNN acceleration. Specifically, we discuss training algorithms in Section 3 , inference algorithms in Section 4 , COTS systems in Section 5 , customized hardware in Section 6 , and special graphs and GNNs in Section 7 .

实验结果

研究问题

RQ1在训练和推理中有哪些现有的GNN加速方法？
RQ2图修改和采样策略如何在GNN训练中减少计算图和延迟？
RQ3COTS系统与定制硬件在GNN加速中的作用与取舍？
RQ4在GNN加速背景下，异构或动态图会带来哪些特殊考虑？

主要发现

该综述通过将技术分为算法、COTS系统和定制硬件，提供了GNN加速的统一视角。
它通过图修改（粗化、稀疏化、凝聚）和采样来减少计算图，从而详细介绍训练加速。
它涵盖推理加速方法，如剪枝、量化和蒸馏。
它讨论COTS系统的优化，如针对稀疏矩阵运算的GPU核加速及针对目标硬件的代码生成。
它对具有不同灵活性水平和稀疏性支持的定制硬件设计进行了综述，包括加速器和FPGA。
它强调在加速研究中需要考虑特殊的异构和动态图，并概述未来方向。

Figure 2. Illustration of graph modification methods: Graph Coarsening methods perform graph clustering and merge clusters of nodes into a super-node. Graph Sparsification methods remove less important edges. Graph Condensation methods generate a new condensed graph using a randomly initialized gene

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。