[논문 리뷰] Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks
DGL은 일반 g-SpMM 및 g-SDDMM 프리미티브로 계산을 추상화한 그래프 중심의 프레임워크를 제시하여 프레임워크 중립적 배치를 가능하게 하고(PyTorch, TensorFlow, MXNet) 벤치마크 전반에서 더 빠른 속도와 메모리 효율을 제공합니다.
Advancing research in the emerging field of deep graph learning requires new tools to support tensor computation over graphs. In this paper, we present the design principles and implementation of Deep Graph Library (DGL). DGL distills the computational patterns of GNNs into a few generalized sparse tensor operations suitable for extensive parallelization. By advocating graph as the central programming abstraction, DGL can perform optimizations transparently. By cautiously adopting a framework-neutral design, DGL allows users to easily port and leverage the existing components across multiple deep learning frameworks. Our evaluation shows that DGL significantly outperforms other popular GNN-oriented frameworks in both speed and memory consumption over a variety of benchmarks and has little overhead for small scale workloads.
연구 동기 및 목표
- Distill GNN computations into a small set of generic, optimizable primitives (g-SpMM and g-SDDMM).
- Make graph the central programming abstraction to simplify user code and enable transparent optimization.
- Design for framework neutrality to ease porting across PyTorch, TensorFlow, and MXNet while preserving performance.
- Achieve high speed and memory efficiency through optimized parallelization strategies and fused computations.
제안 방법
- Formalize GNN message passing as generalized SpMM (g-SpMM) and generalized SDDMM (g-SDDMM) operations.
- Develop parallelization strategies (node parallel for g-SpMM, edge parallel for g-SDDMM) and discuss formats (CSR/CSC/ COO) for performance.
- Implement DGLGraph as the central data structure with framework shims and automatic format switching to optimize forward and backward passes.
- Provide framework-neutral design by minimizing framework changes; use DLPack for tensor sharing and create differentiable operators for backpropagation.
- Expose graph-centric APIs (g.update_all, g.apply_edges) to compose message-passing with user-defined functions without materializing intermediate tensors.
실험 결과
연구 질문
- RQ1Can graph neural network computations be effectively captured by two generalized primitives (g-SpMM and g-SDDMM)?
- RQ2Does a graph-centric, framework-neutral package deliver superior speed and memory efficiency across a diverse set of GNN models and datasets?
- RQ3What parallelization strategies best exploit GPU/CPU hardware for g-SpMM and g-SDDMM?
- RQ4How much porting effort is required to move GNN models between PyTorch, TensorFlow, and MXNet within a single framework-neutral library?
주요 결과
- DGL significantly outperforms other popular GNN-oriented frameworks in both speed and memory consumption across benchmarks.
- g-SpMM and g-SDDMM kernels fuse computation and aggregation, reducing memory traffic and enabling training on much larger graphs than PyG can handle (e.g., ML-10m with PyG out of memory).
- On CPU, DGL achieves substantial speedups (1.9x to 64x) over PyG due to higher CPU utilization of its kernels.
- Mini-batch training with cluster sampling (CS) yields about 1.56x speedup for GAT when using DGL.
- DGL provides framework-neutral backends with low overhead, achieving competitive performance relative to framework-specific implementations.
더 나은 연구,지금 바로 시작하세요
연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.
카드 등록 없음 · 무료 플랜 제공
이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.