[Paper Review] Distributed Parallel Inference on Large Factor Graphs
This paper proposes DBRSplash, a distributed parallel inference algorithm for large factor graphs that uses over-partitioned graph cuts, belief residual scheduling, and uniform work Splash operations to achieve linear to super-linear speedup on a 120-node cluster. It improves load balancing and convergence on irregular graphs by decoupling scheduling across processors and prioritizing belief updates over message updates, enabling efficient large-scale inference in distributed memory systems.
As computer clusters become more common and the size of the problems encountered in the field of AI grows, there is an increasing demand for efficient parallel inference algorithms. We consider the problem of parallel inference on large factor graphs in the distributed memory setting of computer clusters. We develop a new efficient parallel inference algorithm, DBRSplash, which incorporates over-segmented graph partitioning, belief residual scheduling, and uniform work Splash operations. We empirically evaluate the DBRSplash algorithm on a 120 processor cluster and demonstrate linear to super-linear performance gains on large factor graph models.
Motivation & Objective
- To address the challenge of efficient distributed inference on large, irregular factor graphs in cluster environments.
- To improve load balancing and reduce communication overhead in distributed belief propagation through over-partitioned graph cuts.
- To enhance scheduling efficiency on irregular graphs by replacing message-based with belief-based residual scheduling.
- To maintain parallel optimality while scaling to large clusters in a distributed memory setting.
- To demonstrate linear to super-linear performance scaling on real-world AI workloads using a 120-processor cluster.
Proposed method
- Formalizes state partitioning as a weighted graph cut problem, using over-partitioning to improve load balance at the cost of increased communication.
- Introduces belief residual scheduling, which uses changes in belief estimates to prioritize vertex updates, improving convergence uniformity.
- Employs uniform work Splash operations that apply fixed-size BFS-based update sequences to prevent high-degree vertices from dominating computation.
- Decouples scheduling across processors using distributed queues, enabling scalable and asynchronous execution in a message-passing model.
- Adapts the ResidualSplash algorithm from shared memory to distributed memory by rearchitecting scheduling and partitioning for scalability.
- Uses a hybrid message-passing model where processors communicate only via messages, avoiding shared memory bottlenecks.
Experimental results
Research questions
- RQ1Can over-partitioned graph cuts improve load balancing in distributed belief propagation without incurring prohibitive communication costs?
- RQ2Does belief residual scheduling outperform message residual scheduling in convergence speed and accuracy on irregular, large factor graphs?
- RQ3Can uniform work Splash operations prevent high-degree vertices from dominating computation and improve scheduling fairness?
- RQ4Is it possible to achieve super-linear speedup in distributed belief propagation on large factor graphs using a scalable, message-passing algorithm?
- RQ5How does DBRSplash scale across a 120-node cluster on real-world AI workloads compared to prior methods?
Key findings
- DBRSplash achieved linear to super-linear speedup on a 120-node cluster for large factor graph models, demonstrating significant performance gains.
- On the uw-systems MLN, DBRSplash with belief residual scheduling achieved faster convergence and lower average L1 error than message-based scheduling.
- For the cora-1 MLN, belief residual scheduling enabled convergence where message-based scheduling failed, particularly due to high-degree variables.
- Over-partitioning by a factor of 10 reduced load imbalance and improved overall running time, despite increased communication costs.
- On small graphs like uw-languages, performance degraded beyond 20 processors due to increased communication and reduced accuracy, highlighting the importance of graph size relative to cluster size.
- Cumulative edge update counts showed that belief residual scheduling reduced total work by up to 30% compared to message-based scheduling on the cora-1 MLN.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.