Skip to main content
QUICK REVIEW

[论文解读] Orchestrating Multimodal DNN Workloads in Wireless Neural Processing

Sai Xu, Kai-Kit Wong|arXiv (Cornell University)|Mar 2, 2026
Advanced Neural Network Applications被引用 0
一句话总结

本论文提出 O-WiN,联合优化无线传输与多核 DNN 执行以进行多模态推理,引入 RTFS 与 PACS 来比较顺序调度与流水线式调度。在异构多模态工作负载中,PACS 通过将通信与计算重叠显著优于 RTFS。

ABSTRACT

In edge inference, wireless resource allocation and accelerator-level deep neural network (DNN) scheduling have yet to be co-optimized in an end-to-end manner. The lack of coordination between wireless transmission and accelerator-level DNN execution prevents efficient overlap, leading to higher end-to-end inference latency. To address this issue, this paper investigates multimodal DNN workload orchestration in wireless neural processing (WNP), a paradigm that integrates wireless transmission and multi-core accelerator execution into a unified end-to-end pipeline. First, we develop a unified communication-computation model for multimodal DNN execution and formulate the corresponding optimization problem. Second, we propose O-WiN, a framework that orchestrates DNN workloads in WNP through two tightly coupled stages: simulation-based optimization and runtime execution. Third, we develop two algorithms, RTFS and PACS. RTFS schedules communication and computation sequentially, whereas PACS interleaves them to enable pipeline parallelism by overlapping wireless data transfer with accelerator-level DNN execution. Simulation results demonstrate that PACS significantly outperforms RTFS under high modality heterogeneity by better masking wireless latency through communication-computation overlap, thereby highlighting the effectiveness of communication-computation pipelining in accelerating multimodal DNN execution in WNP.

研究动机与目标

  • 推动面向多模态工作负载的无线数据传输与加速器级 DNN 执行的端到端优化。
  • 在无线神经处理(WNP)中建模统一的通信–计算流水线。
  • 开发一个编排框架(O-WiN)及两种调度算法,以最小化端到端完成时间。

提出的方法

  • 构建一个统一模型,将基于 OFDMA 的上行传输与多核加速器上的带先后约束的并行机器调度联系起来。
  • 将作业定义为具有 DAG 依赖关系的模态特定 DNN 运算符,并在网络芯片带宽约束下映射到核心。
  • 引入 O-WiN,包含四个模块化组件(通信系统、计算平台、优化算法、性能评估)。
  • 开发两种启发式算法:RTFS(顺序先传输再计算)和 PACS(流水线感知共调度)。
  • 利用仿真评估在不同核心数、子载波数和压缩因子下测量完成时间与每核 NoC 带宽。

实验结果

研究问题

  • RQ1如何在多模态推理中联合优化无线资源分配与加速器层级 DNN 调度,从而最小化端到端延迟?
  • RQ2在 WNP 中覆叠的通信–计算流水线相较于等待全部完成的方法有哪些好处?
  • RQ3模态异质性与系统参数(核心数、NoC 预算、OFDMA 子载波)如何影响端到端性能?
  • RQ4PACS 是否通过在子图之间覆盖数据传输与计算来超越 RTFS?

主要发现

  • 在高度模态异质性下,PACS 通过在通信–计算重叠中更好地掩蔽无线等待,显著优于 RTFS。
  • 一个统一的流水线框架(O-WiN)实现覆盖无线传输与 DNN 执行的端到端优化。
  • 仿真结果显示流水线并行与阶段重叠带来收益,对核心数、子载波、时延因子与压缩的敏感性分析亦给出。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。