Skip to main content
QUICK REVIEW

[论文解读] Reinforcement Learning With LLMs Interaction For Distributed Diffusion Model Services

Hongyang Du, Ruichen Zhang|arXiv (Cornell University)|Nov 18, 2023
Recommender Systems and Techniques被引用 13
一句话总结

本论文提出一个以用户为中心的交互式AI框架用于分布式扩散模型的AIGC,采用带LLM驱动反馈的RL以及基于GDM的边缘推理方案来优化QoE和能效。

ABSTRACT

Distributed Artificial Intelligence-Generated Content (AIGC) has attracted significant attention, but two key challenges remain: maximizing subjective Quality of Experience (QoE) and improving energy efficiency, which are particularly pronounced in widely adopted Generative Diffusion Model (GDM)-based image generation services. In this paper, we propose a novel user-centric Interactive AI (IAI) approach for service management, with a distributed GDM-based AIGC framework that emphasizes efficient and cooperative deployment. The proposed method restructures the GDM inference process by allowing users with semantically similar prompts to share parts of the denoising chain. Furthermore, to maximize the users' subjective QoE, we propose an IAI approach, i.e., Reinforcement Learning With Large Language Models Interaction (RLLI), which utilizes Large Language Model (LLM)-empowered generative agents to replicate user interaction, providing real-time and subjective QoE feedback aligned with diverse user personalities. Lastly, we present the GDM-based Deep Deterministic Policy Gradient (GDDPG) algorithm, adapted to the proposed RLLI framework, to allocate communication and computing resources effectively while accounting for subjective user traits and dynamic wireless conditions. Simulation results demonstrate that G-DDPG improves total QoE by 15% compared with the standard DDPG algorithm.

研究动机与目标

  • 在具有多样化用户个性的AIGC服务中,激发并最大化主观QoE。
  • 开发一个基于分布式GDM的AIGC框架,以降低能耗和时延。
  • 引入RLLI(带LLM交互的强化学习),利用LLM代理实现实时QoE反馈。
  • 在能源与QoE约束下,建立并求解一个联合优化去噪步数与传输功率的资源分配问题。

提出的方法

  • 将GDM推理重新表述为一个分布式多设备去噪过程,其中语义上相似的提示共享扩散步骤。
  • 通过潜在GDM使用文本提示进行条件化,以实现共享去噪路径(算法1)。
  • 提出RLLI,将深度强化学习与LLM生成的QoE反馈结合起来以引导资源分配。
  • 用遵循大五人格特征的用户个性u_k来建模QoE,以反映审美偏好。
  • 开发G-DDPG(基于GDM的DDPG),用于在能源与QoE约束下联合优化服务器/设备的去噪步骤t和传输功率P。
  • 提供一个以边缘为中心的部署场景(边缘到多设备),并分析能耗与时间的权衡。
Figure 1: The basic framework of interactive AI and four images generated with the prompt “A man sits in the street” . Part A is a man engrossed in a book against vibrant street art appeals to users with high openness . Part B is a formally dressed man on a clean street, resonating with users high i
Figure 1: The basic framework of interactive AI and four images generated with the prompt “A man sits in the street” . Part A is a man engrossed in a book against vibrant street art appeals to users with high openness . Part B is a formally dressed man on a clean street, resonating with users high i

实验结果

研究问题

  • RQ1C1:如何高效获取具备人类感知能力的主观QoE反馈,以指导资源分配?
  • RQ2C2:在考虑用户个性的同时,如何利用网络能力实现能效高、时延低的GDM推理?
  • RQ3在考虑语义提示相似性的情况下,分布式GDM推理如何在能源、时延和QoE约束下最大化总QoE?

主要发现

  • 仿真结果表明,与传统DDPG相比,G-DDPG使总QoE提升15%。
  • 分布式GDM推理通过在语义相似的提示之间共享去噪步骤来减少能量和时间。
  • 由LLM驱动的生成代理模拟多样的用户个性以提供QoE反馈,降低对人类反馈的需求。
  • 基于边缘的协作推理通过将最终内容保留在边缘设备上来提升隐私,同时实现能效的协作。
  • RLLI框架将深度强化学习与LLM反馈结合起来,以适应动态的无线环境和用户特征。
Figure 2: The working principle of the GDM and motivations behind distributed denoising inference process. Part A depicts the cooperative inference process across devices where, starting with Gaussian noise on Device 2 , it denoises using Prompt 2 before Devices 1 and $3$ continue in succession towa
Figure 2: The working principle of the GDM and motivations behind distributed denoising inference process. Part A depicts the cooperative inference process across devices where, starting with Gaussian noise on Device 2 , it denoises using Prompt 2 before Devices 1 and $3$ continue in succession towa

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。