[论文解读] Risk-Based Optimization of Virtual Reality over Terahertz Reconfigurable Intelligent Surfaces
本文提出了一种针对太赫兹可重构智能表面(RIS)上的虚拟现实(VR)的带风险感知的用户关联与调度框架,使用基于熵值-at-风险的分布式风险量化与Lyapunov优化,并结合深度策略学习以处理随机信道。
In this paper, the problem of associating reconfigurable intelligent surfaces (RISs) to virtual reality (VR) users is studied for a wireless VR network. In particular, this problem is considered within a cellular network that employs terahertz (THz) operated RISs acting as base stations. To provide a seamless VR experience, high data rates and reliable low latency need to be continuously guaranteed. To address these challenges, a novel risk-based framework based on the entropic value-at-risk is proposed for rate optimization and reliability performance. Furthermore, a Lyapunov optimization technique is used to reformulate the problem as a linear weighted function, while ensuring that higher order statistics of the queue length are maintained under a threshold. To address this problem, given the stochastic nature of the channel, a policy-based reinforcement learning (RL) algorithm is proposed. Since the state space is extremely large, the policy is learned through a deep-RL algorithm. In particular, a recurrent neural network (RNN) RL framework is proposed to capture the dynamic channel behavior and improve the speed of conventional RL policy-search algorithms. Simulation results demonstrate that the maximal queue length resulting from the proposed approach is only within 1% of the optimal solution. The results show a high accuracy and fast convergence for the RNN with a validation accuracy of 91.92%.
研究动机与目标
- 推动在THz RIS使能的网络中实现高速率、低时延VR的挑战。
- 提出基于风险的带宽和可靠性优化框架,使用熵值-at-风险(EVaR)。
- 通过Lyapunov优化确保队列稳定性并控制时延统计。
- 开发基于策略的算法,利用深度学习在大状态空间中学习用户关联。
- 通过仿真展示近似最优性能和快速收敛。
提出的方法
- 将下行VR服务建模为RIS对信号进行反射以服务移动用户的场景。
- 引入基于熵值-at-风险的风险度量,以捕捉更高阶的时延统计量。
- 应用Lyapunov优化将问题转化为带队列稳定性保证的线性加权目标。
- 提出基于策略的算法,在随机信道条件下执行用户关联。
- 利用深度学习框架学习策略以应对大规模状态空间和动态信道行为。
- 在仿真中展示框架的效率与收敛性。
实验结果
研究问题
- RQ1在随机信道条件下,如何在RIS使能的网络中最大化VR的速率与可靠性?
- RQ2EVaR是否能有效捕捉 RIS辅助的VR系统中的更高阶时延统计?
- RQ3Lyapunov重构和深度策略学习方法是否能实现近似最优的用户关联并具备可扩展的收敛性?
主要发现
- 在所提方法下,最大队列长度仅为最优解的1%以内。
- 学习策略的验证准确率达到91.92%。
- 该框架在仿真中实现了高准确性和快速收敛。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。