QUICK REVIEW

[論文レビュー] JCAS-MARL: Joint Communication and Sensing UAV Networks via Resource-Constrained Multi-Agent Reinforcement Learning

İslam Güven, Mehmet Parlak|arXiv (Cornell University)|Mar 13, 2026

UAV Applications and Optimization被引用数 0

ひとこと要約

JCAS-MARLは、 UAVが動作とOFDMパイロット密度を共同最適化して、廃棄物ホットスポット検出の際に感知、通信、エネルギー/CO2制約をバランスさせる、持続可能性を意識したマルチエージェントRLフレームワークを提案します。

ABSTRACT

Multi-UAV networks are increasingly deployed for large-scale inspection and monitoring missions, where operational performance depends on the coordination of sensing reliability, communication quality, and energy constraints. In particular, the rapid increase in overflowing waste bins and illegal dumping sites has created a need for efficient detection of waste hotspots. In this work, we introduce JCAS-MARL, a resource-aware multi-agent reinforcement learning (MARL) framework for joint communication and sensing (JCAS)-enabled UAV networks. Within this framework, multiple UAVs operate in a shared environment where each agent jointly controls its trajectory and the resource allocation of an OFDM waveform used simultaneously for sensing and communication. Battery consumption, charging behavior, and associated CO$_2$ emissions are incorporated into the system state to model realistic operational constraints. Information sharing occurs over a dynamic communication graph determined by UAV positions and wireless channel conditions. Waste hotspot detection requires consensus among multiple UAVs to improve reliability. Using this environment, we investigate how MARL policies exploit the sensing-communication-energy trade-off in JCAS-enabled UAV networks. Simulation results demonstrate that adaptive pilot-density control learned by the agents can outperform static configurations, particularly in scenarios where sensing accuracy and communication connectivity vary across the environment.

研究の動機と目的

環境モニタリングタスクのためにエネルギーと炭素制約の下でUAVの移動性、 sensing、communicationを共同最適化することを動機づける。
軌道とOFDMパイロット密度を協調する部分観測型MARL環境を開発する。
意思決定プロセスにバッテリ動力学、再生可能エネルギー充電、CO2排出を組み込む。
ホットスポット検出のコンセンサスベースの共有を支援するためにUAV間の知識伝搬を可能にする。

提案手法

エージェントはセンサ信頼性と通信スループットのバランスを取るために2DモーションとOFDMパイロット密度の両方を制御する。
sensing SNRをデータ負荷によるパイロット密度の影響で低減させ、検出確率マッピングを用いたモノスタティックOFDM JCASモデルを利用する。
Dec-POMDPとして定式化し、CTDEを用いたPPOで集中トレーニングと分散実行を適用する。
報酬関数にエネルギーコストとグリッド充電および再生可能分のコストを組み込む。
ホットスポット検出を共有するために通信グラフ上でコンセンサス様のプロセスを介したマルチホップ知識伝搬を実装する。

実験結果

リサーチクエスチョン

RQ1エネルギーと炭素制約の下で信頼性の高いホットスポット検出を最大化するために、UAV群は軌道とパイロット密度をどのように適応させるべきか？
RQ2JCAS対応MARLポリシーは、固定パイロット設定と比較して、さまざまなホットスポット密度と艦隊規模で優れるのか？
RQ3艦隊規模が検出成功、任務時間、エネルギー使用量、通信スループットに与える影響はどの程度か？
RQ4知識伝搬が動的な通信グラフにおける検出ホットスポットのコンセンサスにどう影響するのか？

主な発見

PPOポリシーはミッション成功率を高く達成し、例として10機UAVで約0.73、ホットスポット密度目標で約97%を達成。
任務時間は艦隊規模が大きくなると低下し、カバレッジが十分なときに飽和する。
総エネルギー消費は艦隊サイズとともに漸近的ではあるが、空間カバレージの改善により冗長な動作が減るため増加は緩やか。
適応的なパイロット密度制御は通信 throughputを維持しつつ複数センサーによるホットスポットの確認を可能にし、いくつかのシナリオで一定パイロット設定より優位である。
ホットスポット近傍でスループットを高く保ち、適応ポリシーは必要ない感知を回避して協調の帯域を節約する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。