[论文解读] Language-Driven Interactive Traffic Trajectory Generation
InteractTraj 引入语言到代码编码器和代码到轨迹解码器,通过对车辆交互建模,在 WOMD 与 nuPlan 上实现最先进的真实感。
Realistic trajectory generation with natural language control is pivotal for advancing autonomous vehicle technology. However, previous methods focus on individual traffic participant trajectory generation, thus failing to account for the complexity of interactive traffic dynamics. In this work, we propose InteractTraj, the first language-driven traffic trajectory generator that can generate interactive traffic trajectories. InteractTraj interprets abstract trajectory descriptions into concrete formatted interaction-aware numerical codes and learns a mapping between these formatted codes and the final interactive trajectories. To interpret language descriptions, we propose a language-to-code encoder with a novel interaction-aware encoding strategy. To produce interactive traffic trajectories, we propose a code-to-trajectory decoder with interaction-aware feature aggregation that synergizes vehicle interactions with the environmental map and the vehicle moves. Extensive experiments show our method demonstrates superior performance over previous SoTA methods, offering a more realistic generation of interactive traffic trajectories with high controllability via diverse natural language commands. Our code is available at https://github.com/X1a-jk/InteractTraj.git
研究动机与目标
- Motivate realistic, controllable traffic trajectory generation with language input.
- Bridge abstract language descriptions to concrete interaction-aware representations.
- Leverage interaction-aware coding and aggregation to generate coherent multi-vehicle trajectories.
- Evaluate on real-world benchmarks (WOMD, nuPlan) against state-of-the-art baselines.
提出的方法
- Propose InteractTraj with a two-module architecture: a language-to-code encoder and a code-to-trajectory decoder.
- Encode language into three types of interaction-aware numerical codes: interaction codes, vehicle codes, and map codes.
- Use prompts for GPT-4 to produce codes that capture relative positions, distances, vehicle states, and map features.
- Decode codes into trajectories via a two-step interaction-aware feature aggregation that fuses map, vehicle, and interaction information.
- Train by reconstructing ground-truth trajectories from extracted codes and minimizing trajectory and relative-distance losses.
实验结果
研究问题
- RQ1How can natural language commands be converted into interaction-aware codes that reflect multi-vehicle dynamics?
- RQ2Can a code-to-trajectory decoder leverage these codes to generate realistic, interactive traffic trajectories?
- RQ3Do language-conditioned interactive trajectories outperform prior language-driven or non-interactive baselines on realism and controllability?
- RQ4What is the contribution of interaction codes and aggregation strategies to generation quality?
主要发现
| 数据集 | 方法 | mADE ↓ | minADE ↓ | mFDE ↓ | minFDE ↓ | SCR ↓ | HD ↓ |
|---|---|---|---|---|---|---|---|
| WOMD | TrafficGen | 9.531 | 1.440 | 20.106 | 3.690 | 0.086 | 5.733 |
| WOMD | LCTGen | 1.262 | 0.224 | 2.696 | 0.463 | 0.072 | 1.295 |
| WOMD | InteractTraj(w/o I) | 1.205 | 0.207 | 2.479 | 0.346 | 0.090 | 1.210 |
| WOMD | InteractTraj | 1.067 | 0.181 | 2.190 | 0.320 | 0.070 | 1.076 |
| nuPlan | TrafficGen | 9.418 | 1.416 | 19.686 | 3.627 | 0.082 | 5.874 |
| nuPlan | LCTGen | 1.161 | 0.218 | 2.497 | 0.448 | 0.074 | 1.301 |
| nuPlan | InteractTraj(w/o I) | 1.108 | 0.181 | 2.277 | 0.323 | 0.070 | 1.150 |
| nuPlan | InteractTraj | 0.962 | 0.160 | 1.987 | 0.321 | 0.067 | 1.129 |
- InteractTraj achieves SoTA realism on WOMD and nuPlan, with reduced errors compared to baselines.
- On WOMD, InteractTraj attains mADE 1.067, minADE 0.181, mFDE 2.190, minFDE 0.320, SCR 0.070, HD 1.076.
- On nuPlan, InteractTraj attains mADE 0.962, minADE 0.160, mFDE 1.987, minFDE 0.321, SCR 0.067, HD 1.129.
- An ablated version without interaction codes performs worse, confirming the effectiveness of interaction-aware inputs.
- User studies show higher preference for InteractTraj-generated scenarios over LCTGen across interaction types.
- Ablation studies demonstrate benefits from all proposed components and discretization choices.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。