QUICK REVIEW

[论文解读] Primary-Fine Decoupling for Action Generation in Robotic Imitation

Xiaohan Lei, Min Wang|arXiv (Cornell University)|Feb 25, 2026

Robot Manipulation and Learning被引用 0

一句话总结

PF-DAG 引入一个两阶段动作生成框架，首先将动作片段离散化为模式以实现粗控，然后使用模式条件的 MeanFlow 生成细粒度的连续动作，达到更低的均方误差（MSE）和在多任务上的强大表现。

ABSTRACT

Multi-modal distribution in robotic manipulation action sequences poses critical challenges for imitation learning. To this end, existing approaches often model the action space as either a discrete set of tokens or a continuous, latent-variable distribution. However, both approaches present trade-offs: some methods discretize actions into tokens and therefore lose fine-grained action variations, while others generate continuous actions in a single stage tend to produce unstable mode transitions. To address these limitations, we propose Primary-Fine Decoupling for Action Generation (PF-DAG), a two-stage framework that decouples coarse action consistency from fine-grained variations. First, we compress action chunks into a small set of discrete modes, enabling a lightweight policy to select consistent coarse modes and avoid mode bouncing. Second, a mode conditioned MeanFlow policy is learned to generate high-fidelity continuous actions. Theoretically, we prove PF-DAG's two-stage design achieves a strictly lower MSE bound than single-stage generative policies. Empirically, PF-DAG outperforms state-of-the-art baselines across 56 tasks from Adroit, DexArt, and MetaWorld benchmarks. It further generalizes to real-world tactile dexterous manipulation tasks. Our work demonstrates that explicit mode-level decoupling enables both robust multi-modal modeling and reactive closed-loop control for robotic manipulation.

研究动机与目标

解决机器人操纵动作序列中的多模态分布问题。
减轻模态跳跃和模仿学习中的不稳定过渡。
开发一个将离散模式选择与连续动作生成结合的两阶段框架。
证明解耦设计在理论上的优于单阶段策略。
在多样化基准和真实世界任务中展示经验性能。

提出的方法

将动作片段压缩为少量离散模式，以实现轻量级策略用于粗粒度模式选择。
学习一个模式条件的 MeanFlow 策略，以生成高保真连续动作。
提供理论证明：两阶段设计在均方误差界限上严格低于单阶段策略。
在多个基准上评估 PF-DAG，以评估鲁棒性和泛化能力。

实验结果

研究问题

RQ1将粗粒度动作模式与细粒度动作解耦，是否能降低模态转换的不稳定性？
RQ2两阶段 PF-DAG 方法是否在 MSE 上界方面优于单阶段生成策略？
RQ3PF-DAG 在多样的机器人操纵基准和真实世界触觉任务中的表现如何？
RQ4离散模式压缩是否保留了必要的动作变异性，同时实现鲁棒策略学习？

主要发现

PF-DAG 在 Adroit、DexArt 和 MetaWorld 的56个任务上优于最先进的基线。
两阶段设计在均方误差上实现严格低于单阶段策略。
该方法可推广至真实世界的触觉灵巧操控任务。
显式的模式层级解耦实现了既鲁棒的多模态建模，又具备反应性闭环控制的能力。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。