QUICK REVIEW

[論文レビュー] Primary-Fine Decoupling for Action Generation in Robotic Imitation

Xiaohan Lei, Min Wang|arXiv (Cornell University)|Feb 25, 2026

Robot Manipulation and Learning被引用数 0

ひとこと要約

PF-DAGは、まず行動チャンクをモードに離散化して粗い制御を行い、次にモード条件付きMeanFlowを用いて細かな連続行動を生成する二段階の手法で、MSEを低く抑え、多タスク性能を発揮します。

ABSTRACT

Multi-modal distribution in robotic manipulation action sequences poses critical challenges for imitation learning. To this end, existing approaches often model the action space as either a discrete set of tokens or a continuous, latent-variable distribution. However, both approaches present trade-offs: some methods discretize actions into tokens and therefore lose fine-grained action variations, while others generate continuous actions in a single stage tend to produce unstable mode transitions. To address these limitations, we propose Primary-Fine Decoupling for Action Generation (PF-DAG), a two-stage framework that decouples coarse action consistency from fine-grained variations. First, we compress action chunks into a small set of discrete modes, enabling a lightweight policy to select consistent coarse modes and avoid mode bouncing. Second, a mode conditioned MeanFlow policy is learned to generate high-fidelity continuous actions. Theoretically, we prove PF-DAG's two-stage design achieves a strictly lower MSE bound than single-stage generative policies. Empirically, PF-DAG outperforms state-of-the-art baselines across 56 tasks from Adroit, DexArt, and MetaWorld benchmarks. It further generalizes to real-world tactile dexterous manipulation tasks. Our work demonstrates that explicit mode-level decoupling enables both robust multi-modal modeling and reactive closed-loop control for robotic manipulation.

研究の動機と目的

ロボット操作行動列における多模態分布へ対応する。
模倣学習におけるモード跳ねや不安定な遷移を緩和する。
離散モード選択と連続的行動生成を組み合わせた二段階フレームワークを開発する。
分離設計の理論的利点を単一段階ポリシーよりも示す。
多様なベンチマークと実世界タスクでの経験的性能を示す。

提案手法

粗いモード選択の軽量ポリシーを可能にするため、行動チャンクを離散モードの小さな集合に圧縮する。
モード条件付きMeanFlowポリシーを学習し、高忠実度の連続行動を生成する。
二段階設計が単一段階ポリシーより厳密に低いMSE境界を達成する理論的証明を提供する。
PF-DAGを複数のベンチマークで評価し、頑健性と一般化能力を評価する。

実験結果

リサーチクエスチョン

RQ1粗い行動モードと細かな連続行動を分離することで、模倣学習における不安定なモード遷移を低減できるか。
RQ2二段階のPF-DAGアプローチは単一段階生成ポリシーより低いMSE境界を達成するか。
RQ3PF-DAGは多様なロボット操作ベンチマークと現実世界の触覚タスクでどの程度性能を示すか。
RQ4離散モード圧縮は、堅牢なポリシー学習を可能にしつつ、基本的な行動変動性を保持するか。

主な発見

PF-DAGはAdroit、DexArt、MetaWorldの56タスクで最先端のベースラインを上回る。
二段階設計は単一段階ポリシーより厳密に低いMSE境界を達成する。
本手法は現実の触覚巧緻操作タスクへ一般化可能である。
明示的なモードレベルの分離により、堅牢な多模態モデリングと反応的な閉ループ制御の両立を可能にする。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。