Skip to main content
QUICK REVIEW

[论文解读] Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI

Yang Liu, Weixing Chen|arXiv (Cornell University)|Jul 9, 2024
Robotics and Automated Systems被引用 11
一句话总结

对 Embodied AI 在 MLM/WM 时代的全面综述,涵盖具身感知、交互、代理、仿真器与仿真-实际自适应,以及基准与未来方向。

ABSTRACT

Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial General Intelligence (AGI) and serves as a foundation for various applications (e.g., intelligent mechatronics systems, smart manufacturing) that bridge cyberspace and the physical world. Recently, the emergence of Multi-modal Large Models (MLMs) and World Models (WMs) have attracted significant attention due to their remarkable perception, interaction, and reasoning capabilities, making them a promising architecture for embodied agents. In this survey, we give a comprehensive exploration of the latest advancements in Embodied AI. Our analysis firstly navigates through the forefront of representative works of embodied robots and simulators, to fully understand the research focuses and their limitations. Then, we analyze four main research targets: 1) embodied perception, 2) embodied interaction, 3) embodied agent, and 4) sim-to-real adaptation, covering state-of-the-art methods, essential paradigms, and comprehensive datasets. Additionally, we explore the complexities of MLMs in virtual and real embodied agents, highlighting their significance in facilitating interactions in digital and physical environments. Finally, we summarize the challenges and limitations of embodied AI and discuss potential future directions. We hope this survey will serve as a foundational reference for the research community. The associated project can be found at https://github.com/HCPLab-SYSU/Embodied_AI_Paper_List.

研究动机与目标

  • Survey the landscape of Embodied AI across cyber space to physical world.
  • Analyze representative embodied robots and simulators to identify focus areas and limitations.
  • Synthesize four main research targets: embodied perception, embodied interaction, embodied agents, and sim-to-real adaptation.
  • Discuss MLMs and World Models in enabling embodied agents and highlight datasets and benchmarks.
  • Identify challenges and outline future directions for Embodied AI and AGI implications.

提出的方法

  • Systematic review of embodied robots, simulators, and four core tasks: visual active perception, embodied interaction, multi-modal embodied agents, and sim-to-real robotic control.
  • Categorization and benchmarking of state-of-the-art methods, paradigms, and datasets across simulators and real-world benchmarks.
  • Discussion of MLMs (Multi-modal Large Models) and World Models as brain-like components for embodied agents.
  • Comparison of general-purpose simulators and real-scene based simulators to evaluate research progress.
  • Synthesis of challenges, limitations, and potential future directions for AGI-oriented embodied AI.

实验结果

研究问题

  • RQ1What are the latest advancements and representative works in Embodied AI across robots and simulators?
  • RQ2How do embodied perception, interaction, agents, and sim-to-real adaptations address the cyber-physical alignment goal?
  • RQ3What datasets, benchmarks, and simulators best support embodied AI research in the MLM/WM era?
  • RQ4What are the key challenges and potential future directions toward AGI via embodied AI?

主要发现

  • Embodied AI integrates perception, language, and world models to enable interactions with virtual and physical environments.
  • MLMs and World Models are shaping the brain-like capabilities of embodied agents for perception, reasoning, and task decomposition.
  • A wide spectrum of simulators (general and real-scene based) supports cost-effective experimentation and benchmarking.
  • Current surveys lag behind MLM-era developments, and this work provides a comprehensive, updated taxonomy and benchmarking discussion.
  • Identified challenges include long-term memory, understanding complex intentions, and effective sim-to-real transfer, with proposed future directions for AGI-oriented embodied AI.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。