Skip to main content
QUICK REVIEW

[论文解读] What Breaks Embodied AI Security:LLM Vulnerabilities, CPS Flaws,or Something Else?

Boyang Ma, Hechuan Guo|arXiv (Cornell University)|Feb 19, 2026
Adversarial Robustness in Machine Learning被引用 0
一句话总结

要点摘要:论文认为具身 AI 安全失败不能仅用大型语言模型(LLM)的脆弱性或 CPS 缺陷来解释;相反,感知–决策–行动循环中的系统层面具身导致的错配推动失败,需采取整体安全方法。

ABSTRACT

Embodied AI systems (e.g., autonomous vehicles, service robots, and LLM-driven interactive agents) are rapidly transitioning from controlled environments to safety critical real-world deployments. Unlike disembodied AI, failures in embodied intelligence lead to irreversible physical consequences, raising fundamental questions about security, safety, and reliability. While existing research predominantly analyzes embodied AI through the lenses of Large Language Model (LLM) vulnerabilities or classical Cyber-Physical System (CPS) failures, this survey argues that these perspectives are individually insufficient to explain many observed breakdowns in modern embodied systems. We posit that a significant class of failures arises from embodiment-induced system-level mismatches, rather than from isolated model flaws or traditional CPS attacks. Specifically, we identify four core insights that explain why embodied AI is fundamentally harder to secure: (i) semantic correctness does not imply physical safety, as language-level reasoning abstracts away geometry, dynamics, and contact constraints; (ii) identical actions can lead to drastically different outcomes across physical states due to nonlinear dynamics and state uncertainty; (iii) small errors propagate and amplify across tightly coupled perception-decision-action loops; and (iv) safety is not compositional across time or system layers, enabling locally safe decisions to accumulate into globally unsafe behavior. These insights suggest that securing embodied AI requires moving beyond component-level defenses toward system-level reasoning about physical risk, uncertainty, and failure propagation.

研究动机与目标

  • 系统性地对基于 LLM 的具身代理的漏洞进行分类(语义完整性、跨模态对齐、环境驱动的操控)。
  • 重新审视经典的 CPS 威胁模型,理解传感器、控制、执行器和时序攻击在基于学习的具身系统中的表现。
  • 综合洞见以识别超越 LLM 或 CPS 视角的具身导致的失败的根本原因。
  • 概述尚待解决的挑战与研究方向,指向具身 AI 的 principled 系统级安全。

提出的方法

  • 定义覆盖 LLM 漏洞、CPS 缺陷和具身特定挑战的三维安全分析。
  • 建立将攻击映射到系统级信任假设与故障模式的分类法。
  • 分析四个核心洞见,解释为何具身使安全性超越组件级防护而更复杂。
  • 综合跨领域发现,提出聚焦物理风险、不确定性与故障传播的系统级安全范式。

实验结果

研究问题

  • RQ1基于 LLM 的具身代理中主要的漏洞类别有哪些?它们与传统 LLM 风险有何不同?
  • RQ2CPS 风格的攻击在学习驱动的具身系统中如何表现?还有哪些空缺?
  • RQ3哪些具身特有的机制会导致仅靠 LLM 或 CPS 故障无法解释的失败?
  • RQ4如何将安全防御从组件级转向对物理风险和故障传播的系统级推理?

主要发现

  • 语义正确性并不能保证物理安全,因为几何、动力学和接触约束被语言推理所抽象化。
  • 在相同的物理状态下,因非线性动力学和不确定性,相同的动作也可能带来不同的结果。
  • 错误会在感知–决策–行动循环紧密耦合中传播和放大,随着时间放大微小错误。
  • 具身 AI 的安全性在时间或系统层面并非可组合的,局部的安全决策可能积累为全局的不安全行为。
  • 实现具身 AI 的安全需要对物理风险、不确定性和故障传播进行系统级推理,超越传统的 LLM 或 CPS 防御。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。