QUICK REVIEW

[논문 리뷰] What Breaks Embodied AI Security:LLM Vulnerabilities, CPS Flaws,or Something Else?

Boyang Ma, Hechuan Guo|arXiv (Cornell University)|2026. 02. 19.

Adversarial Robustness in Machine Learning인용 수 0

한 줄 요약

논문은 구현된 AI 보안 실패를 LLM 취약점이나 CPS 결함만으로 설명할 수 없다고 주장한다; 대신 지각–의사결정–행동 루프 전체의 시스템 수준 구현으로 인한 불일치가 실패를 야기하며, 전체론적 보안 접근이 필요하다.

ABSTRACT

Embodied AI systems (e.g., autonomous vehicles, service robots, and LLM-driven interactive agents) are rapidly transitioning from controlled environments to safety critical real-world deployments. Unlike disembodied AI, failures in embodied intelligence lead to irreversible physical consequences, raising fundamental questions about security, safety, and reliability. While existing research predominantly analyzes embodied AI through the lenses of Large Language Model (LLM) vulnerabilities or classical Cyber-Physical System (CPS) failures, this survey argues that these perspectives are individually insufficient to explain many observed breakdowns in modern embodied systems. We posit that a significant class of failures arises from embodiment-induced system-level mismatches, rather than from isolated model flaws or traditional CPS attacks. Specifically, we identify four core insights that explain why embodied AI is fundamentally harder to secure: (i) semantic correctness does not imply physical safety, as language-level reasoning abstracts away geometry, dynamics, and contact constraints; (ii) identical actions can lead to drastically different outcomes across physical states due to nonlinear dynamics and state uncertainty; (iii) small errors propagate and amplify across tightly coupled perception-decision-action loops; and (iv) safety is not compositional across time or system layers, enabling locally safe decisions to accumulate into globally unsafe behavior. These insights suggest that securing embodied AI requires moving beyond component-level defenses toward system-level reasoning about physical risk, uncertainty, and failure propagation.

연구 동기 및 목표

LLM 기반 구현 에이전트의 취약점을 체계적으로 분류(의미적 무결성, 교차 모달 바인딩, 환경 기반 조작).
학습 가능 구현 시스템에서 센서, 제어, 구동장치, 타이밍 공격이 어떻게 나타나는지 이해하기 위해 고전적 CPS 위협 모델을 재검토.
LLM 또는 CPS 관점 너머의 구현으로 인한 실패의 근본 원인을 식별하기 위해 통찰을 종합.
원칙에 입각한 시스템 수준 보안으로 향하는 개방 과제 및 연구 방향 제시.

제안 방법

LLM 취약점, CPS 결함, 구현 특유의 도전에 걸친 3차원 보안 분석 정의.
시스템 수준 신뢰 가정 및 실패 모드에 공격 매핑하는 분류체계 개발.
구성요소 수준의 방어를 넘어선 구현이 안전과 보안을 왜 복잡하게 만드는지 설명하는 네 가지 핵심 통찰 분석.
물리적 위험, 불확실성, 실패 전파에 초점을 맞춘 시스템 차원의 보안 패라다임을 제안하기 위해 교차 도메인 발견을 종합.

실험 결과

연구 질문

RQ1LLM 기반 구현 에이전트의 지배적 취약성 클래스는 무엇이며, 전통적인 LLM 리스크와 어떻게 차이가 있는가?
RQ2학습 가능 구현 시스템에서 CPS 스타일의 공격은 어떻게 나타나며, 어떤 차이가 남아 있는가?
RQ3어떤 구현 특유의 메커니즘이 LLM 또는 CPS 결함만으로 환원될 수 없게 만드는가?
RQ4물리적 위험 및 실패 전파에 대한 시스템 수준 추론으로 보안 방어를 어떻게 전환할 수 있는가?

주요 결과

의미적 정확성이 물리적 안전을 보장하지 않는 이유는 언어 추론으로 추상화된 기하학, 역학, 접촉 제약 때문이 아니다.
동일한 행동도 서로 다른 물리적 상태에서 다른 결과를 낳을 수 있다.
오류는 지각–의사결정–행동 루프의 긴밀하게 연결된 고리에서 전파되고 증폭되어 시간이 지남에 따라 작은 실수가 커진다.
구현형 AI의 안전성은 시간이나 시스템 계층에 걸쳐 조합적으로 보장되지 않으며, 지역적으로 안전한 결정이 전역적으로 안전하지 않은 행동으로 누적될 수 있다.
구현형 AI 보안은 전통적 LLM 또는 CPS 방어를 넘어 물리적 위험, 불확실성, 실패 전파에 대한 시스템 수준 추론이 필요하다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.