QUICK REVIEW

[논문 리뷰] Deriving Character Logic from Storyline as Codified Decision Trees

Letian Peng, Kun Zhou|arXiv (Cornell University)|2026. 01. 15.

Artificial Intelligence in Games인용 수 0

한 줄 요약

본 논문은 Codified Decision Trees (CDT)를 소개하는데, 이는 이야기로부터 실행 가능하고 상황 인식적인 캐릭터 프로필을 유도하는 데이터 기반 프레임워크이며, CDT가 여러 RP 벤치마크에서 baselines와 인간이 작성한 프로필을 능가함을 보여준다.

ABSTRACT

Role-playing (RP) agents rely on behavioral profiles to act consistently across diverse narrative contexts, yet existing profiles are largely unstructured, non-executable, and weakly validated, leading to brittle agent behavior. We propose Codified Decision Trees (CDT), a data-driven framework that induces an executable and interpretable decision structure from large-scale narrative data. CDT represents behavioral profiles as a tree of conditional rules, where internal nodes correspond to validated scene conditions and leaves encode grounded behavioral statements, enabling deterministic retrieval of context-appropriate rules at execution time. The tree is learned by iteratively inducing candidate scene-action rules, validating them against data, and refining them through hierarchical specialization, yielding profiles that support transparent inspection and principled updates. Across multiple benchmarks, CDT substantially outperforms human-written profiles and prior profile induction methods on $85$ characters across $16$ artifacts, indicating that codified and validated behavioral representations lead to more reliable agent grounding.

연구 동기 및 목표

interpretable, executable profiles로 grounding된 역할 수행 에이전트를 동기부여한다.
대규모 서사 자료에서 형식화된 규칙 기반의 캐릭터 행동을 자동으로 도출한다.
行为 규칙의 투명한 검사, 편집, 원칙적 업데이트를 가능하게 한다.
CDT의 우수성을 인간이 작성한 프로필 및 다른 프로파일링 방법과 여러 벤치마크에서 보여준다.

제안 방법

semantic 임베딩으로 장면-행동 쌍을 클러스터링하여 규칙성 surface를 도출한다.
클러스터 내에서 if-then 트리거를 제안하여 코디드 규칙을 형성한다.
전체 데이터셋에서 가설적 트리거를 검증하고 CDT를 재귀적으로 확장한다.
추론 시 CDT를 순회하며 구분 가능한 질문에 답하고 행동 생성을 위한 grounded 진술을 축적한다.
Vanilla prompting, Fine-tuning, RICL, ETA를 포함한 벤치마크에서 CDT를 비교한다.
ablation 및 변형(CDT-Lite, Wikified/Verbalized CDT)과 데이터 증가에 따른 확장성을 분석한다.

실험 결과

연구 질문

RQ1스토리라인에서 도출된 코디드, 실행 가능한 규칙이 전통적 텍스트 프로필보다 RP의 grounding에 더 효과적일 수 있는가?
RQ2계층적이고 검증된 CDT 구조가 다양한 캐릭터와.artifacts에서 행동 예측을 개선하는가?
RQ3CDT가 인간이 작성한 프로필 및 코디드 인간 프로필과 다양한 벤치마크와 데이터 규모에서 어떻게 비교되는가?
RQ4클러스터링, 다변화, 깊이와 같은 CDT 구성 요소가 성능 및 확장성에 어떤 영향을 미치는가?
RQ5목표 지향적이거나 관계 특정한 프로파일링으로 CDT를 적응시켜 타깃 행동을 포착할 수 있는가?

주요 결과

Artifact Group	Vanilla	Fine-tuning	RICL	ETA	CDT (Ours)	CDT-Lite (Ours)	Human Profile	Codified Human Profile
Fandom Avg	55.57	45.68	56.01	56.91	60.82	61.01	58.33	59.30
Bandori Avg	65.50	62.86	68.86	72.25	77.71	79.04	71.28	71.87

CDT 및 CDT-Lite는 미세하게 설정된 Fandom 및 Bandori 벤치마크에서 최상의 NLI 점수를 달성하며 Vanilla prompting, Fine-tuning, RICL 및 ETA를 능가한다.
CDT 및 CDT-Lite는 두 벤치마크에서 인간이 작성한 프로필 및 코디드 인간 프로필도 능가한다.
더 많은 학습 데이터가 CDT 성능을 강화하며, Fandom에서 modest 데이터로도 인간 프로필을 능가하고 Bandori에서도 지속적으로 이득을 얻는다.
ablations는 명시적 검증과 깊이가 더 강한 grounding에 기여함을 보이며, 클러스터링 제거나 다변화 제거는 성능을 저하시킨다.
Wikified 및 Verbalized CDT 변형도 강한 성능을 유지하며 런타임 순회를 필요로 하지 않는 텍스트 표현을 제공한다.
관계 주도형 CDT(목표 지향)가 특정 관계 하위 집합에서 성능을 향상시키며 CDT가 행동을 전문화할 수 있음을 시사한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.