QUICK REVIEW

[논문 리뷰] The PROPER Approach to Proactivity: Benchmarking and Advancing Knowledge Gap Navigation

Kirandeep Kaur, Vinayak Gupta|arXiv (Cornell University)|2026. 01. 14.

AI in Service Interactions인용 수 0

한 줄 요약

Proper는 지식 격차를 능동적으로 탐색하기 위한 두 에이전트 구조(DGA와 RGA)를 도입한다; 단일 턴 작업에서 특히 의료, 코딩, 쇼핑 도메인에서 강한 기반 모델보다 우수하며 다중 턴 상호작용에서도 성능이 뛰어나다.

ABSTRACT

Most language-based assistants follow a reactive ask-and-respond paradigm, requiring users to explicitly state their needs. As a result, relevant but unexpressed needs often go unmet. Existing proactive agents attempt to address this gap either by eliciting further clarification, preserving this burden, or by extrapolating future needs from context, often leading to unnecessary or mistimed interventions. We introduce ProPer, Proactivity-driven Personalized agents, a novel two-agent architecture consisting of a Dimension Generating Agent (DGA) and a Response Generating Agent (RGA). DGA, a fine-tuned LLM agent, leverages explicit user data to generate multiple implicit dimensions (latent aspects relevant to the user's task but not considered by the user) or knowledge gaps. These dimensions are selectively filtered using a reranker based on quality, diversity, and task relevance. RGA then balances explicit and implicit dimensions to tailor personalized responses with timely and proactive interventions. We evaluate ProPer across multiple domains using a structured, gap-aware rubric that measures coverage, initiative appropriateness, and intent alignment. Our results show that ProPer improves quality scores and win rates across all domains, achieving up to 84% gains in single-turn evaluation and consistent dominance in multi-turn interactions.

연구 동기 및 목표

프로액티비티를 명시적 사용자 의도와 잠재적 지식 격차의 균형을 맞추는 보정 문제로 formalize한다.
사용자 필요의 차원 기반 표현과 감독용 도메인 특화 벤치마크(ProPerBench)를 도입한다.
지식 격차 발견과 응답 생성을 분리하는 모듈식 이중 에이전트 아키텍처인 Proper를 제안한다.
의료, 코딩, 추천 도메인 전반에서 작업 유틸리티와 시기적절한 프로액티비를 개선했음을 입증한다.

제안 방법

Dimension Generating Agent (DGA) 를 미세조정하여 사용자 상태에서 암묵적이고 작업 관련 차원을 추론하고 후보 격차를 생성한다.
유틸리티 목표를 최적화하여 품질, 명시적 필요 정렬, 다양성을 균형 있게 조정하는 예산화된 후보 차원 부분집합을 선택하는 사후 보정 재랭커를 사용한다.
Response Generating Agent (RGA) 가 명시적 차원과 활성화된 암묵적 차원을 조건으로 하여 기본 응답을 업데이트한다.
End-to-end Proper 파이프라인: 상호작용 상태를 구성하고 baseline r0 를 생성한 뒤 DGA 가 차원을 제시하고 재랭커가 S_k* 를 선택, RGA 가 의도를 보존하면서 표적적인 프로액티브 정보를 추가하여 업데이트된 응답을 생성한다.

Figure 1: Different agent responses to a user query. The Reactive Agent provides immediate task fulfillment without exploring user context, goals, or learning needs. The Proactive Agent clarifies task-related ambiguities to optimize the immediate solution but remains confined to the user’s explicitl

실험 결과

연구 질문

RQ1RQ1: ProPer 가 강력한 기반 모델과 비교하여 도메인 전반에서 엔드 투 엔드 작업 유틸리티를 개선하는가?
RQ2RQ2: DGA, 재랭커, RGA 구성 요소 각각이 성능에 어떻게 기여하는가?
RQ3RQ3: 관찰된 개선이 단순한 장황함이 아닌 보정된 프로액티비 때문인가?
RQ4RQ4: 다중 턴 대화에서 ProPer의 견고함이 유지되는가?

주요 결과

Proper 는 의료, 코딩, PWAB 도메인에서 강력한 기본 LLM 및 코인-오브-생각 프롬프트에 비해 일관되게 작업 유틸리티를 개선한다.
엔드-투-엔드 이점은 단일 턴 평가에서 최대 84% 향상 및 다중 턴 상호작용에서의 우위를 포함한다.
DGA 를 제거하면 성능이 크게 감소하는 반면 재랭커 제거는 더 작은 저하를 보이며, 암묵적 차원 생성의 중요성을 강조한다.
DGA에서 파생된 차원은 기본 LLM 이 직접 생성한 차원보다 우수하여 학습된 잠재 격차의 가치를 보여준다.
활성화 및 다양성을 제어하는 보정 매개변수(lambda1, lambda2)가 도메인 민감도에 영향을 주며 의료 및 PWAB 도메인에서 더 높은 활성화가 이점을 준다.
다중 턴 평가에서 ProPer 는 11/12 의료, 9/12 코드 대회, 12/12 PWAB 대화에서 선호되며 보정된 프로액티비의 안정성을 보여준다.

Figure 2: Overview of the Proper framework. During training (A), the Dimension Generating Agent (DGA) is fine-tuned on successful interactions annotated with user- and system-explicit dimensions, learning task-specific priors. At inference (B), the DGA identifies explicit and candidate implicit dime

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.