QUICK REVIEW

[논문 리뷰] PersonalAlign: Hierarchical Implicit Intent Alignment for Personalized GUI Agent with Long-Term User-Centric Records

Yibo Lyu, Gongwei Chen|arXiv (Cornell University)|2026. 01. 14.

Personal Information Management and User Behavior인용 수 0

한 줄 요약

PersonalAlign을 도입하고 HIM-Agent 및 AndroidIntent 벤치마크를 사용하여 GUI 에이전트가 장기 기록을 활용해 암시적 사용자 의도를 맞추고 실행 및 선제적 성능을 개선함을 보여준다.

ABSTRACT

While GUI agents have shown strong performance under explicit and completion instructions, real-world deployment requires aligning with users' more complex implicit intents. In this work, we highlight Hierarchical Implicit Intent Alignment for Personalized GUI Agent (PersonalAlign), a new agent task that requires agents to leverage long-term user records as persistent context to resolve omitted preferences in vague instructions and anticipate latent routines by user state for proactive assistance. To facilitate this study, we introduce AndroidIntent, a benchmark designed to evaluate agents' ability in resolving vague instructions and providing proactive suggestions through reasoning over long-term user records. We annotated 775 user-specific preferences and 215 routines from 20k long-term records across different users for evaluation. Furthermore, we introduce Hierarchical Intent Memory Agent (HIM-Agent), which maintains a continuously updating personal memory and hierarchically organizes user preferences and routines for personalization. Finally, we evaluate a range of GUI agents on AndroidIntent, including GPT-5, Qwen3-VL, and UI-TARS, further results show that HIM-Agent significantly improves both execution and proactive performance by 15.7% and 7.3%.

연구 동기 및 목표

GUI 에이전트가 명시적 지시를 넘어 사용자의 암시적 의도를 추론할 필요성을 제시한다.
선호도 및 루틴 정합을 다루기 위한 암시적 의도의 계층적 관점을 제안한다.
평가를 위한 장기 사용자 기록에 주석을 다는 AndroidIntent를 생성한다.
개인화용 장기 기억을 유지하고 정리하기 위한 HIM-Agent를 개발한다.
AndroidIntent 벤치마크에서 HIM-Agent의 향상된 성능을 입증한다.

제안 방법

Reactive, Preference, Routine 정합의 세 가지 패러다임으로 PersonalAlign 작업을 정의한다.
주석 작성을 위한 계층적 필터링이 있는 장기적이고 사용자 중심의 GUI 벤치마크 AndroidIntent를 구성한다.
메모리를 점진적으로 업데이트하기 위한 Streaming Aggregation Module이 포함된 HIM-Agent를 제안한다.
선호도에 대한 Execution-based Preference Filter와 루틴에 대한 State-based Routine Filter를 개발하여 선호도와 루틴의 계층적 메모리를 형성한다.
밀집 임베딩을 희소 Jaccard와 결합하고 메모리 업데이트에서 행동 경로 유사성을 측정하기 위해 DTW를 사용한다.
다수의 GUI 에이전트(GPT-5, Qwen3-VL, UI-TARS 등)에서 평가하고 성능 향상을 보여준다.

실험 결과

연구 질문

RQ1명확한 지시가 없을 때 GUI 에이전트가 장기 기록에서 사용자의 암시적 선호를 어떻게 추론하고 정합시킬 수 있는가?
RQ2계층적 메모리 구조와 스트리밍 업데이트가 GUI 에이전트의 선호도 및 루틴 의도를 어떻게 지원할 수 있는가?
RQ3개인화된 암시적 의도 정합이 GUI 작업에서 반응적 실행 및 선제적 보조를 어느 정도까지 향상시키는가?

주요 결과

HIM-Agent는 베이스라인에 비해 실행 및 선제적 성능을 각각 15.7%, 7.3% 향상시킨다.
AndroidIntent는 91명의 사용자에 걸친 20k 장기 기록에서 775개의 선호 의도와 215개의 루틴 의도에 주석이 달린 실제 정답을 제공한다.
Streaming Aggregation Module과 계층적 메모리(선호 vs 루틴)가 안정적이고 확장 가능한 개인화를 가능하게 한다.
절단 연구는 Execution-based Preference Filter의 모든 구성요소가 성능 향상에 기여함을 보여주며, 전체 모듈은 주목할 만한 CER 개선을 가져온다.
선제적 평가에서 오픈 소스 및 폐쇄 소스 GUI 에이전트 간 의도 정합과 오탐(false alarms) 간의 균형이 더 나아짐을 보여준다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.