QUICK REVIEW

[논문 리뷰] Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start Users

Shijun Li, Wenqiang Lei|arXiv (Cornell University)|2020. 05. 23.

Advanced Bandit Algorithms Research참고 문헌 55인용 수 33

한 줄 요약

본 논문은 속성 묻기와 아이템 추천을 하나의 암 공간으로 통합하는 대화형 톰슨 샘플링 프레임워크 ConTS를 제시하여 냉시작 대화형 추천에서 탐색-활용 균형을 최적화한다. 이 방법은 여러 데이터셋에서 최첨단 CRS 방법들을 능가한다.

ABSTRACT

Static recommendation methods like collaborative filtering suffer from the inherent limitation of performing real-time personalization for cold-start users. Online recommendation, e.g., multi-armed bandit approach, addresses this limitation by interactively exploring user preference online and pursuing the exploration-exploitation (EE) trade-off. However, existing bandit-based methods model recommendation actions homogeneously. Specifically, they only consider the items as the arms, being incapable of handling the item attributes, which naturally provide interpretable information of user's current demands and can effectively filter out undesired items. In this work, we consider the conversational recommendation for cold-start users, where a system can both ask the attributes from and recommend items to a user interactively. This important scenario was studied in a recent work. However, it employs a hand-crafted function to decide when to ask attributes or make recommendations. Such separate modeling of attributes and items makes the effectiveness of the system highly rely on the choice of the hand-crafted function, thus introducing fragility to the system. To address this limitation, we seamlessly unify attributes and items in the same arm space and achieve their EE trade-offs automatically using the framework of Thompson Sampling. Our Conversational Thompson Sampling (ConTS) model holistically solves all questions in conversational recommendation by choosing the arm with the maximal reward to play. Extensive experiments on three benchmark datasets show that ConTS outperforms the state-of-the-art methods Conversational UCB (ConUCB) and Estimation-Action-Reflection model in both metrics of success rate and average number of conversation turns.

연구 동기 및 목표

실시간 개인화와 상호작용 속성 묻기를 통해 냉시작 대화형 추천 문제를 해결한다.
단일 암 공간에서 속성과 아이템을 통합하여 의사결정을 단순화하고 강건성을 높인다.
맥락적 톰슨 샘플링을 활용하여 자연스럽게 탐색과 활용의 균형을 달성한다.

제안 방법

속성 및 아이템을 같은 암 공간의 구분 없이 암으로 모델링하고 통합 보상을 사용해 암을 선택한다.
사용자 임베딩을 기존 사용자로부터 초기화하고 상호작용 중에 후방 매개변수를 업데이트한다.
맥락적 톰슨 샘플링을 사용해 사용자 임베딩을 샘플링하고 보상이 가장 큰 암을 선택한다.
암의 보상을 사용자-암 선호도와 속성 적합도의 조합으로 정의하여 묻기와 추천 행동을 모두 안내한다.
베이지안 개인화 순위(Bayesian Personalized Ranking)를 사용해 속성 및 아이템에 대한 오프라인 FM 임베딩을 학습하여 모든 암을 공유 임베딩 공간에 배치한다.

실험 결과

연구 질문

RQ1속성 묻기와 아이템 추천을 unified한 암 공간 접근법이 냉시작 사용자의 묻기-추천 간 균형을 효과적으로 달성할 수 있는가?
RQ2맥락적 톰슨 샘플링이 속성-아이템 통합 CRS에서 자연스러운 탐색-활용 균형을 제공하는가?
RQ3ConTS가 성공률과 대화 효율성 측면에서 기존 CRS 방법(ConUCB, EAR)과 어떻게 비교되는가?
RQ4속성 기반 피드백이 사용자 임베딩 및 암 보상 업데이트에 어떤 영향을 미치는가?

주요 결과

ConTS는 냉시작 사용자의 성공률과 평균 대화 회수 모두에서 최첨단 CRS 방법인 ConUCB 및 EAR를 능가한다.
속성 및 아이템을 단일 암 공간으로 모델링하는 것은 의사결정을 단순화하고 묻기 대 추천 간의 수작업 타이밍 규칙의 필요성을 제거한다.
맥락적 톰슨 샘플링은 후방 샘플링과 업데이트를 통해 자연스러운 탐색-활용 균형을 제공한다.
Yelp, LastFM, 신규 Kuaishou 데이터셋에 대한 실험은 다양한 도메인에서의 강건성을 보여준다.
업데이트 메커니즘은 사용자 피드백과 알려진 선호 속성을 반영하여 암 보상과 사용자 임베딩을 정교화한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.