QUICK REVIEW

[Paper Review] Personalized Dialogue Generation with Diversified Traits

Yinhe Zheng, Guanyi Chen|arXiv (Cornell University)|Jan 28, 2019

Topic Modeling45 references89 citations

TL;DR

This paper introduces a large-scale dataset with explicit personality traits for speakers and proposes trait-aware Seq2Seq models (PAA and PAB) to generate personalized responses conditioned on diversified traits.

ABSTRACT

Endowing a dialogue system with particular personality traits is essential to deliver more human-like conversations. However, due to the challenge of embodying personality via language expression and the lack of large-scale persona-labeled dialogue data, this research problem is still far from well-studied. In this paper, we investigate the problem of incorporating explicit personality traits in dialogue generation to deliver personalized dialogues. To this end, firstly, we construct PersonalDialog, a large-scale multi-turn dialogue dataset containing various traits from a large number of speakers. The dataset consists of 20.83M sessions and 56.25M utterances from 8.47M speakers. Each utterance is associated with a speaker who is marked with traits like Age, Gender, Location, Interest Tags, etc. Several anonymization schemes are designed to protect the privacy of each speaker. This large-scale dataset will facilitate not only the study of personalized dialogue generation, but also other researches on sociolinguistics or social science. Secondly, to study how personality traits can be captured and addressed in dialogue generation, we propose persona-aware dialogue generation models within the sequence to sequence learning framework. Explicit personality traits (structured by key-value pairs) are embedded using a trait fusion module. During the decoding process, two techniques, namely persona-aware attention and persona-aware bias, are devised to capture and address trait-related information. Experiments demonstrate that our model is able to address proper traits in different contexts. Case studies also show interesting results for this challenging research problem.

Motivation & Objective

Motivate and define the task of incorporating explicit personality traits into dialogue generation.
Provide a large-scale, real social conversation dataset with diversified traits for scalable training.
Develop persona-aware generation models that fuse traits and integrate them into decoding.

Proposed method

Construct PersonalDialog, a large-scale Chinese dialogue corpus with traits like Gender, Age, Location, and Interest Tags for 8.47M speakers across 20.83M sessions.
Encode each trait as an embedding and merge them with a personality trait fusion module to form a persona representation v_p.
Implement two decoding integrations of v_p: (i) Persona-Aware Attention (PAA) that conditions attention weights on v_p, and (ii) Persona-Aware Bias (PAB) that adds a persona bias to the generation distribution with a gating mechanism.
Explore three trait fusion strategies: Traits Attention, Traits Average, and Traits Concatenation.
Use a Seq2Seq framework with a two-layer BiGRU encoder and two-layer GRU decoder, with Bahdanau-style attention, conditioned on v_p.

Experimental results

Research questions

RQ1Can explicit personality traits be effectively learned and expressed in generated dialogues from large-scale social data?
RQ2How do different trait fusion methods affect the integration of personality information into the decoder?
RQ3Which decoding strategy (PAA vs PAB) best leverages persona representations to produce trait-consistent responses?
RQ4How do various trait fusion schemes (attention, average, concatenation) impact the expression of diversified traits across contexts.

Key findings

The model can address proper, diversified traits in different contexts.
Persona-Aware Bias (PAB) generally outperforms Persona-Aware Attention (PAA) in experiments.
A large-scale dataset (PersonalDialog) with real social conversations and diversified traits supports training for personalized dialogue generation.
Trait fusion enables the model to generate responses that reflect explicit trait information without requiring the generated text to contain exact trait values.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.