Skip to main content
QUICK REVIEW

[Paper Review] Building a Conversational Agent Overnight with Dialogue Self-Play

Pararth Shah, Dilek Hakkani‐Tür|arXiv (Cornell University)|Jan 15, 2018
Topic Modeling17 references161 citations
TL;DR

The paper introduces M2M, a framework that combines automated self-play and crowdsourcing to rapidly bootstrap end-to-end goal-oriented dialogue agents, producing diverse, high-quality datasets in hours. It builds a task-agnostic pipeline driven by a task schema and API client to generate outlines, which are paraphrased by crowdworkers to create natural language dialogues for training.

ABSTRACT

We propose Machines Talking To Machines (M2M), a framework combining automation and crowdsourcing to rapidly bootstrap end-to-end dialogue agents for goal-oriented dialogues in arbitrary domains. M2M scales to new tasks with just a task schema and an API client from the dialogue system developer, but it is also customizable to cater to task-specific interactions. Compared to the Wizard-of-Oz approach for data collection, M2M achieves greater diversity and coverage of salient dialogue flows while maintaining the naturalness of individual utterances. In the first phase, a simulated user bot and a domain-agnostic system bot converse to exhaustively generate dialogue "outlines", i.e. sequences of template utterances and their semantic parses. In the second phase, crowd workers provide contextual rewrites of the dialogues to make the utterances more natural while preserving their meaning. The entire process can finish within a few hours. We propose a new corpus of 3,000 dialogues spanning 2 domains collected with M2M, and present comparisons with popular dialogue datasets on the quality and diversity of the surface forms and dialogue flows.

Motivation & Objective

  • Motivate the need to rapidly bootstrap goal-oriented dialogue agents for new tasks without extensive human data collection.
  • Propose a data-efficient framework that reduces crowdsourcing effort while increasing dialogue diversity and coverage.
  • Introduce a two-phase process that first generates dialogue outlines via self-play and then converts them into natural language through paraphrase tasks.
  • Offer a dataset and empirical evaluation comparing M2M-generated data with existing dialogue datasets to demonstrate quality and diversity improvements.

Proposed method

  • Define a task specification consisting of a slot-based schema S and an API client C for querying candidate entities.
  • Generate outlines o by self-play between an agenda-based user simulator and a finite-state-system bot to explore dialogue flows.
  • Convert each outline into natural language dialogue via a domain-general template utterance generator.
  • Paraphrase each template utterance into natural, natural-language utterances u by crowd workers in a contextual rewrite task.
  • Validate and annotate paraphrased utterances with slot spans and semantics, with optional active-learning-backed corrections.
  • Optionally expand data by sampling more outlines and reusing the paraphrase map to synthesize additional dialogues.

Experimental results

Research questions

  • RQ1How can automated self-play and crowdsourcing be combined to rapidly generate high-quality, diverse, and task-relevant dialogue datasets?
  • RQ2Does M2M provide better coverage of dialogue language and flows compared to traditional crowdsourcing approaches like Wizard-of-Oz or skill-level data collection?
  • RQ3Can the generated datasets support effective training of both modular and end-to-end dialogue models for goal-oriented tasks?
  • RQ4What are the practical costs and time savings when bootstrapping a new task dataset using M2M?

Key findings

  • M2M yields higher linguistic diversity and richer dialogue flows than a DSTC2-like restaurant dataset (e.g., more unique transitions per turn and subdialogues).
  • The framework generates a corpus of 3,000 dialogues across two domains (restaurant and movie tickets) using self-play plus crowdsourced paraphrasing.
  • Crowd workers provide natural-language paraphrases via contextual rewrites, enabling efficient annotation of utterances with semantics and slot values.
  • The combined approach enables dataset construction and model training within hours, with crowd-worker-rated quality indicating favorable attributes for both user and system turns.
  • M2M datasets support training of state tracking, language understanding, policy, and generation components, and can be used to bootstrap end-to-end models with reinforcement learning.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.