QUICK REVIEW

[Paper Review] A Survey of Natural Language Generation Techniques with a Focus on Dialogue Systems - Past, Present and Future Directions

Sashank Santhanam, Samira Shaikh|arXiv (Cornell University)|Jun 2, 2019

Topic Modeling129 references34 citations

TL;DR

The paper surveys natural language generation methods from traditional to deep learning, focusing on open-domain dialogue systems and outlining three key future directions: larger context, persona, and avoiding dull responses, with a cognitive-architecture perspective.

ABSTRACT

One of the hardest problems in the area of Natural Language Processing and Artificial Intelligence is automatically generating language that is coherent and understandable to humans. Teaching machines how to converse as humans do falls under the broad umbrella of Natural Language Generation. Recent years have seen unprecedented growth in the number of research articles published on this subject in conferences and journals both by academic and industry researchers. There have also been several workshops organized alongside top-tier NLP conferences dedicated specifically to this problem. All this activity makes it hard to clearly define the state of the field and reason about its future directions. In this work, we provide an overview of this important and thriving area, covering traditional approaches, statistical approaches and also approaches that use deep neural networks. We provide a comprehensive review towards building open domain dialogue systems, an important application of natural language generation. We find that, predominantly, the approaches for building dialogue systems use seq2seq or language models architecture. Notably, we identify three important areas of further research towards building more effective dialogue systems: 1) incorporating larger context, including conversation context and world knowledge; 2) adding personae or personality in the NLG system; and 3) overcoming dull and generic responses that affect the quality of system-produced responses. We provide pointers on how to tackle these open problems through the use of cognitive architectures that mimic human language understanding and generation capabilities.

Motivation & Objective

Provide an overview of natural language generation (NLG) from traditional approaches to deep learning methods.
Examine open-domain dialogue systems as a key application of NLG.
Identify research gaps and propose future directions for more coherent, context-aware and personalized dialogue generation.
Discuss how cognitive architectures can tackle current limitations in NLG for dialogue.

Proposed method

Review traditional NLG architectures and their subcomponents (content determination, document structuring, lexicalization, referring expression generation, sentence aggregation, linguistic realization).
Discuss statistical and rule-based methods used in content selection and realization prior to neural models.
Survey deep learning approaches (language models, encoder–decoder/seq2seq, attention, memory networks, transformer-based methods) and their impact on dialogue systems.
Highlight persistent challenges in dialogue generation such as lack of context encoding, generic responses, and missing persona.
Provide a synthesis of future directions and potential cognitive-architecture-inspired solutions.

Experimental results

Research questions

RQ1What are the main historical and contemporary approaches to natural language generation for dialogue systems?
RQ2What are the key research gaps hindering open-domain dialogue systems from achieving coherent, context-rich and personalized interactions?
RQ3How can future NLG systems incorporate larger context and world knowledge, and how might persona or personality be integrated to improve dialogue quality?
RQ4What role can cognitive architectures play in addressing current limitations of NLG for dialogue?

Key findings

Seq2seq models and language models dominate current dialogue system approaches.
Three main future directions are identified: incorporating larger context and world knowledge, integrating personae or personality, and overcoming dull, generic responses.
Open problems include encoding broader dialogue context, maintaining coherent persona, and producing more engaging and contextually relevant responses.
The survey suggests exploring cognitive architectures that mimic human language understanding and generation to address foundational challenges.
Traditional NLG components (content determination, structuring, lexicalization, REG, aggregation, realization) remain relevant for understanding and organizing neural approaches.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.