Skip to main content
QUICK REVIEW

[论文解读] Neural Text Generation: Past, Present and Beyond

Sidi Lu, Yaoming Zhu|arXiv (Cornell University)|Mar 15, 2018
Topic Modeling参考文献 31被引用 54
一句话总结

本论文综述神经文本生成模型,比较训练范式(MLE、RL、对抗学习),并在图像描述和长文本任务上对若干模型进行基准测试,突出其优点与局限。

ABSTRACT

This paper presents a systematic survey on recent development of neural text generation models. Specifically, we start from recurrent neural network language models with the traditional maximum likelihood estimation training scheme and point out its shortcoming for text generation. We thus introduce the recently proposed methods for text generation based on reinforcement learning, re-parametrization tricks and generative adversarial nets (GAN) techniques. We compare different properties of these models and the corresponding techniques to handle their common problems such as gradient vanishing and generation diversity. Finally, we conduct a benchmarking experiment with different types of neural text generation models on two well-known datasets and discuss the empirical results along with the aforementioned model properties.

研究动机与目标

  • Review the evolution of neural text generation from RNNLMs to RL and GAN-based methods.
  • Analyze training paradigms (MLE, reinforcement learning, adversarial training) and their trade-offs.
  • Benchmark representative NTG models on standard datasets and discuss empirical findings.
  • Identify common problems (exposure bias, gradient vanishing, mode collapse) and proposed solutions.

提出的方法

  • Systematic literature review of NTG methods including MLE, RL, and adversarial training frameworks.
  • Comparison of techniques to address exposure bias and generation diversity.
  • Benchmarking using Texygen platform on Image COCO and EMNLP2017 WMT datasets.
  • Discussion of stability issues and proposed improvements (reward rescaling, ranking-based discriminators, hierarchical models).

实验结果

研究问题

  • RQ1How do MLE-based NTG models compare with RL- and GAN-based approaches in terms of quality and diversity?
  • RQ2What techniques mitigate exposure bias, gradient vanishing, and mode collapse in NTG models?
  • RQ3How do different NTG architectures (SeqGAN, RankGAN, MaliGAN, LeakGAN, MaskGAN, TextGAN) perform on short vs. long text generation tasks?
  • RQ4What behavioral patterns emerge in benchmarking across datasets like Image COCO and WMT?

主要发现

  • LeakGAN shows strong BLEU performance on long text generation datasets.
  • SeqGAN excels on short text generation but struggles with diversity and gradient vanishing.
  • MaliGAN improves gradient stability and maintains diversity in several settings.
  • MaskGAN and TextGAN generally underperform on BLEU in the reported experiments, with TextGAN showing severe mode collapse in long-text generation.
  • MLE remains a strong baseline, often close to or surpassing some GAN/RL variants on shorter texts.
  • Self-BLEU analysis indicates higher mode collapse in several models, with TextGAN showing particularly severe collapse on long text.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。