[論文レビュー] DP-GAN: Diversity-Promoting Generative Adversarial Network for Generating Informative and Diversified Text
DP-GAN は言語モデルベースの識別器を用いて生成テキストの新規性を報酬付けし、レビュ―や対話生成において繰り返しのベースラインより多様性と情報量を促進します。
Existing text generation methods tend to produce repeated and "boring" expressions. To tackle this problem, we propose a new text generation model, called Diversity-Promoting Generative Adversarial Network (DP-GAN). The proposed model assigns low reward for repeatedly generated text and high reward for "novel" and fluent text, encouraging the generator to produce diverse and informative text. Moreover, we propose a novel language-model based discriminator, which can better distinguish novel text from repeated text without the saturation problem compared with existing classifier-based discriminators. The experimental results on review generation and dialogue generation tasks demonstrate that our model can generate substantially more diverse and informative text than existing baselines. The code is available at https://github.com/lancopku/DPGAN
研究の動機と目的
- Address the issue of repetitive, dull text from standard MLE-trained generators.
- Promote diversity and informativeness in generated text via adversarial reinforcement learning.
- Propose a language-model based discriminator to provide non-saturating rewards.
- Demonstrate improved diversity and relevance on review and dialogue generation tasks.
提案手法
- Use a hierarchical sequence-to-sequence generator to produce multi-sentence text.
- Adopt a language-model based discriminator that outputs cross-entropy as rewards instead of a binary classifier.
- Compute sentence-level and word-level rewards from the discriminator output.
- Train the generator with policy gradient using the cross-entropy rewards.
- Pre-train both generator and discriminator before adversarial training.
- Leverage Monte Carlo free reward computation at the word level for efficiency.
実験結果
リサーチクエスチョン
- RQ1Can a language-model based discriminator provide non-saturating, informative rewards to encourage novelty?
- RQ2Does combining sentence-level and word-level rewards yield greater diversity and informativeness than either alone?
- RQ3How does DP-GAN compare to MLE, PG-BLEU, and SeqGAN in diversity and relevance for review and dialogue generation?
- RQ4Does DP-GAN produce text that more closely matches real-world data distributions, especially for low-frequency words?
主な発見
| Dataset | Token | Dist-1 | Dist-2 | Dist-3 | Dist-S |
|---|---|---|---|---|---|
| Yelp | DP-GAN(SW) | 438.6K | 3.4K | 22.3K | 49.6K |
| Yelp | DP-GAN(S) | 438.6K | 1.7K | 7.5K | 15.7K |
| Yelp | DP-GAN(W) | 271.9K | 2.8K | 14.8K | 29.0K |
| Amazon | DP-GAN(SW) | 383.6K | 1.9K | 11.7K | 26.3K |
| Amazon | DP-GAN(S) | 467.6K | 0.8K | 3.6K | 7.6K |
| Amazon | DP-GAN(W) | 279.4K | 1.6K | 8.9K | 18.4K |
| Dialogue | DP-GAN(SW) | 97.3K | 2.1K | 10.8K | 19.1K |
| Dialogue | DP-GAN(S) | 112.2K | 1.5K | 5.2K | 8.5K |
| Dialogue | DP-GAN(W) | 79.4K | 1.9K | 7.7K | 11.4K |
- DP-GAN substantially outperforms baselines in automatic diversity metrics (distinct unigrams, bigrams, trigrams, and sentences).
- DP-GAN achieves higher diversity and relevance in human evaluation with only a slight drop in fluency.
- A combined reward (sentence-level plus word-level) yields higher diversity and longer text than using either reward alone.
- The language-model based discriminator avoids reward saturation and better distinguishes novel text from repetitive text.
- Generated data distributions under DP-GAN more closely resemble real-world distributions, including low-frequency words.
より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。