QUICK REVIEW

[論文レビュー] Delete, Retrieve, Generate: A Simple Approach to Sentiment and Style Transfer

Juncen Li, Robin Jia|arXiv (Cornell University)|Apr 17, 2018

Topic Modeling参考文献 20被引用数 44

ひとこと要約

本論文は、属性マーカーを特定して削除し、ターゲット属性の例を取得し、流暢な出力を生成することで、テキスト属性転送を行う簡単な教師なしアプローチを提案し、人間の評価で対立的モデルよりも優れている。

ABSTRACT

We consider the task of text attribute transfer: transforming a sentence to alter a specific attribute (e.g., sentiment) while preserving its attribute-independent content (e.g., changing "screen is just the right size" to "screen is too small"). Our training data includes only sentences labeled with their attribute (e.g., positive or negative), but not pairs of sentences that differ only in their attributes, so we must learn to disentangle attributes from attribute-independent content in an unsupervised way. Previous work using adversarial methods has struggled to produce high-quality outputs. In this paper, we propose simpler methods motivated by the observation that text attributes are often marked by distinctive phrases (e.g., "too small"). Our strongest method extracts content words by deleting phrases associated with the sentence's original attribute value, retrieves new phrases associated with the target attribute, and uses a neural model to fluently combine these into a final output. On human evaluation, our best method generates grammatical and appropriate responses on 22% more inputs than the best previous system, averaged over three attribute transfer datasets: altering sentiment of reviews on Yelp, altering sentiment of reviews on Amazon, and altering image captions to be more romantic or humorous.

研究の動機と目的

ずれたデータと限定的な属性ラベル付き文を用いて、テキスト属性転送の動機付けを行う。
内容と属性マーカーを分離する、より単純で訓練可能な手法の提案。
属性マーカーを削除し、ターゲットマーカーで再組み立てると流暢な出力が得られることを示す。
検索強化生成が、ベースラインや従来の対立的モデルより文法性と属性適合性を改善することを示す。

提案手法

属性ラベル付きコーパス間での相対頻度を比較して、識別的なn-gramとして属性マーカーを特定する。
Delete: 入力文から高い重要性を持つ属性マーカーを削除して内容を得る。
Retrieve: TF-IDF重なりまたは内容埋め込み距離を用いて、同様の内容を持つターゲット属性文を取得する。
Generate: contentとターゲット属性マーカーを組み合わせる（TemplateBased）、またはニューラルモデル（DeleteOnly、DeleteAndRetrieve）で生成する。必要に応じて取得したターゲットマーカーを条件付けする。
Train DeleteOnlyを、内容と元の属性から文を再構成する自己符号化目的で学習する。
Train DeleteAndRetrieveをデノイジング付きで学習し、単純な継ぎ足しを防ぎ、取得したマーカーを流暢な生成に活用する。

実験結果

リサーチクエスチョン

RQ1属性固有の語句を削除し、取得と生成を通じてターゲット属性を再導入することで、テキスト属性転送を達成できるか？
RQ2単純で非対立的なアプローチは、人間が評価した場合、感情/スタイル転送タスクで対立的に学習されたモデルよりも優れているか？
RQ3取得したターゲットマーカーを条件付けることが、転送出力の文法性と内容の保持にどう影響するか？

主な発見

属性マーカーを削除しターゲット属性の内容を取得する簡単なベースラインが、人間評価で従来の対立システムを顕著な差で上回る。
最も強力なニューラル変種（DeleteAndRetrieve）は3つのデータセットで全体性能が最も高く、他のすべての自動手法を上回る。
Yelp、Amazon、Captionsの各データセットで、最良の方法は従来手法より文法性、内容保持、ターゲット属性一致の点で高い性能を示す。
人間の評価では最良の方法（DeleteAndRetrieve）が文法性、内容保持、属性一致で他のシステムを上回り、マーカー削除閾値で明確なトレードオフを調整できる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。