QUICK REVIEW

[论文解读] Tailor: A Prompt-Based Approach to Attribute-Based Controlled Text Generation

Kexin Yang, Dayiheng Liu|arXiv (Cornell University)|Apr 28, 2022

Topic Modeling被引用 20

一句话总结

Tailor 使用连续的、预训练的属性提示来引导固定的 GPT-2 进行单属性 CTG，并通过提示连接、掩码、重新索引的位置，以及一个可训练的 MAP 连接器来实现多属性生成，从而在不进行全模型微调的情况下提升流畅性和鲁棒性。

ABSTRACT

Attribute-based Controlled Text Generation (CTG) refers to generating sentences that satisfy desirable attributes (e.g., emotions and topics). Existing works often utilize fine-tuning or resort to extra attribute classifiers, yet suffer from storage and inference time increases. To address these concerns, we explore attribute-based CTG in a prompt-based manner. In short, the proposed Tailor represents each attribute as a pre-trained continuous vector (i.e., single-attribute prompt) and guides the generation of a fixed PLM switch to a pre-specified attribute. We experimentally find that these prompts can be simply concatenated as a whole to multi-attribute CTG without any re-training, yet raises problems of fluency decrease and position sensitivity. To this end, Tailor provides a multi-attribute prompt mask and a re-indexing position-ids sequence to bridge the gap between the training (one prompt for each task) and testing stage (concatenating more than one prompt). To further enhance such single-attribute prompt combinations, Tailor also introduces a trainable prompt connector, which can be concatenated with any two single-attribute prompts to multi-attribute text generation. Experiments on 11 attribute-specific generation tasks demonstrate strong performances of Tailor on both single-attribute and multi-attribute CTG, with 0.08\% training parameters of a GPT-2.

研究动机与目标

在不为每个属性存储微调模型的前提下，推动高效的基于属性的受控文本生成。
提出一种基于提示的框架，其中每个属性是一个预训练的连续提示，用于引导固定的语言模型。
通过拼接单属性提示并解决训练-测试之间的差距来实现鲁棒的多属性生成。
引入非训练机制（MAP mask、RP sequence）以缓解流畅性和位置敏感性问题。
提供一个可训练的 MAP 连接器，以增强并推广多属性组合的生成能力，包括未见过的属性组合。

提出的方法

将每个属性表示为固定的、预训练的连续提示（单属性提示），仅在属性特定数据上训练提示。
将单属性提示与输入前缀拼接后输入固定的 GPT-2，以生成带属性控制的文本。
对于多属性生成，拼接单属性提示并通过 MAP mask 和 RP sequence 解决流畅性/位置敏感性问题。
引入一个 MAP 连接器，用于训练一个小模块，将两个单属性提示与一个伪属性提示结合，以实现多属性生成。
在 MAP 连接器训练期间使用伪提示构造（基于 argmax 或加权）来模拟多属性提示。
在 YELP 数据集上使用 GPT-2 基础模型对单属性和多属性 CTG 任务进行评估，采用用于正确性、文本质量和多样性的客观指标。

实验结果

研究问题

RQ1属性特定提示是否能够在不微调模型的情况下，引导固定语言模型生成带有期望单一属性的句子？
RQ2单属性提示是否可以通过拼接扩展到多属性文本生成，且如何保持流畅性？
RQ3如 MAP mask、重新索引的位置 ID，以及 MAP 连接器等机制，是否能提高多属性生成的质量和鲁棒性（包括未见属性组合）？
RQ4在多属性 CTG 中，将提示结合起来的非训练方法与训练方法相比有何优势？

主要发现

单属性提示在对属性进行竞争性控制方面表现出色，参数更新最小（Tailor-S 中 GPT-2 的训练参数占比 0.08%）。
拼接单属性提示可以实现多属性生成，但可能降低流畅性并引入位置敏感性。
MAP mask 与 RP sequence 能缓解跨注意力和位置敏感性，在不重新训练的情况下改善多属性生成的稳定性。
MAP 连接器，在伪提示训练下，进一步提升多属性生成并对未见属性组合具有推广性。
Tailor 变体在 Yelp 的多属性 CTG 上取得了强劲的性能，训练参数显著少于微调基线。
在少-shot 设置下，Tailor 变体在额外训练参数几乎可以忽略的情况下超过基线。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。