QUICK REVIEW

[Paper Review] Tailor: A Prompt-Based Approach to Attribute-Based Controlled Text Generation

Kexin Yang, Dayiheng Liu|arXiv (Cornell University)|Apr 28, 2022

Topic Modeling20 citations

TL;DR

Tailor uses continuous, pre-trained attribute prompts to steer a fixed GPT-2 for single-attribute CTG and enables multi-attribute generation through prompt concatenation, masking, re-indexed positions, and a trainable MAP connector to improve fluency and robustness without full-model fine-tuning.

ABSTRACT

Attribute-based Controlled Text Generation (CTG) refers to generating sentences that satisfy desirable attributes (e.g., emotions and topics). Existing works often utilize fine-tuning or resort to extra attribute classifiers, yet suffer from storage and inference time increases. To address these concerns, we explore attribute-based CTG in a prompt-based manner. In short, the proposed Tailor represents each attribute as a pre-trained continuous vector (i.e., single-attribute prompt) and guides the generation of a fixed PLM switch to a pre-specified attribute. We experimentally find that these prompts can be simply concatenated as a whole to multi-attribute CTG without any re-training, yet raises problems of fluency decrease and position sensitivity. To this end, Tailor provides a multi-attribute prompt mask and a re-indexing position-ids sequence to bridge the gap between the training (one prompt for each task) and testing stage (concatenating more than one prompt). To further enhance such single-attribute prompt combinations, Tailor also introduces a trainable prompt connector, which can be concatenated with any two single-attribute prompts to multi-attribute text generation. Experiments on 11 attribute-specific generation tasks demonstrate strong performances of Tailor on both single-attribute and multi-attribute CTG, with 0.08\% training parameters of a GPT-2.

Motivation & Objective

Motivate efficient attribute-based controlled text generation without storing fine-tuned models for every attribute.
Propose a prompt-based framework where each attribute is a pre-trained continuous prompt guiding a fixed language model.
Enable robust multi-attribute generation by concatenating single-attribute prompts and addressing training-testing gaps.
Introduce non-training mechanisms (MAP mask, RP sequence) to mitigate fluency and position-sensitivity issues.
Provide a trainable MAP connector to enhance and generalize multi-attribute composition, including unseen attribute combinations.

Proposed method

Represent each attribute as a fixed, pre-trained continuous prompt (single-attribute prompt) and train only the prompt on attribute-specific data.
Feed the single-attribute prompt concatenated with the input prefix into a fixed GPT-2 to generate attribute-controlled text.
For multi-attribute generation, concatenate single-attribute prompts and address fluency/position-sensitivity with MAP mask and RP sequence.
Introduce a MAP connector to train a small module that combines two single-attribute prompts with a pseudo-attribute prompt for multi-attribute generation.
Use pseudo-prompt construction (argmax-based or weighted) to simulate multi-attribute prompts during MAP connector training.
Evaluate using a GPT-2 base model across single- and multi-attribute CTG tasks on the YELP dataset with objective metrics for correctness, text quality, and diversity.

Experimental results

Research questions

RQ1Can attribute-specific prompts steer a fixed language model to generate sentences with the desired single attributes without fine-tuning the model?
RQ2Are single-attribute prompts scalable to multi-attribute text generation through concatenation, and how can fluency be preserved?
RQ3Do mechanisms like MAP mask, re-indexed position IDs, and MAP connector improve multi-attribute generation quality and robustness, including unseen attribute combinations?
RQ4What is the comparative benefit of non-training versus training approaches for combining prompts in multi-attribute CTG?

Key findings

Single-attribute prompts enable competitive control of attributes with minimal parameter updates (0.08% training parameters for GPT-2 in Tailor-S).
Concatenating single-attribute prompts can yield multi-attribute generation, but may reduce fluency and introduce position sensitivity.
MAP mask and RP sequence mitigate cross-attention and position-sensitivity, improving stability in multi-attribute generation without retraining.
MAP connector, trained with pseudo-prompts, further enhances multi-attribute generation and generalizes to unseen attribute combinations.
Tailor variants achieve strong performance on multi-attribute CTG on Yelp with substantially fewer training parameters compared to fine-tuning baselines.
In few-shot settings, Tailor variants outperform baselines with negligible extra training parameters.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.