QUICK REVIEW

[논문 리뷰] Plug and Play Language Models: A Simple Approach to Controlled Text Generation

Sumanth Dathathri, Andrea Madotto|arXiv (Cornell University)|2019. 12. 04.

Speech and dialogue systems인용 수 407

한 줄 요약

PPLM은 재훈련 없이 생성 방향을 제어하기 위해 사전 학습된 언어 모델과 경량 속성 모델을 결합하고, 잠재 공간의 기울기 업데이트를 사용해 주제와 감정을 제어하는 한편 유창성을 유지합니다.

ABSTRACT

Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying the model architecture or fine-tuning on attribute-specific data and entailing the significant cost of retraining. We propose a simple alternative: the Plug and Play Language Model (PPLM) for controllable language generation, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM. In the canonical scenario we present, the attribute models are simple classifiers consisting of a user-specified bag of words or a single learned layer with 100,000 times fewer parameters than the LM. Sampling entails a forward and backward pass in which gradients from the attribute model push the LM's hidden activations and thus guide the generation. Model samples demonstrate control over a range of topics and sentiment styles, and extensive automated and human annotated evaluations show attribute alignment and fluency. PPLMs are flexible in that any combination of differentiable attribute models may be used to steer text generation, which will allow for diverse and creative applications beyond the examples given in this paper.

연구 동기 및 목표

대형 LMs를 재훈련하지 않고 컨트롤 가능한 텍스트 생성을 필요로 하는 이유를 제시한다.
추론 시 기본 LM과 간단한 속성 모델을 결합하는 유연한 프레임워크를 제안한다.
주제 및 감정 제어를 시연하고 속성 정렬과 유창성을 평가한다.
PPLM을 기존 기준선과 비교하고 디톡스화 및 제약된 스토리텔링에 대한 가능성을 보여준다.

제안 방법

매개변수를 수정하지 않고 p(x)로서의 사전 학습된 트랜스포머 기반 LM을 사용한다.
한 개 이상의 미분 가능한 속성 모델 p(a|x)을 연결한다(예: bag-of-words 또는 단일 층 판별기).
KL-발산과 사후 정규화 융합 제약을 통한 log p(x)를 유지하면서 log p(a|x)를 최대화하기 위해 LM 잠재 공간 Ht에서 기울기 기반 업데이트를 수행한다.
잠재 표현 ΔHt를 업데이트하고 업데이트된 분포에서 재샘플링하여 토큰을 생성하며 강도 조절 기능을 제공한다.
여러 속성 모델을 결합하고 원래 LM 분포와의 사후 융합을 적용해 유창성을 보존한다.
자동화된 지표(perplexity, Dist-1/2/3)와 유창성 및 속성 관련성에 대한 인간 평가를 모두 사용해 평가한다.

실험 결과

연구 질문

RQ1PPLM이 LM을 재훈련하지 않고도 텍스트 생성을 미리 정의된 속성(주제 또는 감정)으로 이끌 수 있는가?
RQ2잠재 공간 조작은 속성 제어를 위한 출력 분포 가중화와 어떻게 비교되는가?
RQ3KL 정규화와 사후 정규화 융합의 결합이 속성 정렬을 강제하면서도 유창성을 보존하는가?
RQ4다양한 속성 모델(BoW, discriminator)과 도메인에서 PPLM이 효과적인가?

주요 결과

PPLM은 주제와 감정에 대한 속성 제어를 달성하면서 유창성을 기저 LM과 비슷하게 유지한다.
잠재 공간 조작(BC/BCR)은 재정렬이나 가중화 방법만으로보다 훨씬 더 높은 주제 제어를 제공합니다.
PPLM-Discrim with latent updates plus ranking (BCR)은 강한 감정 제어와 경쟁력 있는 유창성을 달성하고 여러 베이스라인을 능가한다.
CTRL 및 POSITIVITY 용으로 미세조정된 GPT-2와 비교할 때, PPLM은 종종 저자의 속성 관련성 및 유창성 개념에 부합하거나 이를 초과한다.
PPLM은 독성 탐지기의 그래디언트를 따라 디톡스화에 사용할 수 있으며 구조화된 스토리 작성을 지원한다.
실험용 코드가 공개되어 실용적 접근성과 재현 가능성을 보여준다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.