QUICK REVIEW

[Paper Review] Controlling Output Length in Neural Encoder-Decoders

Yuta Kikuchi, Graham Neubig|arXiv (Cornell University)|Sep 30, 2016

Topic Modeling38 references43 citations

TL;DR

This paper proposes four methods—two decoding-based and two learning-based—for controlling output sequence length in neural encoder-decoder models, specifically for sentence summarization. The learning-based methods ($\mathit{LenEmb}$ and $\mathit{LenInit}$) effectively constrain output length to desired targets without degrading ROUGE scores, outperforming decoding-based approaches on longer summaries while maintaining competitive performance on standard benchmarks.

ABSTRACT

Neural encoder-decoder models have shown great success in many sequence generation tasks. However, previous work has not investigated situations in which we would like to control the length of encoder-decoder outputs. This capability is crucial for applications such as text summarization, in which we have to generate concise summaries with a desired length. In this paper, we propose methods for controlling the output sequence length for neural encoder-decoder models: two decoding-based methods and two learning-based methods. Results show that our learning-based methods have the capability to control length without degrading summary quality in a summarization task.

Motivation & Objective

To address the lack of explicit length control in neural encoder-decoder models for sequence generation tasks.
To enable summarization systems to generate outputs of desired length, which is critical for applications like document summarization and headline generation.
To evaluate whether length control can be achieved without degrading summary quality, particularly in terms of ROUGE scores.
To compare decoding-based vs. learning-based approaches for length control in terms of effectiveness and robustness.
To demonstrate that the proposed methods maintain competitive performance on standard DUC2004 benchmarks while enabling controllable output length.

Proposed method

Two decoding-based methods ($\mathit{fixLen}$ and $\mathit{fixRng}$) modify beam search during inference by constraining the length of candidate sequences to a target range.
Two learning-based methods ($\mathit{LenEmb}$ and $\mathit{LenInit}$) modify the model architecture to condition the decoder on a learned embedding or initialization vector representing the desired output length.
$\mathit{LenEmb}$ injects a learned embedding of the target length into the decoder's hidden state at each decoding step.
$\mathit{LenInit}$ initializes the decoder’s hidden state with a learned vector that encodes the desired length.
All models are trained using standard sequence-to-sequence objectives with cross-entropy loss, while the learning-based models are jointly optimized to predict both the output sequence and its length.
Length control is evaluated via beam search with length constraints, and performance is measured using ROUGE-1, ROUGE-2, and ROUGE-L metrics.

Experimental results

Research questions

RQ1Can neural encoder-decoder models be effectively modified to generate outputs of a specified length?
RQ2Do learning-based methods for length control outperform decoding-based alternatives in terms of length accuracy and summary quality?
RQ3Does incorporating length control degrade performance on standard summarization benchmarks like DUC2004?
RQ4How do the proposed methods compare in controlling long summaries (e.g., 50–75 bytes) versus shorter ones?
RQ5Can the model maintain high ROUGE scores while achieving precise length control?

Key findings

The learning-based methods $\mathit{LenEmb}$ and $\mathit{LenInit}$ successfully concentrate output lengths around the desired target, as shown in histograms of generated sequences.
$\mathit{LenEmb}$ achieved a ROUGE-L score of 23.88 on the DUC2004 benchmark, outperforming the standard baseline ($\mathit{fixLen}$) and matching state-of-the-art models.
$\mathit{LenInit}$ achieved a ROUGE-L score of 23.25, comparable to the standard model and existing methods, while maintaining strong length control.
For long summaries (e.g., 50–75 bytes), the learning-based methods significantly outperformed the decoding-based methods in length accuracy and consistency.
The beam search results for $\mathit{LenInit}$ showed that all top candidates were close to the desired length (30 bytes), confirming effective length control.
Despite the added complexity of length control, the proposed methods maintained competitive ROUGE scores, indicating no degradation in summary quality.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.