QUICK REVIEW

[Paper Review] Computing and Enumerating Minimal Common Supersequences Between Two Strings

Braeden Sopp, Adiesha Liyanage|arXiv (Cornell University)|Mar 23, 2026

Algorithms and Data Compression0 citations

TL;DR

The paper presents a linear-time algorithm to compute a minimal common supersequence for two strings and a structure to enumerate all minimal common supersequences with quadratic-space, linear-time delay, and cubic-time preprocessing.

ABSTRACT

Given $k$ strings each of length at most $n$, computing the shortest common supersequence of them is a well-known NP-hard problem (when $k$ is unbounded). On the other hand, when $k=2$, such a shortest common supersequence can be computed in $O(n^2)$ time using dynamic programming as a textbook example. In this paper, we consider the problem of computing a \emph{minimal} common supersequence and enumerating all minimal common supersequences for $k=2$ input strings. Our results are summarized as follows. A minimal common supersequence of $k=2$ input strings can be computed in $O(n)$ time. (The method also works when $k$ is a constant). All minimal common supersequences between two input strings can be enumerated with a data structure of $O(n^2)$ space and an $O(n)$ time delay, and the data structure can be constructed in $O(n^3)$ time.

Motivation & Objective

Motivate efficient computation of minimal common supersequences (MCS) as a complement to LCS/SCS problems.
Provide a linear-time method to compute an MCS for two strings (k=2) and extend insights to multiple strings.
Develop an enumeration framework to list all MCSs with provable time/space guarantees.

Proposed method

Sweep a common supersequence and delete removable indices using essentiality criteria (Lemma 3.1) to obtain an MCS in O(n) time for two strings.
Construct right embeddings to track how A and B embed into the supersequence (BuildRightEmbedding).
Prove properties of MCS via essential indices (Lem and Rem embeddings) to guide deletions and ensure minimality.
Provide a constructive algorithm (ReduceSupersequence) that outputs an MCS by removing nonessential positions in linear time.
Extend the approach to k strings with O(kn(log k + log n)) time to compute an MCS for k strings.
Develop an enumeration framework that partitions A and B into blocks and models MCSs as paths in a labeled bipartite graph G(A,B) with st-paths corresponding bijectively to MCSuples.

Figure 1: Depicted on the left is $G(A,B)$ containing all nodes with the vertices of $G_{st}(A,B)$ colored in terms of their respective partitions. Red vertices belong partition $V_{A}$ and blue vertices belong to partition $V_{B}$ . On the right we have $G_{st}(A,B)$ with the vertices labeled. $A=b

Experimental results

Research questions

RQ1Can a minimal common supersequence for two strings be computed in linear time independent of n?
RQ2How can all minimal common supersequences between two strings be enumerated efficiently?
RQ3What data structure and partitioning approach enable bijective correspondence between MCS and feasible paths for enumeration?
RQ4How do extensions to more than two strings affect the time complexity of computing MCS?
RQ5What are the structural properties (embeddings, essential indices) that guarantee minimality and correctness of MCS construction and enumeration?

Key findings

An MCS for two input strings can be computed in O(n) time.
All MCSs between two strings can be enumerated with O(n^2) space and O(n) time delay, and the data structure for enumeration can be constructed in O(n^3) time.
For k strings, an MCS can be computed in O(kn(log k + log n)) time.
A linear-time reduction approach from any common supersequence to a minimal one enables efficient MCS computation.
The enumeration framework uses a graph G(A,B) where st-paths bijectively correspond to MCSs.
The paper establishes a precise characterization of MCS via essential indices and interval partitions (Theorems 5.1, 5.2, 5.3).

Figure 2: On the left hand side, we show characters used in a right embedding of $A_{1}=abbc$ into $S=abccbacc$ in orange cells. On the right, we depict the output data structure for $S$ and $A_{1}$ . On the bottom is the result of $\textsc{MergeRightEmbedding}(S;A_{1},A_{2})$ where $A_{2}=ac$ .

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.