Skip to main content
QUICK REVIEW

[Paper Review] Computing and Enumerating Minimal Common Supersequences Between Two Strings

Braeden Sopp, Adiesha Liyanage|arXiv (Cornell University)|Mar 23, 2026
Algorithms and Data Compression0 citations
TL;DR

The paper presents a linear-time algorithm to compute a minimal common supersequence for two strings and a structure to enumerate all minimal common supersequences with quadratic-space, linear-time delay, and cubic-time preprocessing.

ABSTRACT

Given \(k\) strings each of length at most $n$, computing the shortest common supersequence of them is a well-known NP-hard problem (when \(k\) is unbounded). On the other hand, when \(k=2\), such a shortest common supersequence can be computed in \(O(n^2)\) time using dynamic programming as a textbook example. In this paper, we consider the problem of computing a \emph{minimal} common supersequence and enumerating all minimal common supersequences for \(k=2\) input strings. Our results are summarized as follows. A minimal common supersequence of \(k=2\) input strings can be computed in $O(n)$ time. (The method also works when \(k\) is a constant). All minimal common supersequences between two input strings can be enumerated with a data structure of $O(n^2)$ space and an $O(n)$ time delay, and the data structure can be constructed in $O(n^3)$ time.

Motivation & Objective

  • Motivate efficient computation of minimal common supersequences (MCS) as a complement to LCS/SCS problems.
  • Provide a linear-time method to compute an MCS for two strings (k=2) and extend insights to multiple strings.
  • Develop an enumeration framework to list all MCSs with provable time/space guarantees.

Proposed method

  • Sweep a common supersequence and delete removable indices using essentiality criteria (Lemma 3.1) to obtain an MCS in O(n) time for two strings.
  • Construct right embeddings to track how A and B embed into the supersequence (BuildRightEmbedding).
  • Prove properties of MCS via essential indices (Lem and Rem embeddings) to guide deletions and ensure minimality.
  • Provide a constructive algorithm (ReduceSupersequence) that outputs an MCS by removing nonessential positions in linear time.
  • Extend the approach to k strings with O(kn(log k + log n)) time to compute an MCS for k strings.
  • Develop an enumeration framework that partitions A and B into blocks and models MCSs as paths in a labeled bipartite graph G(A,B) with st-paths corresponding bijectively to MCSuples.
Figure 1: Depicted on the left is $G(A,B)$ containing all nodes with the vertices of $G_{st}(A,B)$ colored in terms of their respective partitions. Red vertices belong partition $V_{A}$ and blue vertices belong to partition $V_{B}$ . On the right we have $G_{st}(A,B)$ with the vertices labeled. $A=b
Figure 1: Depicted on the left is $G(A,B)$ containing all nodes with the vertices of $G_{st}(A,B)$ colored in terms of their respective partitions. Red vertices belong partition $V_{A}$ and blue vertices belong to partition $V_{B}$ . On the right we have $G_{st}(A,B)$ with the vertices labeled. $A=b

Experimental results

Research questions

  • RQ1Can a minimal common supersequence for two strings be computed in linear time independent of n?
  • RQ2How can all minimal common supersequences between two strings be enumerated efficiently?
  • RQ3What data structure and partitioning approach enable bijective correspondence between MCS and feasible paths for enumeration?
  • RQ4How do extensions to more than two strings affect the time complexity of computing MCS?
  • RQ5What are the structural properties (embeddings, essential indices) that guarantee minimality and correctness of MCS construction and enumeration?

Key findings

  • An MCS for two input strings can be computed in O(n) time.
  • All MCSs between two strings can be enumerated with O(n^2) space and O(n) time delay, and the data structure for enumeration can be constructed in O(n^3) time.
  • For k strings, an MCS can be computed in O(kn(log k + log n)) time.
  • A linear-time reduction approach from any common supersequence to a minimal one enables efficient MCS computation.
  • The enumeration framework uses a graph G(A,B) where st-paths bijectively correspond to MCSs.
  • The paper establishes a precise characterization of MCS via essential indices and interval partitions (Theorems 5.1, 5.2, 5.3).
Figure 2: On the left hand side, we show characters used in a right embedding of $A_{1}=abbc$ into $S=abccbacc$ in orange cells. On the right, we depict the output data structure for $S$ and $A_{1}$ . On the bottom is the result of $\textsc{MergeRightEmbedding}(S;A_{1},A_{2})$ where $A_{2}=ac$ .
Figure 2: On the left hand side, we show characters used in a right embedding of $A_{1}=abbc$ into $S=abccbacc$ in orange cells. On the right, we depict the output data structure for $S$ and $A_{1}$ . On the bottom is the result of $\textsc{MergeRightEmbedding}(S;A_{1},A_{2})$ where $A_{2}=ac$ .

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.