[Paper Review] Computing and Enumerating Minimal Common Supersequences Between Two Strings
The paper presents a linear-time algorithm to compute a minimal common supersequence for two strings and a structure to enumerate all minimal common supersequences with quadratic-space, linear-time delay, and cubic-time preprocessing.
Given \(k\) strings each of length at most $n$, computing the shortest common supersequence of them is a well-known NP-hard problem (when \(k\) is unbounded). On the other hand, when \(k=2\), such a shortest common supersequence can be computed in \(O(n^2)\) time using dynamic programming as a textbook example. In this paper, we consider the problem of computing a \emph{minimal} common supersequence and enumerating all minimal common supersequences for \(k=2\) input strings. Our results are summarized as follows. A minimal common supersequence of \(k=2\) input strings can be computed in $O(n)$ time. (The method also works when \(k\) is a constant). All minimal common supersequences between two input strings can be enumerated with a data structure of $O(n^2)$ space and an $O(n)$ time delay, and the data structure can be constructed in $O(n^3)$ time.
Motivation & Objective
- Motivate efficient computation of minimal common supersequences (MCS) as a complement to LCS/SCS problems.
- Provide a linear-time method to compute an MCS for two strings (k=2) and extend insights to multiple strings.
- Develop an enumeration framework to list all MCSs with provable time/space guarantees.
Proposed method
- Sweep a common supersequence and delete removable indices using essentiality criteria (Lemma 3.1) to obtain an MCS in O(n) time for two strings.
- Construct right embeddings to track how A and B embed into the supersequence (BuildRightEmbedding).
- Prove properties of MCS via essential indices (Lem and Rem embeddings) to guide deletions and ensure minimality.
- Provide a constructive algorithm (ReduceSupersequence) that outputs an MCS by removing nonessential positions in linear time.
- Extend the approach to k strings with O(kn(log k + log n)) time to compute an MCS for k strings.
- Develop an enumeration framework that partitions A and B into blocks and models MCSs as paths in a labeled bipartite graph G(A,B) with st-paths corresponding bijectively to MCSuples.

Experimental results
Research questions
- RQ1Can a minimal common supersequence for two strings be computed in linear time independent of n?
- RQ2How can all minimal common supersequences between two strings be enumerated efficiently?
- RQ3What data structure and partitioning approach enable bijective correspondence between MCS and feasible paths for enumeration?
- RQ4How do extensions to more than two strings affect the time complexity of computing MCS?
- RQ5What are the structural properties (embeddings, essential indices) that guarantee minimality and correctness of MCS construction and enumeration?
Key findings
- An MCS for two input strings can be computed in O(n) time.
- All MCSs between two strings can be enumerated with O(n^2) space and O(n) time delay, and the data structure for enumeration can be constructed in O(n^3) time.
- For k strings, an MCS can be computed in O(kn(log k + log n)) time.
- A linear-time reduction approach from any common supersequence to a minimal one enables efficient MCS computation.
- The enumeration framework uses a graph G(A,B) where st-paths bijectively correspond to MCSs.
- The paper establishes a precise characterization of MCS via essential indices and interval partitions (Theorems 5.1, 5.2, 5.3).

Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.