Abstract
The problem of finding a longest common subsequence of two strings is discussed. This problem arises in data processing applications such as comparing two files and in genetic applications such as studying molecular evolution. The difficulty of computing a longest common subsequence of two strings is examined using the decision tree model of computation, in which vertices represent “equal - unequal” comparisons. It is shown that unless a bound on the total number of distinct symbols is assumed, every solution to the problem can consume an amount of time that is proportional to the product of the lengths of the two strings. A general lower bound as a function of the ratio of alphabet size to string length is derived. The case where comparisons between symbols of the same string are forbidden is also considered and it is shown that this problem is of linear complexity for a two-symbol alphabet and quadratic for an alphabet of three or more symbols.
- 1 AHO, A V, HOPCROF'r, J E, AND ULLMAN, J D The Design and Analysts oJ Computer Algorlthrns. Addison-Wesley, Reading, Mass, 1974 Google ScholarDigital Library
- 2 CARTWRIGHT, D, AND HARARY, F On colorings of signed graphs Elemente der Mathemattk 23 (1968), 85-89Google Scholar
- 3 CHVATAL, V, KLARNER, D A, AND KNUTH, D E Selected combmatonal research problems STAN-CS-72-292, Stanford U, Stanford, Cahf, 1972, p 26 Google ScholarDigital Library
- 4 CHVATAL, V, AND SANKOFF, D Longest common subsequences of two random sequences STAN- CS-75-477, Stanford U, Stanford, Cahf, Jan 1975 Google ScholarDigital Library
- 5 FISCHER, M J , AND PATERSON, M S Strmg matching and other products MAC Technical Memorandum 41, M 1 T , Cambridge, Mass, 1974 Google ScholarDigital Library
- 6 FREDMAN, M L On computing the length of longest mcreasmg subsequences Discrete Mathematics 11, 1 (January 1975) 29-36Google ScholarDigital Library
- 7 HIRSCHBERG, D S A hnear space algorithm for computmg maximal common subsequences Comm 4CM 18, 6 (June 1975) 341-343 Google ScholarDigital Library
- 8 HIRSCHBERG, D S On finding maximal common subsequences TR-156, Computer Sciences Laboratory, Princeton U, Prmceton, N J , 1974Google Scholar
- 9 HUNT. J W, AND SZYMANSKI. T G A fast algorithm for computing longest common subsequences Unpublished memorandum, Princeton University, September 1975Google Scholar
- 10 KERNIGHAN, B W, AND CHERRY, L L A system for typesetting mathematics Comm ACM 18, 3 (March 1975), 151-157 Google ScholarDigital Library
- 11 NEEDLEMAN, S B, AND WUNSCH, C D A general method applicable to the search for similarities m the amino acid sequence of two proteins J Mol Btol 48 (1970), 443-453Google ScholarCross Ref
- 12 PATERSON, M S Unpublished manuscript University of Warwick, England, 1974Google Scholar
- 13 REINGOLD, E M On the optimality of some set algorithms J ACM 19, 4 (October 1972), 649-659 Google ScholarDigital Library
- 14 SANKOFF, D Matching sequences under deletion/insertion constraints Proc. Nat. Acad Sct. USA 69, 1 (Jan, 1972), 4-6Google ScholarCross Ref
- 15 VAN EMDE BOAS, P Preserving order m a forest in less than logarithmic time Sixteenth Annual IEEE Symposium on Foundations o! Computer Science, October 1975Google Scholar
- 16 WAGNER, R A, AND FISCHER, M J The strmg-to-strmg correction problem J ACM 21, 1 (Jan, 1974), 168-173 Google ScholarDigital Library
- 17 WONG, C K, AND CHANDRA, A K Bounds for the string editing problem IBM Research Center, Yorktown Heights, New York, 1974Google Scholar
- 18 YAO, A C, AND YAO, F F On computing the rank function for a set of vectors UIUCDCS-R-75- 699, Dept of Computer Science, U of Illinois, Urbana, I11, Feb, 1975Google Scholar
Index Terms
- Bounds on the Complexity of the Longest Common Subsequence Problem
Recommendations
A hardness result and new algorithm for the longest common palindromic subsequence problem
The 2-LCPS problem, first introduced by Chowdhury et al. (2014) [17], asks one to compute (the length of) a longest common palindromic subsequence between two given strings A and B. We show that the 2-LCPS problem is at least as hard as the well-studied ...
Variants of constrained longest common subsequence
We consider a variant of the classical Longest Common Subsequence problem called Doubly-Constrained Longest Common Subsequence (DC-LCS). Given two strings s"1 and s"2 over an alphabet @S, a set C"s of strings, and a function C"o:@S->N, the DC-LCS ...
The longest common subsequence problem a finite automata approach
CIAA'03: Proceedings of the 8th international conference on Implementation and application of automataA new algorithm that creates a common subsequence automaton for a set of strings is presented. Moreover, it is shown that a longest common subsequence of two strings over a constant alphabet can be found in O (|A| (|S1 + |S2| + Σa∈A|S1|a|S2|a)) time, ...
Comments