2006 | OriginalPaper | Chapter
Genomes Containing Duplicates Are Hard to Compare
Authors : Cedric Chauve, Guillaume Fertin, Romeo Rizzi, Stéphane Vialette
Published in: Computational Science – ICCS 2006
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
In this paper, we are interested in the algorithmic complexity of computing (dis)similarity measures between two genomes when they contain duplicated genes. In that case, there are usually two main ways to compute a given (dis)similarity measure
M
between two genomes
G
1
and
G
2
: the first model, that we will call the
matching model
, consists in computing a one-to-one correspondence between genes of
G
1
and genes of
G
2
, in such a way that
M
is optimized in the resulting permutation. The second model, called the
exemplar model
, consists in keeping in
G
1
(resp.
G
2
) exactly
one
copy of each gene, thus deleting all the other copies, in such a way that
M
is optimized in the resulting permutation. We present here different results concerning the algorithmic complexity of computing three different similarity measures (number of common intervals, MAD number and SAD number) in those two models, basically showing that the problem becomes
NP
-completeness for each of them as soon as genomes contain duplicates. In the case of MAD and SAD, we actually prove that, under both models, both MAD and SAD problems are APX-hard.