2008 | OriginalPaper | Buchkapitel
Efficient Seeding Techniques for Protein Similarity Search
verfasst von : Mikhail Roytberg, Anna Gambin, Laurent Noé, Sławomir Lasota, Eugenia Furletova, Ewa Szczurek, Gregory Kucherov
Erschienen in: Bioinformatics Research and Development
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
We apply the concept of
subset seeds
proposed in ? to similarity search in protein sequences. The main question studied is the design of efficient
seed alphabets
to construct seeds with optimal sensitivity/selectivity trade-offs. We propose several different design methods and use them to construct several alphabets. We then perform an analysis of seeds built over those alphabet and compare them with the standard
Blastp
seeding method [2,3], as well as with the family of vector seeds proposed in [4]. While the formalism of subset seed is less expressive (but less costly to implement) than the accumulative principle used in
Blastp
and vector seeds, our seeds show a similar or even better performance than
Blastp
on Bernoulli models of proteins compatible with the common BLOSUM62 matrix.