2005 | OriginalPaper | Chapter
Protein Sequence Classification Through Relevant Sequence Mining and Bayes Classifiers
Authors : Pedro Gabriel Ferreira, Paulo J. Azevedo
Published in: Progress in Artificial Intelligence
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
We tackle the problem of sequence classification using relevant subsequences found in a dataset of protein labelled sequences. A subsequence is
relevant
if it is frequent and has a minimal length. For each query sequence a vector of features is obtained. The features consist in the number and average length of the relevant subsequences shared with each of the protein families. Classification is performed by combining these features in a Bayes Classifier. The combination of these characteristics results in a multi-class and multi-domain method that is exempt of data transformation and background knowledge. We illustrate the performance of our method using three collections of protein datasets. The performed tests showed that the method has an equivalent performance to state of the art methods in protein classification.