2015 | OriginalPaper | Buchkapitel
Domain Adaptation with Logistic Regression for the Task of Splice Site Prediction
verfasst von : Nic Herndon, Doina Caragea
Erschienen in: Bioinformatics Research and Applications
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Supervised classifiers are highly dependent on abundant labeled training data. Alternatives for addressing the lack of labeled data include: labeling data (but this is costly and time consuming); training classifiers with abundant data from another domain (however, the classification accuracy usually decreases as the distance between domains increases); or complementing the limited labeled data with abundant unlabeled data from the same domain and learning semi-supervised classifiers (but the unlabeled data can mislead the classifier). A better alternative is to use both the abundant labeled data from a source domain and the limited labeled data from the target domain to train classifiers in a domain adaptation setting. We propose such a classifier, based on logistic regression, and evaluate it for the task of splice site prediction – a difficult and essential step in gene prediction. Our classifier achieved high accuracy, with highest areas under the precision-recall curve between 50.83% and 82.61%.