The era of data mining has provided renewed effort in the research of certain areas of biology that for their difficulty and lack of knowledge were and are still considered unsolved problems. One such problem, which is one of the fundamental open problems in computational biology is the prediction of the 3D structure of proteins, or protein structure prediction (PSP). The human experts, with the crucial help of data mining tools, are learning how protein fold to form their structure, but are still far from providing perfect models for all kinds of proteins. Data mining and knowledge discovery are totally necessary in order to advance in the understanding of the folding process. In this context, Learning Classifier Systems (LCS) are very competitive tools. They have shown in the past their competence in many different data mining tasks. Moreover, they provide human-readable solutions to the experts that can help them understand the PSP problem. In this chapter we describe our recent efforts in applying LCS to PSP related domains. Specifically, we focus in a relevant PSP subproblem, called Coordination Number (CN) prediction. CN is a kind of simplified profile of the 3D structure of a protein. Two kinds of experiments are described, the first of them analyzing different ways to represent the basic composition of proteins, its primary sequence, and the second one assessing different data sources and problem definition methods for performing competent CN prediction. In all the experiments LCS show their competence in terms of both accurate predictions and explanatory power.
Weitere Kapitel dieses Buchs durch Wischen aufrufen
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
- Data Mining in Proteomics with Learning Classifier Systems
Jonathan D. Hirst
- Springer Berlin Heidelberg
in-adhesives, MKVS, Hellmich GmbH/© Hellmich GmbH, Zühlke/© Zühlke