Skip to main content

2024 | OriginalPaper | Buchkapitel

Experiment to Find Out Suitable Machine Learning Algorithm for Enzyme Subclass Classification

verfasst von : Amitav Saran, Partha Sarathi Ghosh, Umasankar Das, Thiyagarajan Chenga Kalvinathan

Erschienen in: Micro-Electronics and Telecommunication Engineering

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Proteins play a major role in determining many characteristics and functions of living beings. Prediction of protein classes and subclasses is one of the prominent topics of research in bioinformatics. Machine learning methods are widely used for prediction purposes, also applied for classification and subclassification of proteins. The problem is to classify the proteins to the corresponding subclass they belong to and choose a suitable machine learning method which can be used for better subclass classification. The objective is to compare the performances of three existing machine learning methods: logistic regression, support vector machine (SVM), and random forest, for protein subclassification. For this study the methods are implemented, and their results are compared by varying the number of samples of different subclasses and varying the number of subclasses. Logistic regression and support vector machine are used as a binary classifier for predicting multiple classes with \(log_2(n)\) number of classifiers for n class labels. It is observed that both random forest and support vector machine provide almost same accuracy for smaller data size, but as the data size increases random forest performs better than SVM.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Hackett G, Cole N, Bhartia M, Kennedy D, Raju J, Wilkinson P, Saghir A (2014) Blast study group the response to testosterone undecanoate in men with type 2 diabetes is dependent on achieving threshold serum levels (the BLAST study). Int J Clin Pract 68(2):203–215CrossRef Hackett G, Cole N, Bhartia M, Kennedy D, Raju J, Wilkinson P, Saghir A (2014) Blast study group the response to testosterone undecanoate in men with type 2 diabetes is dependent on achieving threshold serum levels (the BLAST study). Int J Clin Pract 68(2):203–215CrossRef
2.
Zurück zum Zitat Donkor ES, Dayie N, Adiku TK (2014) Bioinformatics with basic local alignment search tool (BLAST) and fast alignment (FASTA). J Bioinf Sequence Anal 1:1–6 Donkor ES, Dayie N, Adiku TK (2014) Bioinformatics with basic local alignment search tool (BLAST) and fast alignment (FASTA). J Bioinf Sequence Anal 1:1–6
3.
Zurück zum Zitat Jones NC, Pevzner PA, Pevzner P (2004) In: An introduction to bioinformatics algorithms, MIT Press Jones NC, Pevzner PA, Pevzner P (2004) In: An introduction to bioinformatics algorithms, MIT Press
5.
Zurück zum Zitat Tian Y, Shi Y, Liu X (2012) Recent advances on support vector machines research. Technol Econ Dev Econ 18(1):5–33CrossRef Tian Y, Shi Y, Liu X (2012) Recent advances on support vector machines research. Technol Econ Dev Econ 18(1):5–33CrossRef
6.
Zurück zum Zitat Fawagreh K, Gaber MM (2014) Random forests: from early developments to recent advancements. Syst Sci Control Eng 2(1):602–609 Fawagreh K, Gaber MM (2014) Random forests: from early developments to recent advancements. Syst Sci Control Eng 2(1):602–609
7.
Zurück zum Zitat Tian Y, Shi Y, Liu X (2012) Recent advances on support vector machines research. Technol Econ Dev Econ 18(1):5–33CrossRef Tian Y, Shi Y, Liu X (2012) Recent advances on support vector machines research. Technol Econ Dev Econ 18(1):5–33CrossRef
8.
Zurück zum Zitat Sherstinsky A (2020) Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network, Physica D: Nonlinear Phenomena, March 2020: special issue on machine learning and dynamical systems, vol 404. Elsevier Sherstinsky A (2020) Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network, Physica D: Nonlinear Phenomena, March 2020: special issue on machine learning and dynamical systems, vol 404. Elsevier
10.
Zurück zum Zitat Hastie, Tibshirani, Friedman (2009) In: Elements of statistical learning. Springer, pp 763 Hastie, Tibshirani, Friedman (2009) In: Elements of statistical learning. Springer, pp 763
11.
Zurück zum Zitat Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D (1999) Detecting protein function and protein-protein interactions from genome sequences. Science 285(5428):751–753CrossRef Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D (1999) Detecting protein function and protein-protein interactions from genome sequences. Science 285(5428):751–753CrossRef
12.
Zurück zum Zitat Overbeek R, Fonstein M, D’souza M, Pusch GD, Maltsev N (1999) The use of gene clusters to infer functional coupling. Proc Natl Acad Sci 96(6):2896–2901CrossRef Overbeek R, Fonstein M, D’souza M, Pusch GD, Maltsev N (1999) The use of gene clusters to infer functional coupling. Proc Natl Acad Sci 96(6):2896–2901CrossRef
13.
Zurück zum Zitat Cai YD, Liu XJ, Chou KC (2002) Artificial neural network model for predicting protein subcellular location. Comput Chem 26(2):179–182CrossRef Cai YD, Liu XJ, Chou KC (2002) Artificial neural network model for predicting protein subcellular location. Comput Chem 26(2):179–182CrossRef
14.
Zurück zum Zitat Stawiski EW, Mandel-Gutfreund Y, Lowenthal AC, Gregoret LM(2002) Progress in predicting protein function from structure: unique features of O-glycosidases. Biocomputing 637–648 Stawiski EW, Mandel-Gutfreund Y, Lowenthal AC, Gregoret LM(2002) Progress in predicting protein function from structure: unique features of O-glycosidases. Biocomputing 637–648
15.
Zurück zum Zitat Dobson PD, Doig AJ (2003) Distinguishing enzyme structures from non-enzymes without alignments. J Mol Biol 330(4):771–783CrossRef Dobson PD, Doig AJ (2003) Distinguishing enzyme structures from non-enzymes without alignments. J Mol Biol 330(4):771–783CrossRef
16.
Zurück zum Zitat Shen HB, Chou KC (2007) EzyPred: a top-down approach for predicting enzyme functional classes and subclasses. Biochem Biophys Res Commun 364(1):53–59CrossRef Shen HB, Chou KC (2007) EzyPred: a top-down approach for predicting enzyme functional classes and subclasses. Biochem Biophys Res Commun 364(1):53–59CrossRef
18.
Zurück zum Zitat Kumar C, Choudhary A (2012) A top-down approach to classify enzyme functional classes and sub-classes using random forest. EURASIP J Bioinform Syst Biol 1 Kumar C, Choudhary A (2012) A top-down approach to classify enzyme functional classes and sub-classes using random forest. EURASIP J Bioinform Syst Biol 1
19.
Zurück zum Zitat Ying W, Xiuzhen H, Lixia S, Zhenxing F, Hangyu S (2014) Predicting enzyme subclasses by using random forest with multicharacteristic parameters protein and peptide letters. 21(3):275-284(10); Bentham Science Publishers Ying W, Xiuzhen H, Lixia S, Zhenxing F, Hangyu S (2014) Predicting enzyme subclasses by using random forest with multicharacteristic parameters protein and peptide letters. 21(3):275-284(10); Bentham Science Publishers
Metadaten
Titel
Experiment to Find Out Suitable Machine Learning Algorithm for Enzyme Subclass Classification
verfasst von
Amitav Saran
Partha Sarathi Ghosh
Umasankar Das
Thiyagarajan Chenga Kalvinathan
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-99-9562-2_21

Neuer Inhalt