Skip to main content

2017 | OriginalPaper | Buchkapitel

Prediction of Lysine Pupylation Sites with Machine Learning Methods

verfasst von : Wenzheng Bao, Zhichao Jiang

Erschienen in: Intelligent Computing Theories and Application

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Post translational modification is a crucial type of protein post-translational modification, which is involved in many important cellular processes and serious diseases. In practice, identification of protein pupylated sites through traditional experiment methods is time-consuming and laborious. Computational methods are not suitable to identify a large number of acetylated sites quickly. Therefore, machine learning methods are still very valuable to accelerate lysine acetylated site finding. Post translational modification of protein is one of the most important biological processions in the field of proteomics and bioinformatics. In this work, the random forest algorithm is employed as the classification model and the PseAAC has been employed as the classification features. Considering the different feature types of PseAAC playing different role in the classification model, the random forest voting method has been proposed in this framework. The results demonstrate that such method will work well in such classification issue.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Armengaud, J.: Proteogenomics and systems biology: quest for the ultimate missing parts. Expert Rev. Proteomics (2014) Armengaud, J.: Proteogenomics and systems biology: quest for the ultimate missing parts. Expert Rev. Proteomics (2014)
2.
Zurück zum Zitat Filippakopoulos, P., Knapp, S.: Targeting bromodomains: epigenetic readers of lysine acetylation. Nature Reviews Drug Discovery 13(5), 337–356 (2014)CrossRef Filippakopoulos, P., Knapp, S.: Targeting bromodomains: epigenetic readers of lysine acetylation. Nature Reviews Drug Discovery 13(5), 337–356 (2014)CrossRef
3.
Zurück zum Zitat Scholz, C., Weinert, B., Wagner, S.: Acetylation site specificities of lysine deacetylase inhibitors in human cells. Nat. Biotechnol. 33(4), 415–423 (2015)CrossRef Scholz, C., Weinert, B., Wagner, S.: Acetylation site specificities of lysine deacetylase inhibitors in human cells. Nat. Biotechnol. 33(4), 415–423 (2015)CrossRef
4.
Zurück zum Zitat Aram, R.Z., Charkari, N.M.: A two-layer classification framework for protein fold recognition. J. Theoret. Biol. 365, 32–39 (2015)MathSciNetCrossRefMATH Aram, R.Z., Charkari, N.M.: A two-layer classification framework for protein fold recognition. J. Theoret. Biol. 365, 32–39 (2015)MathSciNetCrossRefMATH
5.
Zurück zum Zitat Kouranov, A., et al.: The RCSB PDB information portal for structural genomics. Nucleic Acids Res. 34(Suppl 1), D302–D305 (2006)CrossRef Kouranov, A., et al.: The RCSB PDB information portal for structural genomics. Nucleic Acids Res. 34(Suppl 1), D302–D305 (2006)CrossRef
6.
Zurück zum Zitat Yang, X., Seto, E.: Lysine acetylation: codified crosstalk with other posttranslational modifications. Mol. Cell. 31(4), 449–461 (2008)CrossRef Yang, X., Seto, E.: Lysine acetylation: codified crosstalk with other posttranslational modifications. Mol. Cell. 31(4), 449–461 (2008)CrossRef
7.
Zurück zum Zitat Zhao, D., Zou, S., Liu, Y.: Lysine-5 acetylation negatively regulates lactate dehydrogenase a and is decreased in pancreatic cancer. Cancer Cell 23(4), 464–476 (2013)CrossRef Zhao, D., Zou, S., Liu, Y.: Lysine-5 acetylation negatively regulates lactate dehydrogenase a and is decreased in pancreatic cancer. Cancer Cell 23(4), 464–476 (2013)CrossRef
8.
Zurück zum Zitat Wu, X., Oh, M., Schwarz, E.: Lysine acetylation is a widespread protein modification for diverse proteins in arabidopsis. Plant Physiol. 155(4), 1769–1778 (2011)CrossRef Wu, X., Oh, M., Schwarz, E.: Lysine acetylation is a widespread protein modification for diverse proteins in arabidopsis. Plant Physiol. 155(4), 1769–1778 (2011)CrossRef
9.
Zurück zum Zitat Sadoul, K., Wang, J., Diagouraga, B.: The tale of protein lysine acetylation in the cytoplasm. BioMed Res. Int. (2010) Sadoul, K., Wang, J., Diagouraga, B.: The tale of protein lysine acetylation in the cytoplasm. BioMed Res. Int. (2010)
10.
Zurück zum Zitat Hou, T., Zheng, G., Zhang, P.: LAceP: lysine acetylation site prediction using logistic regression classifiers. PLoS ONE 9(2), e89575–e89575 (2014)CrossRef Hou, T., Zheng, G., Zhang, P.: LAceP: lysine acetylation site prediction using logistic regression classifiers. PLoS ONE 9(2), e89575–e89575 (2014)CrossRef
11.
Zurück zum Zitat Weinert, B., Iesmantavicius, V., Wagner, S.: Acetyl-phosphate is a critical determinant of lysine acetylation in E. coli. Mol. Cell 51(2), 265–272 (2013)CrossRef Weinert, B., Iesmantavicius, V., Wagner, S.: Acetyl-phosphate is a critical determinant of lysine acetylation in E. coli. Mol. Cell 51(2), 265–272 (2013)CrossRef
12.
Zurück zum Zitat Sol, E., Wagner, S., Weinert, B.: Proteomic investigations of lysine acetylation identify diverse substrates of mitochondrial deacetylase sirt3. PLOS ONE 7(12), e50545 (2012)CrossRef Sol, E., Wagner, S., Weinert, B.: Proteomic investigations of lysine acetylation identify diverse substrates of mitochondrial deacetylase sirt3. PLOS ONE 7(12), e50545 (2012)CrossRef
13.
Zurück zum Zitat Shan, C., Elf, S., Ji, Q.: Lysine acetylation activates 6-phosphogluconate dehydrogenase to promote tumor growth. Mol. Cell 55(4), 552–565 (2014)CrossRef Shan, C., Elf, S., Ji, Q.: Lysine acetylation activates 6-phosphogluconate dehydrogenase to promote tumor growth. Mol. Cell 55(4), 552–565 (2014)CrossRef
14.
Zurück zum Zitat Li, Z.R., Lin, H.H., Han, L.Y., Jiang, L., Chen, X., Chen, Y.Z.: PROFEAT: A web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence. Nucleic Acids Res. 34, W32–W37 (2006)CrossRef Li, Z.R., Lin, H.H., Han, L.Y., Jiang, L., Chen, X., Chen, Y.Z.: PROFEAT: A web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence. Nucleic Acids Res. 34, W32–W37 (2006)CrossRef
15.
Zurück zum Zitat Rao, H.B., Zhu, F., Yang, G.B., Li, Z.R., Chen, Y.Z.: Update of PROFEAT: A web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence. Nucleic Acids Res. 39, W385–W390 (2011)CrossRef Rao, H.B., Zhu, F., Yang, G.B., Li, Z.R., Chen, Y.Z.: Update of PROFEAT: A web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence. Nucleic Acids Res. 39, W385–W390 (2011)CrossRef
16.
Zurück zum Zitat Bao, W., Chen, Y., Wang, D.: Prediction of protein structure classes with flexible neural tree. Bio-Med. Mater. Eng. 24, 3797–3806 (2014) Bao, W., Chen, Y., Wang, D.: Prediction of protein structure classes with flexible neural tree. Bio-Med. Mater. Eng. 24, 3797–3806 (2014)
17.
Zurück zum Zitat Chatterjee, P., Basu, S., Nasipuri, M.: Improving prediction of protein secondary structure using physicochemical properties of amino acids. In: Proceedings of the 2010 International Symposium on Biocomputing (ISB 2010). ACM, New York (2010) Chatterjee, P., Basu, S., Nasipuri, M.: Improving prediction of protein secondary structure using physicochemical properties of amino acids. In: Proceedings of the 2010 International Symposium on Biocomputing (ISB 2010). ACM, New York (2010)
18.
Zurück zum Zitat Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. MIT Press, Cambridge (2012)MATH Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. MIT Press, Cambridge (2012)MATH
19.
Zurück zum Zitat Yang, B., Chen, Y.H., Jiang, M.Y.: Reverse engineering of gene regulatory networks using flexible neural tree models. Neurocomputing 99, 458–466 (2013)CrossRef Yang, B., Chen, Y.H., Jiang, M.Y.: Reverse engineering of gene regulatory networks using flexible neural tree models. Neurocomputing 99, 458–466 (2013)CrossRef
20.
Zurück zum Zitat Deng, S.-P., Zhu, L., Huang, D.S.: Mining the bladder cancer-associated genes by an integrated strategy for the construction and analysis of differential co-expression networks. BMC Genomics 16 (Suppl 3), S4 2015 Deng, S.-P., Zhu, L., Huang, D.S.: Mining the bladder cancer-associated genes by an integrated strategy for the construction and analysis of differential co-expression networks. BMC Genomics 16 (Suppl 3), S4 2015
21.
Zurück zum Zitat Dou, Y., Yao, B., Zhang, C.: PhosphoSVM: prediction of phosphorylation sites by integrating various protein sequence attributes with a support vector machine. Amino Acids 46(6), 1459–1469 (2014)CrossRef Dou, Y., Yao, B., Zhang, C.: PhosphoSVM: prediction of phosphorylation sites by integrating various protein sequence attributes with a support vector machine. Amino Acids 46(6), 1459–1469 (2014)CrossRef
22.
Zurück zum Zitat Minguez, P., Letunic, I., Parca, L., et al.: PTMcode: a database of known and predicted functional associations between post-translational modifications in proteins. Nucleic Acids Res. 41(D1), D306–D311 (2013)CrossRef Minguez, P., Letunic, I., Parca, L., et al.: PTMcode: a database of known and predicted functional associations between post-translational modifications in proteins. Nucleic Acids Res. 41(D1), D306–D311 (2013)CrossRef
23.
Zurück zum Zitat Wang, B., Huang, D.S., Jiang, C.: A new strategy for protein interface identification using manifold learning method. IEEE Trans. Nanobiosci. 13(2), 118–123 (2014)CrossRef Wang, B., Huang, D.S., Jiang, C.: A new strategy for protein interface identification using manifold learning method. IEEE Trans. Nanobiosci. 13(2), 118–123 (2014)CrossRef
24.
Zurück zum Zitat Xiong, Y., Peng, X., Cheng, Z., et al.: A comprehensive catalog of the lysine-acetylation targets in rice (Oryza sativa) based on proteomic analyses. J. Proteomics 138, 20–29 (2016)CrossRef Xiong, Y., Peng, X., Cheng, Z., et al.: A comprehensive catalog of the lysine-acetylation targets in rice (Oryza sativa) based on proteomic analyses. J. Proteomics 138, 20–29 (2016)CrossRef
25.
Zurück zum Zitat Jia, C., Lin, X., Wang, Z.: Prediction of protein S-nitrosylation sites based on adapted normal distribution bi-profile Bayes and Chou’s pseudo amino acid composition. Int. J. Mol. Sci. 15(6), 10410–10423 (2014)CrossRef Jia, C., Lin, X., Wang, Z.: Prediction of protein S-nitrosylation sites based on adapted normal distribution bi-profile Bayes and Chou’s pseudo amino acid composition. Int. J. Mol. Sci. 15(6), 10410–10423 (2014)CrossRef
26.
Zurück zum Zitat Pougovkina, O., te Brinke, H., Ofman, R., et al.: Mitochondrial protein acetylation is driven by acetyl-CoA from fatty acid oxidation. Hum. Mol. Geneti. 23(13), 3513–3522 (2014)CrossRef Pougovkina, O., te Brinke, H., Ofman, R., et al.: Mitochondrial protein acetylation is driven by acetyl-CoA from fatty acid oxidation. Hum. Mol. Geneti. 23(13), 3513–3522 (2014)CrossRef
27.
Zurück zum Zitat Zhang, T.L., Ding, Y.S., Chou, K.C.: Prediction protein structural classes with pseudo amino acid composition: approximate entropy and hydrophobicity pattern. J. Theor. Biol. 250, 186–193 (2008)MathSciNetCrossRef Zhang, T.L., Ding, Y.S., Chou, K.C.: Prediction protein structural classes with pseudo amino acid composition: approximate entropy and hydrophobicity pattern. J. Theor. Biol. 250, 186–193 (2008)MathSciNetCrossRef
28.
Zurück zum Zitat Yu, H.-J., Huang, D.S.: Graphical representation for DNA sequences via joint diagonalization of matrix pencil. IEEE J. Biomed. Health Inform. 17(3), 503–511 (2013)MathSciNetCrossRef Yu, H.-J., Huang, D.S.: Graphical representation for DNA sequences via joint diagonalization of matrix pencil. IEEE J. Biomed. Health Inform. 17(3), 503–511 (2013)MathSciNetCrossRef
29.
Zurück zum Zitat Berezovsky, I.N., Kilosanidze, G.T., Tumanyan, V.G., et al.: Amino acid composition of protein termini are biased in different manners. Protein Eng. 12(1), 23–30 (1999)CrossRef Berezovsky, I.N., Kilosanidze, G.T., Tumanyan, V.G., et al.: Amino acid composition of protein termini are biased in different manners. Protein Eng. 12(1), 23–30 (1999)CrossRef
30.
Zurück zum Zitat Andreeva, A., Howorth, D., Chandonia, J.M., Brenner, S.E., Hubbard, T.J.P., Chothia, C., Murzin, A.G.: Data growth and its impact on the SCOP database: new development (2007) Andreeva, A., Howorth, D., Chandonia, J.M., Brenner, S.E., Hubbard, T.J.P., Chothia, C., Murzin, A.G.: Data growth and its impact on the SCOP database: new development (2007)
31.
Zurück zum Zitat Huang, D.S., Zhang, L., Han, K., Deng, S., Yang, K., Zhang, H.: Prediction of protein-protein interactions based on protein-protein correlation using least squares regression. Curr. Protein Pept. Sci. 15(6), 553–560 (2014)CrossRef Huang, D.S., Zhang, L., Han, K., Deng, S., Yang, K., Zhang, H.: Prediction of protein-protein interactions based on protein-protein correlation using least squares regression. Curr. Protein Pept. Sci. 15(6), 553–560 (2014)CrossRef
32.
Zurück zum Zitat Huang, D.S., Yu, H.-J.: Normalized feature vectors: A novel alignment-free sequence comparison meth-od based on the numbers of adjacent amino acids. IEEE/ACM Trans. Comput. Biol. Bioinform. 10(2), 457–467 (2013)CrossRef Huang, D.S., Yu, H.-J.: Normalized feature vectors: A novel alignment-free sequence comparison meth-od based on the numbers of adjacent amino acids. IEEE/ACM Trans. Comput. Biol. Bioinform. 10(2), 457–467 (2013)CrossRef
33.
Zurück zum Zitat Ding, C.H.Q., Dubchak, I.: Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17(4), 349–358 (2001)CrossRef Ding, C.H.Q., Dubchak, I.: Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17(4), 349–358 (2001)CrossRef
34.
Zurück zum Zitat Chen, K., Kurgan, L.A., Ruan, J.S.: Prediction of protein structural class using novel evolutionary collocation-based sequence representation. J. Comput. Chem. 29, 1596–1604 (2008)CrossRef Chen, K., Kurgan, L.A., Ruan, J.S.: Prediction of protein structural class using novel evolutionary collocation-based sequence representation. J. Comput. Chem. 29, 1596–1604 (2008)CrossRef
35.
Zurück zum Zitat Jones, D.T.: Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195–202 (1999)CrossRef Jones, D.T.: Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195–202 (1999)CrossRef
36.
Zurück zum Zitat Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)CrossRef Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)CrossRef
37.
Zurück zum Zitat Kurgan, L.A., Zhang, T., Zhang, H., Shen, S., Ruan, J.: Secondary structure-based assignment of the protein structural classes. Amino Acids 35, 551–564 (2008)CrossRef Kurgan, L.A., Zhang, T., Zhang, H., Shen, S., Ruan, J.: Secondary structure-based assignment of the protein structural classes. Amino Acids 35, 551–564 (2008)CrossRef
38.
Zurück zum Zitat Kurgan, L., Cios, K., Chen, K.: SCPRED: accurate pre-diction of protein structural class for sequences of twi-light-zone similarity with predicting sequences. BMC Bioinform. 9, 226 (2008)CrossRef Kurgan, L., Cios, K., Chen, K.: SCPRED: accurate pre-diction of protein structural class for sequences of twi-light-zone similarity with predicting sequences. BMC Bioinform. 9, 226 (2008)CrossRef
39.
Zurück zum Zitat Liu, T., Jia, C.: A high-accuracy protein structural class prediction algorithm using predicted secondary structural information. J. Theor. Biol. 267, 272–275 (2010)CrossRef Liu, T., Jia, C.: A high-accuracy protein structural class prediction algorithm using predicted secondary structural information. J. Theor. Biol. 267, 272–275 (2010)CrossRef
41.
Zurück zum Zitat Ding, S., Zhang, S., Li, Y., Wang, T.: A novel protein structural classes prediction method based on predicted secondary structure. Biochimie 94, 1166–1171 (2012)CrossRef Ding, S., Zhang, S., Li, Y., Wang, T.: A novel protein structural classes prediction method based on predicted secondary structure. Biochimie 94, 1166–1171 (2012)CrossRef
42.
Zurück zum Zitat Zheng, C.-H., Huang, D.S., Zhang, L., Kong, X.-Z.: Tumor clustering using non-negative matrix factorization with gene selection. IEEE Trans. Inf. Technol. Biomed. 13(4), 599–607 (2009)CrossRef Zheng, C.-H., Huang, D.S., Zhang, L., Kong, X.-Z.: Tumor clustering using non-negative matrix factorization with gene selection. IEEE Trans. Inf. Technol. Biomed. 13(4), 599–607 (2009)CrossRef
Metadaten
Titel
Prediction of Lysine Pupylation Sites with Machine Learning Methods
verfasst von
Wenzheng Bao
Zhichao Jiang
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-63312-1_36