Skip to main content
Top

2018 | OriginalPaper | Chapter

Alignment-Free Z-Curve Genomic Cepstral Coefficients and Machine Learning for Classification of Viruses

Authors : Emmanuel Adetiba, Oludayo O. Olugbara, Tunmike B. Taiwo, Marion O. Adebiyi, Joke A. Badejo, Matthew B. Akanle, Victor O. Matthews

Published in: Bioinformatics and Biomedical Engineering

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Accurate detection of pathogenic viruses has become highly imperative. This is because viral diseases constitute a huge threat to human health and wellbeing on a global scale. However, both traditional and recent techniques for viral detection suffer from various setbacks. In codicil, some of the existing alignment-free methods are also limited with respect to viral detection accuracy. In this paper, we present the development of an alignment-free, digital signal processing based method for pathogenic viral detection named Z-Curve Genomic Cesptral Coefficients (ZCGCC). To evaluate the method, ZCGCC were computed from twenty six pathogenic viral strains extracted from the ViPR corpus. Naïve Bayesian classifier, which is a popular machine learning method was experimentally trained and validated using the extracted ZCGCC and other alignment-free methods in the literature. Comparative results show that the proposed ZCGCC gives good accuracy (93.0385%) and improved performance to existing alignment-free methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Xie, G., Yu, J., Duan, Z.: New strategy for virus discovery: viruses identified in human feces in the last decade. Sci. China Life Sci. 56(8), 688–696 (2013)CrossRef Xie, G., Yu, J., Duan, Z.: New strategy for virus discovery: viruses identified in human feces in the last decade. Sci. China Life Sci. 56(8), 688–696 (2013)CrossRef
2.
go back to reference Kaushik, A., Tiwari, S., Jayant, R.D., Marty, A., Nair, M.: Towards detection and diagnosis of Ebola virus disease at point-of-care. Biosens. Bioelectron. 75, 254–272 (2016)CrossRef Kaushik, A., Tiwari, S., Jayant, R.D., Marty, A., Nair, M.: Towards detection and diagnosis of Ebola virus disease at point-of-care. Biosens. Bioelectron. 75, 254–272 (2016)CrossRef
3.
go back to reference Mokili, J.L., Rohwer, F., Dutilh, B.E.: Metagenomics and future perspectives in virus discovery. Curr. Opin. Virol. 2(1), 63–77 (2012)CrossRef Mokili, J.L., Rohwer, F., Dutilh, B.E.: Metagenomics and future perspectives in virus discovery. Curr. Opin. Virol. 2(1), 63–77 (2012)CrossRef
4.
go back to reference Mabrouk, M.S.: A study of the potential of EIIP mapping method in exon prediction using the frequency domain techniques. Am. J. Biomed. Eng. 2(2), 17–22 (2012)MathSciNetCrossRef Mabrouk, M.S.: A study of the potential of EIIP mapping method in exon prediction using the frequency domain techniques. Am. J. Biomed. Eng. 2(2), 17–22 (2012)MathSciNetCrossRef
5.
go back to reference Sathish Kumar, S., Duraipandian, N.: An effective identification of species from DNA sequence: a classification technique by integrating DM and ANN. Int. J. Adv. Comput. Sci. Appl. 3(8), 104–114 (2012) Sathish Kumar, S., Duraipandian, N.: An effective identification of species from DNA sequence: a classification technique by integrating DM and ANN. Int. J. Adv. Comput. Sci. Appl. 3(8), 104–114 (2012)
6.
go back to reference Adetiba, E., Olugbara, O.O., Taiwo, T.B.: Identification of pathogenic viruses using genomic cepstral coefficients with radial basis function neural network. In: Pillay, N., Engelbrecht, A.P., Abraham, A., du Plessis, M.C., Snášel, V., Muda, A.K. (eds.) Advances in Nature and Biologically Inspired Computing. AISC, vol. 419, pp. 281–291. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-27400-3_25CrossRef Adetiba, E., Olugbara, O.O., Taiwo, T.B.: Identification of pathogenic viruses using genomic cepstral coefficients with radial basis function neural network. In: Pillay, N., Engelbrecht, A.P., Abraham, A., du Plessis, M.C., Snášel, V., Muda, A.K. (eds.) Advances in Nature and Biologically Inspired Computing. AISC, vol. 419, pp. 281–291. Springer, Cham (2016). https://​doi.​org/​10.​1007/​978-3-319-27400-3_​25CrossRef
7.
go back to reference Pickett, B.E., Greer, D.S., Zhang, Y.: Virus pathogen database and analysis resource (ViPR): a comprehensive bioinformatics database and analysis resource for the coronavirus research community. Viruses 4, 3209–3226 (2012)CrossRef Pickett, B.E., Greer, D.S., Zhang, Y.: Virus pathogen database and analysis resource (ViPR): a comprehensive bioinformatics database and analysis resource for the coronavirus research community. Viruses 4, 3209–3226 (2012)CrossRef
8.
go back to reference Wang, Q., Garrity, G.M., Tiedje, J.M., Cole, J.R.: Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl. Environ. Microbiol. 73(16), 5261–5267 (2007)CrossRef Wang, Q., Garrity, G.M., Tiedje, J.M., Cole, J.R.: Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl. Environ. Microbiol. 73(16), 5261–5267 (2007)CrossRef
9.
go back to reference Li, Y., Tian, K., Yin, C., He, R.L., Yau, S.S.T.: Virus classification in 60-dimensional protein space. Mol. Phylogenet. Evol. 99, 53–62 (2016)CrossRef Li, Y., Tian, K., Yin, C., He, R.L., Yau, S.S.T.: Virus classification in 60-dimensional protein space. Mol. Phylogenet. Evol. 99, 53–62 (2016)CrossRef
11.
go back to reference Kantorovitz, M.R., Robinson, G.E., Sinha, S.: A statistical method for alignment-free comparison of regulatory sequences. Bioinformatics 23(13), i249–i255 (2007)CrossRef Kantorovitz, M.R., Robinson, G.E., Sinha, S.: A statistical method for alignment-free comparison of regulatory sequences. Bioinformatics 23(13), i249–i255 (2007)CrossRef
12.
go back to reference Dai, Q., Yang, Y., Wang, T.: Markov model plus k-word distributions: a synergy that produces novel statistical measures for sequence comparison. Bioinformatics 24(20), 2296–2302 (2008)CrossRef Dai, Q., Yang, Y., Wang, T.: Markov model plus k-word distributions: a synergy that produces novel statistical measures for sequence comparison. Bioinformatics 24(20), 2296–2302 (2008)CrossRef
13.
go back to reference Sims, G.E., Jun, S.R., Wu, G.A., Kim, S.H.: Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions. Proc. Natl. Acad. Sci. 106(8), 2677–2682 (2009)CrossRef Sims, G.E., Jun, S.R., Wu, G.A., Kim, S.H.: Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions. Proc. Natl. Acad. Sci. 106(8), 2677–2682 (2009)CrossRef
14.
go back to reference Deng, M., Yu, C., Liang, Q., He, R.L., Yau, S.S.T.: A novel method of characterizing genetic sequences: genome space with biological distance and applications. PLoS One 6(3), e17293 (2011)CrossRef Deng, M., Yu, C., Liang, Q., He, R.L., Yau, S.S.T.: A novel method of characterizing genetic sequences: genome space with biological distance and applications. PLoS One 6(3), e17293 (2011)CrossRef
15.
go back to reference Yu, C., Liang, Q., Yin, C., He, R.L., Yau, S.S.T.: A novel construction of genome space with biological geometry. DNA Res. 17, 155–168 (2010)CrossRef Yu, C., Liang, Q., Yin, C., He, R.L., Yau, S.S.T.: A novel construction of genome space with biological geometry. DNA Res. 17, 155–168 (2010)CrossRef
16.
go back to reference Yu, C., Hernandez, T., Zheng, H., Yau, S.C., Huang, H.H., He, R.L., Yau, S.S.T.: Real time classification of viruses in 12 dimensions. PLoS One 8(5), e64328 (2013)CrossRef Yu, C., Hernandez, T., Zheng, H., Yau, S.C., Huang, H.H., He, R.L., Yau, S.S.T.: Real time classification of viruses in 12 dimensions. PLoS One 8(5), e64328 (2013)CrossRef
17.
go back to reference Huang, H.H., Yu, C., Zheng, H., Hernandez, T., Yau, S.C., He, R.L., Yau, S.S.T.: Global comparison of multiple-segmented viruses in 12-dimensional genome space. Mol. Phylogenet. Evol. 81, 29–36 (2014)CrossRef Huang, H.H., Yu, C., Zheng, H., Hernandez, T., Yau, S.C., He, R.L., Yau, S.S.T.: Global comparison of multiple-segmented viruses in 12-dimensional genome space. Mol. Phylogenet. Evol. 81, 29–36 (2014)CrossRef
18.
go back to reference Anastassiou, D.: DSP in genomics: processing and frequency-domain analysis of character strings. In: Proceedings of 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP 2001), vol. 2, pp. 1053–1056. IEEE (2001) Anastassiou, D.: DSP in genomics: processing and frequency-domain analysis of character strings. In: Proceedings of 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP 2001), vol. 2, pp. 1053–1056. IEEE (2001)
19.
go back to reference Bai Arniker, S., Kwan, H.K.: Advanced numerical representation of DNA sequences. In: International Conference on Bioscience, Biochemistry and Bioinformatics IPCBEE, vol. 3, p. 1 (2012) Bai Arniker, S., Kwan, H.K.: Advanced numerical representation of DNA sequences. In: International Conference on Bioscience, Biochemistry and Bioinformatics IPCBEE, vol. 3, p. 1 (2012)
20.
go back to reference Guo, F.B., Lin, Y., Chen, L.L.: Recognition of protein-coding genes based on Z-curve algorithms. Curr. Genomics 15(2), 95–103 (2014)CrossRef Guo, F.B., Lin, Y., Chen, L.L.: Recognition of protein-coding genes based on Z-curve algorithms. Curr. Genomics 15(2), 95–103 (2014)CrossRef
21.
go back to reference Zhang, R., Zhang, C.T.: A brief review: the z-curve theory and its application in genome analysis. Curr. Genomics 15(2), 78–94 (2014)CrossRef Zhang, R., Zhang, C.T.: A brief review: the z-curve theory and its application in genome analysis. Curr. Genomics 15(2), 78–94 (2014)CrossRef
22.
go back to reference Cornish-Bowden, A.: Nomenclature for incompletely specified bases in nucleic acid sequences: recommendations 1984. Nucleic Acids Res. 13(9), 3021 (1985)CrossRef Cornish-Bowden, A.: Nomenclature for incompletely specified bases in nucleic acid sequences: recommendations 1984. Nucleic Acids Res. 13(9), 3021 (1985)CrossRef
23.
go back to reference Randall, R.B.: A history of cepstrum analysis and its application to mechanical problems. In: International Conference at Institute of Technology of Chartres, France, pp. 11–16 (2013) Randall, R.B.: A history of cepstrum analysis and its application to mechanical problems. In: International Conference at Institute of Technology of Chartres, France, pp. 11–16 (2013)
24.
go back to reference Thakur, S., Adetiba, E., Olugbara, O.O., Millham, R.: Experimentation using short-term spectral features for secure mobile internet voting authentication. Math. Probl. Eng. (2015) Thakur, S., Adetiba, E., Olugbara, O.O., Millham, R.: Experimentation using short-term spectral features for secure mobile internet voting authentication. Math. Probl. Eng. (2015)
26.
go back to reference Adetiba, E., Badejo, J.A., Thakur, S., Matthews, V.O., Adebiyi, M.O., Adebiyi, E.F.: Experimental investigation of frequency chaos game representation for in silico and accurate classification of viral pathogens from genomic sequences. In: Rojas, I., Ortuño, F. (eds.) IWBBIO 2017. LNCS, vol. 10208, pp. 155–164. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56148-6_13CrossRef Adetiba, E., Badejo, J.A., Thakur, S., Matthews, V.O., Adebiyi, M.O., Adebiyi, E.F.: Experimental investigation of frequency chaos game representation for in silico and accurate classification of viral pathogens from genomic sequences. In: Rojas, I., Ortuño, F. (eds.) IWBBIO 2017. LNCS, vol. 10208, pp. 155–164. Springer, Cham (2017). https://​doi.​org/​10.​1007/​978-3-319-56148-6_​13CrossRef
27.
go back to reference Vijayan, K., Nair, V.V., Gopinath, D.P.: Classification of organisms using frequency-chaos game representation of genomic sequences and ANN. In: 10th National Conference on Technological Trends (NCTT 2009), pp. 6–7 (2009) Vijayan, K., Nair, V.V., Gopinath, D.P.: Classification of organisms using frequency-chaos game representation of genomic sequences and ANN. In: 10th National Conference on Technological Trends (NCTT 2009), pp. 6–7 (2009)
28.
go back to reference Shao, J., Yan, X., Shao, S.: SNR of DNA sequences mapped by general affine transformations of the indicator sequences. J. Math. Biol. 67(2), 433–451 (2013)MathSciNetCrossRefMATH Shao, J., Yan, X., Shao, S.: SNR of DNA sequences mapped by general affine transformations of the indicator sequences. J. Math. Biol. 67(2), 433–451 (2013)MathSciNetCrossRefMATH
29.
go back to reference Adetiba, E., Olugbara, O.O.: Improved classification of lung cancer using radial basis function neural network with affine transforms of Voss representation. PLoS One 10(12), e0143542 (2015)CrossRef Adetiba, E., Olugbara, O.O.: Improved classification of lung cancer using radial basis function neural network with affine transforms of Voss representation. PLoS One 10(12), e0143542 (2015)CrossRef
Metadata
Title
Alignment-Free Z-Curve Genomic Cepstral Coefficients and Machine Learning for Classification of Viruses
Authors
Emmanuel Adetiba
Oludayo O. Olugbara
Tunmike B. Taiwo
Marion O. Adebiyi
Joke A. Badejo
Matthew B. Akanle
Victor O. Matthews
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-78723-7_25

Premium Partner