Skip to main content
Erschienen in: Network Modeling Analysis in Health Informatics and Bioinformatics 1/2020

01.12.2020 | Original Article

A tri-nucleotide mapping scheme based on residual volume of amino acids for short length exon prediction using sliding window DFT method

verfasst von: Amit Kumar Singh, Vinay Kumar Srivastava

Erschienen in: Network Modeling Analysis in Health Informatics and Bioinformatics | Ausgabe 1/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

One of the great challenges in the field of bioinformatics is how to locate the accurate protein-coding regions in a given DNA sequence. The accurate identification of the protein-coding region is useful in many applications. For instance; it helps in characterizing new proteins, drug designing, and also in revealing the evolutionary background of a particular organism. DSP based techniques are quite popular in the protein-coding region identification. The first essential step of the DSP based exon prediction technique is to convert the base sequences into the numerical sequence. The choice of the numerical mapping scheme affects how well the characteristic feature of the DNA sequence is reflected in the numerical domain which helps in finding the accurate location of exons. In the last two decades, numbers of mapping schemes have been successfully used for exon prediction. However, locating the short length exon is still a difficult task. In this paper, we have proposed a tri-nucleotide mapping scheme that exploits the residual volume property of amino acid to encode the given DNA sequence. It is obtained that the proposed tri-nucleotide mapping scheme provides better results than other mapping schemes in case of short length exon detection.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abbasi O, Rostami A, Karimian G (2011) Identification of exonic regions in DNA sequences using cross-correlation and noise suppression by discrete wavelet transform. BMC Bioinformatics 12(430):1–10 Abbasi O, Rostami A, Karimian G (2011) Identification of exonic regions in DNA sequences using cross-correlation and noise suppression by discrete wavelet transform. BMC Bioinformatics 12(430):1–10
Zurück zum Zitat Ahmad M, Jung LT, Bhuiyan A (2017) From DNA to protein: why genetic code context of nucleotides for DNA signal processing? A review. Biomed Signal Process Control 34:44–63CrossRef Ahmad M, Jung LT, Bhuiyan A (2017) From DNA to protein: why genetic code context of nucleotides for DNA signal processing? A review. Biomed Signal Process Control 34:44–63CrossRef
Zurück zum Zitat Akhtar M, Epps J, Ambikairajah E (2007) On DNA numerical representations for period-3 based exon prediction. In: IEEE international workshop on genomic signal processing and statistics (GENSIPS), pp 1–4 Akhtar M, Epps J, Ambikairajah E (2007) On DNA numerical representations for period-3 based exon prediction. In: IEEE international workshop on genomic signal processing and statistics (GENSIPS), pp 1–4
Zurück zum Zitat Akhtar M, Epps J, Ambikairajah E (2008) Signal processing in sequence analysis: advances in eukaryotic gene prediction. IEEE J Sel Top signal Process 2(3):310–321CrossRef Akhtar M, Epps J, Ambikairajah E (2008) Signal processing in sequence analysis: advances in eukaryotic gene prediction. IEEE J Sel Top signal Process 2(3):310–321CrossRef
Zurück zum Zitat Anastassiou D (2001) Genomic signal processing. IEEE Signal Process Mag 18(4):8–20CrossRef Anastassiou D (2001) Genomic signal processing. IEEE Signal Process Mag 18(4):8–20CrossRef
Zurück zum Zitat Arniker SB, Kwan HK (2012) Advanced numerical representation of DNA sequences. In: International conference on bioscience, biochemistry and bioinformatics, IPCBEE Arniker SB, Kwan HK (2012) Advanced numerical representation of DNA sequences. In: International conference on bioscience, biochemistry and bioinformatics, IPCBEE
Zurück zum Zitat Arora R, Sethares WA (2008) Latent periodicities in genomic sequences. IEEE J Sel Top Signal Process 2(3):332–342CrossRef Arora R, Sethares WA (2008) Latent periodicities in genomic sequences. IEEE J Sel Top Signal Process 2(3):332–342CrossRef
Zurück zum Zitat Guigo R (1999) DNA composition, codon usage and exon prediction Genetic Databases. Academic Press, Cambridge Guigo R (1999) DNA composition, codon usage and exon prediction Genetic Databases. Academic Press, Cambridge
Zurück zum Zitat Hota MK, Srivastava VK (2008) DSP technique for gene and exon prediction taking complex indicator sequence. Proc IEEE TENCON 2008:1–6 Hota MK, Srivastava VK (2008) DSP technique for gene and exon prediction taking complex indicator sequence. Proc IEEE TENCON 2008:1–6
Zurück zum Zitat Koltar D, Lavner Y (2003) Gene prediction by spectral rotation measure: a new method for identifying protein-coding regions. Genome Res 13:1930–1937 Koltar D, Lavner Y (2003) Gene prediction by spectral rotation measure: a new method for identifying protein-coding regions. Genome Res 13:1930–1937
Zurück zum Zitat Kwan HK, Kwan BYM, Kwan JJY (2012) Novel methodologies for spectral classification of exon and intron sequences. EURASIP J Adv Signal Process 50(1):1–14MathSciNet Kwan HK, Kwan BYM, Kwan JJY (2012) Novel methodologies for spectral classification of exon and intron sequences. EURASIP J Adv Signal Process 50(1):1–14MathSciNet
Zurück zum Zitat Meher JK, Dash GN, Meher PK, Raval MK (2011a) A reduced computational load protein coding predictor using equivalent amino acid sequence of DNA string with period-3 based time and frequency domain analysis. Am J Mol Biol 1:79–86CrossRef Meher JK, Dash GN, Meher PK, Raval MK (2011a) A reduced computational load protein coding predictor using equivalent amino acid sequence of DNA string with period-3 based time and frequency domain analysis. Am J Mol Biol 1:79–86CrossRef
Zurück zum Zitat Meher J, Meher PK, Das G (2011b) Improved comb filter based approach for effective prediction of protein coding regions in DNA sequences. J Signal Inf Process 2:88–99 Meher J, Meher PK, Das G (2011b) Improved comb filter based approach for effective prediction of protein coding regions in DNA sequences. J Signal Inf Process 2:88–99
Zurück zum Zitat Mena-Chalco JP, Carrer H, Zana Y, Cesar RM Jr (2008) Identification of protein coding regions using the modified Gabor-wavelet transform. IEEE/ACM Trans Comput Biol Bioinf 5:198–207CrossRef Mena-Chalco JP, Carrer H, Zana Y, Cesar RM Jr (2008) Identification of protein coding regions using the modified Gabor-wavelet transform. IEEE/ACM Trans Comput Biol Bioinf 5:198–207CrossRef
Zurück zum Zitat Nair AS, Sreenadhan SP (2006) A coding measure scheme employing electron-ion interaction pseudopotantial (EIIP). Bioinformation 1(6):197–202 Nair AS, Sreenadhan SP (2006) A coding measure scheme employing electron-ion interaction pseudopotantial (EIIP). Bioinformation 1(6):197–202
Zurück zum Zitat Provazník I, Kubicová V, Škutková H, Tkacz E, Babula P (2012) Detection of Short Exons in DNA sequences using complex wavelet transform of structural features. In: 2012 International workshop on genomic signal processing and statistics (GENSIPS), pp 107–110, Washington, DC Provazník I, Kubicová V, Škutková H, Tkacz E, Babula P (2012) Detection of Short Exons in DNA sequences using complex wavelet transform of structural features. In: 2012 International workshop on genomic signal processing and statistics (GENSIPS), pp 107–110, Washington, DC
Zurück zum Zitat Ramachandran P, Antoniou A (2008) Identification of hot spot locations in proteins using digital filter. IEEE J Sel Top Signal Process 2(3):378–389CrossRef Ramachandran P, Antoniou A (2008) Identification of hot spot locations in proteins using digital filter. IEEE J Sel Top Signal Process 2(3):378–389CrossRef
Zurück zum Zitat Rao N, Lei X, Guo J, Huang H, Ren Z (2009) An efficient sliding window strategy for accurate location of eukaryotic protein coding regions. Comput Biol Med 39:392–395CrossRef Rao N, Lei X, Guo J, Huang H, Ren Z (2009) An efficient sliding window strategy for accurate location of eukaryotic protein coding regions. Comput Biol Med 39:392–395CrossRef
Zurück zum Zitat Rogic S, Mackworth AK, Ouellette FBF (2001) Evaluation of gene finding programs on mammalian sequences. Genome Res 11(5):817–832CrossRef Rogic S, Mackworth AK, Ouellette FBF (2001) Evaluation of gene finding programs on mammalian sequences. Genome Res 11(5):817–832CrossRef
Zurück zum Zitat Roy M, Barman S (2014) Effective gene prediction by high resolution frequency estimator based on least-norm solution technique. EURASIP J Bioinf Syst Biol 2(1):1–13 Roy M, Barman S (2014) Effective gene prediction by high resolution frequency estimator based on least-norm solution technique. EURASIP J Bioinf Syst Biol 2(1):1–13
Zurück zum Zitat Roy M, Barman S (2016) Improved gene prediction by principal component analysis based autoregressive Yule-Walker method. Gene 575:488–497CrossRef Roy M, Barman S (2016) Improved gene prediction by principal component analysis based autoregressive Yule-Walker method. Gene 575:488–497CrossRef
Zurück zum Zitat Shakya DK, Saxena R, Sharma SN (2013) An adaptive window length strategy for eukaryotic CDS prediction. IEEE/ACM Trans Comput Biol Bioinf 10(5):1241–1252CrossRef Shakya DK, Saxena R, Sharma SN (2013) An adaptive window length strategy for eukaryotic CDS prediction. IEEE/ACM Trans Comput Biol Bioinf 10(5):1241–1252CrossRef
Zurück zum Zitat Silverman BD, Linsker R (1986) A measure of DNA periodicity. J Theor Biol 118:295–300CrossRef Silverman BD, Linsker R (1986) A measure of DNA periodicity. J Theor Biol 118:295–300CrossRef
Zurück zum Zitat Singh AK, Srivastava VK (2019) Performance evaluation of different window functions for STDFT based exon prediction technique taking paired numeric mapping scheme. In: 6th International Conference on Signal Processing and Integrated Networks (SPIN), pp 1–5 Singh AK, Srivastava VK (2019) Performance evaluation of different window functions for STDFT based exon prediction technique taking paired numeric mapping scheme. In: 6th International Conference on Signal Processing and Integrated Networks (SPIN), pp 1–5
Zurück zum Zitat Tiwari S, Ramachandran R, Bhattacharya A, Bhattacharya S, Ramaswamy R (1997) Prediction of probable genes by Fourier analysis of genomic sequences. CABIOS 13(3):263–270 Tiwari S, Ramachandran R, Bhattacharya A, Bhattacharya S, Ramaswamy R (1997) Prediction of probable genes by Fourier analysis of genomic sequences. CABIOS 13(3):263–270
Zurück zum Zitat Trifonov (1998) E.N.: 3-, 10.5- and 400-base periodicities in genome sequences. Phys A 249:511–516CrossRef Trifonov (1998) E.N.: 3-, 10.5- and 400-base periodicities in genome sequences. Phys A 249:511–516CrossRef
Zurück zum Zitat Vaidyanathan PP, Yoon BJ (2004) The role of signal-processing concepts in genomics and proteomics. J Franklin Inst 341:111–1354MATHCrossRef Vaidyanathan PP, Yoon BJ (2004) The role of signal-processing concepts in genomics and proteomics. J Franklin Inst 341:111–1354MATHCrossRef
Zurück zum Zitat Voss RF (1992) Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. Phys Rev Lett 68:3805–3808CrossRef Voss RF (1992) Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. Phys Rev Lett 68:3805–3808CrossRef
Zurück zum Zitat Zhang S, Wang T (2009) Feature analysis of protein structure by using discrete Fourier transform and continuous wavelet transform. J Math Chem 46:562–568MathSciNetMATHCrossRef Zhang S, Wang T (2009) Feature analysis of protein structure by using discrete Fourier transform and continuous wavelet transform. J Math Chem 46:562–568MathSciNetMATHCrossRef
Zurück zum Zitat Zhang W, Yan H (2011) Exon prediction using empirical mode decomposition and Fourier transform of structural profiles of DNA sequences. Pattern Recogn 45:947–955CrossRef Zhang W, Yan H (2011) Exon prediction using empirical mode decomposition and Fourier transform of structural profiles of DNA sequences. Pattern Recogn 45:947–955CrossRef
Zurück zum Zitat Zhang R, Zhang CT (1994) Z curves an intuitive tool for visualizing and analyzing the DNA sequences. J Biomol Struct Dyn 11(4):767–782CrossRef Zhang R, Zhang CT (1994) Z curves an intuitive tool for visualizing and analyzing the DNA sequences. J Biomol Struct Dyn 11(4):767–782CrossRef
Metadaten
Titel
A tri-nucleotide mapping scheme based on residual volume of amino acids for short length exon prediction using sliding window DFT method
verfasst von
Amit Kumar Singh
Vinay Kumar Srivastava
Publikationsdatum
01.12.2020
Verlag
Springer Vienna
Erschienen in
Network Modeling Analysis in Health Informatics and Bioinformatics / Ausgabe 1/2020
Print ISSN: 2192-6662
Elektronische ISSN: 2192-6670
DOI
https://doi.org/10.1007/s13721-020-00230-1

Weitere Artikel der Ausgabe 1/2020

Network Modeling Analysis in Health Informatics and Bioinformatics 1/2020 Zur Ausgabe

Premium Partner