Skip to main content

2017 | OriginalPaper | Buchkapitel

Efficient and Effective Multiple Protein Sequence Alignment Model Using Dynamic Progressive Approach with Novel Look Back Ahead Scoring System

verfasst von : Sanjay Bankapur, Nagamma Patil

Erschienen in: Pattern Recognition and Machine Intelligence

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Multiple protein sequence alignment is the elementary hurdle towards addressing further challenges like prediction of protein structure and its functions, protein sub-cellular localization, drug discovery etc. For the last 3 decades numerous models have been proposed to address this challenge however the models are either computationally complex or not effective with respect to aligned results. In this paper, a computationally efficient and effective model is proposed to solve multiple protein sequence alignment. Our proposed model follows dynamic progressive global alignment approach in which a sequence pair is merged dynamically based on novel scoring system, named Look Back Ahead (LBA). Proposed model results were validated with aligned reference results on benchmark datasets (PREFAB4refm and SABrem), using four metrics: Sum-of-Pairs (SP), Total Gap Penalty (TGP), Column Score (CS) and Total Mutation Count Pair-wise (TMCP). Experimental results demonstrate that the proposed method outperforms benchmark reference results in any three evaluation metrics by 77.46% and 68.65% for PREFAB4refm and SABrem datasets respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Cutello, V., Nicosia, G., Pavone, M., Prizzi, I.: Protein multiple sequence alignment by hybrid bio-inspired algorithms. Nucleic Acids Res. 39(6), 1980–1992 (2011)CrossRef Cutello, V., Nicosia, G., Pavone, M., Prizzi, I.: Protein multiple sequence alignment by hybrid bio-inspired algorithms. Nucleic Acids Res. 39(6), 1980–1992 (2011)CrossRef
2.
Zurück zum Zitat Dayhoff, M., Schwartz, R., Orcutt, B.: 22 a model of evolutionary change in proteins. In: Atlas of protein sequence and structure, vol. 5, pp. 345–352. National Biomedical Research Foundation Silver Spring, MD (1978) Dayhoff, M., Schwartz, R., Orcutt, B.: 22 a model of evolutionary change in proteins. In: Atlas of protein sequence and structure, vol. 5, pp. 345–352. National Biomedical Research Foundation Silver Spring, MD (1978)
4.
Zurück zum Zitat Edgar, R.C.: Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32(5), 1792–1797 (2004)CrossRef Edgar, R.C.: Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32(5), 1792–1797 (2004)CrossRef
5.
Zurück zum Zitat Feng, D.F., Doolittle, R.F.: Progressive sequence alignment as a prerequisitetto correct phylogenetic trees. J. Mol. Evol. 25(4), 351–360 (1987)CrossRef Feng, D.F., Doolittle, R.F.: Progressive sequence alignment as a prerequisitetto correct phylogenetic trees. J. Mol. Evol. 25(4), 351–360 (1987)CrossRef
6.
Zurück zum Zitat Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. 89(22), 10915–10919 (1992)CrossRef Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. 89(22), 10915–10919 (1992)CrossRef
7.
Zurück zum Zitat Kaya, M., Sarhan, A., Alhajj, R.: Multiple sequence alignment with affine gap by using multi-objective genetic algorithm. Comput. Methods Programs Biomed. 114(1), 38–49 (2014)CrossRef Kaya, M., Sarhan, A., Alhajj, R.: Multiple sequence alignment with affine gap by using multi-objective genetic algorithm. Comput. Methods Programs Biomed. 114(1), 38–49 (2014)CrossRef
8.
Zurück zum Zitat Lassmann, T., Sonnhammer, E.L.: Kalign-an accurate and fast multiple sequence alignment algorithm. BMC Bioinform. 6(1), 298 (2005)CrossRef Lassmann, T., Sonnhammer, E.L.: Kalign-an accurate and fast multiple sequence alignment algorithm. BMC Bioinform. 6(1), 298 (2005)CrossRef
9.
Zurück zum Zitat Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)CrossRef Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)CrossRef
10.
Zurück zum Zitat Notredame, C.: Recent evolutions of multiple sequence alignment algorithms. PLoS Comput. Biol. 3(8), e123 (2007)CrossRef Notredame, C.: Recent evolutions of multiple sequence alignment algorithms. PLoS Comput. Biol. 3(8), e123 (2007)CrossRef
11.
Zurück zum Zitat Ortuno, F., Florido, J.P., Urquiza, J.M., Pomares, H., Prieto, A., Rojas, I.: Optimization of multiple sequence alignment methodologies using a multiobjective evolutionary algorithm based on nsga-ii. In: 2012 IEEE Congress on Evolutionary Computation (CEC), pp. 1–8. IEEE (2012) Ortuno, F., Florido, J.P., Urquiza, J.M., Pomares, H., Prieto, A., Rojas, I.: Optimization of multiple sequence alignment methodologies using a multiobjective evolutionary algorithm based on nsga-ii. In: 2012 IEEE Congress on Evolutionary Computation (CEC), pp. 1–8. IEEE (2012)
12.
Zurück zum Zitat Oyola, S.O., Otto, T.D., Gu, Y., Maslen, G., Manske, M., Campino, S., Turner, D.J., MacInnis, B., Kwiatkowski, D.P., Swerdlow, H.P., et al.: Optimizing illumina next-generation sequencing library preparation for extremely at-biased genomes. BMC Genom. 13(1), 1 (2012)CrossRef Oyola, S.O., Otto, T.D., Gu, Y., Maslen, G., Manske, M., Campino, S., Turner, D.J., MacInnis, B., Kwiatkowski, D.P., Swerdlow, H.P., et al.: Optimizing illumina next-generation sequencing library preparation for extremely at-biased genomes. BMC Genom. 13(1), 1 (2012)CrossRef
13.
Zurück zum Zitat Rubio-Largo, Á., Vega-Rodríguez, M.A., González-Álvarez, D.L.: A hybrid multiobjective memetic metaheuristic for multiple sequence alignment. IEEE Trans. Evol. Comput. 20(4), 499–514 (2016)CrossRef Rubio-Largo, Á., Vega-Rodríguez, M.A., González-Álvarez, D.L.: A hybrid multiobjective memetic metaheuristic for multiple sequence alignment. IEEE Trans. Evol. Comput. 20(4), 499–514 (2016)CrossRef
14.
Zurück zum Zitat Sievers, F., Wilm, A., Dineen, D., Gibson, T.J., Karplus, K., Li, W., Lopez, R., McWilliam, H., Remmert, M., Söding, J., et al.: Fast, scalable generation of high-quality protein multiple sequence alignments using clustal omega. Mol. Syst. Biol. 7(1), 539 (2011)CrossRef Sievers, F., Wilm, A., Dineen, D., Gibson, T.J., Karplus, K., Li, W., Lopez, R., McWilliam, H., Remmert, M., Söding, J., et al.: Fast, scalable generation of high-quality protein multiple sequence alignments using clustal omega. Mol. Syst. Biol. 7(1), 539 (2011)CrossRef
15.
Zurück zum Zitat Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)CrossRef Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)CrossRef
16.
Zurück zum Zitat Thompson, J.D., Higgins, D.G., Gibson, T.J.: Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22(22), 4673–4680 (1994)CrossRef Thompson, J.D., Higgins, D.G., Gibson, T.J.: Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22(22), 4673–4680 (1994)CrossRef
17.
Zurück zum Zitat Thompson, J.D., Plewniak, F., Poch, O.: A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 27(13), 2682–2690 (1999)CrossRef Thompson, J.D., Plewniak, F., Poch, O.: A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 27(13), 2682–2690 (1999)CrossRef
18.
Zurück zum Zitat Thompson, J.D., Poch, O.: Multiple sequence alignment as a workbench for molecular systems biology. Curr. Bioinform. 1(1), 95–104 (2006)CrossRef Thompson, J.D., Poch, O.: Multiple sequence alignment as a workbench for molecular systems biology. Curr. Bioinform. 1(1), 95–104 (2006)CrossRef
19.
Zurück zum Zitat Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. J. Comput. Biol. 1(4), 337–348 (1994)CrossRef Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. J. Comput. Biol. 1(4), 337–348 (1994)CrossRef
20.
Zurück zum Zitat Zhu, H., He, Z., Jia, Y.: A novel approach to multiple sequence alignment using multiobjective evolutionary algorithm based on decomposition. IEEE J. Biomed. Health Inf. 20(2), 717–727 (2016)CrossRef Zhu, H., He, Z., Jia, Y.: A novel approach to multiple sequence alignment using multiobjective evolutionary algorithm based on decomposition. IEEE J. Biomed. Health Inf. 20(2), 717–727 (2016)CrossRef
Metadaten
Titel
Efficient and Effective Multiple Protein Sequence Alignment Model Using Dynamic Progressive Approach with Novel Look Back Ahead Scoring System
verfasst von
Sanjay Bankapur
Nagamma Patil
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-69900-4_50