Skip to main content
Top
Published in: Cluster Computing 1/2017

28-11-2016

Motif identification method based on Gibbs sampling and genetic algorithm

Authors: Xiaochun Sheng, Kefeng Wang

Published in: Cluster Computing | Issue 1/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The regulation of gene expression is the key of organism genetic mechanism. Motif identification is an important step in constructing expression regulatory network. Based on Gibbs sampling method, this work constructed position weight matrix, thereby proposing motif recognition method based on genetic algorithm. Scoring function is defined to update the population and obtain the convergence matrix of position weight, achieving the identification of motifs with different length. Simulation and experimental data sets were utilized to verify the accuracy and execution time of the algorithm.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference D’heaseleer, P.: What are DNA sequence motifs. Natl. Biotechnol. 24(4), 423–425 (2006)CrossRef D’heaseleer, P.: What are DNA sequence motifs. Natl. Biotechnol. 24(4), 423–425 (2006)CrossRef
2.
go back to reference Latchman, D.S.: Transcription Factors: A Practical Approach. Oxford University Press, Oxford (1993) Latchman, D.S.: Transcription Factors: A Practical Approach. Oxford University Press, Oxford (1993)
3.
go back to reference Wu, B., et al.: Identify target genes involved in transcription factor GCF2 that promotes cell migration in tumor cell BEL-7404. Genomics Appl. Biol. 34(1), 35–40 (2015) Wu, B., et al.: Identify target genes involved in transcription factor GCF2 that promotes cell migration in tumor cell BEL-7404. Genomics Appl. Biol. 34(1), 35–40 (2015)
4.
go back to reference Haruka, O., Wataru, I.: MOCCS: clarifying DNA-binding motif ambiguity using ChIP-Seq data. Comput. Biol. Chem. 63, 62–72 (2016)CrossRef Haruka, O., Wataru, I.: MOCCS: clarifying DNA-binding motif ambiguity using ChIP-Seq data. Comput. Biol. Chem. 63, 62–72 (2016)CrossRef
5.
go back to reference Bussemaker, H.J., Li, H., Siggia, E.D.: Building a dictionary for genomes: identification of presumptive regulatory sites by statistical analysis. Proc. Natl. Acad. Sci. USA 97(18), 10096–10100 (2000)MathSciNetCrossRef Bussemaker, H.J., Li, H., Siggia, E.D.: Building a dictionary for genomes: identification of presumptive regulatory sites by statistical analysis. Proc. Natl. Acad. Sci. USA 97(18), 10096–10100 (2000)MathSciNetCrossRef
6.
go back to reference Sinha, S., Tompa, M.: Discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res. 30(24), 5549–5560 (2002)CrossRef Sinha, S., Tompa, M.: Discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res. 30(24), 5549–5560 (2002)CrossRef
7.
go back to reference Sinha, S., Tompa, M.: YMF: a program for discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res. 31(13), 3586–3588 (2003)CrossRef Sinha, S., Tompa, M.: YMF: a program for discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res. 31(13), 3586–3588 (2003)CrossRef
8.
go back to reference Brazma, A., Jonassen, I., Eidhammer, I., Gilbert, D.: Approaches to the automatic discovery of patterns in biosequences. J. Comput. Biol. 5, 279–305 (1998)CrossRef Brazma, A., Jonassen, I., Eidhammer, I., Gilbert, D.: Approaches to the automatic discovery of patterns in biosequences. J. Comput. Biol. 5, 279–305 (1998)CrossRef
9.
go back to reference Du, Y.H., Wang, Z.Z.: Review on computational prediction of transcription factor blinding sites. Life Sci. Res. 10(2), 24–31 (2006) Du, Y.H., Wang, Z.Z.: Review on computational prediction of transcription factor blinding sites. Life Sci. Res. 10(2), 24–31 (2006)
10.
go back to reference Li, T.T., Jiang, B., Wang, X.W.: Tutorial for computational analysis of transcription factor binding sites. Acta Biophys. Sin. 24(5), 334–347 (2008) Li, T.T., Jiang, B., Wang, X.W.: Tutorial for computational analysis of transcription factor binding sites. Acta Biophys. Sin. 24(5), 334–347 (2008)
11.
go back to reference Hertz, G., Stormo, G.: Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15(7–8), 563–577 (1999)CrossRef Hertz, G., Stormo, G.: Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15(7–8), 563–577 (1999)CrossRef
12.
go back to reference Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., Kumar, S.: MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2731–2739 (2011)CrossRef Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., Kumar, S.: MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2731–2739 (2011)CrossRef
13.
go back to reference Lawrence, C., Altschul, S.H.: Combinatorial approaches to finding subtle signals in DNA sequence. In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology (ISMB-2000), pp. 269–278. AAAI Press, San Diego (2000) Lawrence, C., Altschul, S.H.: Combinatorial approaches to finding subtle signals in DNA sequence. In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology (ISMB-2000), pp. 269–278. AAAI Press, San Diego (2000)
14.
go back to reference Neuwald, A.F., Liu, J.S., Lawrence, C.E.: Gibbs motif sampling: detection of bacterial outer membrane repeats. Protein Sci. 4(8), 1618–1632 (1995)CrossRef Neuwald, A.F., Liu, J.S., Lawrence, C.E.: Gibbs motif sampling: detection of bacterial outer membrane repeats. Protein Sci. 4(8), 1618–1632 (1995)CrossRef
15.
go back to reference Surujon, D., Ratner, D.I.: Use of a probabilistic motif search to identify histidine phosphotransfer domain-containing proteins. PLoS ONE 11, 1–18 (2016)CrossRef Surujon, D., Ratner, D.I.: Use of a probabilistic motif search to identify histidine phosphotransfer domain-containing proteins. PLoS ONE 11, 1–18 (2016)CrossRef
16.
go back to reference Stine, M.: Motif discovery in upstream sequences of coordinately expressed genes. In: Proceedings of the CEC’03, pp. 1596–1603. [s. n.], Memphis (2003) Stine, M.: Motif discovery in upstream sequences of coordinately expressed genes. In: Proceedings of the CEC’03, pp. 1596–1603. [s. n.], Memphis (2003)
17.
go back to reference Liu, F.F.M.: FMGA: finding motifs by genetic algorithm. In: Proceedings of the BIBE’04, pp. 459–466. IEEE Press, Taichung (2004) Liu, F.F.M.: FMGA: finding motifs by genetic algorithm. In: Proceedings of the BIBE’04, pp. 459–466. IEEE Press, Taichung (2004)
18.
go back to reference Che, D.S.: MDGA: motif discovery using a genetic algorithm. In: Proceedings of the Conference on Genetic and Evolutionary Computation, pp. 447–452. [s. n.], Washington D.C. (2005) Che, D.S.: MDGA: motif discovery using a genetic algorithm. In: Proceedings of the Conference on Genetic and Evolutionary Computation, pp. 447–452. [s. n.], Washington D.C. (2005)
19.
go back to reference Congdon, C.B.: Preliminary results for GAMI: a genetic algorithms approach to motif inference. In: Proceedings of the Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 1–8. IEEE Press, [S. l.] (2005) Congdon, C.B.: Preliminary results for GAMI: a genetic algorithms approach to motif inference. In: Proceedings of the Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 1–8. IEEE Press, [S. l.] (2005)
20.
go back to reference Paul, T.K., Iba, H.: Identification of weak motifs in multiple biological sequences using genetic algorithm. In: Proceedings of the GECCO’06, pp. 271–278. [s. n.], Seattle (2006) Paul, T.K., Iba, H.: Identification of weak motifs in multiple biological sequences using genetic algorithm. In: Proceedings of the GECCO’06, pp. 271–278. [s. n.], Seattle (2006)
21.
go back to reference Zhang, F., Tan, J., Xie, J.B.: Comparison, analysis and optimization of motif finding based on different algorithms. Comput. Eng. 35(22), 94–96 (2009) Zhang, F., Tan, J., Xie, J.B.: Comparison, analysis and optimization of motif finding based on different algorithms. Comput. Eng. 35(22), 94–96 (2009)
22.
go back to reference Watson, J.D., Crick, F.H.C.: A structure for DNA. Nature 171, 737–738 (1953)CrossRef Watson, J.D., Crick, F.H.C.: A structure for DNA. Nature 171, 737–738 (1953)CrossRef
23.
go back to reference Vaidyanathan, P.P.: Genomics and proteomics: a signal processor’s tour. Circuits Syst. 4(4), 6–29 (2004) Vaidyanathan, P.P.: Genomics and proteomics: a signal processor’s tour. Circuits Syst. 4(4), 6–29 (2004)
24.
go back to reference Lenhard, B., Wasserman, W.W.: TFBS: computational framework for transcription factor binding sites analysis. Bioinform. Appl. Note 18(8), 1135–1136 (2002)CrossRef Lenhard, B., Wasserman, W.W.: TFBS: computational framework for transcription factor binding sites analysis. Bioinform. Appl. Note 18(8), 1135–1136 (2002)CrossRef
25.
go back to reference Hou, L., Qian, M.P., Zhu, Y.P.: Advances on bioinformatic research in transcription factor binding sites. HEREDITAS 31(4), 365–373 (2009)CrossRef Hou, L., Qian, M.P., Zhu, Y.P.: Advances on bioinformatic research in transcription factor binding sites. HEREDITAS 31(4), 365–373 (2009)CrossRef
26.
go back to reference Waterman, M.S., Arratia, R., Galas, D.J.: Pattern recognition in several sequences: consensus and alignment. Bull. Math. Biol. 46, 515–527 (1984)MathSciNetCrossRefMATH Waterman, M.S., Arratia, R., Galas, D.J.: Pattern recognition in several sequences: consensus and alignment. Bull. Math. Biol. 46, 515–527 (1984)MathSciNetCrossRefMATH
27.
go back to reference Hertz, G.Z., Stormo, G.D.: Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15, 563–577 (1999)CrossRef Hertz, G.Z., Stormo, G.D.: Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15, 563–577 (1999)CrossRef
28.
go back to reference Crooks, G.E., Hon, G., Chandonia, J.M., et al.: Web Logo: a sequence logo generator. Genome Res. 14, 1188–1190 (2004)CrossRef Crooks, G.E., Hon, G., Chandonia, J.M., et al.: Web Logo: a sequence logo generator. Genome Res. 14, 1188–1190 (2004)CrossRef
29.
go back to reference Schuster, B., Schultz, J., Rahmann, S.: HMM logos for visualization of protein families. BMC Bioinform. 5, 7 (2004)CrossRef Schuster, B., Schultz, J., Rahmann, S.: HMM logos for visualization of protein families. BMC Bioinform. 5, 7 (2004)CrossRef
30.
go back to reference Kok, W.Y., Oon, Y.B., Lee, N.K.: Perception enhancement using visual attributes in sequence motif visualization. BioRxiv 31, 1–8 (2016). doi:10.1101/066928 Kok, W.Y., Oon, Y.B., Lee, N.K.: Perception enhancement using visual attributes in sequence motif visualization. BioRxiv 31, 1–8 (2016). doi:10.​1101/​066928
31.
go back to reference Tang, Z.G., Yang, B.R., Yang, J.: New outlier detection algorithm based on Markov chain. Syst. Eng. Electron. 32(12), 2721–2724 (2010)MathSciNetMATH Tang, Z.G., Yang, B.R., Yang, J.: New outlier detection algorithm based on Markov chain. Syst. Eng. Electron. 32(12), 2721–2724 (2010)MathSciNetMATH
32.
go back to reference Hughes, J., Estep, P., Tavazoie, S., Church, G.: Computational identification of Cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J. Mol. Biol. 296(5), 1205–1214 (2000)CrossRef Hughes, J., Estep, P., Tavazoie, S., Church, G.: Computational identification of Cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J. Mol. Biol. 296(5), 1205–1214 (2000)CrossRef
33.
go back to reference Martin, T., Nan, L., et al.: Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23, 137–144 (2005)CrossRef Martin, T., Nan, L., et al.: Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23, 137–144 (2005)CrossRef
34.
go back to reference Zhou, Qingyuan: Research on heterogeneous data integration model of group enterprise based on cluster computing. Clust. Comput. 19(3), 1275–1282 (2016)CrossRef Zhou, Qingyuan: Research on heterogeneous data integration model of group enterprise based on cluster computing. Clust. Comput. 19(3), 1275–1282 (2016)CrossRef
35.
go back to reference Zhou, Q., Luo, J.: Artificial neural network based grid computing of E-government scheduling for emergency management. Comput. Syst. Sci. Eng. 30(5), 327–335 (2015) Zhou, Q., Luo, J.: Artificial neural network based grid computing of E-government scheduling for emergency management. Comput. Syst. Sci. Eng. 30(5), 327–335 (2015)
36.
go back to reference Xu, Z., Zhang, H., Hu, C., Mei, L., Xuan, J., Choo, K.R., Sugumaran, V., Zhu, Y.: Building knowledge base of urban emergency events based on crowdsourcing of social media. Concurr. Comput.: Pract. Exp. 28(15), 4038–4052 (2016)CrossRef Xu, Z., Zhang, H., Hu, C., Mei, L., Xuan, J., Choo, K.R., Sugumaran, V., Zhu, Y.: Building knowledge base of urban emergency events based on crowdsourcing of social media. Concurr. Comput.: Pract. Exp. 28(15), 4038–4052 (2016)CrossRef
37.
go back to reference Xu, Z., Zhang, H., Sugumaran, V., Choo, K.R., Mei, L., Zhu, Y.: Participatory sensing-based semantic and spatial analysis of urban emergency events using mobile social media. EURASIP J. Wireless Commun. Netw. 2016, 44 (2016)CrossRef Xu, Z., Zhang, H., Sugumaran, V., Choo, K.R., Mei, L., Zhu, Y.: Participatory sensing-based semantic and spatial analysis of urban emergency events using mobile social media. EURASIP J. Wireless Commun. Netw. 2016, 44 (2016)CrossRef
38.
go back to reference Xu, Z., Hu, C., Mei, L.: Video structured description technology based intelligence analysis of surveillance videos for public security applications. Multimedia Tools Appl. 75(19), 12155–12172 (2016)CrossRef Xu, Z., Hu, C., Mei, L.: Video structured description technology based intelligence analysis of surveillance videos for public security applications. Multimedia Tools Appl. 75(19), 12155–12172 (2016)CrossRef
39.
go back to reference Xu, Z., Wei, X., Liu, Y., Mei, L., Hu, C., Choo, K.R., Zhu, Y., Sugumaran, V.: Building the search pattern of web users using conceptual semantic space model. IJWGS 12(3), 328–347 (2016)CrossRef Xu, Z., Wei, X., Liu, Y., Mei, L., Hu, C., Choo, K.R., Zhu, Y., Sugumaran, V.: Building the search pattern of web users using conceptual semantic space model. IJWGS 12(3), 328–347 (2016)CrossRef
40.
go back to reference Xu, Z., Mei, L., Hu, C., Liu, Y.: The big data analytics and applications of the surveillance system using video structured description technology. Clust. Comput. 19(3), 1283–1292 (2016)CrossRef Xu, Z., Mei, L., Hu, C., Liu, Y.: The big data analytics and applications of the surveillance system using video structured description technology. Clust. Comput. 19(3), 1283–1292 (2016)CrossRef
Metadata
Title
Motif identification method based on Gibbs sampling and genetic algorithm
Authors
Xiaochun Sheng
Kefeng Wang
Publication date
28-11-2016
Publisher
Springer US
Published in
Cluster Computing / Issue 1/2017
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-016-0699-x

Other articles of this Issue 1/2017

Cluster Computing 1/2017 Go to the issue

Premium Partner