Skip to main content

2013 | OriginalPaper | Buchkapitel

CMF: A Combinatorial Tool to Find Composite Motifs

verfasst von : Mauro Leoncini, Manuela Montangero, Marco Pellegrini, Karina Panucia Tillán

Erschienen in: Learning and Intelligent Optimization

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Controlling the differential expression of many thousands genes at any given time is a fundamental task of metazoan organisms and this complex orchestration is controlled by the so-called regulatory genome encoding complex regulatory networks. Cis-Regulatory Modules are fundamental units of such networks. To detect Cis-Regulatory Modules “in-silico” a key step is the discovery of recurrent clusters of DNA binding sites for sets of cooperating Transcription Factors. Composite motif is the term often adopted to refer to these clusters of sites. In this paper we describe CMF, a new efficient combinatorial method for the problem of detecting composite motifs, given in input a description of the binding affinities for a set of transcription factors. Testing with known benchmark data, we attain statistically significant better performance against nine state-of-the-art competing methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Assuming the number \(N\) of sequences is clearly understood, we silently equate the fraction \(q\in (0,1]\) and the absolute number of sequences \(\lceil q\cdot N\rceil \).
 
2
Even if not taken into consideration in this paper, CMF is also able to run a number of third-party motif discovery tools to “synthesize” PWMs.
 
3
Currently, CMF invokes RSAT’s utility compare-matrices for this purpose [21], which uses pairwise normalized correlation
 
4
Sometimes referred to as weak signals in the literature.
 
5
In the following, we refer to [8] as to the assessment paper.
 
6
Note that in the already cited paper by Tompa et al. [31], true negative predictions at the motif level are not considered.
 
Literatur
1.
Zurück zum Zitat Davidson, E.H.: The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, 1st edn. Academic Press, San Diego (2006) Davidson, E.H.: The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, 1st edn. Academic Press, San Diego (2006)
2.
Zurück zum Zitat Pavesi, G., Mauri, G., Pesole, G.: In silico representation and discovery of transcription factor binding sites. Brief. Bioinform. 5, 217–236 (2004)CrossRef Pavesi, G., Mauri, G., Pesole, G.: In silico representation and discovery of transcription factor binding sites. Brief. Bioinform. 5, 217–236 (2004)CrossRef
3.
Zurück zum Zitat Sandve, G.K., Drabløs, F.: A survey of motif discovery methods in an integrated framework. Biol. Direct. 1, 11 (2006)CrossRef Sandve, G.K., Drabløs, F.: A survey of motif discovery methods in an integrated framework. Biol. Direct. 1, 11 (2006)CrossRef
4.
Zurück zum Zitat Häußler, M., Nicolas, J.: Motif discovery on promotor sequences. Research report RR-5714, INRIA (2005) Häußler, M., Nicolas, J.: Motif discovery on promotor sequences. Research report RR-5714, INRIA (2005)
5.
Zurück zum Zitat Zambelli, F., Pesole, G., Pavesi, G.: Motif discovery and transcription factor binding sites before and after the next-generation sequencing era. Brief. Bioinf. (2012) Zambelli, F., Pesole, G., Pavesi, G.: Motif discovery and transcription factor binding sites before and after the next-generation sequencing era. Brief. Bioinf. (2012)
6.
Zurück zum Zitat Wingender, E., et al.: Transfac: a database on transcription factors and their DNA binding sites. Nucl. Acids Res. 24, 238–241 (1996)CrossRef Wingender, E., et al.: Transfac: a database on transcription factors and their DNA binding sites. Nucl. Acids Res. 24, 238–241 (1996)CrossRef
7.
Zurück zum Zitat Sandelin, A., Alkema, W., Engström, P.G., Wasserman, W.W., Lenhard, B.: Jaspar: an open-access database for eukaryotic transcription factor binding profiles. Nucl. Acids Res. 32, 91–94 (2004)CrossRef Sandelin, A., Alkema, W., Engström, P.G., Wasserman, W.W., Lenhard, B.: Jaspar: an open-access database for eukaryotic transcription factor binding profiles. Nucl. Acids Res. 32, 91–94 (2004)CrossRef
8.
Zurück zum Zitat Klepper, K., Sandve, G., Abul, O., Johansen, J., Drabløs, F.: Assessment of composite motif discovery methods. BMC Bioinform. 9, 123 (2008)CrossRef Klepper, K., Sandve, G., Abul, O., Johansen, J., Drabløs, F.: Assessment of composite motif discovery methods. BMC Bioinform. 9, 123 (2008)CrossRef
9.
Zurück zum Zitat Sinha, S.: Finding regulatory elements in genomic sequences. Ph.D. thesis, University of Washington (2002) Sinha, S.: Finding regulatory elements in genomic sequences. Ph.D. thesis, University of Washington (2002)
10.
Zurück zum Zitat Van Loo, P., Marynen, P.: Computational methods for the detection of cis-regulatory modules. Brief. Bioinform. 10, 509–524 (2009)CrossRef Van Loo, P., Marynen, P.: Computational methods for the detection of cis-regulatory modules. Brief. Bioinform. 10, 509–524 (2009)CrossRef
11.
Zurück zum Zitat Ivan, A., Halfon, M., Sinha, S.: Computational discovery of cis-regulatory modules in drosophila without prior knowledge of motifs. Genome Biol. 9, R22 (2008)CrossRef Ivan, A., Halfon, M., Sinha, S.: Computational discovery of cis-regulatory modules in drosophila without prior knowledge of motifs. Genome Biol. 9, R22 (2008)CrossRef
12.
Zurück zum Zitat Federico, M., Leoncini, M., Montangero, M., Valente, P.: Direct vs 2-stage approaches to structured motif finding. Algorithms Mol. Biol. 7, 20 (2012)CrossRef Federico, M., Leoncini, M., Montangero, M., Valente, P.: Direct vs 2-stage approaches to structured motif finding. Algorithms Mol. Biol. 7, 20 (2012)CrossRef
13.
Zurück zum Zitat Sandve, G., Abul, O., Drablos, F.: Compo: composite motif discovery using discrete models. BMC Bioinform. 9, 527 (2008)CrossRef Sandve, G., Abul, O., Drablos, F.: Compo: composite motif discovery using discrete models. BMC Bioinform. 9, 527 (2008)CrossRef
14.
Zurück zum Zitat Hu, J., Hu, H., Li, X.: Mopat: a graph-based method to predict recurrent cis-regulatory modules from known motifs. Nucl. Acids Res. 36, 4488–4497 (2008)CrossRef Hu, J., Hu, H., Li, X.: Mopat: a graph-based method to predict recurrent cis-regulatory modules from known motifs. Nucl. Acids Res. 36, 4488–4497 (2008)CrossRef
15.
Zurück zum Zitat Nikulova, A.A., Favorov, A.V., Sutormin, R.A., Makeev, V.J., Mironov, A.A.: Coreclust: identification of the conserved CRM grammar together with prediction of gene regulation. Nucl. Acids Res. 40, e93 (2012). doi:10.1093/nar/gks235 CrossRef Nikulova, A.A., Favorov, A.V., Sutormin, R.A., Makeev, V.J., Mironov, A.A.: Coreclust: identification of the conserved CRM grammar together with prediction of gene regulation. Nucl. Acids Res. 40, e93 (2012). doi:10.​1093/​nar/​gks235 CrossRef
16.
Zurück zum Zitat Vavouri, T., Elgar, G.: Prediction of cis-regulatory elements using binding site matrices - the successes, the failures and the reasons for both. Curr. Opin. Genet. Develop. 15, 395–402 (2005)CrossRef Vavouri, T., Elgar, G.: Prediction of cis-regulatory elements using binding site matrices - the successes, the failures and the reasons for both. Curr. Opin. Genet. Develop. 15, 395–402 (2005)CrossRef
17.
Zurück zum Zitat Kel, A., Gößling, E., Reuter, I., Cheremushkin, E., Kel-Margoulis, O., Wingender, E.: Matchtm: a tool for searching transcription factor binding sites in DNA sequences. Nucl. Acids Res. 31, 3576–3579 (2003)CrossRef Kel, A., Gößling, E., Reuter, I., Cheremushkin, E., Kel-Margoulis, O., Wingender, E.: Matchtm: a tool for searching transcription factor binding sites in DNA sequences. Nucl. Acids Res. 31, 3576–3579 (2003)CrossRef
18.
Zurück zum Zitat Chen, Q.K., Hertz, G.Z., Stormo, G.D.: Matrix search 1.0: a computer program that scans DNA sequences for transcriptional elements using a database of weight matrices. Comp. Appl. Biosci.: CABIOS 11, 563–566 (1995) Chen, Q.K., Hertz, G.Z., Stormo, G.D.: Matrix search 1.0: a computer program that scans DNA sequences for transcriptional elements using a database of weight matrices. Comp. Appl. Biosci.: CABIOS 11, 563–566 (1995)
19.
Zurück zum Zitat Prestridge, D.S.: Signal scan: a computer program that scans DNA sequences for eukaryotic transcriptional elements. Comp. Appl. Biosci.: CABIOS 7, 203–206 (1991) Prestridge, D.S.: Signal scan: a computer program that scans DNA sequences for eukaryotic transcriptional elements. Comp. Appl. Biosci.: CABIOS 7, 203–206 (1991)
20.
Zurück zum Zitat Matys, V., et al.: TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucl. Acids Res. 34, D108–D110 (2006)CrossRef Matys, V., et al.: TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucl. Acids Res. 34, D108–D110 (2006)CrossRef
21.
Zurück zum Zitat Thomas-Chollier, M., et al.: RSAT: regulatory sequence analysis tools. Nucl. Acids Res. 36, W119–W127 (2008)CrossRef Thomas-Chollier, M., et al.: RSAT: regulatory sequence analysis tools. Nucl. Acids Res. 36, W119–W127 (2008)CrossRef
22.
Zurück zum Zitat Uno, T.: Pce: Pseudo clique enumerator, ver. 1.0 (2006) Uno, T.: Pce: Pseudo clique enumerator, ver. 1.0 (2006)
23.
Zurück zum Zitat Zhou, Q., Wong, W.H.: Cismodule: De novo discovery of cis-regulatory modules by hierarchical mixture modeling. Proc. Natl. Acad. Sci. 101, 12114–12119 (2004)CrossRef Zhou, Q., Wong, W.H.: Cismodule: De novo discovery of cis-regulatory modules by hierarchical mixture modeling. Proc. Natl. Acad. Sci. 101, 12114–12119 (2004)CrossRef
24.
Zurück zum Zitat Frith, M.C., Hansen, U., Weng, Z.: Detection of cis -element clusters in higher eukaryotic dna. Bioinformatics 17, 878–889 (2001)CrossRef Frith, M.C., Hansen, U., Weng, Z.: Detection of cis -element clusters in higher eukaryotic dna. Bioinformatics 17, 878–889 (2001)CrossRef
25.
Zurück zum Zitat Frith, M.C., Li, M.C., Weng, Z.: Cluster-Buster: finding dense clusters of motifs in DNA sequences. Nucl. Acids Res. 31, 3666–3668 (2003)CrossRef Frith, M.C., Li, M.C., Weng, Z.: Cluster-Buster: finding dense clusters of motifs in DNA sequences. Nucl. Acids Res. 31, 3666–3668 (2003)CrossRef
26.
Zurück zum Zitat Kel, A., Konovalova, T., Waleev, T., Cheremushkin, E., Kel-Margoulis, O., Wingender, E.: Composite module analyst: a fitness-based tool for identification of transcription factor binding site combinations. Bioinformatics 22, 1190–1197 (2006)CrossRef Kel, A., Konovalova, T., Waleev, T., Cheremushkin, E., Kel-Margoulis, O., Wingender, E.: Composite module analyst: a fitness-based tool for identification of transcription factor binding site combinations. Bioinformatics 22, 1190–1197 (2006)CrossRef
27.
Zurück zum Zitat Bailey, T.L., Noble, W.S.: Searching for statistically significant regulatory modules. Bioinformatics 19, ii16–ii25 (2003)CrossRef Bailey, T.L., Noble, W.S.: Searching for statistically significant regulatory modules. Bioinformatics 19, ii16–ii25 (2003)CrossRef
28.
Zurück zum Zitat Aerts, S., Van Loo, P., Thijs, G., Moreau, Y., De Moor, B.: Computational detection of cis -regulatory modules. Bioinformatics 19, ii5–ii14 (2003)CrossRef Aerts, S., Van Loo, P., Thijs, G., Moreau, Y., De Moor, B.: Computational detection of cis -regulatory modules. Bioinformatics 19, ii5–ii14 (2003)CrossRef
29.
Zurück zum Zitat Johansson, Ö., Alkema, W., Wasserman, W.W., Lagergren, J.: Identification of functional clusters of transcription factor binding motifs in genome sequences: the mscan algorithm. Bioinformatics 19, i169–i176 (2003)CrossRef Johansson, Ö., Alkema, W., Wasserman, W.W., Lagergren, J.: Identification of functional clusters of transcription factor binding motifs in genome sequences: the mscan algorithm. Bioinformatics 19, i169–i176 (2003)CrossRef
30.
Zurück zum Zitat Sinha, S., van Nimwegen, E., Siggia, E.D.: A probabilistic method to detect regulatory modules. Bioinformatics 19, i292–i301 (2003)CrossRef Sinha, S., van Nimwegen, E., Siggia, E.D.: A probabilistic method to detect regulatory modules. Bioinformatics 19, i292–i301 (2003)CrossRef
31.
Zurück zum Zitat Tompa, M., et al.: Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23, 137–144 (2005)CrossRef Tompa, M., et al.: Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23, 137–144 (2005)CrossRef
32.
Zurück zum Zitat García, S., Fernández, A., Luengo, J., Herrera, F.: Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf. Sci. 180, 2044–2064 (2010)CrossRef García, S., Fernández, A., Luengo, J., Herrera, F.: Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf. Sci. 180, 2044–2064 (2010)CrossRef
Metadaten
Titel
CMF: A Combinatorial Tool to Find Composite Motifs
verfasst von
Mauro Leoncini
Manuela Montangero
Marco Pellegrini
Karina Panucia Tillán
Copyright-Jahr
2013
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-642-44973-4_21