Skip to main content

2019 | OriginalPaper | Buchkapitel

ModHMM: A Modular Supra-Bayesian Genome Segmentation Method

verfasst von : Philipp Benner, Martin Vingron

Erschienen in: Research in Computational Molecular Biology

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Genome segmentation methods are powerful tools to obtain cell type or tissue specific genome-wide annotations and are frequently used to discover regulatory elements. However, traditional segmentation methods show low predictive accuracy and their data-driven annotations have some undesirable properties. As an alternative, we developed ModHMM, a highly modular genome segmentation method. Inspired by the supra-Bayesian approach, it incorporates predictions from a set of classifiers. This allows to compute genome segmentations by utilizing state-of-the-art methodology. We demonstrate the method on ENCODE data and show that it outperforms traditional segmentation methods not only in terms of predictive performance, but also in qualitative aspects. Therefore, ModHMM is a valuable alternative to study the epigenetic and regulatory landscape across and within cell types or tissues. The software is freely available at https://​github.​com/​pbenner/​modhmm.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Andersson, R., et al.: An atlas of active enhancers across human cell types and tissues. Nature 507(7493), 455 (2014)CrossRef Andersson, R., et al.: An atlas of active enhancers across human cell types and tissues. Nature 507(7493), 455 (2014)CrossRef
2.
Zurück zum Zitat Barski, A., et al.: High-resolution profiling of histone methylations in the human genome. Cell 129(4), 823–837 (2007)CrossRef Barski, A., et al.: High-resolution profiling of histone methylations in the human genome. Cell 129(4), 823–837 (2007)CrossRef
3.
Zurück zum Zitat Buenrostro, J.D., Giresi, P.G., Zaba, L.C., Chang, H.Y., Greenleaf, W.J.: Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10(12), 1213 (2013)CrossRef Buenrostro, J.D., Giresi, P.G., Zaba, L.C., Chang, H.Y., Greenleaf, W.J.: Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10(12), 1213 (2013)CrossRef
4.
Zurück zum Zitat Buenrostro, J.D., Wu, B., Chang, H.Y., Greenleaf, W.J.: ATAC-seq: a method for assaying chromatin accessibility genome-wide. Curr. Protoco. Mol. Biol. 109(1), 21–29 (2015) Buenrostro, J.D., Wu, B., Chang, H.Y., Greenleaf, W.J.: ATAC-seq: a method for assaying chromatin accessibility genome-wide. Curr. Protoco. Mol. Biol. 109(1), 21–29 (2015)
5.
Zurück zum Zitat Burge, C., Karlin, S.: Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268(1), 78–94 (1997)CrossRef Burge, C., Karlin, S.: Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268(1), 78–94 (1997)CrossRef
6.
Zurück zum Zitat Calo, E., Wysocka, J.: Modification of enhancer chromatin: what, how, and why? Mol. cell 49(5), 825–837 (2013)CrossRef Calo, E., Wysocka, J.: Modification of enhancer chromatin: what, how, and why? Mol. cell 49(5), 825–837 (2013)CrossRef
8.
Zurück zum Zitat Consortium, E.P., et al.: An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414), 57 (2012)CrossRef Consortium, E.P., et al.: An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414), 57 (2012)CrossRef
9.
Zurück zum Zitat Creyghton, M.P., et al.: Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Nat. Acad. Sci. 107(50), 21931–21936 (2010)CrossRef Creyghton, M.P., et al.: Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Nat. Acad. Sci. 107(50), 21931–21936 (2010)CrossRef
10.
Zurück zum Zitat Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc.: Ser. B (Methodol.) 39(1), 1–22 (1977) Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc.: Ser. B (Methodol.) 39(1), 1–22 (1977)
11.
Zurück zum Zitat Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. A Wiley-Interscience Publication, New York (1973)MATH Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. A Wiley-Interscience Publication, New York (1973)MATH
12.
Zurück zum Zitat Ernst, J., Kellis, M.: ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9(3), 215 (2012)CrossRef Ernst, J., Kellis, M.: ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9(3), 215 (2012)CrossRef
13.
Zurück zum Zitat Ernst, J., Kellis, M.: Chromatin-state discovery and genome annotation with ChromHMM. Nat. Protoc. 12(12), 2478 (2017)CrossRef Ernst, J., Kellis, M.: Chromatin-state discovery and genome annotation with ChromHMM. Nat. Protoc. 12(12), 2478 (2017)CrossRef
15.
Zurück zum Zitat Gelfand, A.E., Mallick, B.K., Dey, D.K.: Modeling expert opinion arising as a partial probabilistic specification. J. Am. Stat. Assoc. 90(430), 598–604 (1995)MathSciNetMATHCrossRef Gelfand, A.E., Mallick, B.K., Dey, D.K.: Modeling expert opinion arising as a partial probabilistic specification. J. Am. Stat. Assoc. 90(430), 598–604 (1995)MathSciNetMATHCrossRef
16.
Zurück zum Zitat Genest, C., Zidek, J.V., et al.: Combining probability distributions: a critique and an annotated bibliography. Stat. Sci. 1(1), 114–135 (1986)MathSciNetMATHCrossRef Genest, C., Zidek, J.V., et al.: Combining probability distributions: a critique and an annotated bibliography. Stat. Sci. 1(1), 114–135 (1986)MathSciNetMATHCrossRef
17.
Zurück zum Zitat Gorkin, D., et al.: Systematic mapping of chromatin state landscapes during mouse development. bioRxiv p. 166652 (2017) Gorkin, D., et al.: Systematic mapping of chromatin state landscapes during mouse development. bioRxiv p. 166652 (2017)
18.
Zurück zum Zitat He, Y., et al.: Improved regulatory element prediction based on tissue-specific local epigenomic signatures. Proc. Nat. Acad. Sci. 114(9), E1633–E1640 (2017)CrossRef He, Y., et al.: Improved regulatory element prediction based on tissue-specific local epigenomic signatures. Proc. Nat. Acad. Sci. 114(9), E1633–E1640 (2017)CrossRef
19.
Zurück zum Zitat Heintzman, N.D., et al.: Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 39(3), 311 (2007)CrossRef Heintzman, N.D., et al.: Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 39(3), 311 (2007)CrossRef
20.
Zurück zum Zitat Heinz, S., Romanoski, C.E., Benner, C., Glass, C.K.: The selection and function of cell type-specific enhancers. Nat. Rev. Mol. Cell Biol. 16(3), 144 (2015)CrossRef Heinz, S., Romanoski, C.E., Benner, C., Glass, C.K.: The selection and function of cell type-specific enhancers. Nat. Rev. Mol. Cell Biol. 16(3), 144 (2015)CrossRef
21.
Zurück zum Zitat Hoffman, M.M., Buske, O.J., Wang, J., Weng, Z., Bilmes, J.A., Noble, W.S.: Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat. Methods 9(5), 473 (2012)CrossRef Hoffman, M.M., Buske, O.J., Wang, J., Weng, Z., Bilmes, J.A., Noble, W.S.: Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat. Methods 9(5), 473 (2012)CrossRef
22.
Zurück zum Zitat Hoffman, M.M., et al.: Integrative annotation of chromatin elements from encode data. Nucleic Acids Res. 41(2), 827–841 (2012)CrossRef Hoffman, M.M., et al.: Integrative annotation of chromatin elements from encode data. Nucleic Acids Res. 41(2), 827–841 (2012)CrossRef
23.
Zurück zum Zitat Jacobs, R.A.: Methods for combining experts’ probability assessments. Neural Comput. 7(5), 867–888 (1995)CrossRef Jacobs, R.A.: Methods for combining experts’ probability assessments. Neural Comput. 7(5), 867–888 (1995)CrossRef
24.
Zurück zum Zitat Koch, F., et al.: Transcription initiation platforms and GTF recruitment at tissue-specific enhancers and promoters. Nat. Struct. Mol. Biol. 18(8), 956 (2011)CrossRef Koch, F., et al.: Transcription initiation platforms and GTF recruitment at tissue-specific enhancers and promoters. Nat. Struct. Mol. Biol. 18(8), 956 (2011)CrossRef
25.
Zurück zum Zitat Kundaje, A., et al.: Integrative analysis of 111 reference human epigenomes. Nature 518(7539), 317 (2015)CrossRef Kundaje, A., et al.: Integrative analysis of 111 reference human epigenomes. Nature 518(7539), 317 (2015)CrossRef
26.
Zurück zum Zitat Kuzmichev, A., Nishioka, K., Erdjument-Bromage, H., Tempst, P., Reinberg, D.: Histone methyltransferase activity associated with a human multiprotein complex containing the Enhancer of Zeste protein. Genes Dev. 16(22), 2893–2905 (2002)CrossRef Kuzmichev, A., Nishioka, K., Erdjument-Bromage, H., Tempst, P., Reinberg, D.: Histone methyltransferase activity associated with a human multiprotein complex containing the Enhancer of Zeste protein. Genes Dev. 16(22), 2893–2905 (2002)CrossRef
27.
Zurück zum Zitat Lauberth, S.M., et al.: H3K4me3 interactions with TAF3 regulate preinitiation complex assembly and selective gene activation. Cell 152(5), 1021–1036 (2013)CrossRef Lauberth, S.M., et al.: H3K4me3 interactions with TAF3 regulate preinitiation complex assembly and selective gene activation. Cell 152(5), 1021–1036 (2013)CrossRef
29.
Zurück zum Zitat Lindley, D.: Reconciliation of discrete probability distributions. In: J. Bernardo, M. DeGroot, D. Lindley, A. Smith (eds.) Bayesian statistics 2: Proceedings of the Second Valencia International Meeting, pp. 375–390. Valencia University Press (1985) Lindley, D.: Reconciliation of discrete probability distributions. In: J. Bernardo, M. DeGroot, D. Lindley, A. Smith (eds.) Bayesian statistics 2: Proceedings of the Second Valencia International Meeting, pp. 375–390. Valencia University Press (1985)
30.
Zurück zum Zitat Lindley, D.V., Tversky, A., Brown, R.V.: On the reconciliation of probability assessments. J. Roy. Stat. Soc. Ser. A (Gen.) 142, 146–180 (1979)MathSciNetMATHCrossRef Lindley, D.V., Tversky, A., Brown, R.V.: On the reconciliation of probability assessments. J. Roy. Stat. Soc. Ser. A (Gen.) 142, 146–180 (1979)MathSciNetMATHCrossRef
31.
Zurück zum Zitat Mammana, A., Chung, H.R.: Chromatin segmentation based on a probabilistic model for read counts explains a large portion of the epigenome. Genome Biol. 16(1), 151 (2015)CrossRef Mammana, A., Chung, H.R.: Chromatin segmentation based on a probabilistic model for read counts explains a large portion of the epigenome. Genome Biol. 16(1), 151 (2015)CrossRef
32.
Zurück zum Zitat Margueron, R., Reinberg, D.: The polycomb complex PRC2 and its mark in life. Nature 469(7330), 343 (2011)CrossRef Margueron, R., Reinberg, D.: The polycomb complex PRC2 and its mark in life. Nature 469(7330), 343 (2011)CrossRef
33.
Zurück zum Zitat Maron, M.E.: Automatic indexing: an experimental inquiry. J. ACM (JACM) 8(3), 404–417 (1961)MATHCrossRef Maron, M.E.: Automatic indexing: an experimental inquiry. J. ACM (JACM) 8(3), 404–417 (1961)MATHCrossRef
34.
Zurück zum Zitat Mitchell, T.M.: Machine Learning. McGraw-Hill Boston, MA (1997)MATH Mitchell, T.M.: Machine Learning. McGraw-Hill Boston, MA (1997)MATH
35.
Zurück zum Zitat Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L., Wold, B.: Mapping and quantifying mammalian transcriptomes by RNA-seq. Nature Methods 5(7), 621 (2008)CrossRef Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L., Wold, B.: Mapping and quantifying mammalian transcriptomes by RNA-seq. Nature Methods 5(7), 621 (2008)CrossRef
36.
Zurück zum Zitat Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)CrossRef Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)CrossRef
37.
Zurück zum Zitat Ramírez, F., et al.: deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44(W1), W160–W165 (2016)CrossRef Ramírez, F., et al.: deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44(W1), W160–W165 (2016)CrossRef
38.
Zurück zum Zitat Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Education Limited, Malaysia (2016)MATH Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Education Limited, Malaysia (2016)MATH
39.
Zurück zum Zitat Saksouk, N., Simboeck, E., Déjardin, J.: Constitutive heterochromatin formation and transcription in mammals. Epigenet. Chromatin 8(1), 3 (2015)CrossRef Saksouk, N., Simboeck, E., Déjardin, J.: Constitutive heterochromatin formation and transcription in mammals. Epigenet. Chromatin 8(1), 3 (2015)CrossRef
40.
Zurück zum Zitat Shiraki, T., et al.: Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc. Nat. Acad. Sci. 100(26), 15776–15781 (2003)CrossRef Shiraki, T., et al.: Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc. Nat. Acad. Sci. 100(26), 15776–15781 (2003)CrossRef
41.
Zurück zum Zitat Spyrou, C., Stark, R., Lynch, A.G., Tavaré, S.: BayesPeak: Bayesian analysis of ChIP-seq data. BMC Bioinf. 10(1), 299 (2009)CrossRef Spyrou, C., Stark, R., Lynch, A.G., Tavaré, S.: BayesPeak: Bayesian analysis of ChIP-seq data. BMC Bioinf. 10(1), 299 (2009)CrossRef
42.
Zurück zum Zitat Valouev, A., et al.: Genome-wide analysis of transcription factor binding sites based on ChIP-seq data. Nat. Methods 5(9), 829 (2008)CrossRef Valouev, A., et al.: Genome-wide analysis of transcription factor binding sites based on ChIP-seq data. Nat. Methods 5(9), 829 (2008)CrossRef
43.
Zurück zum Zitat Wagner, E.J., Carpenter, P.B.: Understanding the language of Lys36 methylation at histone H3. Nature Rev. Mol. Cell Biol. 13(2), 115 (2012)CrossRef Wagner, E.J., Carpenter, P.B.: Understanding the language of Lys36 methylation at histone H3. Nature Rev. Mol. Cell Biol. 13(2), 115 (2012)CrossRef
44.
Zurück zum Zitat Wilbanks, E.G., Facciotti, M.T.: Evaluation of algorithm performance in ChIP-seq peak detection. PloS One 5(7), e11471 (2010)CrossRef Wilbanks, E.G., Facciotti, M.T.: Evaluation of algorithm performance in ChIP-seq peak detection. PloS One 5(7), e11471 (2010)CrossRef
45.
Zurück zum Zitat Won, K.J., et al.: Comparative annotation of functional regions in the human genome using epigenomic data. Nucleic Acids Res. 41(8), 4423–4432 (2013)CrossRef Won, K.J., et al.: Comparative annotation of functional regions in the human genome using epigenomic data. Nucleic Acids Res. 41(8), 4423–4432 (2013)CrossRef
46.
Zurück zum Zitat Zhang, Y., et al.: Model-based analysis of ChIP-seq (MACS). Genome Biol. 9(9), R137 (2008)CrossRef Zhang, Y., et al.: Model-based analysis of ChIP-seq (MACS). Genome Biol. 9(9), R137 (2008)CrossRef
Metadaten
Titel
ModHMM: A Modular Supra-Bayesian Genome Segmentation Method
verfasst von
Philipp Benner
Martin Vingron
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-17083-7_3