Non-homogeneous dynamic Bayesian networks for continuous data

Grzegorczyk, Marco; Husmeier, Dirk

doi:10.1007/s10994-010-5230-7

Non-homogeneous dynamic Bayesian networks for continuous data

Published: 27 February 2011

Volume 83, pages 355–419, (2011)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Non-homogeneous dynamic Bayesian networks for continuous data

Download PDF

Marco Grzegorczyk¹ &
Dirk Husmeier²

1463 Accesses
45 Citations
Explore all metrics

Abstract

Classical dynamic Bayesian networks (DBNs) are based on the homogeneous Markov assumption and cannot deal with non-homogeneous temporal processes. Various approaches to relax the homogeneity assumption have recently been proposed. The present paper presents a combination of a Bayesian network with conditional probabilities in the linear Gaussian family, and a Bayesian multiple changepoint process, where the number and location of the changepoints are sampled from the posterior distribution with MCMC. Our work improves four aspects of an earlier conference paper: it contains a comprehensive and self-contained exposition of the methodology; it discusses the problem of spurious feedback loops in network reconstruction; it contains a comprehensive comparative evaluation of the network reconstruction accuracy on a set of synthetic and real-world benchmark problems, based on a novel discrete changepoint process; and it suggests new and improved MCMC schemes for sampling both the network structures and the changepoint configurations from the posterior distribution. The latter study compares RJMCMC, based on changepoint birth and death moves, with two dynamic programming schemes that were originally devised for Bayesian mixture models. We demonstrate the modifications that have to be made to allow for changing network structures, and the critical impact that the prior distribution on changepoint configurations has on the overall computational complexity.

References

Ahmed, A., & Xing, E. P. (2009). Recovering time-varying networks of dependencies in social and biological studies. Proceedings of the National Academy of Sciences, 106, 11878–11883.
Article Google Scholar
Alabadi, D., Oyama, T., Yanovsky, M. J., Harmon, F. G., Mas, P., & Kay, S. A. (2001). Reciprocal regulation between TOC1 and LHY/CCA1 within the Arabidopsis circadian clock. Science, 293, 880–883.
Article Google Scholar
Brooks, S., & Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphial Statistics, 7, 434–455.
Article MathSciNet Google Scholar
Davis, J., & Goadrich, M. (2006). The relationship between precision-recall and ROC curves. In Proceedings of the twenty-third international conference on machine learning (ICML) (pp. 233–240). New York: ACM.
Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, B39, 1–38.
MathSciNet Google Scholar
Dougherty, M. K., Muller, J., Ritt, D. A., Zhou, M., Zhou, X. Z., Copeland, T. D., Conrads, T. P., Veenstra, T. D., Lu, K. P., & Morrison, D. K. (2005). Regulation of Raf-1 by direct feedback phosphorylation. Molecular Cell, 17, 215–224.
Article Google Scholar
Edwards, K. D., Anderson, P. E., Hall, A., Salathia, N. S., Locke, J. C., Lynn, J. R., Straume, M., Smith, J. Q., & Millar, A. J. (2006). Flowering locus C mediates natural variation in the high-temperature response of the Arabidopsis circadian clock. The Plant Cell, 18, 639–650.
Article Google Scholar
Fearnhead, P. (2006). Exact and efficient Bayesian inference for multiple changepoint problems. Statistics and Computing, 16, 203–213.
Article MathSciNet Google Scholar
Friedman, N., & Koller, D. (2003). Being Bayesian about network structure. Machine Learning, 50, 95–126.
Article MATH Google Scholar
Friedman, N., Linial, M., Nachman, I., & Pe’er, D. (2000). Using Bayesian networks to analyze expression data. Journal of Computational Biology, 7, 601–620.
Article Google Scholar
Geiger, D., & Heckerman, D. (1994). Learning Gaussian networks. In Proceedings of the tenth conference on uncertainty in artificial intelligence (pp. 235–243). San Francisco: Morgan Kaufmann.
Google Scholar
Giudici, P., & Castelo, R. (2003). Improving Markov chain Monte Carlo model search for data mining. Machine Learning, 50, 127–158.
Article MATH Google Scholar
Green, P. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82, 711–732.
Article MathSciNet MATH Google Scholar
Grzegorczyk, M., & Husmeier, D. (2008). Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move. Machine Learning, 71, 265–305.
Article Google Scholar
Grzegorczyk, M., & Husmeier, D. (2009). Non-stationary continuous dynamic Bayesian networks. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, & A. Culotta (Eds.), Advances in neural information processing systems (NIPS) (Vol. 22, pp. 682–690).
Google Scholar
Grzegorczyk, M., Husmeier, D., Edwards, K., Ghazal, P., & Millar, A. (2008). Modelling non-stationary gene regulatory processes with a non-homogeneous Bayesian network and the allocation sampler. Bioinformatics, 24, 2071–2078.
Article Google Scholar
Grzegorczyk, M., Rahnenführer, J., & Husmeier, D. (2010). Modelling non-stationary dynamic gene regulatory processes with the BGM model. Computational Statistics. doi:10.1007/s00180-010-0201-9.
Google Scholar
Hartemink, A. J. (2001) Principled computational methods for the validation and discovery of genetic regulatory networks. Ph.D. thesis, MIT.
Heckerman, D., & Geiger, D. (1995). Learning Bayesian networks: A unification for discrete and Gaussian domains. In Proceedings of the 11th annual conference on uncertainty in artificial intelligence (UAI-95) (pp. 274–82). San Francisco: Morgan Kaufmann.
Google Scholar
Kikis, E., Khanna, R., & Quail, P. (2005). ELF4 is a phytochrome-regulated component of a negative-feedback loop involving the central oscillator components CCA1 and LHY. The Plant Journal, 44, 300–313.
Article Google Scholar
Ko, Y., Zhai, C., & Rodriguez-Zas, S. (2007). Inference of gene pathways using Gaussian mixture models. In BIBM International conference on bioinformatics and biomedicine, Fremont, CA (pp. 362–367).
Google Scholar
Kolar, M., Song, L., & Xing, E. (2009). Sparsistent learning of varying-coefficient models with structural changes. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, & A. Culotta (Eds.), Advances in neural information processing systems (NIPS) (pp. 1006–1014).
Google Scholar
Lèbre, S. (2007) Stochastic process analysis for genomics and dynamic Bayesian networks inference. Ph.D. thesis, Université d‘Evry-Val-d‘Essonne, France.
Lèbre, S., Becq, J., Devaux, F., Lelandais, G., & Stumpf, M. (2010). Statistical inference of the time-varying structure of gene-regulation networks. BMC Systems Biology, 4 (130).
Google Scholar
Lim, W., Wang, K., Lefebvre, C., & Califano, A. (2007). Comparative analysis of microarray normalization procedures: effects on reverse engineering gene networks. Bioinformatics, 23, i282–i288.
Article Google Scholar
Locke, J., Southern, M., Kozma-Bognar, L., Hibberd, V., Brown, P., Turner, M., & Millar, A. (2005) Extension of a genetic network model by iterative experimentation and mathematical analysis. Molecular Systems Biology, 1 (online).
Madigan, D., & York, J. (1995). Bayesian graphical models for discrete data. International Statistical Review, 63, 215–232.
Article MATH Google Scholar
McClung, C. R. (2006). Plant circadian rhythms. Plant Cell, 18, 792–803.
Article Google Scholar
Miwa, K., Serikawa, M., Suzuki, S., Kondo, T., & Oyama, T. (2006). Conserved expression profiles of circadian clock-related genes in two lemna species showing long-day and short-day photoperiodic flowering responses. Plant and Cell Physiology, 47, 601–612.
Article Google Scholar
Miwa, K., Ito, S., Nakamichi, N., Mizoguchi, T., Niinuma, K., Yamashino, T., & Mizuno, T. (2007). Genetic linkages of the circadian clock-associated genes, TOC1, CCA1 and LHY, in the photoperiodic control of flowering time in Arabidopsis thaliana. Plant and Cell Physiology, 48, 925–937.
Article Google Scholar
Mockler, T., Michael, T., Priest, H., Shen, R., Sullivan, C., Givan, S., McEntee, C., Kay, S., & Chory, J. (2007). The diurnal project: Diurnal and circadian expression profiling, model-based pattern matching and promoter analysis. Cold Spring Harbor Symposia on Quantitative Biology, 72, 353–363.
Article Google Scholar
Nobile, A., & Fearnside, A. (2007). Bayesian finite mixtures with an unknown number of components: The allocation sampler. Statistics and Computing, 17, 147–162.
Article MathSciNet Google Scholar
Robinson, J. W., & Hartemink, A. J. (2009). Non-stationary dynamic Bayesian networks. In D. Koller, D. Schuurmans, Y. Bengio, & L. Bottou (Eds.), Advances in neural information processing systems (NIPS) (Vol. 21, pp. 1369–1376). San Mateo: Morgan Kaufmann.
Google Scholar
Rogers, S., & Girolami, M. (2005). A Bayesian regression approach to the inference of regulatory networks from gene expression data. Bioinformatics, 21, 3131–3137.
Article Google Scholar
Sachs, K., Perez, O., Pe’er, D., Lauffenburger, D. A., & Nolan, G. P. (2005). Protein-signaling networks derived from multiparameter single-cell data. Science, 308, 523–529.
Article Google Scholar
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
Article MathSciNet MATH Google Scholar
Shen-Orr, S. S., Milo, R., Mangan, S., & Alon, U. (2002). Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genetics, 31, 64–68.
Article Google Scholar
Smith, V. A., Yu, J., Smulders, T. V., Hartemink, A. J., & Jarvi, E. D. (2006). Computational inference of neural information flow networks. PLoS Computational Biology, 2, 1436–1449.
Article Google Scholar
Talih, M., & Hengartner, N. (2005). Structural learning with time-varying components: Tracking the cross-section of financial time series. Journal of the Royal Statistical Society B, 67, 321–341.
Article MathSciNet MATH Google Scholar
Werhli, A. V., & Husmeier, D. (2008). Gene regulatory network reconstruction by Bayesian integration of prior knowledge and/or different experimental conditions. Journal of Bioinformatics and Computational Biology, 6, 543–572.
Article Google Scholar
Xuan, X., & Murphy, K. (2007). Modeling changing dependency structure in multivariate time series. In Z. Ghahramani (Ed.), Proceedings of the 24th annual international conference on machine learning (ICML 2007) (pp. 1055–1062). New York: Omnipress.
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, TU Dortmund University, 44221, Dortmund, Germany
Marco Grzegorczyk
Biomathematics and Statistics Scotland (BioSS), JCMB, The King’s Buildings, Edinburgh, EH9 3JZ, UK
Dirk Husmeier

Authors

Marco Grzegorczyk
View author publications
You can also search for this author in PubMed Google Scholar
Dirk Husmeier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marco Grzegorczyk.

Additional information

Editor: Kevin P. Murphy.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Grzegorczyk, M., Husmeier, D. Non-homogeneous dynamic Bayesian networks for continuous data. Mach Learn 83, 355–419 (2011). https://doi.org/10.1007/s10994-010-5230-7

Download citation

Received: 15 March 2010
Revised: 23 November 2010
Accepted: 24 November 2010
Published: 27 February 2011
Issue Date: June 2011
DOI: https://doi.org/10.1007/s10994-010-5230-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Non-homogeneous dynamic Bayesian networks for continuous data

Abstract

Article PDF

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

A new computational framework for log-concave density estimation

A Systematic Review of Hidden Markov Models and Their Applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Non-homogeneous dynamic Bayesian networks for continuous data

Abstract

Article PDF

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

A new computational framework for log-concave density estimation

A Systematic Review of Hidden Markov Models and Their Applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation