1932

Abstract

A graphical model is a statistical model that is associated with a graph whose nodes correspond to variables of interest. The edges of the graph reflect allowed conditional dependencies among the variables. Graphical models have computationally convenient factorization properties and have long been a valuable tool for tractable modeling of multivariate distributions. More recently, applications such as reconstructing gene regulatory networks from gene expression data have driven major advances in structure learning, that is, estimating the graph underlying a model. We review some of these advances and discuss methods such as the graphical lasso and neighborhood selection for undirected graphical models (or Markov random fields) and the PC algorithm and score-based search methods for directed graphical models (or Bayesian networks). We further review extensions that account for effects of latent variables and heterogeneous data sources.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-060116-053803
2017-03-07
2024-04-25
Loading full text...

Full text loading...

/deliver/fulltext/statistics/4/1/annurev-statistics-060116-053803.html?itemId=/content/journals/10.1146/annurev-statistics-060116-053803&mimeType=html&fmt=ahah

Literature Cited

  1. Ali RA, Richardson TS, Spirtes P. 2009. Markov equivalence for ancestral graphs. Ann. Stat. 37:2808–37 [Google Scholar]
  2. Anandkumar A, Ge R, Hsu D, Kakade SM, Telgarsky M. 2014. Tensor decompositions for learning latent variable models. J. Mach. Learn. Res. 15:2773–832 [Google Scholar]
  3. Anandkumar A, Valluvan R. 2013. Learning loopy graphical models with latent variables: efficient methods and guarantees. Ann. Stat. 41:401–35 [Google Scholar]
  4. Andersson SA, Madigan D, Perlman MD. 1997. A characterization of Markov equivalence classes for acyclic digraphs. Ann. Stat. 25:505–41 [Google Scholar]
  5. Banerjee O, El Ghaoui L, d'Aspremont A. 2008. Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Mach. Learn. Res. 9:485–516 [Google Scholar]
  6. Banerjee S, Ghosal S. 2015. Bayesian structure learning in graphical models. J. Multivar. Anal. 136:147–62 [Google Scholar]
  7. Barber RF, Drton M. 2015. High-dimensional Ising model selection with Bayesian information criteria. Electron. J. Stat. 9:567–607 [Google Scholar]
  8. Besag J. 1975. Statistical analysis of non-lattice data. J. R. Stat. Soc. D. 24:179–95 [Google Scholar]
  9. Bilodeau M. 2014. Graphical lassos for meta-elliptical distributions. Can. J. Stat. 42:185–203 [Google Scholar]
  10. Bollen KA. 1989. Structural Equations with Latent Variables New York: Wiley
  11. Bresler G. 2015. Efficiently learning Ising models on arbitrary graphs. Proc. 47th Annu. ACM Symp. Theory Comp. (STOC 2015)771–82 New York, NY: ACM [Google Scholar]
  12. Bühlmann P, Peters J, Ernest J. 2014. CAM: causal additive models, high-dimensional order search and penalized regression. Ann. Stat. 42:2526–56 [Google Scholar]
  13. Cai TT, Liu W, Luo X. 2011. A constrained ℓ1minimization approach to sparse precision matrix estimation. J. Am. Stat. Assoc. 106:594–607 [Google Scholar]
  14. Cai TT, Liu W, Zhou HH. 2016. Estimating sparse precision matrix: optimal rates of convergence and adaptive estimation. Ann. Stat. 44:455–88 [Google Scholar]
  15. Chaganty AT, Liang P. 2014. Estimating latent-variable graphical models using moments and likelihoods. Proc. 31st Int. Conf. Mach. Learn. (ICML 2014)1872–80 New York: ACM
  16. Chandrasekaran V, Parrilo PA, Willsky AS. 2012. Latent variable graphical model selection via convex optimization. Ann. Stat. 40:1935–67 [Google Scholar]
  17. Chen S, Witten DM, Shojaie A. 2015. Selection and estimation for mixed graphical models. Biometrika 102:47–64 [Google Scholar]
  18. Chickering DM. 2002. Optimal structure identification with greedy search. J. Mach. Learn. Res. 3:507–54 [Google Scholar]
  19. Chickering DM, Meek C. 2015. Selective greedy equivalence search: finding optimal Bayesian networks using a polynomial number of score evaluations. Proc. 31st Conf. Uncertain. Artif. Intell. (UAI 2015), pp. 211–19 Corvallis, OR: AUAI Press [Google Scholar]
  20. Chiquet J, Grandvalet Y, Ambroise C. 2011. Inferring multiple graphical structures. Stat. Comput. 21:537–53 [Google Scholar]
  21. Chow CK, Liu C. 1968. Approximating discrete probability distributions with dependence trees. IEEE Trans. Inform. Theory IT-14:462–67 [Google Scholar]
  22. Claassen T, Mooij JM, Heskes T. 2013. Learning sparse causal models is not NP-hard. Proc. 29th Conf. Uncertain. Artif. Intell. (UAI 2013)172–81 Corvallis, OR: AUAI Press [Google Scholar]
  23. Colombo D, Maathuis MH. 2014. Order-independent constraint-based causal structure learning. J. Mach. Learn. Res. 15:3741–82 [Google Scholar]
  24. Colombo D, Maathuis MH, Kalisch M, Richardson TS. 2012. Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Stat. 40:294–321 [Google Scholar]
  25. Cussens J, Bartlett M. 2013. Advances in Bayesian network learning using integer programming. Proc. 29th Conf. Uncertain. Artif. Intell. (UAI 2013)182–91 Corvallis, OR: AUAI Press [Google Scholar]
  26. Danaher P, Wang P, Witten DM. 2014. The joint graphical lasso for inverse covariance estimation across multiple classes. J. R. Stat. Soc. B 76:373–97 [Google Scholar]
  27. Danks D, Glymour C, Tillman RE. 2009. Integrating locally learned causal structures with overlapping variables. Proc. Adv. Neural Inf. Proc. Syst. 21 (NIPS 2008)1665–72 Red Hook, NY: Curran [Google Scholar]
  28. Dasarathy G, Singh A, Balcan MF, Park JH. 2016. Active learning algorithms for graphical model selection. arXiv1602.00354
  29. de Campos C, Zeng Z, Ji Q. 2009. Structure learning of Bayesian networks using constraints. Proc. 26th Int. Conf. Mach. Learn. (ICML 2009)113–20 New York: ACM [Google Scholar]
  30. Defazio A, Caetano TS. 2012. A convex formulation for learning scale-free networks via submodular relaxation. Proc. Adv. Neural Inf. Proc. Syst. 25 (NIPS 2012)1250–58 Red Hook, NY: Curran [Google Scholar]
  31. Drton M. 2009. Likelihood ratio tests and singularities. Ann. Stat. 37:979–1012 [Google Scholar]
  32. Drton M, Perlman MD. 2007. Multiple testing and error control in Gaussian graphical model selection. Stat. Sci. 22:430–49 [Google Scholar]
  33. Drton M, Sturmfels B, Sullivant S. 2009. Lectures on Algebraic Statistics Oberwolfach Semin. 39 Basel, Switz: Birkhäuser Verlag
  34. Edwards D, de Abreu GC, Labouriau R. 2010. Selecting high-dimensional mixed graphical models using minimal AIC or BIC forests. BMC Bioinform. 11:1–13 [Google Scholar]
  35. Eichler M. 2012. Graphical modelling of multivariate time series. Probab. Theory Related Fields 153:233–68 [Google Scholar]
  36. Evans RJ. 2016. Graphs for margins of Bayesian networks. Scand. J. Stat. doi: 10.1111/sjos.12194
  37. Fan J, Liu H, Ning Y, Zou H. 2016. High dimensional semiparametric latent graphical model for mixed data. J. R. Stat. Soc. B. doi: 10.1111/rssb.12168
  38. Finegold M, Drton M. 2011. Robust graphical modeling of gene networks using classical and alternative t-distributions. Ann. Appl. Stat. 5:1057–80 [Google Scholar]
  39. Finegold M, Drton M. 2014. Robust Bayesian graphical modeling using Dirichlet t-distributions. Bayesian Anal. 9:521–50 [Google Scholar]
  40. Forbes PGM, Lauritzen S. 2015. Linear estimating equations for exponential families with application to Gaussian linear concentration models. Linear Algebra Appl. 473:261–83 [Google Scholar]
  41. Foygel R, Drton M. 2010. Extended Bayesian information criteria for Gaussian graphical models. Proc. Adv. Neural Inf. Process. Syst. 23 (NIPS 2010)2020–28 Red Hook, NY: Curran [Google Scholar]
  42. Friedman J, Hastie T, Tibshirani R. 2008. Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9:432–41 [Google Scholar]
  43. Friedman N. 2004. Inferring cellular networks using probabilistic graphical models. Science 303:799–805 [Google Scholar]
  44. Friedman N, Ninio M, Pe'er I, Pupko T. 2002. A structural EM algorithm for phylogenetic inference. J. Comput. Biol. 9:331–53 [Google Scholar]
  45. Frydenberg M. 1990. The chain graph Markov property. Scand. J. Stat. 17:333–53 [Google Scholar]
  46. Gao X, Pu DQ, Wu Y, Xu H. 2012. Tuning parameter selection for penalized likelihood estimation of Gaussian graphical model. Stat. Sinica 22:1123–46 [Google Scholar]
  47. Geiger D, Meek C, Sturmfels B. 2006. On the toric algebra of graphical models. Ann. Stat. 34:1463–92 [Google Scholar]
  48. Goudie RJB, Mukherjee S. 2016. A Gibbs sampler for learning DAGs. J. Mach. Learn. Res. 17:1–39 [Google Scholar]
  49. Guo J, Levina E, Michailidis G, Zhu J. 2011. Joint estimation of multiple graphical models. Biometrika 98:1–15 [Google Scholar]
  50. Harris N, Drton M. 2013. PC algorithm for nonparanormal graphical models. J. Mach. Learn. Res. 14:3365–83 [Google Scholar]
  51. Hauser A, Bühlmann P. 2012. Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. J. Mach. Learn. Res. 13:2409–64 [Google Scholar]
  52. He S, Yin J, Li H, Wang X. 2014. Graphical model selection and estimation for high dimensional tensor data. J. Multivar. Anal. 128:165–85 [Google Scholar]
  53. He Y, Jia J, Yu B. 2013. Reversible MCMC on Markov equivalence classes of sparse directed acyclic graphs. Ann. Stat. 41:1742–79 [Google Scholar]
  54. Höfling H, Tibshirani R. 2009. Estimation of sparse binary pairwise Markov networks using pseudo-likelihoods. J. Mach. Learn. Res. 10:883–906 [Google Scholar]
  55. Højsgaard S, Edwards D, Lauritzen S. 2012. Graphical Models with R New York: Springer
  56. Hsieh CJ, Sustik MA, Dhillon IS, Ravikumar PK, Poldrack R. 2013. BIG & QUIC: sparse inverse covariance estimation for a million variables. Proc. Adv. Neural Inf. Proc. Syst. 26 (NIPS 2013)3165–73 Red Hook, NY: Curran [Google Scholar]
  57. Hyttinen A, Hoyer P, Eberhardt F, Jarvisalo M. 2013. Discovering cyclic causal models with latent variables: a general SAT-based procedure. Proc. 29th Conf. Uncertain. Artif. Intell. (UAI 2013)301–10 Corvallis, OR: AUAI Press [Google Scholar]
  58. Hyvärinen A. 2005. Estimation of non-normalized statistical models by score matching. J. Mach. Learn. Res. 6:695–709 [Google Scholar]
  59. Hyvärinen A. 2007. Some extensions of score matching. Comput. Stat. Data Anal. 51:2499–512 [Google Scholar]
  60. Jaakkola T, Sontag D, Globerson A, Meila M. 2010. Learning Bayesian network structure using LP relaxations. J. Mach. Learn. Res. 9:358–65 [Google Scholar]
  61. Jalali A, Johnson CC, Ravikumar PK. 2011. On learning discrete graphical models using greedy methods. Proc. Adv. Neural Inf. Proc. Syst. 24 (NIPS 2011)1935–43 Red Hook, NY: Curran [Google Scholar]
  62. Janková J. de Geer S. , van 2015. Confidence intervals for high-dimensional inverse covariance estimation. Electron. J. Stat. 9:1205–29 [Google Scholar]
  63. Janofsky E. 2015. Exponential series approaches for nonparametric graphical models PhD thesis Univ. Chicago arXiv1506.03537 [math.ST]
  64. Kalisch M, Bühlmann P. 2007. Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 8:613–36 [Google Scholar]
  65. Karger D, Srebro N. 2001. Learning Markov networks: maximum bounded tree-width graphs. Proc. 12th ACM-SIAM Symp. Discret. Algorithms (SODA 2001)392–401 Philadelphia: SIAM [Google Scholar]
  66. Khare K, Oh SY, Rajaratnam B. 2015. A convex pseudolikelihood framework for high dimensional partial correlation estimation with convergence guarantees. J. R. Stat. Soc. B 77:803–25 [Google Scholar]
  67. Kolar M, Song L, Ahmed A, Xing EP. 2010. Estimating time-varying networks. Ann. Appl. Stat. 4:94–123 [Google Scholar]
  68. Kolar M, Xing EP. 2012. Estimating networks with jumps. Electron. J. Stat. 6:2069–106 [Google Scholar]
  69. Koller D, Friedman N. 2009. Probabilistic Graphical Models: Principles and Techniques Cambridge, MA: MIT Press
  70. Kuipers J, Moffa G. 2016. Partition MCMC for inference on acyclic digraphs. J. Am. Stat. Assoc. doi: 10.1080/01621459.2015.1133426
  71. Lauritzen SL. 1996. Graphical Models Oxford Stat. Sci. Ser. 17 Oxford, UK: Clarendon
  72. Lederer J, Müller C. 2014. Topology adaptive graph estimation in high dimensions. arXiv1410.7279 [stat.ML]
  73. Lin L, Drton M, Shojaie A. 2016. Estimation of high-dimensional graphical models using regularized score matching. Electron. J. Stat. 10:806–54 [Google Scholar]
  74. Liu H, Han F, Yuan M, Lafferty J, Wasserman L. 2012a. High-dimensional semiparametric Gaussian copula graphical models. Ann. Stat. 40:2293–326 [Google Scholar]
  75. Liu H, Han F, Zhang CH. 2012b. Transelliptical graphical models. Proc. Adv. Neural Inf. Proc. Syst. 25 (NIPS 2012)809–17 Red Hook, NY: Curran [Google Scholar]
  76. Liu H, Lafferty J, Wasserman L. 2009. The nonparanormal: semiparametric estimation of high dimensional undirected graphs. J. Mach. Learn. Res. 10:2295–328 [Google Scholar]
  77. Liu H, Roeder K, Wasserman L. 2010. Stability approach to regularization selection (StARS) for high-dimensional graphical models. Proc. Adv. Neural Inf. Process. Syst. 23 (NIPS 2010)1432–40 Red Hook, NY: Curran [Google Scholar]
  78. Liu H, Xu M, Gu H, Gupta A, Lafferty J, Wasserman L. 2011. Forest density estimation. J. Mach. Learn. Res. 12:907–51 [Google Scholar]
  79. Liu W. 2013. Gaussian graphical model estimation with false discovery rate control. Ann. Statist. 41:2948–78 [Google Scholar]
  80. Liu W, Luo X. 2015. Fast and adaptive sparse precision matrix estimation in high dimensions. J. Multivar. Anal. 135:153–62 [Google Scholar]
  81. Ma S, Xue L, Zou H. 2013. Alternating direction methods for latent variable Gaussian graphical model selection. Neural Comput. 25:2172–98 [Google Scholar]
  82. Maathuis MH, Colombo D, Kalisch M, Bühlmann P. 2010. Predicting causal effects in large-scale systems from observational data. Nat. Methods 7:247–48 [Google Scholar]
  83. Maathuis MH, Kalisch M, Bühlmann P. 2009. Estimating high-dimensional intervention effects from observational data. Ann. Stat. 37:3133–64 [Google Scholar]
  84. Marbach D, Costello JC, Kuffner R, Vega NM, Prill RJ. et al. 2012. Wisdom of crowds for robust gene network inference. Nat. Methods 9:796–804 [Google Scholar]
  85. Matúš F. 2012. On conditional independence and log-convexity. Ann. Inst. Henri Poincaré Probab. Stat. 48:1137–47 [Google Scholar]
  86. Mazumder R, Hastie T. 2012a. Exact covariance thresholding into connected components for large-scale graphical lasso. J. Mach. Learn. Res. 13:781–94 [Google Scholar]
  87. Mazumder R, Hastie T. 2012b. The graphical lasso: new insights and alternatives. Electron. J. Stat. 6:2125–49 [Google Scholar]
  88. Meinshausen N. 2008. A note on the Lasso for Gaussian graphical model selection. Stat. Probab. Lett. 78:880–84 [Google Scholar]
  89. Meinshausen N, Bühlmann P. 2006. High-dimensional graphs and variable selection with the lasso. Ann. Stat. 34:1436–62 [Google Scholar]
  90. Meinshausen N, Bühlmann P. 2010. Stability selection. J. R. Stat. Soc. B 72:417–73 [Google Scholar]
  91. Mitra R, Müller P, Ji Y. 2016. Bayesian graphical models for differential pathways. Bayesian Anal. 11:99–124 [Google Scholar]
  92. Mooij JM, Heskes T. 2013. Cyclic causal discovery from continuous equilibrium data. Proc. 29th Conf. Uncertain. Artif. Intell. (UAI 2013)431–39 Corvallis, OR: AUAI Press [Google Scholar]
  93. Nandy P, Hauser A, Maathuis MH. 2016a. High-dimensional consistency in score-based and hybrid structure learning. arXiv1507.02608 [math.ST]
  94. Nandy P, Maathuis MH, Richardson TS. 2016b. Estimating the effect of joint interventions from observational data in sparse high-dimensional settings. Ann. Stat. In press
  95. Neapolitan RE. 2004. Learning Bayesian Networks Upper Saddle River, NJ: Pearson Prentice Hall
  96. Nyman H, Pensar J, Koski T, Corander J. 2014. Stratified graphical models—context-specific independence in graphical models. Bayesian Anal. 9:883–908 [Google Scholar]
  97. Parviainen P, Koivisto M. 2009. Exact structure discovery in Bayesian networks with less space. Proc. 25th Conf. Uncertain. Artif. Intell. (UAI 2009)436–43 Arlington, VA: AUAI Press [Google Scholar]
  98. Pearl J. 2009. Causality: Models, Reasoning and Inference. Cambridge, UK: Cambridge Univ. Press, 2nd ed.. [Google Scholar]
  99. Perković E, Textor J, Kalisch M, Maathuis MH. 2015. A complete adjustment criterion. Proc. 31st Conf. Uncertain. Artif. Intell. (UAI 2015)682–91 Corvallis, OR: AUAI Press [Google Scholar]
  100. Peters J, Bühlmann P. 2014. Identifiability of Gaussian structural equation models with equal error variances. Biometrika 101:219–28 [Google Scholar]
  101. Peters J, Bühlmann P, Meinshausen N. 2016. Causal inference using invariant prediction: identification and confidence intervals. J. R. Stat. Soc. B 78:1–42 [Google Scholar]
  102. Peters J, Mooij JM, Janzing D, Schölkopf B. 2014. Causal discovery with continuous additive noise models. J. Mach. Learn. Res. 15:2009–53 [Google Scholar]
  103. Peterson C, Stingo FC, Vannucci M. 2015. Bayesian inference of multiple Gaussian graphical models. J. Am. Stat. Assoc. 110:159–74 [Google Scholar]
  104. Ravikumar P, Wainwright MJ, Lafferty JD. 2010. High-dimensional Ising model selection using ℓ1-regularized logistic regression. Ann. Stat. 38:1287–319 [Google Scholar]
  105. Ravikumar P, Wainwright MJ, Raskutti G, Yu B. 2011. High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence. Electron. J. Stat. 5:935–80 [Google Scholar]
  106. Ray A, Sanghavi S, Shakkottai S. 2015. Improved greedy algorithms for learning graphical models. IEEE Trans. Inf. Theory 61:3457–68 [Google Scholar]
  107. Ren Z, Sun T, Zhang CH, Zhou HH. 2015. Asymptotic normality and optimalities in estimation of large Gaussian graphical models. Ann. Stat. 43:991–1026 [Google Scholar]
  108. Richardson TS. 1996. A discovery algorithm for directed cyclic graphs. Proc. 12th Conf. Uncertain. Artif. Intell. (UAI 1996)454–61 San Francisco: Morgan Kaufmann [Google Scholar]
  109. Richardson TS, Spirtes P. 2002. Ancestral graph Markov models. Ann. Stat. 30:962–1030 [Google Scholar]
  110. Roverato A. 2005. A unified approach to the characterization of equivalence classes of DAGs, chain graphs with no flags and chain graphs. Scand. J. Stat. 32:295–312 [Google Scholar]
  111. Saegusa T, Shojaie A. 2016. Joint estimation of precision matrices in heterogeneous populations. Electron. J. Stat. 10:1341–92 [Google Scholar]
  112. Shah RD, Samworth RJ. 2013. Variable selection with error control: another look at stability selection. J. R. Stat. Soc. B 75:55–80 [Google Scholar]
  113. Shimizu S. 2014. LiNGAM: Non-Gaussian methods for estimating causal structures. Behaviormetrika 41:65–98 [Google Scholar]
  114. Shimizu S, Hoyer PO, Hyvärinen A, Kerminen A. 2006. A linear non-Gaussian acyclic model for causal discovery. J. Mach. Learn. Res. 7:2003–30 [Google Scholar]
  115. Shimizu S, Inazumi T, Sogawa Y, Hyvärinen A, Kawahara Y. et al. 2011. DirectLiNGAM: a direct method for learning a linear non-Gaussian structural equation model. J. Mach. Learn. Res. 12:1225–48 [Google Scholar]
  116. Shojaie A, Michailidis G. 2010. Discovering graphical Granger causality using the truncating lasso penalty. Bioinformatics 26:i517–23 [Google Scholar]
  117. Shpitser I, Evans R, Richardson T, Robins J. 2014. Introduction to nested Markov models. Behaviormetrika 41:3–39 [Google Scholar]
  118. Shpitser I, Richardson TS, Robins JM, Evans R. 2012. Parameter and structure learning in nested Markov models. arXiv1207.5058 [stat.ML]
  119. Silander T, Myllymäki P. 2006. A simple approach for finding the globally optimal Bayesian network structure. Proc. 22nd Conf. Uncertain. Artif. Intell. (UAI 2006)445–52 Arlington, VA: AUAI Press [Google Scholar]
  120. Silva R, Ghahramani Z. 2009. The hidden life of latent variables: Bayesian learning with mixed graph models. J. Mach. Learn. Res. 10:1187–238 [Google Scholar]
  121. Spirtes P, Glymour C, Scheines R. 2000. Causation, Prediction, and Search Cambridge, MA: MIT Press, 2nd ed..
  122. Spirtes P, Meek C, Richardson TS. 1999. An algorithm for causal inference in the presence of latent variables and selection bias. Computation, Causation, and Discovery211–52 Menlo Park, CA: AAAI Press [Google Scholar]
  123. Statnikov A, Ma S, Henaff M, Lytkin N, Efstathiadis E. et al. 2015. Ultra-scalable and efficient methods for hybrid observational and experimental local causal pathway discovery. J. Mach. Learn. Res. 16:3219–67 [Google Scholar]
  124. Studený M. 2005. Probabilistic conditional independence structures London: Springer
  125. Studený M, Haws D. 2014. Learning Bayesian network structure: towards the essential graph by integer linear programming tools. Int. J. Approx. Reason. 55:1043–71 [Google Scholar]
  126. Sullivant S, Gross E. 2014. The maximum likelihood threshold of a graph. arXiv1404.6989 [math.CO]
  127. Sun S, Kolar M, Xu J. 2015. Learning structured densities via infinite dimensional exponential families. Proc. Adv. Neural Inf. Proc. Syst. 28 (NIPS 2015)2287–95 Red Hook, NY: Curran [Google Scholar]
  128. Tan KM, London P, Mohan K, Lee SI, Fazel M, Witten D. 2014. Learning graphical models with hubs. J. Mach. Learn. Res. 15:3297–331 [Google Scholar]
  129. Tan VYF, Anandkumar A, Willsky AS. 2010. Learning Gaussian tree models: analysis of error exponents and extremal structures. IEEE Trans. Signal Process. 58:2701–14 [Google Scholar]
  130. Tan VYF, Anandkumar A, Willsky AS. 2011. Learning high-dimensional Markov forest distributions: analysis of error rates. J. Mach. Learn. Res. 12:1617–53 [Google Scholar]
  131. Triantafillou S, Tsamardinos I. 2015. Constraint-based causal discovery from multiple interventions over overlapping variable sets. J. Mach. Learn. Res. 16:2147–205 [Google Scholar]
  132. Tsamardinos I, Brown LE, Aliferis CF. 2006. The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65:31–78 [Google Scholar]
  133. Uhler C, Raskutti G, Bühlmann P, Yu B. 2013. Geometry of the faithfulness assumption in causal inference. Ann. Stat. 41:436–63 [Google Scholar]
  134. Van de Geer S, Bühlmann P. 2013. 0-penalized maximum likelihood for sparse directed acyclic graphs. Ann. Stat. 41:536–67 [Google Scholar]
  135. Vats D, Nowak R, Baraniuk R. 2014. Active learning for undirected graphical model selection. Proc. 17th Int. Conf. Artif. Intell. Stat. (AISTATS) 2014, Reykjavik, Iceland958–67 Cambridge, MA: MIT Press
  136. Verma T, Pearl J. 1988. Causal networks: Semantics and expressiveness. Proc. 4th Conf. Uncertain. Artif. Intell. (UAI 1988)352–59 Amsterdam: Elsevier [Google Scholar]
  137. Verma TS, Pearl J. 1990. Equivalence and synthesis of causal models. Proc. 6th Conf. Uncertain. Artif. Intell. (UAI 1990)255–68 Amsterdam: Elsevier [Google Scholar]
  138. Vogel D, Fried R. 2011. Elliptical graphical modelling. Biometrika 98:935–51 [Google Scholar]
  139. Vogel D, Tyler DE. 2014. Robust estimators for nondecomposable elliptical graphical models. Biometrika 101:865–82 [Google Scholar]
  140. Voorman A, Shojaie A, Witten D. 2014. Graph estimation with joint additive models. Biometrika 101:85–101 [Google Scholar]
  141. Wainwright MJ, Jordan MI. 2008. Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1:1–305 [Google Scholar]
  142. Wasserman L, Kolar M, Rinaldo A. 2014. Berry-Esseen bounds for estimating undirected graphs. Electron. J. Stat. 8:1188–224 [Google Scholar]
  143. Wermuth N. 2011. Probability distributions with summary graph structure. Bernoulli 17:845–79 [Google Scholar]
  144. Witten DM, Friedman JH, Simon N. 2011. New insights and faster computations for the graphical lasso. J. Comput. Graph. Stat. 20:892–900 [Google Scholar]
  145. Xia Y, Cai T, Cai TT. 2015. Testing differential networks with applications to the detection of gene-gene interactions. Biometrika 102:247–66 [Google Scholar]
  146. Xue L, Zou H. 2012. Regularized rank-based estimation of high-dimensional nonparanormal graphical models. Ann. Stat. 40:2541–71 [Google Scholar]
  147. Yang E, Ravikumar P, Allen GI, Liu Z. 2015a. Graphical models via univariate exponential family distributions. J. Mach. Learn. Res. 16:3813–47 [Google Scholar]
  148. Yang S, Lu Z, Shen X, Wonka P, Ye J. 2015b. Fused multiple graphical lasso. SIAM J. Optim. 25:916–43 [Google Scholar]
  149. Yang Z, Ning Y, Liu H. 2014. On semiparametric exponential family graphical models. arXiv1412.8697 [stat.ML]
  150. Yuan M, Lin Y. 2007. Model selection and estimation in the Gaussian graphical model. Biometrika 94:19–35 [Google Scholar]
  151. Zhang J. 2008. On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artif. Intell. 172:1873–96 [Google Scholar]
  152. Zhao SD, Cai TT, Li H. 2014. Direct estimation of differential networks. Biometrika 101:253–68 [Google Scholar]
  153. Zhao T, Liu H, Roeder K, Lafferty J, Wasserman L. 2012. The huge package for high-dimensional undirected graph estimation in R. J. Mach. Learn. Res. 13:1059–62 [Google Scholar]
  154. Zhou S, Lafferty J, Wasserman L. 2010. Time varying undirected graphs. Mach. Learn. 80:295–319 [Google Scholar]
/content/journals/10.1146/annurev-statistics-060116-053803
Loading
/content/journals/10.1146/annurev-statistics-060116-053803
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error