chapter

Causality for Machine Learning

Author:
Bernhard Schölkopf

Max Planck Institute for Intelligent Systems

Max Planck Institute for Intelligent Systems
View Profile

Probabilistic and Causal Inference: The Works of Judea PearlFebruary 2022Pages 765–804https://doi.org/10.1145/3501714.3501755

Published:04 March 2022Publication History

Probabilistic and Causal Inference: The Works of Judea Pearl

Pages 765–804

References

J. Aldrich. 1989. Autonomy. Oxf. Econ. Pap. 41, 15–34. DOI: .Google ScholarCross Ref
I. Asimov. 1951. Foundation. Gnome Press, New York.Google Scholar
E. Bareinboim and J. Pearl. 2014. Transportability from multiple environments with limited experiments: Completeness results. In Advances in Neural Information Processing Systems 27, 280–288.Google Scholar
E. Bareinboim, A. Forney, and J. Pearl. 2015. Bandits with unobserved confounders: A causal approach. In Advances in Neural Information Processing Systems 28, 1342–1350.Google Scholar
S. Bauer, B. Schölkopf, and J. Peters. 2016. The arrow of time in multivariate time series. In Proceedings of the 33rd International Conference on Machine Learning, Vol. 48. JMLR Workshop and Conference Proceedings, 2043–2051.Google Scholar
Y. Bengio, A. Courville, and P. Vincent. 2012. Representation learning: A review and new perspectives. IEEE Trans. Softw. Eng. 35, 8, 1798–1828. DOI: .Google ScholarDigital Library
E. Bengio, V. Thomas, J. Pineau, D. Precup, and Y. Bengio. 2017. Independently controllable features. arXiv:1703.07718.Google Scholar
Y. Bengio, T. Deleu, N. Rahaman, R. Ke, S. Lachapelle, O. Bilaniuk, A. Goyal, and C. Pal. 2019. A meta-transfer objective for learning to disentangle causal mechanisms. arXiv:1901.10912.Google Scholar
B. Benneke, I. Wong, C. Piaulet, H. A. Knutson, I. J. M. Crossfield, J. Lothringer, C. V. Morley, P. Gao, T. P. Greene, C. Dressing, D. Dragomir, A. W. Howard, P. R. McCullough, E. M. R. K. J. J. Fortney, and J. Fraine. 2019. Water vapor on the habitable-zone exoplanet K2-18b. arXiv:1909.04642.Google Scholar
M. Besserve, N. Shajarisales, B. Schölkopf, and D. Janzing. 2018a. Group invariance principles for causal generative models. In Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS). 557–565.Google Scholar
M. Besserve, R. Sun, and B. Schölkopf. 2018b. Counterfactuals uncover the modular structure of deep generative models. arXiv:1812.03253.Google Scholar
P. Blöbaum, T. Washio, and S. Shimizu. 2016. Error asymmetry in causal and anticausal regression. Behaviormetrika 2017. arXiv:1610.03263. DOI: .Google ScholarCross Ref
A. Blum and T. Mitchell. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the 11th Annual Conference on Computational Learning Theory. ACM, New York, 92–100. DOI: .Google ScholarDigital Library
D. Bohm. 1957. Causality and Chance in Modern Physics. Routledge & Kegan Paul, London.Google Scholar
B. Bonet and H. Geffner. 2019. Learning first-order symbolic representations for planning from the structure of the state space. arXiv:1909.05546.Google Scholar
L. Bottou, J. Peters, J. Quiñonero-Candela, D. X. Charles, D. M. Chickering, E. Portugualy, D. Ray, P. Simard, and E. Snelson. 2013. Counterfactual reasoning and learning systems: The example of computational advertising. J. Mach. Learn. Res. 14, 3207–3260.Google ScholarDigital Library
E. Brynjolfsson, A. Collis, W. E. Diewert, F. Eggers, and K. J. Fox. 2019. GDP-B: Accounting for the value of new and free goods in the digital economy. Working Paper 25695, National Bureau of Economic Research.Google Scholar
K. Budhathoki and J. Vreeken. 2016. Causal inference by compression. In IEEE 16th International Conference on Data Mining. DOI: .Google ScholarCross Ref
L. Buesing, T. Weber, Y. Zwols, S. Racaniere, A. Guez, J.-B. Lespiau, and N. Heess. 2018. Woulda, coulda, shoulda: Counterfactually-guided policy search. arXiv:1811.06272.Google Scholar
K. Chalupka, P. Perona, and F. Eberhardt. 2015. Multi-level cause–effect systems. arXiv:1512.07942.Google Scholar
K. Chalupka, P. Perona, and F. Eberhardt. 2018. Fast conditional independence test for vector variables with large sample sizes. arXiv:1804.02747.Google Scholar
O. Chapelle and B. Schölkopf. 2002. Incorporating invariances in nonlinear SVMs. In T. G. Dietterich, S. Becker, and Z. Ghahramani (Eds.), Advances in Neural Information Processing Systems 14. MIT Press, Cambridge, MA, 609–616. DOI: .Google ScholarCross Ref
O. Chapelle, B. Schölkopf, and A. Zien (Eds.). 2006. Semi-Supervised Learning. MIT Press, Cambridge, MA. http://www.kyb.tuebingen.mpg.de/ssl-book/. DOI: .Google ScholarCross Ref
Y. Chen and A. Cheung. 2018. The transparent self under big data profiling: Privacy and Chinese legislation on the social credit system. J. Comp. Law 12, 2, 356–378. DOI: .Google ScholarCross Ref
N. Chentanez, A. G. Barto, and S. P. Singh. 2005. Intrinsically motivated reinforcement learning. In Advances in Neural Information Processing Systems 17. MIT Press, 1281–1288.Google Scholar
X. Dai. 2018. Toward a reputation state: The social credit system project of China. https://ssrn.com/abstract=3193577. DOI: .Google ScholarCross Ref
P. Daniušis, D. Janzing, J. M. Mooij, J. Zscheischler, B. Steudel, K. Zhang, and B. Schölkopf. 2010. Inferring deterministic causal relations. In Proceedings of the 26th Annual Conference on Uncertainty in Artificial Intelligence (UAI). 143–150.Google Scholar
I. Dasgupta, J. Wang, S. Chiappa, J. Mitrovic, P. Ortega, D. Raposo, E. Hughes, P. Battaglia, M. Botvinick, and Z. Kurth-Nelson. 2019. Causal reasoning from meta-reinforcement learning. arXiv:1901.08162.Google Scholar
A. P. Dawid. 1979. Conditional independence in statistical theory. J. R. Stat. Soc. B 41, 1, 1–31.Google ScholarCross Ref
L. Devroye, L. Györfi, and G. Lugosi. 1996. A Probabilistic Theory of Pattern Recognition, Vol. 31: Applications of Mathematics. Springer, New York. DOI: .Google ScholarCross Ref
D. Foreman-Mackey, B. T. Montet, D. W. Hogg, T. D. Morton, D. Wang, and B. Schölkopf. 2015. A systematic search for transiting planets in the K2 data. Astrophys. J. 806, 2. http://stacks.iop.org/0004-637X/806/i=2/a=215. DOI: .Google ScholarCross Ref
R. Frisch, T. Haavelmo, T. Koopmans, and J. Tinbergen. 1948. Autonomy of Economic Relations. Universitets Socialøkonomiske Institutt, Oslo, Norway.Google Scholar
K. Fukumizu, A. Gretton, X. Sun, and B. Schölkopf. 2008. Kernel measures of conditional dependence. In Advances in Neural Information Processing Systems 20. 489–496.Google Scholar
D. Geiger and J. Pearl. 1990. Logical and algorithmic properties of independence and their application to Bayesian networks. Ann. Math. Artif. Intell. 2, 165–178. DOI: .Google ScholarCross Ref
M. Gong, K. Zhang, T. Liu, D. Tao, C. Glymour, and B. Schölkopf. 2016. Domain adaptation with conditional transferable components. In Proceedings of the 33rd International Conference on Machine Learning. 2839–2848.Google Scholar
M. Gong, K. Zhang, B. Schölkopf, C. Glymour, and D. Tao. 2017. Causal discovery from temporally aggregated time series. In Proceedings of the 33rd Conference on Uncertainty in Artificial Intelligence (UAI). ID 269.Google Scholar
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems 27. Curran Associates, Inc., 2672–2680.Google Scholar
O. Gottesman, F. Johansson, J. Meier, J. Dent, D. Lee, S. Srinivasan, L. Zhang, Y. Ding, D. Wihl, X. Peng, J. Yao, I. Lage, C. Mosch, L. wei H. Lehman, M. Komorowski, M. Komorowski, A. Faisal, L. A. Celi, D. Sontag, and F. Doshi-Velez. 2018. Evaluating reinforcement learning algorithms in observational health settings. arXiv:1805.12298.Google Scholar
O. Goudet, D. Kalainathan, P. Caillou, I. Guyon, D. Lopez-Paz, and M. Sebag. 2017. Causal generative neural networks. arXiv:1711.08936.Google Scholar
A. Goyal, A. Lamb, J. Hoffmann, S. Sodhani, S. Levine, Y. Bengio, and B. Schölkopf. 2019. Recurrent independent mechanisms. arXiv:1909.10893.Google Scholar
A. Gretton, O. Bousquet, A. Smola, and B. Schölkopf. 2005a. Measuring statistical dependence with Hilbert–Schmidt norms. In Algorithmic Learning Theory. Springer-Verlag, 63–78. DOI: .Google ScholarDigital Library
A. Gretton, R. Herbrich, A. Smola, O. Bousquet, and B. Schölkopf. 2005b. Kernel methods for measuring independence. J. Mach. Learn. Res. 6, 2075–2129.Google ScholarDigital Library
R. Guo, L. Cheng, J. Li, P. R. Hahn, and H. Liu. 2018. A survey of learning causality with data: Problems and methods. arXiv:1809.09337. DOI: .Google ScholarDigital Library
I. Guyon, C. Aliferis, and A. Elisseeff. 2007. Causal feature selection. In Computational Methods of Feature Selection. Chapman and Hall/CRC, Boca Raton, FL, 75–97.Google Scholar
I. Guyon, D. Janzing, and B. Schölkopf. 2010. Causality: Objectives and assessment. In I. Guyon, D. Janzing, and B. Schölkopf (Eds.), JMLR Workshop and Conference Proceedings. Vol. 6. MIT Press, Cambridge, MA, 1–42.Google Scholar
T. Haavelmo. 1944. The probability approach in econometrics. Econometrica 12, (supplement), S1–S115.Google Scholar
C. Heinze-Deml and N. Meinshausen. 2017. Conditional variance penalties and domain shift robustness. arXiv:1710.11469.Google Scholar
C. Heinze-Deml, J. Peters, and N. Meinshausen. 2017. Invariant causal prediction for nonlinear models. arXiv:1706.08576.Google Scholar
J. Henrich. 2016. The Secret of our Success. Princeton University Press, Princeton, NJ.Google Scholar
K. D. Hoover. 2008. Causality in economics and econometrics. In S. N. Durlauf and L. E. Blume (Eds.), The New Palgrave Dictionary of Economics (2nd. ed.). Palgrave Macmillan, Basingstoke, UK.Google Scholar
P. O. Hoyer, D. Janzing, J. M. Mooij, J. Peters, and B. Schölkopf. 2009. Nonlinear causal discovery with additive noise models. In Advances in Neural Information Processing Systems 21 (NIPS).689–696.Google Scholar
B. Huang, K. Zhang, J. Zhang, R. Sanchez-Romero, C. Glymour, and B. Schölkopf. 2017. Behind distribution shift: Mining driving forces of changes and causal arrows. In IEEE 17th International Conference on Data Mining (ICDM 2017). 913–918. DOI: .Google ScholarCross Ref
D. Janzing. 2019. Causal regularization. In Advances in Neural Information Processing Systems 33.Google Scholar
D. Janzing and B. Schölkopf. 2010. Causal inference using the algorithmic Markov condition. IEEE Trans. Inf. Theory 56, 10, 5168–5194. DOI: .Google ScholarDigital Library
D. Janzing and B. Schölkopf. 2015. Semi-supervised interpolation in an anticausal learning scenario. J. Mach. Learn. Res. 16, 1923–1948.Google ScholarDigital Library
D. Janzing and B. Schölkopf. 2018. Detecting non-causal artifacts in multivariate linear regression models. In Proceedings of the 35th International Conference on Machine Learning (ICML). 2250–2258.Google Scholar
D. Janzing, J. Peters, J. M. Mooij, and B. Schölkopf. 2009. Identifying confounders using additive noise models. In Proceedings of the 25th Annual Conference on Uncertainty in Artificial Intelligence (UAI). 249–257.Google Scholar
D. Janzing, P. Hoyer, and B. Schölkopf. 2010. Telling cause from effect based on high-dimensional observations. In J. Fürnkranz and T. Joachims (Eds.), In Proceedings of the 27th International Conference on Machine Learning. 479–486.Google Scholar
D. Janzing, J. M. Mooij, K. Zhang, J. Lemeire, J. Zscheischler, P. Daniušis, B. Steudel, and B. Schölkopf. 2012. Information-geometric approach to inferring causal directions. Artif. Intell. 182–183, 1–31. DOI: .Google ScholarDigital Library
D. Janzing, R. Chaves, and B. Schölkopf. 2016. Algorithmic independence of initial condition and dynamical law in thermodynamics and causal inference. New J. Phys. 18, 093052, 1–13. DOI: .Google ScholarCross Ref
M. Khajehnejad, B. Tabibian, B. Schölkopf, A. Singla, and M. Gomez-Rodriguez. 2019. Optimal decision making under strategic behavior. arXiv:1905.09239.Google Scholar
N. Kilbertus, M. Rojas Carulla, G. Parascandolo, M. Hardt, D. Janzing, and B. Schölkopf. 2017. Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems 30. 656–666.Google Scholar
N. Kilbertus, G. Parascandolo, and B. Schölkopf. 2018. Generalization in anti-causal learning. arXiv:1812.00524.Google Scholar
D. P. Kingma and M. Welling. 2013. Auto-encoding variational Bayes. arXiv:1312.6114.Google Scholar
F. Klein. 1872. Vergleichende Betrachtungen über neuere geometrische Forschungen. Verlag von Andreas Deichert, Erlangen.Google Scholar
S. Kpotufe, E. Sgouritsa, D. Janzing, and B. Schölkopf. 2014. Consistency of causal inference under the additive noise model. In Proceedings of the 31st International Conference on Machine Learning. 478–486.Google Scholar
M. J. Kusner, J. Loftus, C. Russell, and R. Silva. 2017. Counterfactual fairness. In Advances in Neural Information Processing Systems 30. Curran Associates, Inc., 4066–4076.Google Scholar
S. Lange, T. Gabel, and M. Riedmiller. 2012. Batch reinforcement learning. In M. Wiering and M. van Otterlo (Eds.), Reinforcement Learning: State-of-the-Art. Springer, Berlin, 45–73.Google ScholarCross Ref
S. L. Lauritzen. 1996. Graphical Models. Oxford University Press, New York.Google Scholar
Y. LeCun, Y. Bengio, and G. Hinton. 2015. Deep learning. Nature 521, 7553, 436–444. DOI: .Google ScholarCross Ref
Y. Li, M. Gong, X. Tian, T. Liu, and D. Tao. 2018a. Domain generalization via conditional invariant representation. arXiv:1807.08479.Google Scholar
Y. Li, X. Tian, M. Gong, Y. Liu, T. Liu, K. Zhang, and D. Tao. 2018b. Deep domain generalization via conditional invariant adversarial networks. In The European Conference on Computer Vision (ECCV).Google Scholar
Z. C. Lipton, Y.-X. Wang, and A. Smola. 2018. Detecting and correcting for label shift with black box predictors. arXiv:1802.03916.Google Scholar
F. Locatello, S. Bauer, M. Lucic, G. Rätsch, S. Gelly, B. Schölkopf, and O. Bachem. 2018a. Challenging common assumptions in the unsupervised learning of disentangled representations. In Proceedings of the 36th International Conference on Machine Learning.Google Scholar
F. Locatello, D. Vincent, I. Tolstikhin, G. Rätsch, S. Gelly, and B. Schölkopf. 2018b. Competitive training of mixtures of independent deep generative models. arXiv:1804.11130.Google Scholar
D. Lopez-Paz, K. Muandet, B. Schölkopf, and I. Tolstikhin. 2015. Towards a learning theory of cause–effect inference. In Proceedings of the 32nd International Conference on Machine Learning. 1452–1461.Google Scholar
D. Lopez-Paz, R. Nishihara, S. Chintala, B. Schölkopf, and L. Bottou. 2017. Discovering causal signals in images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 58–66.Google Scholar
K. Lorenz. 1973. Die Rückseite des Spiegels. R. Piper & Co. Verlag, Munich.Google Scholar
C. Lu, B. Schölkopf, and J. M. Hernández-Lobato. 2018. Deconfounding reinforcement learning in observational settings. arXiv:1812.10576.Google Scholar
S. MacLane. 1971. Categories for the Working Mathematician. Vol. 5. Graduate Texts in Mathematics. Springer-Verlag, New York.Google Scholar
S. Magliacane, T. van Ommen, T. Claassen, S. Bongers, P. Versteeg, and J. M. Mooij. 2018. Domain adaptation by using causal inference to predict invariant conditional distributions. In Proceedings of the NeurIPS. arXiv:1707.06422.Google Scholar
E. Medina. 2011. Cybernetic Revolutionaries: Technology and Politics in Allende’s Chile. The MIT Press, Cambridge, MA.Google ScholarCross Ref
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540, 529–533. DOI: .Google ScholarCross Ref
B. T. Montet, T. D. Morton, D. Foreman-Mackey, J. A. Johnson, D. W. Hogg, B. P. Bowler, D. W. Latham, A. Bieryla, and A. W. Mann. 2015. Stellar and planetary properties of K2 Campaign 1 candidates and validation of 17 planets, including a planet receiving Earth-like insolation. Astrophys. J. 809, 1, 25.Google ScholarCross Ref
J. M. Mooij, D. Janzing, J. Peters, and B. Schölkopf. 2009. Regression by dependence minimization and its application to causal inference. In Proceedings of the 26th International Conference on Machine Learning (ICML). 745–752. DOI: .Google ScholarDigital Library
J. M. Mooij, D. Janzing, T. Heskes, and B. Schölkopf. 2011. On causal discovery with cyclic additive noise models. In Advances in Neural Information Processing Systems 24 (NIPS).Google Scholar
J. M. Mooij, D. Janzing, and B. Schölkopf. 2013. From ordinary differential equations to structural causal models: The deterministic case. In Proceedings of the 29th Annual Conference on Uncertainty in Artificial Intelligence (UAI). 440–448.Google Scholar
J. M. Mooij, J. Peters, D. Janzing, J. Zscheischler, and B. Schölkopf. 2016. Distinguishing cause from effect using observational data: Methods and benchmarks. J. Mach. Learn. Res. 17, 32, 1–102.Google Scholar
G. Parascandolo, N. Kilbertus, M. Rojas-Carulla, and B. Schölkopf. 2018. Learning independent causal mechanisms. In Proceedings of the 35th International Conference on Machine Learning (ICML). 4033–4041.Google Scholar
J. Pearl. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., San Francisco, CA.Google ScholarDigital Library
J. Pearl. 2009a. Causality: Models, Reasoning, and Inference. (2nd. ed.). Cambridge University Press, New York.Google Scholar
J. Pearl. 2009b. Giving computers free will. Forbes.Google Scholar
J. Pearl and E. Bareinboim. 2015. External validity: From do-calculus to transportability across populations. Stat. Sci. 2014, 29, 4, 579–595. arXiv:1503.01603. DOI: .Google ScholarCross Ref
J. Peters, J. M. Mooij, D. Janzing, and B. Schölkopf. 2011. Identifiability of causal graphs using functional models. In Proceedings of the 27th Annual Conference on Uncertainty in Artificial Intelligence (UAI). 589–598.Google Scholar
J. Peters, J. M. Mooij, D. Janzing, and B. Schölkopf. 2014. Causal discovery with continuous additive noise models. J. Mach. Learn. Res. 15, 2009–2053.Google ScholarDigital Library
J. Peters, P. Bühlmann, and N. Meinshausen. 2016. Causal inference using invariant prediction: Identification and confidence intervals. J. R. Stat. Soc. Series B Stat. Methodol. 78, 5, 947–1012. DOI: .Google ScholarCross Ref
J. Peters, D. Janzing, and B. Schölkopf. 2017. Elements of Causal Inference: Foundations and Learning Algorithms. MIT Press, Cambridge, MA.Google Scholar
N. Pfister, S. Bauer, and J. Peters. 2018a. Identifying causal structure in large-scale kinetic systems. arXiv:1810.11776.Google Scholar
N. Pfister, P. Bühlmann, B. Schölkopf, and J. Peters. 2018b. Kernel-based tests for joint independence. J. R. Stat. Soc. Series B Stat. Methodol. 80, 1, 5–31. DOI: .Google ScholarCross Ref
S. Rabanser, S. Günnemann, and Z. C. Lipton. 2018. Failing loudly: An empirical study of methods for detecting dataset shift. arXiv:1810.11953.Google Scholar
H. Reichenbach. 1956. The Direction of Time. University of California Press, Berkeley, CA. DOI: .Google ScholarCross Ref
M. Rojas-Carulla, B. Schölkopf, R. Turner, and J. Peters. 2018. Invariant models for causal transfer learning. J. Mach. Learn. Res. 19, 36, 1–34. DOI: .Google ScholarDigital Library
P. K. Rubenstein, S. Weichwald, S. Bongers, J. M. Mooij, D. Janzing, M. Grosse-Wentrup, and B. Schölkopf. 2017. Causal consistency of structural equation models. In Proceedings of the 33rd Conference on Uncertainty in Artificial Intelligence.Google Scholar
P. K. Rubenstein, S. Bongers, B. Schölkopf, and J. M. Mooij. 2018. From deterministic ODEs to dynamic structural causal models. In Proceedings of the 34th Conference on Uncertainty in Artificial Intelligence (UAI).Google Scholar
B. Schölkopf. 2015. Artificial intelligence: Learning to see and act. Nature 518, 7540, 486–487. DOI: .Google ScholarCross Ref
B. Schölkopf. 2017. Causal learning. In Invited Talk, 34th International Conference on Machine Learning (ICML). https://vimeo.com/238274659.Google Scholar
B. Schölkopf and A. J. Smola. 2002. Learning with Kernels. MIT Press, Cambridge, MA.Google Scholar
B. Schölkopf, D. Janzing, J. Peters, and K. Zhang. 2011. Robust learning via cause–effect models. https://arxiv.org/abs/1112.2738.Google Scholar
B. Schölkopf, D. Janzing, J. Peters, E. Sgouritsa, K. Zhang, and J. Mooij. 2012. On causal and anticausal learning. In J. Langford and J. Pineau (Eds.), Proceedings of the 29th International Conference on Machine Learning (ICML). Omnipress, New York, 1255–1262. http://icml.cc/2012/papers/625.pdf.Google Scholar
B. Schölkopf, D. Hogg, D. Wang, D. Foreman-Mackey, D. Janzing, C.-J. Simon-Gabriel, and J. Peters. 2016a. Modeling confounding by half-sibling regression. Proc. Natl. Acad. Sci. U. S. A. 113, 27, 7391–7398. DOI: .Google ScholarCross Ref
B. Schölkopf, D. Janzing, and D. Lopez-Paz. 2016b. Causal and statistical learning. Oberwolfach Rep. 13, 3, 1896–1899. DOI: .Google ScholarCross Ref
L. Schott, J. Rauber, M. Bethge, and W. Brendel. 2019. Towards the first adversarially robust neural network model on MNIST. In International Conference on Learning Representations. https://openreview.net/forum?id=S1EHOsC9tX.Google Scholar
R. D. Shah and J. Peters. 2018. The hardness of conditional independence testing and the generalised covariance measure. Ann. Statist. 48, 3, 1514–1538. arXiv:1804.07203. DOI: .Google ScholarCross Ref
N. Shajarisales, D. Janzing, B. Schölkopf, and M. Besserve. 2015. Telling cause from effect in deterministic linear dynamical systems. In Proceedings of the 32nd International Conference on Machine Learning (ICML). 285–294.Google Scholar
C. E. Shannon. 1959. Coding theorems for a discrete source with a fidelity criterion. In IRE International Convention Records. Vol. 7, 142–163.Google Scholar
S. Shimizu, P. O. Hoyer, A. Hyvärinen, and A. J. Kerminen. 2006. A linear non-Gaussian acyclic model for causal discovery. J. Mach. Learn. Res. 7, 2003–2030.Google ScholarDigital Library
V. Smil. 2017. Energy and Civilization: A History. MIT Press, Cambridge, MA. DOI: .Google ScholarCross Ref
P. Spirtes, C. Glymour, and R. Scheines. 2000. Causation, Prediction, and Search (2nd. ed.). MIT Press, Cambridge, MA. DOI: .Google ScholarCross Ref
W. Spohn. 1978. Grundlagen der Entscheidungstheorie. Scriptor-Verlag.Google Scholar
I. Steinwart and A. Christmann. 2008. Support Vector Machines. Springer, New York.Google Scholar
B. Steudel, D. Janzing, and B. Schölkopf. 2010. Causal Markov condition for submodular information measures. In Proceedings of the 23rd Annual Conference on Learning Theory (COLT). 464–476.Google Scholar
A. Subbaswamy, P. Schulam, and S. Saria, 2018. Preventing failures due to dataset shift: Learning predictive models that transport. arXiv:1812.04597.Google Scholar
X. Sun, D. Janzing, and B. Schölkopf. 2006. Causal inference by choosing graphs with most plausible Markov kernels. In Proceedings of the 9th International Symposium on Artificial Intelligence and Mathematics.Google Scholar
R. Suter, Đ. Miladinović, B. Schölkopf, and S. Bauer. 2018. Robustly disentangled causal mechanisms: Validating deep representations for interventional robustness. arXiv:1811.00007. Proceedings ICML.Google Scholar
C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. 2013. Intriguing properties of neural networks. arXiv:1312.6199.Google Scholar
A. Tsiaras, I. Waldmann, G. Tinetti, J. Tennyson, and S. N. Yurchenko. 2019. Water vapour in the atmosphere of the habitable-zone eight-earth-mass planet K2-18b. Nat. Astron. 3, 1–6. DOI: .Google ScholarCross Ref
V. Vapnik. 1995. The Nature of Statistical Learning Theory. Springer, New York. DOI: .Google ScholarCross Ref
V. N. Vapnik. 1998. Statistical Learning Theory. Wiley, New York.Google ScholarCross Ref
J. von Kügelgen, A. Mey, M. Loog, and B. Schölkopf. 2019. Semi-supervised learning, causality and the conditional cluster assumption. https://arxiv.org/abs/1905.12081.Google Scholar
H. Wang, Z. He, Z. C. Lipton, and E. P. Xing. 2019. Learning robust representations by projecting superficial statistics out. arXiv:1903.06256.Google Scholar
S. Weichwald, B. Schölkopf, T. Ball, and M. Grosse-Wentrup. 2014. Causal and anti-causal learning in pattern recognition for neuroimaging. In 4th International Workshop on Pattern Recognition in Neuroimaging (PRNI). IEEE. DOI: .Google ScholarCross Ref
W. K. Wootters and W. H. Zurek. 1982. A single quantum cannot be cloned. Nature 299, 5886, 802–803. DOI: .Google ScholarCross Ref
J. Zhang and E. Bareinboim. 2018. Fairness in decision-making—The causal explanation formula. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. New Orleans, LA, 2037–2045.Google Scholar
J. Zhang and E. Bareinboim. 2019. Near-optimal reinforcement learning in dynamic treatment regimes. In Advances in Neural Information Processing Systems 33.Google Scholar
K. Zhang and A. Hyvärinen. 2009. On the identifiability of the post-nonlinear causal model. In Proceedings of the 25th Annual Conference on Uncertainty in Artificial Intelligence (UAI). 647–655.Google Scholar
K. Zhang, J. Peters, D. Janzing, and B. Schölkopf. 2011. Kernel-based conditional independence test and application in causal discovery. In Proceedings of the 27th Annual Conference on Uncertainty in Artificial Intelligence (UAI). 804–813.Google Scholar
K. Zhang, B. Schölkopf, K. Muandet, and Z. Wang. 2013. Domain adaptation under target and conditional shift. In Proceedings of the 30th International Conference on Machine Learning (ICML). 819–827.Google Scholar
K. Zhang, M. Gong, and B. Schölkopf. 2015. Multi-source domain adaptation: A causal view. In Proceedings of the 29th AAAI Conference on Artificial Intelligence. 3150–3157.Google Scholar
K. Zhang, B. Huang, J. Zhang, C. Glymour, and B. Schölkopf. 2017. Causal discovery from nonstationary/heterogeneous data: Skeleton estimation and orientation determination. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI 2017). 1347–1353. DOI: .Google ScholarCross Ref

Index Terms

Causality for Machine Learning

Index terms have been assigned to the content through auto-classification.

Recommendations

Lifelong Machine Learning
Read More
Machine scheduling with DeJong's learning effect

We consider a new scheduling model with DeJong's learning effect.The objectives are to minimize makespan and the total completion time.For single machine, both objectives are showed to be polynomially solvable.For parallel machines, an FPTAS is proposed ...
Read More
Single machine scheduling models with deterioration and learning: handling precedence constraints via priority generation

We consider various single machine scheduling problems in which the processing time of a job depends either on its position in a processing sequence or on its start time. We focus on problems of minimizing the makespan or the sum of (weighted) ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Chapter

Published in
Probabilistic and Causal Inference: The Works of Judea Pearl
February 2022
946 pages
ISBN:9781450395861
DOI:10.1145/3501714
Editors:
Hector Geffner
ICREA and Universitat Pompeu Fabra
,
Rina Dechter
University of California, Irvine
,
Joseph Y. Halpern
Cornell University
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 March 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- chapter
Conference

Appears In
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 19
  Total Citations
  View Citations
- 442
  Total Downloads
- Downloads (Last 12 months)200
- Downloads (Last 6 weeks)32
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Causality for Machine Learning

Probabilistic and Causal Inference: The Works of Judea Pearl

References

Cited By

Index Terms

Recommendations

Lifelong Machine Learning

Machine scheduling with DeJong's learning effect

Single machine scheduling models with deterioration and learning: handling precedence constraints via priority generation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Appears In

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Causality for Machine Learning

Probabilistic and Causal Inference: The Works of Judea Pearl

References

Cited By

Index Terms

Recommendations

Lifelong Machine Learning

Machine scheduling with DeJong's learning effect

Single machine scheduling models with deterioration and learning: handling precedence constraints via priority generation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Appears In

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media