Skip to main content

2021 | OriginalPaper | Buchkapitel

Detection of Conditional Dependence Between Multiple Variables Using Multiinformation

verfasst von : Jan Mielniczuk, Paweł Teisseyre

Erschienen in: Computational Science – ICCS 2021

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We consider a problem of detecting the conditional dependence between multiple discrete variables. This is a generalization of well-known and widely studied problem of testing the conditional independence between two variables given a third one. The issue is important in various applications. For example, in the context of supervised learning, such test can be used to verify model adequacy of the popular Naive Bayes classifier. In epidemiology, there is a need to verify whether the occurrences of multiple diseases are dependent. However, focusing solely on occurrences of diseases may be misleading, as one has to take into account the confounding variables (such as gender or age) and preferably consider the conditional dependencies between diseases given the confounding variables. To address the aforementioned problem, we propose to use conditional multiinformation (CMI), which is a measure derived from information theory. We prove some new properties of CMI. To account for the uncertainty associated with a given data sample, we propose a formal statistical test of conditional independence based on the empirical version of CMI. The main contribution of the work is determination of the asymptotic distribution of empirical CMI, which leads to construction of the asymptotic test for conditional independence. The asymptotic test is compared with the permutation test and the scaled chi squared test. Simulation experiments indicate that the asymptotic test achieves larger power than the competitive methods thus leading to more frequent detection of conditional dependencies when they occur. We apply the method to detect dependencies in medical data set MIMIC-III.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bellot, A., van der Schaar, M.: Conditional independence testing using generative adversarial networks. In: Advances in Neural Information Processing Systems, vol. 32, pp. 2199–2208 (2019) Bellot, A., van der Schaar, M.: Conditional independence testing using generative adversarial networks. In: Advances in Neural Information Processing Systems, vol. 32, pp. 2199–2208 (2019)
2.
Zurück zum Zitat Berrett, T.B., Wang, Y., Barber, R.F., Samworth, R.J.: The conditional permutation test for independence while controlling for confounders. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 82(1), 175–197 (2020)MathSciNetCrossRef Berrett, T.B., Wang, Y., Barber, R.F., Samworth, R.J.: The conditional permutation test for independence while controlling for confounders. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 82(1), 175–197 (2020)MathSciNetCrossRef
4.
Zurück zum Zitat Candès, E., Fan, Y., Janson, L., Lv, J.: Panning for gold: model-x knockoffs for high-dimensional controlled variable selection. J. Roy. Stat. Soc. B 80, 551–577 (2018)MathSciNetCrossRef Candès, E., Fan, Y., Janson, L., Lv, J.: Panning for gold: model-x knockoffs for high-dimensional controlled variable selection. J. Roy. Stat. Soc. B 80, 551–577 (2018)MathSciNetCrossRef
5.
Zurück zum Zitat Chanda, P., et al.: Ambience: a novel approach and efficient algorithm for identifying informative genetic and environmental associations with complex phenotypes. Genetics 180, 1191–2010 (2008)CrossRef Chanda, P., et al.: Ambience: a novel approach and efficient algorithm for identifying informative genetic and environmental associations with complex phenotypes. Genetics 180, 1191–2010 (2008)CrossRef
6.
Zurück zum Zitat Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley Series in Telecommunications and Signal Processing. Wiley-Interscience (2006) Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley Series in Telecommunications and Signal Processing. Wiley-Interscience (2006)
7.
Zurück zum Zitat Dawid, A.P.: Conditional independence in statistical theory. J. Roy. Stat. Soc.: Ser. B (Methodol.) 41(1), 1–15 (1979)MathSciNetMATH Dawid, A.P.: Conditional independence in statistical theory. J. Roy. Stat. Soc.: Ser. B (Methodol.) 41(1), 1–15 (1979)MathSciNetMATH
8.
Zurück zum Zitat Johnson, A.E.W., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3, 1–9 (2016)CrossRef Johnson, A.E.W., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3, 1–9 (2016)CrossRef
9.
Zurück zum Zitat Kubkowski, M., Mielniczuk, J.: Asymptotic distributions of interaction information. Methodol. Comput. Appl. Probab. 23, 291–315 (2020)MathSciNetCrossRef Kubkowski, M., Mielniczuk, J.: Asymptotic distributions of interaction information. Methodol. Comput. Appl. Probab. 23, 291–315 (2020)MathSciNetCrossRef
10.
Zurück zum Zitat Kullback, S.: Information Theory and Statistics. Peter Smith (1978) Kullback, S.: Information Theory and Statistics. Peter Smith (1978)
11.
Zurück zum Zitat Li, C., Fan, X.: On nonparametric conditional independence tests for continuous variables. WIREs Comput. Stat. 12, 1–11 (2020)MathSciNetCrossRef Li, C., Fan, X.: On nonparametric conditional independence tests for continuous variables. WIREs Comput. Stat. 12, 1–11 (2020)MathSciNetCrossRef
12.
13.
Zurück zum Zitat Rowe, T., Troy, D.: The sampling distribution of the total correlation for multivariate gaussian random variables. Entropy 21, 921 (2019)MathSciNetCrossRef Rowe, T., Troy, D.: The sampling distribution of the total correlation for multivariate gaussian random variables. Entropy 21, 921 (2019)MathSciNetCrossRef
14.
Zurück zum Zitat Runge, J.: Conditional independence testing based on a nearest neighbour estimator of conditional mutual information. In: Proceedings of the 21st International Conference on Artificial Intelligence and Statistics, PMLR, vol. 84, pp. 938–947 (2018) Runge, J.: Conditional independence testing based on a nearest neighbour estimator of conditional mutual information. In: Proceedings of the 21st International Conference on Artificial Intelligence and Statistics, PMLR, vol. 84, pp. 938–947 (2018)
15.
Zurück zum Zitat Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search, 2nd edn. MIT Press (2000) Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search, 2nd edn. MIT Press (2000)
16.
Zurück zum Zitat Studený, M.: Asymptotic behaviour of empirical multiinformation. Kybernetika 23, 124–135 (1987)MathSciNetMATH Studený, M.: Asymptotic behaviour of empirical multiinformation. Kybernetika 23, 124–135 (1987)MathSciNetMATH
17.
Zurück zum Zitat Studený, M., Vejnarová, J.: The multiinformation as a tool for measuring stochastic dependence. In: Learning in Graphical Models, pp. 66–82. MIT Press (1999) Studený, M., Vejnarová, J.: The multiinformation as a tool for measuring stochastic dependence. In: Learning in Graphical Models, pp. 66–82. MIT Press (1999)
18.
Zurück zum Zitat Tsamardinos, I., Aliferis, C., Statnikov, A.: Algorithms for large scale Markov Blanket discovery. In: FLAIRS Conference, pp. 376–381 (2003) Tsamardinos, I., Aliferis, C., Statnikov, A.: Algorithms for large scale Markov Blanket discovery. In: FLAIRS Conference, pp. 376–381 (2003)
21.
Zurück zum Zitat Watanabe, S.: Information theoretical analysis of multivariate correlation. IBM J. Res. Dev. 4, 66–82 (1960)MathSciNetCrossRef Watanabe, S.: Information theoretical analysis of multivariate correlation. IBM J. Res. Dev. 4, 66–82 (1960)MathSciNetCrossRef
22.
Zurück zum Zitat Zhang, K., Peters, J., Janzing, D., Schölkopf, B.: Kernel-based conditional independence test and application in causal discovery. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, UAI 2011, pp. 804–813 (2011) Zhang, K., Peters, J., Janzing, D., Schölkopf, B.: Kernel-based conditional independence test and application in causal discovery. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, UAI 2011, pp. 804–813 (2011)
Metadaten
Titel
Detection of Conditional Dependence Between Multiple Variables Using Multiinformation
verfasst von
Jan Mielniczuk
Paweł Teisseyre
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-77980-1_51