skip to main content
research-article

Causal Interpretability for Machine Learning - Problems, Methods and Evaluation

Published:18 May 2020Publication History
Skip Abstract Section

Abstract

Machine learning models have had discernible achievements in a myriad of applications. However, most of these models are black-boxes, and it is obscure how the decisions are made by them. This makes the models unreliable and untrustworthy. To provide insights into the decision making processes of these models, a variety of traditional interpretable models have been proposed. Moreover, to generate more humanfriendly explanations, recent work on interpretability tries to answer questions related to causality such as "Why does this model makes such decisions?" or "Was it a specific feature that caused the decision made by the model?". In this work, models that aim to answer causal questions are referred to as causal interpretable models. The existing surveys have covered concepts and methodologies of traditional interpretability. In this work, we present a comprehensive survey on causal interpretable models from the aspects of the problems and methods. In addition, this survey provides in-depth insights into the existing evaluation metrics for measuring interpretability, which can help practitioners understand for what scenarios each evaluation metric is suitable.

References

  1. A. Aamodt and E. Plaza. Case-based reasoning: Foundational issues, methodological variations, and system approaches. AI communications, 7(1):39--59, 1994.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Adebayo, J. Gilmer, M. Muelly, I. Goodfellow, M. Hardt, and B. Kim. Sanity checks for saliency maps. In Advances in Neural Information Processing Systems, pages 9505--9515, 2018.Google ScholarGoogle Scholar
  3. D. Alvarez-Melis and T. Jaakkola. A causal framework for explaining the predictions of black-box sequence-to-sequence models. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 412--421, Copenhagen, Denmark, Sept. 2017. Association for Computational Linguistics.Google ScholarGoogle ScholarCross RefCross Ref
  4. J. Angwin, J. Larson, L. Kirchner, and S. Mattu. Machine bias risk assessments in criminal sentencing. https://www.propublica.org/article/machinebias- risk-assessments-in-criminal-sentencing, Mar 2019.Google ScholarGoogle Scholar
  5. AWS. Amazon customer reviews dataset. https://s3.amazonaws.com/amazon-reviews-pds/ readme.html, 2020.Google ScholarGoogle Scholar
  6. D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.Google ScholarGoogle Scholar
  7. D. Bau, J. Zhu, H. Strobelt, B. Zhou, J. B. Tenenbaum, W. T. Freeman, and A. Torralba. GAN dissection: Visualizing and understanding generative adversarial networks. CoRR, abs/1811.10597, 2018.Google ScholarGoogle Scholar
  8. M. Besserve, R. Sun, and B. Sch¨olkopf. Counterfactuals uncover the modular structure of deep generative models. CoRR, abs/1812.03253, 2018.Google ScholarGoogle Scholar
  9. D. Boyd and K. Crawford. Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication & society, 15(5):662--679, 2012.Google ScholarGoogle Scholar
  10. O. Boz. Extracting decision trees from trained neural networks. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 456--461. ACM, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhadad. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '15, pages 1721--1730, New York, NY, USA, 2015. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Chattopadhyay, P. Manupriya, A. Sarkar, and V. N. Balasubramanian. Neural network attributions: A causal perspective. CoRR, abs/1902.02302, 2019.Google ScholarGoogle Scholar
  13. X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Advances in neural information processing systems, pages 2172--2180, 2016.Google ScholarGoogle Scholar
  14. W. Cheng, Y. Shen, L. Huang, and Y. Zhu. Incorporating interpretability into latent factor models via fast influence analysis. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 885--893. ACM, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. CoRR, abs/1703.00056, 2017.Google ScholarGoogle Scholar
  16. M. Craven and J. W. Shavlik. Extracting tree-structured representations of trained networks. In Advances in neural information processing systems, pages 24--30, 1996.Google ScholarGoogle Scholar
  17. F. Doshi-Velez and B. Kim. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608, 2017.Google ScholarGoogle Scholar
  18. M. Du, N. Liu, and X. Hu. Techniques for interpretable machine learning. arXiv preprint arXiv:1808.00033, 2018.Google ScholarGoogle Scholar
  19. D. Dua and C. Graff. UCI machine learning repository, 2017.Google ScholarGoogle Scholar
  20. D. Erhan, Y. Bengio, A. Courville, and P. Vincent. Visualizing higher-layer features of a deep network. University of Montreal, 1341(3):1, 2009.Google ScholarGoogle Scholar
  21. M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2):303--338, June 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Flores, K. Bechtel, and C. Lowenkamp. False positives, false negatives, and false analyses: A rejoinder to "machine bias: There's software used across the country to predict future criminals. and it's biased against blacks.". Federal probation, 80, 09 2016.Google ScholarGoogle Scholar
  23. J. H. Friedman. Greedy function approximation: a gradient boosting machine. Annals of statistics, pages 1189--1232, 2001.Google ScholarGoogle Scholar
  24. N. Frosst and G. Hinton. Distilling a neural network into a soft decision tree. arXiv preprint arXiv:1711.09784, 2017.Google ScholarGoogle Scholar
  25. L. Gerson Neuberg. Causality: models, reasoning, and inference, by judea pearl, cambridge university press, 2000. Econometric Theory, 19:675--685, 08 2003.Google ScholarGoogle ScholarCross RefCross Ref
  26. A. Ghorbani, A. Abid, and J. Zou. Interpretation of neural networks is fragile. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 3681--3688, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. L. H. Gilpin, D. Bau, B. Z. Yuan, A. Bajwa, M. Specter, and L. Kagal. Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), pages 80--89. IEEE, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  28. A. Goldstein, A. Kapelner, J. Bleich, and E. Pitkin. Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation. Journal of Computational and Graphical Statistics, 24(1):44--65, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  29. I. Goodfellow, Y. Bengio, and A. Courville. Deep Learning. MIT Press, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.Google ScholarGoogle Scholar
  31. B. Goodman and S. Flaxman. Eu regulations on algorithmic decision-making and a "right to explanation", 2016. cite arxiv:1606.08813Comment: presented at 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, NY.Google ScholarGoogle Scholar
  32. B. Goodman and S. Flaxman. European union regulations on algorithmic decision-making and a "right to explanation". AI Magazine, 38(3):50--57, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Y. Goyal, U. Shalit, and B. Kim. Explaining classifiers with causal concept effect (cace). CoRR, abs/1907.07165, 2019.Google ScholarGoogle Scholar
  34. Y. Goyal, Z. Wu, J. Ernst, D. Batra, D. Parikh, and S. Lee. Counterfactual visual explanations. CoRR, abs/1904.07451, 2019.Google ScholarGoogle Scholar
  35. R. M. Grath, L. Costabello, C. L. Van, P. Sweeney, F. Kamiab, Z. Shen, and F. L´ecu´e. Interpretable credit application predictions with counterfactual explanations. CoRR, abs/1811.05245, 2018.Google ScholarGoogle Scholar
  36. R. Guo, L. Cheng, J. Li, P. R. Hahn, and H. Liu. A survey of learning causality with data: Problems and methods. arXiv preprint arXiv:1809.09337, 2018.Google ScholarGoogle Scholar
  37. K. S. Gurumoorthy, A. Dhurandhar, G. Cecchi, and C. Aggarwal. Efficient data representation by selecting prototypes with importance weights, 2017.Google ScholarGoogle Scholar
  38. M. Harradon, J. Druce, and B. E. Ruttenberg. Causal learning and explanation of deep neural networks via autoencoded activations. CoRR, abs/1802.00541, 2018.Google ScholarGoogle Scholar
  39. L. A. Hendricks, R. Hu, T. Darrell, and Z. Akata. Generating counterfactual explanations with natural language. CoRR, abs/1806.09809, 2018.Google ScholarGoogle Scholar
  40. I. Higgins, L. Matthey, A. Pal, C. Burgess, X. Glorot, M. Botvinick, S. Mohamed, and A. Lerchner. beta-vae: Learning basic visual concepts with a constrained variational framework. In International Conference on Learning Representations, volume 3, 2017.Google ScholarGoogle Scholar
  41. G. Hinton, O. Vinyals, and J. Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.Google ScholarGoogle Scholar
  42. A. Hyv¨arinen and E. Oja. Independent component analysis: algorithms and applications. Neural networks, 13(4--5):411--430, 2000.Google ScholarGoogle Scholar
  43. IMDb. Imdb datasets. https://www.imdb.com/interfaces/, 2020.Google ScholarGoogle Scholar
  44. I. Jolliffe. Principal component analysis. Springer, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  45. A. Kanehira, K. Takemoto, S. Inayoshi, and T. Harada. Multimodal explanations by predictingGoogle ScholarGoogle Scholar
  46. N. Kilbertus, M. Rojas Carulla, G. Parascandolo, M. Hardt, D. Janzing, and B. Sch¨olkopf. Avoiding discrimination through causal reasoning. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 656--666. Curran Associates, Inc., 2017.Google ScholarGoogle Scholar
  47. B. Kim, R. Khanna, and O. O. Koyejo. Examples are not enough, learn to criticize! criticism for interpretability. In Advances in Neural Information Processing Systems, pages 2280--2288, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. B. Kim, O. Koyejo, and R. Khanna. Examples are not enough, learn to criticize! criticism for interpretability. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5--10, 2016, Barcelona, Spain, pages 2280--2288, 2016.Google ScholarGoogle Scholar
  49. C. Kim and O. Bastani. Learning interpretable models with causal guarantees. CoRR, abs/1901.08576, 2019.Google ScholarGoogle Scholar
  50. D. P. Kingma and M. Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.Google ScholarGoogle Scholar
  51. P. W. Koh, K.-S. Ang, H. H. Teo, and P. Liang. On the accuracy of influence functions for measuring group effects. arXiv preprint arXiv:1905.13289, 2019.Google ScholarGoogle Scholar
  52. P. W. Koh and P. Liang. Understanding black-box predictions via influence functions. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1885--1894. JMLR. org, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. M. J. Kusner, J. R. Loftus, C. Russell, and R. Silva. Counterfactual fairness. In I. Guyon, U. von Luxburg, S. Bengio, H. M. Wallach, R. Fergus, S. V. N. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4--9 December 2017, Long Beach, CA, USA, pages 4069--4079, 2017.Google ScholarGoogle Scholar
  54. K. Lang. 20 newsgroups. http://qwone.com/~jason/20Newsgroups/, 2008.Google ScholarGoogle Scholar
  55. Q. V. Le. Building high-level features using large scale unsupervised learning. In 2013 IEEE international conference on acoustics, speech and signal processing, pages 8595--8598. IEEE, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  56. Y. LeCun, C. Cortes, and C. Burges. The mnist database. http://yann.lecun.com/exdb/mnist/, Jan 2020.Google ScholarGoogle Scholar
  57. J. Li, K. Cheng, S. Wang, F. Morstatter, R. P. Trevino, J. Tang, and H. Liu. Feature selection: A data perspective. ACM Computing Surveys (CSUR), 50(6):94, 2018.Google ScholarGoogle Scholar
  58. Y. Li, R. Guo, W. Wang, and H. Liu. Causal learning in question quality improvement. In 2019 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench'19), 2019.Google ScholarGoogle Scholar
  59. Z. C. Lipton. The mythos of model interpretability. CoRR, abs/1606.03490, 2016.Google ScholarGoogle Scholar
  60. S. Liu, B. Kailkhura, D. Loveland, and Y. Han. Generative counterfactual introspection for explainable deep learning. CoRR, abs/1907.03077, 2019.Google ScholarGoogle Scholar
  61. A. V. Looveren and J. Klaise. Interpretable counterfactual explanations guided by prototypes. CoRR, abs/1907.02584, 2019.Google ScholarGoogle Scholar
  62. Y. Lou, R. Caruana, and J. Gehrke. Intelligible models for classification and regression. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 150--158. ACM, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Y. Lou, R. Caruana, J. Gehrke, and G. Hooker. Accurate intelligible models with pairwise interactions. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '13, pages 623--631, New York, NY, USA, 2013. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. J. Lu, C. Xiong, D. Parikh, and R. Socher. Knowing when to look: Adaptive attention via a visual sentinel for image captioning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 375--383, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  65. J. Lu, J. Yang, D. Batra, and D. Parikh. Hierarchical question-image co-attention for visual question answering. In Advances In Neural Information Processing Systems, pages 289--297, 2016.Google ScholarGoogle Scholar
  66. S. M. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, pages 4765--4774, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. D. Madras, E. Creager, T. Pitassi, and R. S. Zemel. Fairness through causal awareness: Learning latent-variable models for biased data. CoRR, abs/1809.02519, 2018.Google ScholarGoogle Scholar
  68. P. Madumal, T. Miller, L. Sonenberg, and F. Vetere. Explainable reinforcement learning through a causal lens. CoRR, abs/1905.10958, 2019.Google ScholarGoogle Scholar
  69. N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan. A survey on bias and fairness in machine learning. arXiv preprint arXiv:1908.09635, 2019.Google ScholarGoogle Scholar
  70. T. Miller. Explanation in artificial intelligence: Insights from the social sciences. CoRR, abs/1706.07269, 2017.Google ScholarGoogle Scholar
  71. C. Molnar. Interpretable Machine Learning. Lulu.com, 2019. https: //christophm.github.io/interpretable-ml-book/.Google ScholarGoogle Scholar
  72. C. Molnar. Interpretable machine learning. Lulu. com, 2019.Google ScholarGoogle Scholar
  73. J. Moore, N. Hammerla, and C. Watkins. Explaining deep learning models with constrained adversarial examples. CoRR, abs/1906.10671, 2019.Google ScholarGoogle Scholar
  74. S. Moosavi-Dezfooli, A. Fawzi, and P. Frossard. Deepfool: a simple and accurate method to fool deep neural networks. CoRR, abs/1511.04599, 2015.Google ScholarGoogle Scholar
  75. A. Mordvintsev, C. Olah, and M. Tyka. Inceptionism: Going deeper into neural networks, 2015.Google ScholarGoogle Scholar
  76. R. K. Mothilal, A. Sharma, and C. Tan. Explaining machine learning classifiers through diverse counterfactual explanations. CoRR, abs/1905.07697, 2019.Google ScholarGoogle Scholar
  77. T. Narendra, A. Sankaran, D. Vijaykeerthy, and S. Mani. Explaining deep learning models using causal inference. CoRR, abs/1811.04376, 2018.Google ScholarGoogle Scholar
  78. C. Olah, A. Mordvintsev, and L. Schubert. Feature visualization. Distill, 2017. https://distill.pub/2017/feature-visualization.Google ScholarGoogle ScholarCross RefCross Ref
  79. C. Olah, A. Satyanarayan, I. Johnson, S. Carter, L. Schubert, K. Ye, and A. Mordvintsev. The building blocks of interpretability. Distill, 2018. https://distill.pub/2018/building-blocks.Google ScholarGoogle ScholarCross RefCross Ref
  80. N. Papernot, P. D. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami. The limitations of deep learning in adversarial settings. CoRR, abs/1511.07528, 2015.Google ScholarGoogle Scholar
  81. ´A. Parafita and J. Vitri'a. Explaining visual models by causal attribution. arXiv preprint arXiv:1909.08891, 2019.Google ScholarGoogle Scholar
  82. J. Pearl. Causality. Cambridge university press, 2009.Google ScholarGoogle Scholar
  83. J. Pearl. Theoretical impediments to machine learning with seven sparks from the causal revolution. CoRR, abs/1801.04016, 2018.Google ScholarGoogle Scholar
  84. J. Pearl. The seven tools of causal inference, with reflections on machine learning. Commun. ACM, 62(3):54--60, Feb. 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. G. Plumb, D. Molitor, and A. S. Talwalkar. Model agnostic supervised local explanations. In Advances in Neural Information Processing Systems, pages 2515--2524, 2018.Google ScholarGoogle Scholar
  86. S. Rathi. Generating counterfactual and contrastive explanations using SHAP. CoRR, abs/1906.09293, 2019.Google ScholarGoogle Scholar
  87. A. Renkl. Toward an instructionally oriented theory of example-based learning. Cognitive science, 38(1):1--37, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  88. M. T. Ribeiro, S. Singh, and C. Guestrin. Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386, 2016.Google ScholarGoogle Scholar
  89. M. T. Ribeiro, S. Singh, and C. Guestrin. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135--1144. ACM, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211--252, 2015.Google ScholarGoogle Scholar
  91. R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, pages 618--626, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  92. U. Shalit, F. D. Johansson, and D. Sontag. Estimating individual treatment effect: generalization bounds and algorithms. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 3076--3085. JMLR. org, 2017.Google ScholarGoogle Scholar
  93. K. Simonyan, A. Vedaldi, and A. Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013.Google ScholarGoogle Scholar
  94. J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller. Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806, 2014.Google ScholarGoogle Scholar
  95. P.-N. Tan. Introduction to data mining. Pearson Education India, 2018.Google ScholarGoogle Scholar
  96. G. G. Towell and J. W. Shavlik. Extracting refined rules from knowledge-based neural networks. Machine learning, 13(1):71--101, 1993.Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. UCI. Uci machine learning repository. https://archive.ics.uci.edu/ml/index.php, 2020.Google ScholarGoogle Scholar
  98. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, ªL. Kaiser, and I. Polosukhin. Attention is all you need. In Advances in neural information processing systems, pages 5998--6008, 2017.Google ScholarGoogle Scholar
  99. P. Velickovi´c, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio. Graph attention networks. arXiv preprint arXiv:1710.10903, 2017.Google ScholarGoogle Scholar
  100. U. Von Luxburg. A tutorial on spectral clustering. Statistics and computing, 17(4):395--416, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. S. Wachter, B. D. Mittelstadt, and C. Russell. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. CoRR, abs/1711.00399, 2017.Google ScholarGoogle Scholar
  102. Y. Wu, L. Zhang, X. Wu, and H. Tong. Pc-fairness: A unified framework for measuring causality-based fairness. In Advances in Neural Information Processing Systems, pages 3399--3409, 2019.Google ScholarGoogle Scholar
  103. H. Xu and K. Saenko. Ask, attend and answer: Exploring question-guided spatial attention for visual question answering. In European Conference on Computer Vision, pages 451--466. Springer, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  104. K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning, pages 2048--2057, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  105. F. Yang, M. Du, and X. Hu. Evaluating explanation without ground truth in interpretable machine learning, 2019.Google ScholarGoogle Scholar
  106. Z. Yang, X. He, J. Gao, L. Deng, and A. Smola. Stacked attention networks for image question answering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 21--29, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  107. Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, and E. Hovy. Hierarchical attention networks for document classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pages 1480--1489, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  108. YELP. Yelp dataset. https://www.yelp.com/dataset, 2020.Google ScholarGoogle Scholar
  109. M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. In European conference on computer vision, pages 818--833. Springer, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  110. J. Zhang and E. Bareinboim. Fairness in decision-making-the causal explanation formula. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018.Google ScholarGoogle Scholar
  111. Q. Zhang, Y. Yang, H. Ma, and Y. N. Wu. Interpreting cnns via decision trees. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6261--6270, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  112. Q.-s. Zhang and S.-C. Zhu. Visual interpretability for deep learning: a survey. Frontiers of Information Technology & Electronic Engineering, 19(1):27--39, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  113. Q. Zhao and T. Hastie. Causal interpretations of black-box models. Journal of Business & Economic Statistics, pages 1--10, 2019. SIGKDDGoogle ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Causal Interpretability for Machine Learning - Problems, Methods and Evaluation
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGKDD Explorations Newsletter
      ACM SIGKDD Explorations Newsletter  Volume 22, Issue 1
      June 2020
      33 pages
      ISSN:1931-0145
      EISSN:1931-0153
      DOI:10.1145/3400051
      Issue’s Table of Contents

      Copyright © 2020 Copyright is held by the owner/author(s)

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 18 May 2020

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader