Skip to main content
Top
Published in: Knowledge and Information Systems 2/2021

21-10-2020 | Regular paper

Learning credible DNNs via incorporating prior knowledge and model local explanation

Published in: Knowledge and Information Systems | Issue 2/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Recent studies have shown that state-of-the-art DNNs are not always credible, despite their impressive performance on the hold-out test set of a variety of tasks. These models tend to exploit dataset shortcuts to make predictions, rather than learn the underlying task. The non-credibility could lead to low generalization, adversarial vulnerability, as well as algorithmic discrimination of the DNN models. In this paper, we propose CREX in order to develop more credible DNNs. The high-level idea of CREX is to encourage DNN models to focus more on evidences that actually matter for the task at hand and to avoid overfitting to data-dependent shortcuts. Specifically, in the DNN training process, CREX directly regularizes the local explanation with expert rationales, i.e., a subset of features highlighted by domain experts as justifications for predictions, to enforce the alignment between local explanations and rationales. Even when rationales are not available, CREX still could be useful by requiring the generated explanations to be sparse. In addition, CREX is widely applicable to different network architectures, including CNN, LSTM and attention model. Experimental results on several text classification datasets demonstrate that CREX could increase the credibility of DNNs. Comprehensive analysis further shows three meaningful improvements of CREX: (1) it significantly increases DNN accuracy on new and previously unseen data beyond test set, (2) it enhances fairness of DNNs in terms of equality of opportunity metric and reduce models’ discrimination toward certain demographic group, and (3) it promotes the robustness of DNN models with respect to adversarial attack. These experimental results highlight the advantages of the increased credibility by CREX.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Agrawal A, Batra D, Parikh D (2016) Analyzing the behavior of visual question answering models. Empirical Methods in Natural Language Processing (EMNLP) Agrawal A, Batra D, Parikh D (2016) Analyzing the behavior of visual question answering models. Empirical Methods in Natural Language Processing (EMNLP)
2.
go back to reference Agrawal A, Batra D, Parikh D, Kembhavi A (2018) Don’t just assume; look and answer: Overcoming priors for visual question answering. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) Agrawal A, Batra D, Parikh D, Kembhavi A (2018) Don’t just assume; look and answer: Overcoming priors for visual question answering. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)
3.
go back to reference Bao Y, Chang S, Yu M, Barzilay R (2018) Deriving machine attention from human rationales. In: 2018 conference on empirical methods in natural language processing (EMNLP) Bao Y, Chang S, Yu M, Barzilay R (2018) Deriving machine attention from human rationales. In: 2018 conference on empirical methods in natural language processing (EMNLP)
4.
go back to reference Barrett M, Bingel J, Hollenstein N, Rei M, Søgaard A (2018) Sequence classification with human attention. In: Proceedings of the 22nd conference on computational natural language learning (CoNLL), pp 302–312 Barrett M, Bingel J, Hollenstein N, Rei M, Søgaard A (2018) Sequence classification with human attention. In: Proceedings of the 22nd conference on computational natural language learning (CoNLL), pp 302–312
5.
go back to reference Bolukbasi T, Chang K.W, Zou J.Y, Saligrama V, Kalai A.T (2016) Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In: Thirtieth conference on neural information processing systems (NIPS) Bolukbasi T, Chang K.W, Zou J.Y, Saligrama V, Kalai A.T (2016) Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In: Thirtieth conference on neural information processing systems (NIPS)
6.
go back to reference Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence (PAMI) Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence (PAMI)
7.
go back to reference Dacrema MF, Cremonesi P, Jannach D (2019) Are we really making much progress? a worrying analysis of recent neural recommendation approaches. In: Proceedings of the 13th ACM conference on recommender systems (RecSys) Dacrema MF, Cremonesi P, Jannach D (2019) Are we really making much progress? a worrying analysis of recent neural recommendation approaches. In: Proceedings of the 13th ACM conference on recommender systems (RecSys)
8.
go back to reference Devlin J, Chang MW, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: 2019 annual conference of the north american chapter of the association for computational linguistics (NAACL) Devlin J, Chang MW, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: 2019 annual conference of the north american chapter of the association for computational linguistics (NAACL)
10.
go back to reference Du M, Liu N, Hu X (2020) Techniques for interpretable machine learning. Commun ACM 63(1):68–77CrossRef Du M, Liu N, Hu X (2020) Techniques for interpretable machine learning. Commun ACM 63(1):68–77CrossRef
11.
go back to reference Du M, Liu N, Yang F, Hu X (2019) Learning credible deep neural networks with rationale regularization. In: IEEE international conference on data mining (ICDM) Du M, Liu N, Yang F, Hu X (2019) Learning credible deep neural networks with rationale regularization. In: IEEE international conference on data mining (ICDM)
12.
go back to reference Du M, Yang F, Zou N, Hu X (2020) Fairness in deep learning: a computational perspective. IEEE Intell Syst Du M, Yang F, Zou N, Hu X (2020) Fairness in deep learning: a computational perspective. IEEE Intell Syst
13.
go back to reference Geirhos R, Jacobsen JH, Michaelis C, Zemel R, Brendel W, Bethge M, Wichmann FA (2020) Shortcut learning in deep neural networks. arXiv preprint arXiv:2004.07780 Geirhos R, Jacobsen JH, Michaelis C, Zemel R, Brendel W, Bethge M, Wichmann FA (2020) Shortcut learning in deep neural networks. arXiv preprint arXiv:​2004.​07780
14.
go back to reference Gururangan S, Swayamdipta S, Levy O, Schwartz R, Bowman SR, Smith NA (2018) Annotation artifacts in natural language inference data. North American Chapter of the Association for Computational Linguistics (NAACL) Gururangan S, Swayamdipta S, Levy O, Schwartz R, Bowman SR, Smith NA (2018) Annotation artifacts in natural language inference data. North American Chapter of the Association for Computational Linguistics (NAACL)
15.
go back to reference Hardt M, Price E, Srebro N, et al (2016) Equality of opportunity in supervised learning. In: Advances in neural information processing systems (NIPS) Hardt M, Price E, Srebro N, et al (2016) Equality of opportunity in supervised learning. In: Advances in neural information processing systems (NIPS)
16.
go back to reference He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)
17.
go back to reference Hendricks LA, Burns K, Saenko K, Darrell T, Rohrbach A (2018) Women also snowboard: Overcoming bias in captioning models. In: 15th European conference on computer vision (ECCV) Hendricks LA, Burns K, Saenko K, Darrell T, Rohrbach A (2018) Women also snowboard: Overcoming bias in captioning models. In: 15th European conference on computer vision (ECCV)
19.
go back to reference Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef
20.
go back to reference Hu Z, Ma X, Liu Z, Hovy E, Xing E (2016) Harnessing deep neural networks with logic rules. In: 54th annual meeting of the association for computational linguistics (ACL) Hu Z, Ma X, Liu Z, Hovy E, Xing E (2016) Harnessing deep neural networks with logic rules. In: 54th annual meeting of the association for computational linguistics (ACL)
21.
go back to reference Ilyas A, Santurkar S, Tsipras D, Engstrom L, Tran B, Madry A (2019) Adversarial examples are not bugs, they are features. In: Conference on neural information processing systems (NeurIPS) Ilyas A, Santurkar S, Tsipras D, Engstrom L, Tran B, Madry A (2019) Adversarial examples are not bugs, they are features. In: Conference on neural information processing systems (NeurIPS)
22.
go back to reference Jia R, Liang P (2017) Adversarial examples for evaluating reading comprehension systems. In: 2017 conference on empirical methods in natural language processing (EMNLP) Jia R, Liang P (2017) Adversarial examples for evaluating reading comprehension systems. In: 2017 conference on empirical methods in natural language processing (EMNLP)
23.
go back to reference Kádár A, Chrupała G, Alishahi A (2017) Representation of linguistic form and function in recurrent neural networks. Comput Linguist, pp 761–780 Kádár A, Chrupała G, Alishahi A (2017) Representation of linguistic form and function in recurrent neural networks. Comput Linguist, pp 761–780
24.
go back to reference Khodabakhsh A, Ramachandra R, Raja K, Wasnik P, Busch C (2018) Fake face detection methods: Can they be generalized? In: 2018 international conference of the biometrics special interest group (BIOSIG) Khodabakhsh A, Ramachandra R, Raja K, Wasnik P, Busch C (2018) Fake face detection methods: Can they be generalized? In: 2018 international conference of the biometrics special interest group (BIOSIG)
25.
go back to reference Kim Y (2014) Convolutional neural networks for sentence classification. In: Empirical methods in natural language processing (EMNLP) Kim Y (2014) Convolutional neural networks for sentence classification. In: Empirical methods in natural language processing (EMNLP)
27.
go back to reference Kiritchenko S, Mohammad SM (2018) Examining gender and race bias in two hundred sentiment analysis systems. In: Proceedings of the 7th joint conference on lexical and computational semantics Kiritchenko S, Mohammad SM (2018) Examining gender and race bias in two hundred sentiment analysis systems. In: Proceedings of the 7th joint conference on lexical and computational semantics
28.
go back to reference Lei T, Barzilay R, Jaakkola T (2016) Rationalizing neural predictions. Empirical Methods in Natural Language Processing (EMNLP) Lei T, Barzilay R, Jaakkola T (2016) Rationalizing neural predictions. Empirical Methods in Natural Language Processing (EMNLP)
29.
30.
go back to reference Lin Z, Feng M, Santos CNd, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. In: International conference on learning representations (ICLR) Lin Z, Feng M, Santos CNd, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. In: International conference on learning representations (ICLR)
32.
go back to reference Lu R, Jin X, Zhang S, Qiu M, Wu X (2018) A study on big knowledge and its engineering issues. IEEE Trans Knowl Data Eng 31(9):1630–1644CrossRef Lu R, Jin X, Zhang S, Qiu M, Wu X (2018) A study on big knowledge and its engineering issues. IEEE Trans Knowl Data Eng 31(9):1630–1644CrossRef
33.
go back to reference Malaviya C, Ferreira P, Martins AF (2018) Sparse and constrained attention for neural machine translation. In: 56th annual meeting of the association for computational linguistics (ACL) Malaviya C, Ferreira P, Martins AF (2018) Sparse and constrained attention for neural machine translation. In: 56th annual meeting of the association for computational linguistics (ACL)
34.
go back to reference McAuley J, Leskovec J, Jurafsky D (2012) Learning attitudes and attributes from multi-aspect reviews. In: International conference on data mining (ICDM). IEEE McAuley J, Leskovec J, Jurafsky D (2012) Learning attitudes and attributes from multi-aspect reviews. In: International conference on data mining (ICDM). IEEE
35.
go back to reference Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Conference on neural information processing systems (NIPS) Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Conference on neural information processing systems (NIPS)
36.
go back to reference Minervini P, Riedel S (2018) Adversarially regularising neural nli models to integrate logical background knowledge. In: The SIGNLL conference on computational natural language learning (CoNLL) Minervini P, Riedel S (2018) Adversarially regularising neural nli models to integrate logical background knowledge. In: The SIGNLL conference on computational natural language learning (CoNLL)
37.
go back to reference Montavon G, Samek W, Müller KR (2018) Methods for interpreting and understanding deep neural networks. Digit Signal Process 73:1–15MathSciNetCrossRef Montavon G, Samek W, Müller KR (2018) Methods for interpreting and understanding deep neural networks. Digit Signal Process 73:1–15MathSciNetCrossRef
38.
go back to reference Mudrakarta PK, Taly A, Sundararajan M, Dhamdhere K (2018) Did the model understand the question? In: 56th annual meeting of the association for computational linguistics (ACL) Mudrakarta PK, Taly A, Sundararajan M, Dhamdhere K (2018) Did the model understand the question? In: 56th annual meeting of the association for computational linguistics (ACL)
39.
go back to reference Niven T, Kao HY (2019) Probing neural network comprehension of natural language arguments. In: 57th annual meeting of the association for computational linguistics (ACL) Niven T, Kao HY (2019) Probing neural network comprehension of natural language arguments. In: 57th annual meeting of the association for computational linguistics (ACL)
40.
go back to reference Pang B, Lee L (2004) A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on association for computational linguistics (ACL) Pang B, Lee L (2004) A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on association for computational linguistics (ACL)
41.
go back to reference Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Annual meeting on association for computational linguistics (ACL) Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Annual meeting on association for computational linguistics (ACL)
42.
go back to reference Peters B, Niculae V, Martins AF (2018) Interpretable structure induction via sparse attention. In: EMNLP workshop Peters B, Niculae V, Martins AF (2018) Interpretable structure induction via sparse attention. In: EMNLP workshop
43.
go back to reference Poncelas A, Shterionov D, Way A, Wenniger GMdB, Passban P (2018) Investigating backtranslation in neural machine translation. arXiv preprint arXiv:1804.06189 Poncelas A, Shterionov D, Way A, Wenniger GMdB, Passban P (2018) Investigating backtranslation in neural machine translation. arXiv preprint arXiv:​1804.​06189
44.
go back to reference Quiñonero-Candela J, Sugiyama M, Schwaighofer A, Lawrence N (2008) Covariate shift and local learning by distribution matching Quiñonero-Candela J, Sugiyama M, Schwaighofer A, Lawrence N (2008) Covariate shift and local learning by distribution matching
45.
go back to reference Rashkin H, Sap M, Allaway E, Smith NA, Choi Y (2018) Event2mind: commonsense inference on events, intents, and reactions. In: 56th annual meeting of the association for computational linguistics (ACL) Rashkin H, Sap M, Allaway E, Smith NA, Choi Y (2018) Event2mind: commonsense inference on events, intents, and reactions. In: 56th annual meeting of the association for computational linguistics (ACL)
46.
go back to reference Ribeiro MT, Singh S, Guestrin C (2016) Why should i trust you? Explaining the predictions of any classifier. In: Proceedings of ACM SIGKDD international conference on knowledge discovery & data mining (KDD) Ribeiro MT, Singh S, Guestrin C (2016) Why should i trust you? Explaining the predictions of any classifier. In: Proceedings of ACM SIGKDD international conference on knowledge discovery & data mining (KDD)
47.
go back to reference Ross AS, Hughes MC, Doshi-Velez F (2017) Right for the right reasons: training differentiable models by constraining their explanations. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence (IJCAI) Ross AS, Hughes MC, Doshi-Velez F (2017) Right for the right reasons: training differentiable models by constraining their explanations. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence (IJCAI)
48.
go back to reference Sato M, Suzuki J, Shindo H, Matsumoto Y (2018) Interpretable adversarial perturbation in input embedding space for text. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence (IJCAI) Sato M, Suzuki J, Shindo H, Matsumoto Y (2018) Interpretable adversarial perturbation in input embedding space for text. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence (IJCAI)
49.
go back to reference Sennrich R, Haddow B, Birch A (2016) Improving neural machine translation models with monolingual data. In: 54th annual meeting of the association for computational linguistics (ACL) Sennrich R, Haddow B, Birch A (2016) Improving neural machine translation models with monolingual data. In: 54th annual meeting of the association for computational linguistics (ACL)
50.
go back to reference Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH
51.
go back to reference Sundararajan M, Taly A, Yan Q (2017) Axiomatic attribution for deep networks, pp 3319–3328 Sundararajan M, Taly A, Yan Q (2017) Axiomatic attribution for deep networks, pp 3319–3328
52.
go back to reference Wang A, Singh A, Michael J, Hill F, Levy O, Bowman SR (2019) Glue: a multi-task benchmark and analysis platform for natural language understanding Wang A, Singh A, Michael J, Hill F, Levy O, Bowman SR (2019) Glue: a multi-task benchmark and analysis platform for natural language understanding
53.
go back to reference Wang J, Oh J, Wang H, Wiens J (2018) Learning credible models. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining (KDD) Wang J, Oh J, Wang H, Wiens J (2018) Learning credible models. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining (KDD)
54.
go back to reference Yang F, Liu N, Du M, Zhou K, Ji S, Hu X (2020) Deep neural networks with knowledge instillation Yang F, Liu N, Du M, Zhou K, Ji S, Hu X (2020) Deep neural networks with knowledge instillation
55.
go back to reference Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, Le Q.V (2019) Xlnet: Generalized autoregressive pretraining for language understanding. Adv Neural Inf Process Syst, pp 5753–5763 Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, Le Q.V (2019) Xlnet: Generalized autoregressive pretraining for language understanding. Adv Neural Inf Process Syst, pp 5753–5763
56.
go back to reference Yue L, Chen W, Li X, Zuo W, Yin M (2018) A survey of sentiment analysis in social media. Knowl Inf Syst, pp 1–47 Yue L, Chen W, Li X, Zuo W, Yin M (2018) A survey of sentiment analysis in social media. Knowl Inf Syst, pp 1–47
57.
go back to reference Zaidan O, Eisner J, Piatko C (2007) Using annotator rationales to improve machine learning for text categorization. In: North American chapter of the association for computational linguistics (NAACL) Zaidan O, Eisner J, Piatko C (2007) Using annotator rationales to improve machine learning for text categorization. In: North American chapter of the association for computational linguistics (NAACL)
58.
go back to reference Zellers R, Bisk Y, Schwartz R, Choi Y (2018) Swag: a large-scale adversarial dataset for grounded commonsense inference. In: Empirical methods in natural language processing (EMNLP) Zellers R, Bisk Y, Schwartz R, Choi Y (2018) Swag: a large-scale adversarial dataset for grounded commonsense inference. In: Empirical methods in natural language processing (EMNLP)
59.
go back to reference Zhang Y, Marshall I, Wallace BC (2016) Rationale-augmented convolutional neural networks for text classification. In: Empirical methods in natural language processing (EMNLP) Zhang Y, Marshall I, Wallace BC (2016) Rationale-augmented convolutional neural networks for text classification. In: Empirical methods in natural language processing (EMNLP)
60.
go back to reference Zhao ZQ, Zheng P, Xu St, Wu X (2019) Object detection with deep learning: a review. In: IEEE transactions on neural networks and learning systems (TNNLS) Zhao ZQ, Zheng P, Xu St, Wu X (2019) Object detection with deep learning: a review. In: IEEE transactions on neural networks and learning systems (TNNLS)
Metadata
Title
Learning credible DNNs via incorporating prior knowledge and model local explanation
Publication date
21-10-2020
Published in
Knowledge and Information Systems / Issue 2/2021
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-020-01517-5

Other articles of this Issue 2/2021

Knowledge and Information Systems 2/2021 Go to the issue

Premium Partner