Top

Knowledge and Information Systems

Published in:

21-10-2020 | Regular paper

Learning credible DNNs via incorporating prior knowledge and model local explanation

Published in: Knowledge and Information Systems | Issue 2/2021

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Recent studies have shown that state-of-the-art DNNs are not always credible, despite their impressive performance on the hold-out test set of a variety of tasks. These models tend to exploit dataset shortcuts to make predictions, rather than learn the underlying task. The non-credibility could lead to low generalization, adversarial vulnerability, as well as algorithmic discrimination of the DNN models. In this paper, we propose CREX in order to develop more credible DNNs. The high-level idea of CREX is to encourage DNN models to focus more on evidences that actually matter for the task at hand and to avoid overfitting to data-dependent shortcuts. Specifically, in the DNN training process, CREX directly regularizes the local explanation with expert rationales, i.e., a subset of features highlighted by domain experts as justifications for predictions, to enforce the alignment between local explanations and rationales. Even when rationales are not available, CREX still could be useful by requiring the generated explanations to be sparse. In addition, CREX is widely applicable to different network architectures, including CNN, LSTM and attention model. Experimental results on several text classification datasets demonstrate that CREX could increase the credibility of DNNs. Comprehensive analysis further shows three meaningful improvements of CREX: (1) it significantly increases DNN accuracy on new and previously unseen data beyond test set, (2) it enhances fairness of DNNs in terms of equality of opportunity metric and reduce models’ discrimination toward certain demographic group, and (3) it promotes the robustness of DNN models with respect to adversarial attack. These experimental results highlight the advantages of the increased credibility by CREX.

previous article A differentially private algorithm for range queries on trajectories

next article A relative position attention network for aspect-based sentiment analysis

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

http://www.cs.jhu.edu/~ozaidan/rationales/.

https://www.beeradvocate.com/.

https://saifmohammad.com/WebPages/Biases-SA.html.

https://pypi.org/project/googletrans.

https://code.google.com/archive/p/word2vec/.

https://www.kaggle.com/iarunava/imdb-movie-reviews-dataset.

http://www.cs.cornell.edu/people/pabo/movie-review-data/.

https://www.nltk.org/api/nltk.tokenize.html.

Agrawal A, Batra D, Parikh D (2016) Analyzing the behavior of visual question answering models. Empirical Methods in Natural Language Processing (EMNLP)

Agrawal A, Batra D, Parikh D, Kembhavi A (2018) Don’t just assume; look and answer: Overcoming priors for visual question answering. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

Bao Y, Chang S, Yu M, Barzilay R (2018) Deriving machine attention from human rationales. In: 2018 conference on empirical methods in natural language processing (EMNLP)

Barrett M, Bingel J, Hollenstein N, Rei M, Søgaard A (2018) Sequence classification with human attention. In: Proceedings of the 22nd conference on computational natural language learning (CoNLL), pp 302–312

Bolukbasi T, Chang K.W, Zou J.Y, Saligrama V, Kalai A.T (2016) Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In: Thirtieth conference on neural information processing systems (NIPS)

Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence (PAMI)

Dacrema MF, Cremonesi P, Jannach D (2019) Are we really making much progress? a worrying analysis of recent neural recommendation approaches. In: Proceedings of the 13th ACM conference on recommender systems (RecSys)

Devlin J, Chang MW, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: 2019 annual conference of the north american chapter of the association for computational linguistics (NAACL)

Doshi-Velez F, Kim B (2017) Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608

10.

Du M, Liu N, Hu X (2020) Techniques for interpretable machine learning. Commun ACM 63(1):68–77CrossRef

11.

Du M, Liu N, Yang F, Hu X (2019) Learning credible deep neural networks with rationale regularization. In: IEEE international conference on data mining (ICDM)

12.

Du M, Yang F, Zou N, Hu X (2020) Fairness in deep learning: a computational perspective. IEEE Intell Syst

13.

Geirhos R, Jacobsen JH, Michaelis C, Zemel R, Brendel W, Bethge M, Wichmann FA (2020) Shortcut learning in deep neural networks. arXiv preprint arXiv:2004.07780

14.

Gururangan S, Swayamdipta S, Levy O, Schwartz R, Bowman SR, Smith NA (2018) Annotation artifacts in natural language inference data. North American Chapter of the Association for Computational Linguistics (NAACL)

15.

Hardt M, Price E, Srebro N, et al (2016) Equality of opportunity in supervised learning. In: Advances in neural information processing systems (NIPS)

16.

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

17.

Hendricks LA, Burns K, Saenko K, Darrell T, Rohrbach A (2018) Women also snowboard: Overcoming bias in captioning models. In: 15th European conference on computer vision (ECCV)

18.

Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531

19.

Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef

20.

Hu Z, Ma X, Liu Z, Hovy E, Xing E (2016) Harnessing deep neural networks with logic rules. In: 54th annual meeting of the association for computational linguistics (ACL)

21.

Ilyas A, Santurkar S, Tsipras D, Engstrom L, Tran B, Madry A (2019) Adversarial examples are not bugs, they are features. In: Conference on neural information processing systems (NeurIPS)

22.

Jia R, Liang P (2017) Adversarial examples for evaluating reading comprehension systems. In: 2017 conference on empirical methods in natural language processing (EMNLP)

23.

Kádár A, Chrupała G, Alishahi A (2017) Representation of linguistic form and function in recurrent neural networks. Comput Linguist, pp 761–780

24.

Khodabakhsh A, Ramachandra R, Raja K, Wasnik P, Busch C (2018) Fake face detection methods: Can they be generalized? In: 2018 international conference of the biometrics special interest group (BIOSIG)

25.

Kim Y (2014) Convolutional neural networks for sentence classification. In: Empirical methods in natural language processing (EMNLP)

26.

Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

27.

Kiritchenko S, Mohammad SM (2018) Examining gender and race bias in two hundred sentiment analysis systems. In: Proceedings of the 7th joint conference on lexical and computational semantics

28.

Lei T, Barzilay R, Jaakkola T (2016) Rationalizing neural predictions. Empirical Methods in Natural Language Processing (EMNLP)

29.

Li J, Monroe W, Jurafsky D (2016) Understanding neural networks through representation erasure. arXiv preprint arXiv:1612.08220

30.

Lin Z, Feng M, Santos CNd, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. In: International conference on learning representations (ICLR)

31.

Lipton ZC (2016) The mythos of model interpretability. arXiv preprint arXiv:1606.03490

32.

Lu R, Jin X, Zhang S, Qiu M, Wu X (2018) A study on big knowledge and its engineering issues. IEEE Trans Knowl Data Eng 31(9):1630–1644CrossRef

33.

Malaviya C, Ferreira P, Martins AF (2018) Sparse and constrained attention for neural machine translation. In: 56th annual meeting of the association for computational linguistics (ACL)

34.

McAuley J, Leskovec J, Jurafsky D (2012) Learning attitudes and attributes from multi-aspect reviews. In: International conference on data mining (ICDM). IEEE

35.

Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Conference on neural information processing systems (NIPS)

36.

Minervini P, Riedel S (2018) Adversarially regularising neural nli models to integrate logical background knowledge. In: The SIGNLL conference on computational natural language learning (CoNLL)

37.

Montavon G, Samek W, Müller KR (2018) Methods for interpreting and understanding deep neural networks. Digit Signal Process 73:1–15MathSciNetCrossRef

38.

Mudrakarta PK, Taly A, Sundararajan M, Dhamdhere K (2018) Did the model understand the question? In: 56th annual meeting of the association for computational linguistics (ACL)

39.

Niven T, Kao HY (2019) Probing neural network comprehension of natural language arguments. In: 57th annual meeting of the association for computational linguistics (ACL)

40.

Pang B, Lee L (2004) A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on association for computational linguistics (ACL)

41.

Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Annual meeting on association for computational linguistics (ACL)

42.

Peters B, Niculae V, Martins AF (2018) Interpretable structure induction via sparse attention. In: EMNLP workshop

43.

Poncelas A, Shterionov D, Way A, Wenniger GMdB, Passban P (2018) Investigating backtranslation in neural machine translation. arXiv preprint arXiv:1804.06189

44.

Quiñonero-Candela J, Sugiyama M, Schwaighofer A, Lawrence N (2008) Covariate shift and local learning by distribution matching

45.

Rashkin H, Sap M, Allaway E, Smith NA, Choi Y (2018) Event2mind: commonsense inference on events, intents, and reactions. In: 56th annual meeting of the association for computational linguistics (ACL)

46.

Ribeiro MT, Singh S, Guestrin C (2016) Why should i trust you? Explaining the predictions of any classifier. In: Proceedings of ACM SIGKDD international conference on knowledge discovery & data mining (KDD)

47.

Ross AS, Hughes MC, Doshi-Velez F (2017) Right for the right reasons: training differentiable models by constraining their explanations. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence (IJCAI)

48.

Sato M, Suzuki J, Shindo H, Matsumoto Y (2018) Interpretable adversarial perturbation in input embedding space for text. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence (IJCAI)

49.

Sennrich R, Haddow B, Birch A (2016) Improving neural machine translation models with monolingual data. In: 54th annual meeting of the association for computational linguistics (ACL)

50.

Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH

51.

Sundararajan M, Taly A, Yan Q (2017) Axiomatic attribution for deep networks, pp 3319–3328

52.

Wang A, Singh A, Michael J, Hill F, Levy O, Bowman SR (2019) Glue: a multi-task benchmark and analysis platform for natural language understanding

53.

Wang J, Oh J, Wang H, Wiens J (2018) Learning credible models. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining (KDD)

54.

Yang F, Liu N, Du M, Zhou K, Ji S, Hu X (2020) Deep neural networks with knowledge instillation

55.

Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, Le Q.V (2019) Xlnet: Generalized autoregressive pretraining for language understanding. Adv Neural Inf Process Syst, pp 5753–5763

56.

Yue L, Chen W, Li X, Zuo W, Yin M (2018) A survey of sentiment analysis in social media. Knowl Inf Syst, pp 1–47

57.

Zaidan O, Eisner J, Piatko C (2007) Using annotator rationales to improve machine learning for text categorization. In: North American chapter of the association for computational linguistics (NAACL)

58.

Zellers R, Bisk Y, Schwartz R, Choi Y (2018) Swag: a large-scale adversarial dataset for grounded commonsense inference. In: Empirical methods in natural language processing (EMNLP)

59.

Zhang Y, Marshall I, Wallace BC (2016) Rationale-augmented convolutional neural networks for text classification. In: Empirical methods in natural language processing (EMNLP)

60.

Zhao ZQ, Zheng P, Xu St, Wu X (2019) Object detection with deep learning: a review. In: IEEE transactions on neural networks and learning systems (TNNLS)

Title: Learning credible DNNs via incorporating prior knowledge and model local explanation
Publication date: 21-10-2020
Published in: Knowledge and Information Systems / Issue 2/2021
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI: https://doi.org/10.1007/s10115-020-01517-5

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 2/2021

CANE: community-aware network embedding via adversarial training

A graph grammar and -type tournament-based approach to detect conflicts of interest in a social network

Anytime mining of sequential discriminative patterns in labeled sequences

BestNeighbor: efficient evaluation of kNN queries on large time series databases

Partial multi-label learning with noisy side information

Statistical model for reproducibility in ranking-based feature selection

Premium Partner