Skip to main content
Erschienen in: Neural Processing Letters 1/2023

01.06.2022

Knowledge Reverse Distillation Based Confidence Calibration for Deep Neural Networks

verfasst von: Xianhui Jiang, Xiaogang Deng

Erschienen in: Neural Processing Letters | Ausgabe 1/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Deep neural networks, as a key technical breakthrough in machine learning field, have been widely used in various practical scenarios. However, the existing deep neural networks often generate the predictions with high confidence risks, which are prone to mislead practitioners and limit the deploying of deep neural networks in high-risk decision-making fields. In order to solve this issue, this paper proposes a confidence calibration method for deep neural networks by designing one novel knowledge reverse distillation strategy. Traditional knowledge distillation strategy takes the accuracy as the knowledge, and transfers it from the teacher network (usually one complex deep network) to the student network (usually one simple network). Different from this traditional distillation strategy, the proposed knowledge reverse distillation strategy regards the confidence as the knowledge, and constructs one reverse knowledge transfer pathway by applying the confidence knowledge in the simple network to calibrate the complex deep network. Experimental results on three benchmark image datasets show that the knowledge reverse distillation strategy can effectively improve the calibration capability of complex networks so that the complex deep neural network captures the high confidence along with the high prediction accuracy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. Int Conf Mach Learn PMLR 448–456 Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. Int Conf Mach Learn PMLR 448–456
2.
Zurück zum Zitat Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH
6.
Zurück zum Zitat Singh S, Mahmood A (2021) The NLP cookbook: modern recipes for transformer based deep learning architectures. IEEE Access 9:68675–68702CrossRef Singh S, Mahmood A (2021) The NLP cookbook: modern recipes for transformer based deep learning architectures. IEEE Access 9:68675–68702CrossRef
10.
Zurück zum Zitat Pang G, Shen C, Cao L et al (2021) Deep learning for anomaly detection: a review. ACM Comput Surv CSUR 54(2):1–38 Pang G, Shen C, Cao L et al (2021) Deep learning for anomaly detection: a review. ACM Comput Surv CSUR 54(2):1–38
12.
Zurück zum Zitat Brundage M, Avin S, Clark J. et al (2018). The malicious use of artificial intelligence: Forecasting, prevention, and mitigation. arXiv:1802.07228 Brundage M, Avin S, Clark J. et al (2018). The malicious use of artificial intelligence: Forecasting, prevention, and mitigation. arXiv:​1802.​07228
13.
Zurück zum Zitat Michelmore R, Kwiatkowska M, Gal Y (2018) Evaluating uncertainty quantification in end-to-end autonomous driving control. arXiv:1811.06817 Michelmore R, Kwiatkowska M, Gal Y (2018) Evaluating uncertainty quantification in end-to-end autonomous driving control. arXiv:​1811.​06817
17.
Zurück zum Zitat Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. Inte Conf Mach Learn PMLR 1:609–616 Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. Inte Conf Mach Learn PMLR 1:609–616
18.
Zurück zum Zitat Zadrozny B, Elkan C (2002) Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 694–699. https://doi.org/10.1145/775107.775151 Zadrozny B, Elkan C (2002) Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 694–699. https://​doi.​org/​10.​1145/​775107.​775151
19.
Zurück zum Zitat Platt J (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers 10(3):61–74 Platt J (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers 10(3):61–74
20.
Zurück zum Zitat Guo C, Pleiss G, Sun Y, et al (2017) On calibration of modern neural networks. Int Conf Mach Learn PMLR 1321–1330 Guo C, Pleiss G, Sun Y, et al (2017) On calibration of modern neural networks. Int Conf Mach Learn PMLR 1321–1330
21.
Zurück zum Zitat Fernando K, Ruwani M, Tsokos Chris P (2021) Dynamically weighted balanced loss: class imbalanced learning and confidence calibration of deep neural networks. IEEE Trans Neural Netw Learn Syst 99:1–12 Fernando K, Ruwani M, Tsokos Chris P (2021) Dynamically weighted balanced loss: class imbalanced learning and confidence calibration of deep neural networks. IEEE Trans Neural Netw Learn Syst 99:1–12
22.
Zurück zum Zitat Pereyra G, Tucker G, Chorowski J, et al (2017) Regularizing neural networks by penalizing confident output distributions. arXiv:1701.06548 Pereyra G, Tucker G, Chorowski J, et al (2017) Regularizing neural networks by penalizing confident output distributions. arXiv:​1701.​06548
23.
Zurück zum Zitat Kumar A, Sarawagi S, Jain U (2018) Trainable calibration measures for neural networks from kernel mean embeddings. Int Conf Mach Learn PMLR 2805–2814 Kumar A, Sarawagi S, Jain U (2018) Trainable calibration measures for neural networks from kernel mean embeddings. Int Conf Mach Learn PMLR 2805–2814
24.
25.
Zurück zum Zitat Ji B, Jung H, Yoon J, et al (2019) Bin-wise temperature scaling: improvement in confidence calibration performance through simple scaling techniques. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW). IEEE, pp 4190–4196 Ji B, Jung H, Yoon J, et al (2019) Bin-wise temperature scaling: improvement in confidence calibration performance through simple scaling techniques. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW). IEEE, pp 4190–4196
26.
Zurück zum Zitat Seo S, Seo PH, Han B (2019) Learning for single-shot confidence calibration in deep neural networks through stochastic inferences. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9030–9038 Seo S, Seo PH, Han B (2019) Learning for single-shot confidence calibration in deep neural networks through stochastic inferences. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9030–9038
27.
Zurück zum Zitat Zhang Z, Dalca AV, Sabuncu MR (2019) Confidence calibration for convolutional neural networks using structured dropout. arXiv:1906.09551 Zhang Z, Dalca AV, Sabuncu MR (2019) Confidence calibration for convolutional neural networks using structured dropout. arXiv:​1906.​09551
29.
Zurück zum Zitat DeGroot MH, Fienberg SE (1983) The comparison and evaluation of forecasters. J Roy Stat Soc Ser D (The Statistician) 32(1–2):12–22 DeGroot MH, Fienberg SE (1983) The comparison and evaluation of forecasters. J Roy Stat Soc Ser D (The Statistician) 32(1–2):12–22
31.
Zurück zum Zitat Wang L, Yoon KJ (2021) Knowledge distillation and student-teacher learning for visual intelligence: a review and new outlooks. IEEE Trans Pattern Anal Mach Intell 99:1 Wang L, Yoon KJ (2021) Knowledge distillation and student-teacher learning for visual intelligence: a review and new outlooks. IEEE Trans Pattern Anal Mach Intell 99:1
32.
Zurück zum Zitat Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto
33.
Zurück zum Zitat Netzer Y, Wang T, Coates A, Bissacco A, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. NIPS workshop on deep learning and unsupervised feature learning Netzer Y, Wang T, Coates A, Bissacco A, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. NIPS workshop on deep learning and unsupervised feature learning
35.
Zurück zum Zitat Zhang X, Zhou X, Lin M, et al (2018) ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. IEEE, pp 6848–6856 Zhang X, Zhou X, Lin M, et al (2018) ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. IEEE, pp 6848–6856
36.
38.
Zurück zum Zitat Huang G, Liu Z, Van Der Maaten L, et al. (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708 Huang G, Liu Z, Van Der Maaten L, et al. (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
39.
Zurück zum Zitat Zhang J, Kailkhura B, Han TYJ (2020) Mix-n-match: ensemble and compositional methods for uncertainty calibration in deep learning. In: International conference on machine learning, pp 11117–11128 Zhang J, Kailkhura B, Han TYJ (2020) Mix-n-match: ensemble and compositional methods for uncertainty calibration in deep learning. In: International conference on machine learning, pp 11117–11128
40.
Zurück zum Zitat Bohdal O, Yang Y, Hospedales T (2021) Meta-calibration: meta-learning of model calibration using differentiable expected calibration error. arXiv:2106.09613 Bohdal O, Yang Y, Hospedales T (2021) Meta-calibration: meta-learning of model calibration using differentiable expected calibration error. arXiv:2106.09613
Metadaten
Titel
Knowledge Reverse Distillation Based Confidence Calibration for Deep Neural Networks
verfasst von
Xianhui Jiang
Xiaogang Deng
Publikationsdatum
01.06.2022
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 1/2023
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-022-10885-8

Weitere Artikel der Ausgabe 1/2023

Neural Processing Letters 1/2023 Zur Ausgabe

Neuer Inhalt