Skip to main content
Erschienen in: AStA Wirtschafts- und Sozialstatistisches Archiv 3-4/2023

29.11.2023 | Originalveröffentlichung

Exploring quality dimensions in trustworthy Machine Learning in the context of official statistics: model explainability and uncertainty quantification

verfasst von: Saeid Molladavoudi, Wesley Yung

Erschienen in: AStA Wirtschafts- und Sozialstatistisches Archiv | Ausgabe 3-4/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Despite the fact that National Statistical Offices (NSOs) continue to embrace and adopt Machine Learning (ML) methods and tools in a variety of areas of their operations, including data collection, integration, and processing, it is still not clear how these complex and prediction-oriented approaches can be incorporated into the quality standards and frameworks within NSOs or if the frameworks themselves need to be modified. This article focuses on and builds upon two of the quality dimensions proposed in the Quality Framework for Statistical Algorithms (QF4SA): model explainability and accuracy (including uncertainty). The implications of the current methods for explainable ML and uncertainty quantification will be examined in further detail, as well as their possible uses in statistical production, such as continuous model monitoring in intermediate ML classifications and auto-coding phases. This strategy will ensure that human subject-matter experts, who are an essential component of every statistical program, are effectively integrated into the life cycle of ML projects. It will also guarantee to maintain the quality of ML models in production, adhere to the current quality frameworks within NSOs, and ultimately boost confidence and trust in these emerging technologies.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Alvarez-Melis D, Jaakkola TS (2018) On the robustness of interpretability methods (presented at 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018), Stockholm, Sweden) Alvarez-Melis D, Jaakkola TS (2018) On the robustness of interpretability methods (presented at 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018), Stockholm, Sweden)
Zurück zum Zitat Angelopoulos AN, Bates S (2021) A gentle introduction to conformal prediction and distribution-free uncertainty quantification (arXiv:2107.07511) Angelopoulos AN, Bates S (2021) A gentle introduction to conformal prediction and distribution-free uncertainty quantification (arXiv:2107.07511)
Zurück zum Zitat Angelopoulos AN, Bates S, Fisch A, Lei L, Schuster T (2022) Conformal risk control (arXiv:2208.02814) Angelopoulos AN, Bates S, Fisch A, Lei L, Schuster T (2022) Conformal risk control (arXiv:2208.02814)
Zurück zum Zitat Angelopoulos AN, Bates S, Fannjiang C, Jordan MI, Zrnic T (2023) Prediction-powered inference (ArXiv:2301.09633)CrossRef Angelopoulos AN, Bates S, Fannjiang C, Jordan MI, Zrnic T (2023) Prediction-powered inference (ArXiv:2301.09633)CrossRef
Zurück zum Zitat Barber RF, Candes EJ, Ramdas A, Tibshirani RJ (2022) Conformal prediction beyond exchangeability (arXiv:2202.13415) Barber RF, Candes EJ, Ramdas A, Tibshirani RJ (2022) Conformal prediction beyond exchangeability (arXiv:2202.13415)
Zurück zum Zitat Bernasconi E, De Fausti F, Pugliese F, Scannapieco M, Zardetto D (2022) Automatic extraction of land cover statistics from satellite imagery by deep learning. SJI 38:183–199CrossRef Bernasconi E, De Fausti F, Pugliese F, Scannapieco M, Zardetto D (2022) Automatic extraction of land cover statistics from satellite imagery by deep learning. SJI 38:183–199CrossRef
Zurück zum Zitat Bhatt U, Antorán J, Zhang Y, Liao QV, Sattigeri P, Fogliato R, Melançon G, Krishnan R, Stanley J, Tickoo O, Nachman L, Chunara R, Srikumar M, Weller A, Xiang A (2021) Uncertainty as a form of transparency: measuring, communicating, and using uncertainty. Proceedings of the 2021 AAAI/ACM conference on AI, ethics, and society, association for computing machinery, New York, NY, USA, pp 401–413 https://doi.org/10.1145/3461702.3462571CrossRef Bhatt U, Antorán J, Zhang Y, Liao QV, Sattigeri P, Fogliato R, Melançon G, Krishnan R, Stanley J, Tickoo O, Nachman L, Chunara R, Srikumar M, Weller A, Xiang A (2021) Uncertainty as a form of transparency: measuring, communicating, and using uncertainty. Proceedings of the 2021 AAAI/ACM conference on AI, ethics, and society, association for computing machinery, New York, NY, USA, pp 401–413 https://​doi.​org/​10.​1145/​3461702.​3462571CrossRef
Zurück zum Zitat Böhm V, Lanusse F, Seljak U (2019) Uncertainty quantification with generative models (arXiv.1910.10046) Böhm V, Lanusse F, Seljak U (2019) Uncertainty quantification with generative models (arXiv.1910.10046)
Zurück zum Zitat Breidt FJ, Claeskens G, Opsomer JD (2005) Model-assisted estimation for complex surveys using penalised splines. Biometrika 92(4):831–846MathSciNetCrossRef Breidt FJ, Claeskens G, Opsomer JD (2005) Model-assisted estimation for complex surveys using penalised splines. Biometrika 92(4):831–846MathSciNetCrossRef
Zurück zum Zitat Cassel CM, Särndal CE, Wretman JH (1976) Some results on generalized difference estimation and generalized regression estimation for finite populations. Biometrika 63:615–620MathSciNetCrossRef Cassel CM, Särndal CE, Wretman JH (1976) Some results on generalized difference estimation and generalized regression estimation for finite populations. Biometrika 63:615–620MathSciNetCrossRef
Zurück zum Zitat Chen T, Fox E, Guestrin C (2014) Stochastic gradient hamiltonian Monte Carlo. In: Xing EP, Jebara T (eds) Proceedings of the 31st international conference on machine learning, PMLR, Bejing, China, proceedings of machine learning research, vol 32, pp 1683–1691 Chen T, Fox E, Guestrin C (2014) Stochastic gradient hamiltonian Monte Carlo. In: Xing EP, Jebara T (eds) Proceedings of the 31st international conference on machine learning, PMLR, Bejing, China, proceedings of machine learning research, vol 32, pp 1683–1691
Zurück zum Zitat Fadel S, Trottier S (2023) A study on explainable active learning for text classification (Statistics Canada’s internal report) Fadel S, Trottier S (2023) A study on explainable active learning for text classification (Statistics Canada’s internal report)
Zurück zum Zitat Gal Y, Ghahramani Z (2015) Bayesian convolutional neural networks with bernoulli approximate variational inference (arxiv:1506.02158) Gal Y, Ghahramani Z (2015) Bayesian convolutional neural networks with bernoulli approximate variational inference (arxiv:1506.02158)
Zurück zum Zitat Gal Y, Ghahramani Z (2016) Dropout as a bayesian approximation: representing model uncertainty in deep learning. In: Balcan MF, Weinberger KQ (eds) Proceedings of the 33rd international conference on machine learning, PMLR, New York, New York, USA, proceedings of machine learning research, vol 48, pp 1050–1059 Gal Y, Ghahramani Z (2016) Dropout as a bayesian approximation: representing model uncertainty in deep learning. In: Balcan MF, Weinberger KQ (eds) Proceedings of the 33rd international conference on machine learning, PMLR, New York, New York, USA, proceedings of machine learning research, vol 48, pp 1050–1059
Zurück zum Zitat Gal Y, Islam R, Ghahramani Z (2017) Deep bayesian active learning with image data. Proceedings of the 34th international conference on machine learning, vol 70, pp 1183–1192 Gal Y, Islam R, Ghahramani Z (2017) Deep bayesian active learning with image data. Proceedings of the 34th international conference on machine learning, vol 70, pp 1183–1192
Zurück zum Zitat Geifman Y, El-Yaniv R (2017) Selective classification for deep neural networks. In: Guyon I, von Luxburg U, Bengio S, Wallach H, Fergus R, Garnett R (eds) Advances in neural information processing systems, vol 30 Geifman Y, El-Yaniv R (2017) Selective classification for deep neural networks. In: Guyon I, von Luxburg U, Bengio S, Wallach H, Fergus R, Garnett R (eds) Advances in neural information processing systems, vol 30
Zurück zum Zitat Guo C, Pleiss G, Sun Y, Weinberger KQ (2017) On calibration of modern neural networks. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning, PMLR, proceedings of machine learning research, vol 70, pp 1321–1330 Guo C, Pleiss G, Sun Y, Weinberger KQ (2017) On calibration of modern neural networks. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning, PMLR, proceedings of machine learning research, vol 70, pp 1321–1330
Zurück zum Zitat Kaiser P, Kern C, Rügamer D (2022) Uncertainty-aware predictive modeling for fair data-driven decisions Kaiser P, Kern C, Rügamer D (2022) Uncertainty-aware predictive modeling for fair data-driven decisions
Zurück zum Zitat Lakshminarayanan B, Pritzel A, Blundell C (2017) Simple and scalable predictive uncertainty estimation using deep ensembles. Proceedings of the 31st international conference on neural information processing systems, pp 6405–6416 Lakshminarayanan B, Pritzel A, Blundell C (2017) Simple and scalable predictive uncertainty estimation using deep ensembles. Proceedings of the 31st international conference on neural information processing systems, pp 6405–6416
Zurück zum Zitat Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. Proceedings of the 31st international conference on neural information processing systems, pp 4768–4777 Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. Proceedings of the 31st international conference on neural information processing systems, pp 4768–4777
Zurück zum Zitat Montanari G, Ranalli M (2005) Nonparametric model calibration estimation in survey sampling. J Am Stat Assoc 100(472):1429–1442MathSciNetCrossRef Montanari G, Ranalli M (2005) Nonparametric model calibration estimation in survey sampling. J Am Stat Assoc 100(472):1429–1442MathSciNetCrossRef
Zurück zum Zitat Platt JC (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers. MIT Press, pp 61–74 Platt JC (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers. MIT Press, pp 61–74
Zurück zum Zitat Ribeiro MT, Singh S, Guestrin C (2018) Anchors: high-precision model-agnostic explanations. Proceedings of the AAAI conference on artificial intelligence Ribeiro MT, Singh S, Guestrin C (2018) Anchors: high-precision model-agnostic explanations. Proceedings of the AAAI conference on artificial intelligence
Zurück zum Zitat Romano Y, Patterson E, Candes EJ (2019) Conformalized quantile regression. In: Wallach H, Larochelle H, Beygelzimer A, Alché-Buc F, Fox E, Garnett R (eds) Advances in neural information processing systems, vol 32 Romano Y, Patterson E, Candes EJ (2019) Conformalized quantile regression. In: Wallach H, Larochelle H, Beygelzimer A, Alché-Buc F, Fox E, Garnett R (eds) Advances in neural information processing systems, vol 32
Zurück zum Zitat Särndal CE, Swensson B, Wretman J (1992) Model assisted survey sampling. SpringerCrossRef Särndal CE, Swensson B, Wretman J (1992) Model assisted survey sampling. SpringerCrossRef
Zurück zum Zitat Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNet Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNet
Zurück zum Zitat Steinberger L, Leeb H (2018) Conditional predictive inference for stable algorithms (arXiv:1809.01412) Steinberger L, Leeb H (2018) Conditional predictive inference for stable algorithms (arXiv:1809.01412)
Zurück zum Zitat Vaicenavicius J, Widmann D, Andersson C, Lindsten F, Roll J, Schön T (2019) Evaluating model calibration in classification. In: Chaudhuri K, Sugiyama M (eds) Proceedings of the twenty-second international conference on artificial intelligence and statistics, PMLR, proceedings of machine learning research, vol 89, pp 3459–3467 Vaicenavicius J, Widmann D, Andersson C, Lindsten F, Roll J, Schön T (2019) Evaluating model calibration in classification. In: Chaudhuri K, Sugiyama M (eds) Proceedings of the twenty-second international conference on artificial intelligence and statistics, PMLR, proceedings of machine learning research, vol 89, pp 3459–3467
Zurück zum Zitat Vovk V, Gammerman A, Shafer G (2005) Algorithmic learning in a random world. Springer, Berlin, Heidelberg Vovk V, Gammerman A, Shafer G (2005) Algorithmic learning in a random world. Springer, Berlin, Heidelberg
Zurück zum Zitat Wachter S, Mittelstadt B, Russell C (2018) Counterfactual explanations without opening the black box: automated decisions and the gdpr. Harv J Law Technol 31(2):841–887 Wachter S, Mittelstadt B, Russell C (2018) Counterfactual explanations without opening the black box: automated decisions and the gdpr. Harv J Law Technol 31(2):841–887
Zurück zum Zitat Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. Proceedings of the eighteenth international conference on machine learning. Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, pp 609–616 Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. Proceedings of the eighteenth international conference on machine learning. Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, pp 609–616
Zurück zum Zitat Zhang J (2022) Machine learning techniques to handle survey non-response (statistics Canada’s internal report) Zhang J (2022) Machine learning techniques to handle survey non-response (statistics Canada’s internal report)
Metadaten
Titel
Exploring quality dimensions in trustworthy Machine Learning in the context of official statistics: model explainability and uncertainty quantification
verfasst von
Saeid Molladavoudi
Wesley Yung
Publikationsdatum
29.11.2023
Verlag
Springer Berlin Heidelberg
Erschienen in
AStA Wirtschafts- und Sozialstatistisches Archiv / Ausgabe 3-4/2023
Print ISSN: 1863-8155
Elektronische ISSN: 1863-8163
DOI
https://doi.org/10.1007/s11943-023-00331-z

Weitere Artikel der Ausgabe 3-4/2023

AStA Wirtschafts- und Sozialstatistisches Archiv 3-4/2023 Zur Ausgabe

Originalveröffentlichung

Quality aspects of annotated data