nach oben

Erschienen in:

2020 | OriginalPaper | Buchkapitel

Shift Happens: Adjusting Classifiers

verfasst von : Theodore James Thibault Heiser, Mari-Liis Allikivi, Meelis Kull

Erschienen in: Machine Learning and Knowledge Discovery in Databases

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Minimizing expected loss measured by a proper scoring rule, such as Brier score or log-loss (cross-entropy), is a common objective while training a probabilistic classifier. If the data have experienced dataset shift where the class distributions change post-training, then often the model’s performance will decrease, over-estimating the probabilities of some classes while under-estimating the others on average. We propose unbounded and bounded general adjustment (UGA and BGA) methods that transform all predictions to (re-)equalize the average prediction and the class distribution. These methods act differently depending on which proper scoring rule is to be minimized, and we have a theoretical guarantee of reducing loss on test data, if the exact class distribution is known. We also demonstrate experimentally that, when in practice the class distribution is known only approximately, there is often still a reduction in loss depending on the amount of shift and the precision to which the class distribution is known.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Unjustified Classification Regions and Counterfactual Explanations in Machine Learning

Nächstes Kapitel Beyond the Selected Completely at Random Assumption for Learning from Positive and Unlabeled Data

Proofs, code: https://github.com/teddyheiser/Shift_Happens_ECML_PKDD_2019.

Banerjee, A., Guo, X., Wang, H.: On the optimality of conditional expectation as a Bregman predictor. IEEE Trans. Inf. Theory 51(7), 2664–2669 (2005)MathSciNetMATHCrossRef

Bauschke, H.H., Borwein, J.M.: Joint and separate convexity of the Bregman distance. In: Studies in Computational Mathematics, vol. 8, pp. 23–36. Elsevier (2001)

Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge Univ. Press, Cambridge (2004)MATHCrossRef

Bregman, L.M.: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7(3), 200–217 (1967)MathSciNetMATHCrossRef

Dawid, A.P.: The geometry of proper scoring rules. Ann. Inst. Stat. Math. 59(1), 77–93 (2007)MathSciNetMATHCrossRef

Diamond, S., Boyd, S.: CVXPY: a Python-embedded modeling language for convex optimization. J. Mach. Learn. Res. 17(83), 1–5 (2016)MathSciNetMATH

Gretton, A., Smola, A.J., Huang, J., Schmittfull, M., Borgwardt, K.M., Schölkopf, B.: Covariate shift by kernel mean matching. In: Dataset shift in machine learning, pp. 131–160 (2009)

Hein, M.: Binary classification under sample selection bias. In: Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N. (eds.) Dataset Shift in Machine Learning. MIT Press, Cambridge, pp. 41–64 (2009)

Kull, M., Flach, P.: Novel decompositions of proper scoring rules for classification: score adjustment as precursor to calibration. In: Appice, A., Rodrigues, P.P., Santos Costa, V., Soares, C., Gama, J., Jorge, A. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9284, pp. 68–85. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23528-8_5CrossRef

10.

Merkle, E.C., Steyvers, M.: Choosing a strictly proper scoring rule. Decis. Anal. 10(4), 292–304 (2013)MathSciNetMATHCrossRef

11.

Moreno-Torres, J.G., Raeder, T., Alaiz-Rodríguez, R., Chawla, N.V., Herrera, F.: A unifying view on dataset shift in classification. Pattern Recogn. 45(1), 521–530 (2012)CrossRef

12.

Saerens, M., Latinne, P., Decaestecker, C.: Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure. Neural Comp. 14(1), 21–41 (2002)MATHCrossRef

13.

Shimodaira, H.: Improving predictive inference under covariate shift by weighting the log-likelihood function. J. Stat. Plan. Infer. 90(2), 227–244 (2000)MathSciNetMATHCrossRef

14.

Storkey, A.: When training and test sets are different: characterizing learning transfer. Dataset shift in machine learning, pp. 3–28 (2009)

15.

Sugiyama, M., Krauledat, M., Müller, K.R.: Covariate shift adaptation by importance weighted cross validation. J. Mach. Learn. Res. 8(May), 985–1005 (2007)MATH

16.

Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. SIGKDD Explor. 15(2), 49–60 (2013)CrossRef

17.

Weiss, K., Khoshgoftaar, T.M., Wang, D.D.: A survey of transfer learning. J. Big Data 3(1), 1–40 (2016). https://doi.org/10.1186/s40537-016-0043-6CrossRef

Titel: Shift Happens: Adjusting Classifiers
verfasst von: Theodore James Thibault Heiser
Mari-Liis Allikivi
Meelis Kull
Verlag: Springer International Publishing
Buch: Machine Learning and Knowledge Discovery in Databases
Print ISBN: 978-3-030-46146-1

Electronic ISBN: 978-3-030-46147-8

Copyright-Jahr: 2020
DOI: https://doi.org/10.1007/978-3-030-46147-8_4

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"