Skip to main content

2020 | OriginalPaper | Buchkapitel

Shift Happens: Adjusting Classifiers

verfasst von : Theodore James Thibault Heiser, Mari-Liis Allikivi, Meelis Kull

Erschienen in: Machine Learning and Knowledge Discovery in Databases

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Minimizing expected loss measured by a proper scoring rule, such as Brier score or log-loss (cross-entropy), is a common objective while training a probabilistic classifier. If the data have experienced dataset shift where the class distributions change post-training, then often the model’s performance will decrease, over-estimating the probabilities of some classes while under-estimating the others on average. We propose unbounded and bounded general adjustment (UGA and BGA) methods that transform all predictions to (re-)equalize the average prediction and the class distribution. These methods act differently depending on which proper scoring rule is to be minimized, and we have a theoretical guarantee of reducing loss on test data, if the exact class distribution is known. We also demonstrate experimentally that, when in practice the class distribution is known only approximately, there is often still a reduction in loss depending on the amount of shift and the precision to which the class distribution is known.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Banerjee, A., Guo, X., Wang, H.: On the optimality of conditional expectation as a Bregman predictor. IEEE Trans. Inf. Theory 51(7), 2664–2669 (2005)MathSciNetMATHCrossRef Banerjee, A., Guo, X., Wang, H.: On the optimality of conditional expectation as a Bregman predictor. IEEE Trans. Inf. Theory 51(7), 2664–2669 (2005)MathSciNetMATHCrossRef
2.
Zurück zum Zitat Bauschke, H.H., Borwein, J.M.: Joint and separate convexity of the Bregman distance. In: Studies in Computational Mathematics, vol. 8, pp. 23–36. Elsevier (2001) Bauschke, H.H., Borwein, J.M.: Joint and separate convexity of the Bregman distance. In: Studies in Computational Mathematics, vol. 8, pp. 23–36. Elsevier (2001)
3.
Zurück zum Zitat Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge Univ. Press, Cambridge (2004)MATHCrossRef Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge Univ. Press, Cambridge (2004)MATHCrossRef
4.
Zurück zum Zitat Bregman, L.M.: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7(3), 200–217 (1967)MathSciNetMATHCrossRef Bregman, L.M.: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7(3), 200–217 (1967)MathSciNetMATHCrossRef
6.
Zurück zum Zitat Diamond, S., Boyd, S.: CVXPY: a Python-embedded modeling language for convex optimization. J. Mach. Learn. Res. 17(83), 1–5 (2016)MathSciNetMATH Diamond, S., Boyd, S.: CVXPY: a Python-embedded modeling language for convex optimization. J. Mach. Learn. Res. 17(83), 1–5 (2016)MathSciNetMATH
7.
Zurück zum Zitat Gretton, A., Smola, A.J., Huang, J., Schmittfull, M., Borgwardt, K.M., Schölkopf, B.: Covariate shift by kernel mean matching. In: Dataset shift in machine learning, pp. 131–160 (2009) Gretton, A., Smola, A.J., Huang, J., Schmittfull, M., Borgwardt, K.M., Schölkopf, B.: Covariate shift by kernel mean matching. In: Dataset shift in machine learning, pp. 131–160 (2009)
8.
Zurück zum Zitat Hein, M.: Binary classification under sample selection bias. In: Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N. (eds.) Dataset Shift in Machine Learning. MIT Press, Cambridge, pp. 41–64 (2009) Hein, M.: Binary classification under sample selection bias. In: Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N. (eds.) Dataset Shift in Machine Learning. MIT Press, Cambridge, pp. 41–64 (2009)
9.
Zurück zum Zitat Kull, M., Flach, P.: Novel decompositions of proper scoring rules for classification: score adjustment as precursor to calibration. In: Appice, A., Rodrigues, P.P., Santos Costa, V., Soares, C., Gama, J., Jorge, A. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9284, pp. 68–85. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23528-8_5CrossRef Kull, M., Flach, P.: Novel decompositions of proper scoring rules for classification: score adjustment as precursor to calibration. In: Appice, A., Rodrigues, P.P., Santos Costa, V., Soares, C., Gama, J., Jorge, A. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9284, pp. 68–85. Springer, Cham (2015). https://​doi.​org/​10.​1007/​978-3-319-23528-8_​5CrossRef
11.
Zurück zum Zitat Moreno-Torres, J.G., Raeder, T., Alaiz-Rodríguez, R., Chawla, N.V., Herrera, F.: A unifying view on dataset shift in classification. Pattern Recogn. 45(1), 521–530 (2012)CrossRef Moreno-Torres, J.G., Raeder, T., Alaiz-Rodríguez, R., Chawla, N.V., Herrera, F.: A unifying view on dataset shift in classification. Pattern Recogn. 45(1), 521–530 (2012)CrossRef
12.
Zurück zum Zitat Saerens, M., Latinne, P., Decaestecker, C.: Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure. Neural Comp. 14(1), 21–41 (2002)MATHCrossRef Saerens, M., Latinne, P., Decaestecker, C.: Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure. Neural Comp. 14(1), 21–41 (2002)MATHCrossRef
13.
Zurück zum Zitat Shimodaira, H.: Improving predictive inference under covariate shift by weighting the log-likelihood function. J. Stat. Plan. Infer. 90(2), 227–244 (2000)MathSciNetMATHCrossRef Shimodaira, H.: Improving predictive inference under covariate shift by weighting the log-likelihood function. J. Stat. Plan. Infer. 90(2), 227–244 (2000)MathSciNetMATHCrossRef
14.
Zurück zum Zitat Storkey, A.: When training and test sets are different: characterizing learning transfer. Dataset shift in machine learning, pp. 3–28 (2009) Storkey, A.: When training and test sets are different: characterizing learning transfer. Dataset shift in machine learning, pp. 3–28 (2009)
15.
Zurück zum Zitat Sugiyama, M., Krauledat, M., Müller, K.R.: Covariate shift adaptation by importance weighted cross validation. J. Mach. Learn. Res. 8(May), 985–1005 (2007)MATH Sugiyama, M., Krauledat, M., Müller, K.R.: Covariate shift adaptation by importance weighted cross validation. J. Mach. Learn. Res. 8(May), 985–1005 (2007)MATH
16.
Zurück zum Zitat Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. SIGKDD Explor. 15(2), 49–60 (2013)CrossRef Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. SIGKDD Explor. 15(2), 49–60 (2013)CrossRef
Metadaten
Titel
Shift Happens: Adjusting Classifiers
verfasst von
Theodore James Thibault Heiser
Mari-Liis Allikivi
Meelis Kull
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-46147-8_4