Weitere Kapitel dieses Buchs durch Wischen aufrufen
When predicting a categorical outcome, some measure of classification accuracy is typically used to evaluate the model’s effectiveness. However, there are different ways to measure classification accuracy, depending of the modeler’s primary objectives. Most classification models can produce both a continuous and categorical prediction output. In Section 11.1, we review these outputs, demonstrate how to adjust probabilities based on calibration plots, recommend ways for displaying class predictions, and define equivocal or indeterminate zones of prediction. In Section 11.2, we review common metrics for assessing classification predictions such as accuracy, kappa, sensitivity, specificity, and positive and negative predicted values. This section also addresses model evaluation when costs are applied to making false positive or false negative mistakes. Classification models may also produce predicted classification probabilities. Evaluating this type of output is addressed in Section 11.3, and includes a discussion of receiver operating characteristic curves as well as lift charts. In Section 11.4, we demonstrate how measures of classification performance can be generated in R.
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
Agresti A (2002). Categorical Data Analysis. Wiley–Interscience.
Altman D, Bland J (1994). “Diagnostic Tests 3: Receiver Operating Characteristic Plots.” British Medical Journal, 309(6948), 188. CrossRef
Becton Dickinson and Company (1991). ProbeTec ET Chlamydia trachomatis and Neisseria gonorrhoeae Amplified DNA Assays (Package Insert).
Bridle J (1990). “Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition.” In “Neurocomputing: Algorithms, Architectures and Applications,” pp. 227–236. Springer–Verlag.
Brown C, Davis H (2006). “Receiver Operating Characteristics Curves and Related Decision Measures: A Tutorial.” Chemometrics and Intelligent Laboratory Systems, 80(1), 24–38. CrossRef
Cohen J (1960). “A Coefficient of Agreement for Nominal Data.” Educational and Psychological Measurement, 20, 37–46. CrossRef
Dobson A (2002). An Introduction to Generalized Linear Models. Chapman & Hall/CRC.
Drummond C, Holte R (2000). “Explicitly Representing Expected Cost: An Alternative to ROC Representation.” In “Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,” pp. 198–207. CrossRef
Gupta S, Hanssens D, Hardie B, Kahn W, Kumar V, Lin N, Ravishanker N, Sriram S (2006). “Modeling Customer Lifetime Value.” Journal of Service Research, 9(2), 139–155. CrossRef
Hanley J, McNeil B (1982). “The Meaning and Use of the Area under a Receiver Operating (ROC) Curvel Characteristic.” Radiology, 143(1), 29–36. CrossRef
Lachiche N, Flach P (2003). “Improving Accuracy and Cost of Two–Class and Multi–Class Probabilistic Classifiers using ROC Curves.” In “Proceedings of the Twentieth International Conference on Machine Learning,” volume 20, pp. 416–424.
Larose D (2006). Data Mining Methods and Models. Wiley.
Ling C, Li C (1998). “Data Mining for Direct Marketing: Problems and solutions.” In “Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining,” pp. 73–79.
McClish D (1989). “Analyzing a Portion of the ROC Curve.” Medical Decision Making, 9, 190–195. CrossRef
Pepe MS, Longton G, Janes H (2009). “Estimation and Comparison of Receiver Operating Characteristic Curves.” Stata Journal, 9(1), 1–16.
Piersma A, Genschow E, Verhoef A, Spanjersberg M, Brown N, Brady M, Burns A, Clemann N, Seiler A, Spielmann H (2004). “Validation of the Postimplantation Rat Whole-embryo Culture Test in the International ECVAM Validation Study on Three In Vitro Embryotoxicity Tests.” Alternatives to Laboratory Animals, 32, 275–307.
Platt J (2000). “Probabilistic Outputs for Support Vector Machines and Comparison to Regularized Likelihood Methods.” In B Bartlett, B Schölkopf, D Schuurmans, A Smola (eds.), “Advances in Kernel Methods Support Vector Learning,” pp. 61–74. Cambridge, MA: MIT Press.
Provost F, Fawcett T, Kohavi R (1998). “The Case Against Accuracy Estimation for Comparing Induction Algorithms.” Proceedings of the Fifteenth International Conference on Machine Learning, pp. 445–453.
Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez JC, Muller M (2011). “ pROC: an open-source package for R and S+ to analyze and compare ROC curves.” BMC Bioinformatics, 12(1), 77.
Youden W (1950). “Index for Rating Diagnostic Tests.” Cancer, 3(1), 32–35. CrossRef
- Measuring Performance in Classification Models
- Springer New York
- Chapter 11
Neuer Inhalt/© ITandMEDIA, Product Lifecycle Management/© Eisenhans | vege | Fotolia