Elsevier

Pattern Recognition

Volume 30, Issue 7, July 1997, Pages 1145-1159
Pattern Recognition

The use of the area under the ROC curve in the evaluation of machine learning algorithms

https://doi.org/10.1016/S0031-3203(96)00142-2Get rights and content

Abstract

In this paper we investigate the use of the area under the receiver operating characteristic (ROC) curve (AUC) as a performance measure for machine learning algorithms. As a case study we evaluate six machine learning algorithms (C4.5, Multiscale Classifier, Perceptron, Multi-layer Perceptron, k-Nearest Neighbours, and a Quadratic Discriminant Function) on six “real world” medical diagnostics data sets. We compare and discuss the use of AUC to the more conventional overall accuracy and find that AUC exhibits a number of desirable properties when compared to overall accuracy: increased sensitivity in Analysis of Variance (ANOVA) tests; a standard error that decreased as both AUC and the number of test samples increased; decision threshold independent; and it is invariant to a priori class probabilities. The paper concludes with the recommendation that AUC be used in preference to overall accuracy for “single number” evaluation of machine learning algorithms.

References (36)

  • RayM.J. et al.

    Relationship of platelet aggregation to bleeding after cardiopulmonary bypass

    Ann. Thoracic Surgery

    (1994)
  • DetranoR. et al.

    International application of a new probability algorithm for the diagnosis of coronary artery disease

    Am. J. Cardiol.

    (1989)
  • GennariJ.H. et al.

    Models of incremental conceot formation

    Artif. Intell.

    (1989)
  • FukunagaK.

    Introduction to Statistical Pattern Recognition

    (1990)
  • TherrienC.W.

    Decision Estimation and Classification: An Introduction to Pattern Recognition and Related Topics

    (1989)
  • WalpoleR.E. et al.

    Probability and Statistics for Engineers and Scientists

    (1990)
  • DorfmannD.D. et al.

    Maximum likelihood estimation of parameters of signal detection theory and determination of confidence intervals-rating method data

    J. Math. Psychology

    (1969)
  • SwetsJ.A.

    ROC analysis applied to the evaluation of medical imaging techniques

    Invest. Radiol.

    (1979)
  • HanleyJ.A. et al.

    The meaning and use of the area under a receiver operating characteristic (ROC) curve

    Radiology

    (1982)
  • HanleyJ.A. et al.

    A method of comparing the areas under receiver operating characteristic curves derived from the same cases

    Radiology

    (1983)
  • FriedmanJ.H.

    An overview of predictive learning and function approximation

  • WalkerR. et al.

    Classification of cervical cell nuclei using morphological segmentation and textural feature extraction

  • WolbergW.H. et al.

    Multisurface method of pattern separation for medical diagnosis applied to breast cytology

  • SmithJ.W. et al.

    Using the (ADAP) learning algorithm to forecast the onset of diabetes mellitus

  • LovellB.C. et al.

    The multiscale classifier

    IEEE Trans. Pattern Analysis Mach. Intell.

    (1996)
  • DevijverP.A. et al.

    Pattern Recognition: A Statistical Approach

    (1982)
  • RauberT.W. et al.

    A tool☐ for the Analysis and Visualisation of Sensor Data in Supervision

  • TouJ.T. et al.

    Pattern Recognition Principles

    (1981)
  • Cited by (5458)

    View all citing articles on Scopus
    *

    Present address: Department of Computing Science, 615 General Services Building, University of Alberta, Edmonton, Canada T6G 2H1.

    View full text