In order to test to what extent can data mining distinguish from observation points of different types, the indicators that can measure the difference between the distribution of positive and negative point scores are raised. First of all, we use the overlapping area of two types of point distributions-overlapping degree, to describe the difference, and discuss the nature of overlapping degree. Secondly, we put forward the image and quantitative indicators with the ability to distinguish different models: Lorenz curve, Gini coefficient, AR, as well as the similar ROC curve and AUC. We have proved AUC and AR are completely linear related; Finally, we construct the nonparametric statistics of AUC, however, the difference of K-S is that we cannot draw the conclusion that zero assumption is more difficult to be rejected when negative points take up a smaller proportion.
Weitere Kapitel dieses Buchs durch Wischen aufrufen
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
- The Measurement of Distinguishing Ability of Classification in Data Mining Model and Its Statistical Significance
- Springer Berlin Heidelberg