1 Introduction
-
Section 4.1 discusses the applicability of prequential AUC to performance visualization over time, for stationary and drifting streams;
-
Section 4.2 analyzes the consistency and discriminancy of prequential AUC compared to AUC computed traditionally on an entire stationary dataset;
-
Section 5.3 summarizes experiments performed to evaluate the speed of the proposed measure;
-
Section 5.4 analyzes parameter sensitivity of prequential AUC;
-
Section 5.5 compares prequential AUC to other class imbalance evaluation measures on a series of streams involving various difficulty factors.
2 Related work
2.1 Stream classifier evaluation
2.2 Area under the ROC curve
3 Prequential AUC
4 Properties of prequential AUC
4.1 AUC visualizations over time
RBF
generator (the dataset will be discussed in more detail in Sect. 5.2). The first dataset contained no drifts, whereas in the second dataset a sudden drift was added after 10 k examples. Prequential and block AUC were calculated using a window of \(d = 1000\) examples (the impact of the d parameter will be analyzed in Sect. 5.4). It is worth noting that similar comparisons were previously done for incremental and prequential accuracy [21, 33].4.2 Prequential AUC averaged over entire streams
4.2.1 Motivation
\(C_1\)
| − | − | − | − | − | − | − | − |
\(+\)
|
\(+\)
|
\(+\)
|
\(+\)
|
\(+\)
|
\(+\)
|
\(+\)
|
\(+\)
|
t
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
\(C_2\)
| − | − | − | − | − | − | − | − |
\(+\)
|
\(+\)
|
\(+\)
|
\(+\)
|
\(+\)
|
\(+\)
|
\(+\)
|
\(+\)
|
t
| 1 | 3 | 5 | 7 | 9 | 11 | 13 | 15 | 2 | 4 | 6 | 8 | 10 | 12 | 14 | 16 |
\(C_3\)
|
\(+\)
| − |
\(+\)
| − |
\(+\)
| − |
\(+\)
| − |
\(+\)
| − |
\(+\)
|
\( -\)
|
\(+\)
| − |
\(+\)
| − |
t
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
\(C_4\)
| − | − | − | − | − | − | − | − |
\(+\)
|
\(+\)
|
\(+\)
|
\(+\)
|
\(+\)
|
\(+\)
|
\( +\)
|
\( +\)
|
t
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
4.2.2 Theoretical background
4.2.3 Simulations
n
|
d
|
\(A(a) < A(b)\) & |
\(A(a) < A(b)\) & |
\(\mathbf {C}\)
|
n
|
d
|
\(A(a) < A(b)\) & |
\(A(a) < A(b)\) & |
\(\mathbf {C}\)
|
---|---|---|---|---|---|---|---|---|---|
\(pA(a) < pA(b)\)
|
\(pA(a) > pA(b)\)
|
\(bA(a) < bA(b)\)
|
\(bA(a) > bA(b)\)
| ||||||
(a) Prequential AUC
|
(b) Block-based AUC
| ||||||||
4 | 2 | 226 | 30 | 0.883 | 4 | 2 | 208 | 56 | 0.788 |
4 | 3 | 276 | 0 | 1.000 | 4 | 3 | 236 | 28 | 0.894 |
6 | 2 | 78,558 | 14,206 | 0.847 | 6 | 2 | 67,230 | 17,428 | 0.794 |
6 | 3 | 95,086 | 11,688 | 0.891 | 6 | 3 | 87,150 | 14,164 | 0.860 |
6 | 4 | 106,566 | 10,098 | 0.913 | 6 | 4 | 92,930 | 19,356 | 0.828 |
6 | 5 | 109,152 | 3840 | 0.966 | 6 | 5 | 101,520 | 7,608 | 0.930 |
8 | 2 | 53,158,628 | 11,185,976 | 0.826 | 8 | 2 | 45,405,586 | 13,358,810 | 0.773 |
8 | 3 | 63,240,034 | 10,425,237 | 0.858 | 8 | 3 | 53,850,202 | 14,391,887 | 0.789 |
8 | 4 | 69,182,304 | 9,972,770 | 0.874 | 8 | 4 | 66,567,040 | 8,624,167 | 0.885 |
8 | 5 | 70,761,192 | 7,693,012 | 0.902 | 8 | 5 | 63,572,560 | 12,086,730 | 0.840 |
8 | 6 | 73,479,168 | 6,411,408 | 0.920 | 8 | 6 | 64,611,792 | 14,232,192 | 0.819 |
8 | 7 | 75,503,520 | 2,403,360 | 0.969 | 8 | 7 | 72,136,080 | 4,448,880 | 0.942 |
10 | 2 | 60,634,866,784 | 14,063,999,524 | 0.812 | 10 | 2 | 51,487,887,104 | 17,000,441,762 | 0.752 |
10 | 3 | 71,130,163,463 | 13,290,485,752 | 0.843 | 10 | 3 | 59,490,757,935 | 16,883,741,696 | 0.779 |
10 | 4 | 74,037,644,404 | 14,163,943,582 | 0.839 | 10 | 4 | 64,512,419,513 | 20,509,708,299 | 0.759 |
10 | 5 | 78,920,700,364 | 11,176,341,290 | 0.876 | 10 | 5 | 76,714,147,735 | 9,078,493,566 | 0.894 |
10 | 6 | 73,168,730,742 | 13,467,350,816 | 0.845 | 10 | 6 | 71,855,703,700 | 13,934,726,872 | 0.838 |
10 | 7 | 79,076,275,296 | 9,293,058,744 | 0.895 | 10 | 7 | 70,312,608,720 | 16,552,837,632 | 0.809 |
10 | 8 | 83,944,244,880 | 6,701,320,800 | 0.926 | 10 | 8 | 74,378,685,600 | 15,590,530,080 | 0.827 |
10 | 9 | 86,539,824,000 | 2,926,627,200 | 0.967 | 10 | 9 | 83,365,107,840 | 4,817,836,800 | 0.945 |
n
|
d
|
\(A(a) < A(b)\) & |
\(A(a) < A(b)\) & |
\(\mathbf {C}\)
|
n
|
d
|
\(A(a) < A(b)\) & |
\(A(a) < A(b)\) & |
\(\mathbf {C}\)
|
---|---|---|---|---|---|---|---|---|---|
\(pA(a) < pA(b)\)
|
\(pA(a) > pA(b)\)
|
\(bA(a) < bA(b)\)
|
\(bA(a) > bA(b)\)
| ||||||
(a) Prequential AUC
|
(b) Block-based AUC
| ||||||||
4 | 2 | 96 | 8 | 0.923 | 4 | 2 | 92 | 8 | 0.920 |
4 | 3 | 112 | 8 | 0.933 | 4 | 3 | 102 | 18 | 0.850 |
6 | 2 | 44,328 | 7430 | 0.856 | 6 | 2 | 39,222 | 9582 | 0.804 |
6 | 3 | 53,318 | 7312 | 0.879 | 6 | 3 | 49,482 | 7590 | 0.867 |
6 | 4 | 58,774 | 6904 | 0.895 | 6 | 4 | 52,302 | 11,468 | 0.820 |
6 | 5 | 62,064 | 3312 | 0.949 | 6 | 5 | 59,496 | 5352 | 0.917 |
8 | 2 | 34,250,730 | 6,980,006 | 0.831 | 8 | 2 | 28,945,260 | 8,463,258 | 0.774 |
8 | 3 | 40,699,497 | 6,793,142 | 0.857 | 8 | 3 | 34,682,961 | 8,761,157 | 0.798 |
8 | 4 | 44,193,331 | 6,982,899 | 0.864 | 8 | 4 | 43,361,987 | 6,740,645 | 0.865 |
8 | 5 | 45,463,956 | 5,426,998 | 0.893 | 8 | 5 | 41,705,650 | 7,623,636 | 0.845 |
8 | 6 | 47,516,640 | 4,326,864 | 0.917 | 8 | 6 | 42,382,608 | 8,985,336 | 0.825 |
8 | 7 | 49,281,120 | 2,127,600 | 0.959 | 8 | 7 | 47,743,920 | 3,418,560 | 0.933 |
10 | 2 | 14,056,095,136 | 2,991,723,204 | 0.825 | 10 | 2 | 12,195,033,276 | 3,413,021,306 | 0.781 |
10 | 3 | 16,506,930,025 | 3,058,609,668 | 0.844 | 10 | 3 | 13,792,398,281 | 3,675,831,232 | 0.790 |
10 | 4 | 17,397,851,565 | 3,458,536,944 | 0.834 | 10 | 4 | 15,903,298,910 | 4,113,647,610 | 0.794 |
10 | 5 | 17,857,902,489 | 3,077,583,820 | 0.853 | 10 | 5 | 17,306,025,784 | 3,102,200,565 | 0.848 |
10 | 6 | 17,368,267,282 | 3,208,774,634 | 0.844 | 10 | 6 | 17,469,134,532 | 2,947,901,726 | 0.856 |
10 | 7 | 18,541,658,208 | 2,198,965,392 | 0.894 | 10 | 7 | 16,864,120,896 | 3,466,868,304 | 0.829 |
10 | 8 | 19,469,178,720 | 1,653,883,920 | 0.922 | 10 | 8 | 17,293,451,760 | 3,656,527,920 | 0.825 |
10 | 9 | 20,338,174,080 | 834,825,600 | 0.961 | 10 | 9 | 19,822,360,320 | 1,216,010,880 | 0.942 |
n
|
d
|
\(A(a) < A(b)\) & |
\(A(a) < A(b)\) & |
\(\mathbf {C}\)
|
n
|
d
|
\(A(a) < A(b)\) & |
\(A(a) < A(b)\) & |
\(\mathbf {C}\)
|
---|---|---|---|---|---|---|---|---|---|
\(pA(a) < pA(b)\)
|
\(pA(a) > pA(b)\)
|
\(bA(a) < bA(b)\)
|
\(bA(a) > bA(b)\)
| ||||||
(a) Prequential AUC
|
(b) Block-based AUC
| ||||||||
4 | 2 | 96 | 8 | 0.923 | 4 | 2 | 92 | 8 | 0.920 |
4 | 3 | 112 | 8 | 0.933 | 4 | 3 | 102 | 18 | 0.850 |
6 | 2 | 6888 | 840 | 0.891 | 6 | 2 | 6096 | 784 | 0.886 |
6 | 3 | 8068 | 1240 | 0.867 | 6 | 3 | 8152 | 644 | 0.927 |
6 | 4 | 8568 | 1220 | 0.875 | 6 | 4 | 8276 | 934 | 0.899 |
6 | 5 | 9264 | 816 | 0.919 | 6 | 5 | 8880 | 1200 | 0.881 |
8 | 2 | 701,568 | 96,768 | 0.879 | 8 | 2 | 571,680 | 71,040 | 0.889 |
8 | 3 | 841,168 | 145,632 | 0.852 | 8 | 3 | 796,032 | 90,528 | 0.898 |
8 | 4 | 868,680 | 164,584 | 0.841 | 8 | 4 | 914,944 | 64,888 | 0.93 |
8 | 5 | 891,960 | 164,384 | 0.844 | 8 | 5 | 923,656 | 77,832 | 0.922 |
8 | 6 | 936,144 | 139,056 | 0.871 | 8 | 6 | 940,368 | 99,312 | 0.904 |
8 | 7 | 1,006,560 | 82,080 | 0.925 | 8 | 7 | 982,800 | 105,840 | 0.903 |
10 | 2 | 99,792,000 | 14,636,160 | 0.872 | 10 | 2 | 81,250,560 | 11,632,320 | 0.875 |
10 | 3 | 121,113,792 | 21,683,232 | 0.848 | 10 | 3 | 105,151,848 | 16,303,848 | 0.866 |
10 | 4 | 125,448,672 | 25,481,808 | 0.831 | 10 | 4 | 126,349,656 | 12,907,656 | 0.907 |
10 | 5 | 127,000,512 | 27,198,096 | 0.824 | 10 | 5 | 139,374,336 | 8,251,152 | 0.944 |
10 | 6 | 128,755,248 | 27,134,520 | 0.826 | 10 | 6 | 142,754,112 | 12,808,368 | 0.918 |
10 | 7 | 132,687,744 | 24,559,776 | 0.844 | 10 | 7 | 141,324,576 | 10,824,720 | 0.929 |
10 | 8 | 139,400,640 | 19,555,920 | 0.877 | 10 | 8 | 142,572,240 | 12,827,520 | 0.917 |
10 | 9 | 148,861,440 | 10,805,760 | 0.932 | 10 | 9 | 146,603,520 | 13,063,680 | 0.918 |
n
|
d
|
\(A(a) < A(b)\) & |
\(A(a) = A(b)\) & |
\(\mathbf {D}\)
|
n
|
d
|
\(A(a) < A(b)\) & |
\(A(a) = A(b)\) & |
\(\mathbf {D}\)
|
---|---|---|---|---|---|---|---|---|---|
\(pA(a) = pA(b)\)
|
\(pA(a) < pA(b)\)
|
\(bA(a) = bA(b)\)
|
\(bA(a) < bA(b)\)
| ||||||
(a) Prequential AUC
|
(b) Block-based AUC
| ||||||||
4 | 2 | 80 | 8 | 10.00 | 4 | 2 | 72 | 0 |
\(\infty \)
|
4 | 3 | 60 | 8 | 7.50 | 4 | 3 | 72 | 4 | 18.00 |
6 | 2 | 26,756 | 3500 | 7.64 | 6 | 2 | 34,862 | 2010 | 17.34 |
6 | 3 | 12,746 | 4044 | 3.15 | 6 | 3 | 18,206 | 2802 | 6.50 |
6 | 4 | 2856 | 4724 | 0.60 | 6 | 4 | 7234 | 4122 | 1.75 |
6 | 5 | 6528 | 4032 | 1.62 | 6 | 5 | 10,392 | 2712 | 3.83 |
8 | 2 | 16,214,756 | 2,353,182 | 6.89 | 8 | 2 | 21,794,964 | 1,827,868 | 11.92 |
8 | 3 | 6,894,089 | 2,738,339 | 2.52 | 8 | 3 | 12,317,271 | 2,152,120 | 5.72 |
8 | 4 | 1,304,298 | 2,996,764 | 0.44 | 8 | 4 | 5,268,165 | 2,258,632 | 2.33 |
8 | 5 | 2,104,792 | 2,870,944 | 0.73 | 8 | 5 | 4,899,706 | 2,262,634 | 2.17 |
8 | 6 | 356,712 | 2,891,136 | 0.12 | 8 | 6 | 1,403,304 | 2,475,192 | 0.57 |
8 | 7 | 2,652,480 | 2,349,360 | 1.13 | 8 | 7 | 3,974,400 | 1,751,040 | 2.27 |
10 | 2 | 16,474,070,244 | 2,325,714,636 | 7.08 | 10 | 2 | 22,684,607,686 | 1,764,461,488 | 12.86 |
10 | 3 | 6,751,652,794 | 2,692,735,085 | 2.51 | 10 | 3 | 14,797,802,378 | 2,108,350,183 | 7.02 |
10 | 4 | 1,135,333,025 | 2,782,776,394 | 0.41 | 10 | 4 | 4,314,793,199 | 2,648,700,613 | 1.63 |
10 | 5 | 1,523,949,099 | 2,693,165,272 | 0.57 | 10 | 5 | 5,828,349,452 | 2,133,672,486 | 2.73 |
10 | 6 | 205,901,522 | 2,464,416,666 | 0.08 | 10 | 6 | 1,051,552,508 | 2,577,780,700 | 0.41 |
10 | 7 | 837,730,392 | 2,688,388,944 | 0.31 | 10 | 7 | 2,341,618,080 | 2,381,208,000 | 0.98 |
10 | 8 | 163,838,880 | 2,513,982,960 | 0.07 | 10 | 8 | 840,188,880 | 2,230,413,840 | 0.38 |
10 | 9 | 1,707,148,800 | 2,113,695,360 | 0.81 | 10 | 9 | 2,990,655,360 | 1,607,114,880 | 1.86 |
n
|
d
|
\(A(a) < A(b)\) & |
\(A(a) = A(b)\) & |
\(\mathbf {D}\)
|
n
|
d
|
\(A(a) < A(b)\) & |
\(A(a) = A(b)\) & |
\(\mathbf {D}\)
|
---|---|---|---|---|---|---|---|---|---|
\(pA(a) = pA(b)\)
|
\(pA(a) < pA(b)\)
|
\(bA(a) = bA(b)\)
|
\(bA(a) < bA(b)\)
| ||||||
(a) Prequential AUC
|
(b) Block-based AUC
| ||||||||
4 | 2 | 40 | 0 |
\(\infty \)
| 4 | 2 | 44 | 0 |
\(\infty \)
|
4 | 3 | 24 | 0 |
\(\infty \)
| 4 | 3 | 24 | 0 |
\(\infty \)
|
6 | 2 | 15,922 | 1676 | 9.50 | 6 | 2 | 18,876 | 1050 | 17.98 |
6 | 3 | 7050 | 2080 | 3.39 | 6 | 3 | 10,608 | 1042 | 10.18 |
6 | 4 | 1988 | 2408 | 0.83 | 6 | 4 | 3896 | 2156 | 1.81 |
6 | 5 | 2304 | 2040 | 1.13 | 6 | 5 | 2832 | 1416 | 2.00 |
8 | 2 | 10,822,384 | 1,445,690 | 7.49 | 8 | 2 | 14,644,602 | 1,061,404 | 13.80 |
8 | 3 | 4,560,481 | 1,698,162 | 2.69 | 8 | 3 | 8,609,002 | 1,206,625 | 7.13 |
8 | 4 | 904,810 | 1,861,822 | 0.49 | 8 | 4 | 1,978,408 | 1,835,340 | 1.08 |
8 | 5 | 1,156,828 | 1,864,496 | 0.62 | 8 | 5 | 2,718,496 | 1,620,150 | 1.68 |
8 | 6 | 191,496 | 1,831,680 | 0.10 | 8 | 6 | 667,056 | 1,522,224 | 0.44 |
8 | 7 | 646,560 | 1,517,760 | 0.43 | 8 | 7 | 892,800 | 1,100,880 | 0.81 |
10 | 2 | 4,231,464,860 | 512,002,316 | 8.26 | 10 | 2 | 5,671,228,618 | 384,816,444 | 14.74 |
10 | 3 | 1,713,743,507 | 605,486,578 | 2.83 | 10 | 3 | 3,811,053,687 | 451,977,687 | 8.43 |
10 | 4 | 295,397,178 | 680,067,479 | 0.43 | 10 | 4 | 1,134,839,167 | 602,144,260 | 1.88 |
10 | 5 | 297,697,500 | 660,113,562 | 0.45 | 10 | 5 | 824,957,460 | 625,114,249 | 1.32 |
10 | 6 | 43,162,624 | 665,026,736 | 0.06 | 10 | 6 | 203,168,282 | 669,160,838 | 0.30 |
10 | 7 | 104,005,512 | 765,190,632 | 0.14 | 10 | 7 | 513,639,912 | 557,695,848 | 0.92 |
10 | 8 | 79,602,480 | 614,255,040 | 0.13 | 10 | 8 | 252,685,440 | 474,176,160 | 0.53 |
10 | 9 | 114,589,440 | 486,864,000 | 0.24 | 10 | 9 | 249,217,920 | 341,107,200 | 0.73 |
n
|
d
|
\(A(a) < A(b)\) & |
\(A(a) = A(b)\) & |
\(\mathbf {D}\)
|
n
|
d
|
\(A(a) < A(b)\) & |
\(A(a) = A(b)\)& |
\(\mathbf {D}\)
|
---|---|---|---|---|---|---|---|---|---|
\(pA(a) = pA(b)\)
|
\(pA(a) < pA(b)\)
|
\(bA(a) = bA(b)\)
|
\(bA(a) < bA(b)\)
| ||||||
(a) Prequential AUC
|
(b) Block-based AUC
| ||||||||
4 | 2 | 40 | 0 |
\(\infty \)
| 4 | 2 | 44 | 0 |
\(\infty \)
|
4 | 3 | 24 | 0 |
\(\infty \)
| 4 | 3 | 24 | 0 |
\(\infty \)
|
6 | 2 | 3072 | 0 |
\(\infty \)
| 6 | 2 | 3920 | 0 |
\(\infty \)
|
6 | 3 | 1492 | 0 |
\(\infty \)
| 6 | 3 | 2004 | 0 |
\(\infty \)
|
6 | 4 | 1012 | 0 |
\(\infty \)
| 6 | 4 | 1590 | 0 |
\(\infty \)
|
6 | 5 | 720 | 0 |
\(\infty \)
| 6 | 5 | 720 | 0 |
\(\infty \)
|
8 | 2 | 330,624 | 0 |
\(\infty \)
| 8 | 2 | 486,240 | 0 |
\(\infty \)
|
8 | 3 | 142,160 | 0 |
\(\infty \)
| 8 | 3 | 242,400 | 0 |
\(\infty \)
|
8 | 4 | 95,696 | 0 |
\(\infty \)
| 8 | 4 | 149,128 | 0 |
\(\infty \)
|
8 | 5 | 72,616 | 0 |
\(\infty \)
| 8 | 5 | 127,472 | 0 |
\(\infty \)
|
8 | 6 | 53,760 | 0 |
\(\infty \)
| 8 | 6 | 89,280 | 0 |
\(\infty \)
|
8 | 7 | 40,320 | 0 |
\(\infty \)
| 8 | 7 | 40,320 | 0 |
\(\infty \)
|
10 | 2 | 48,867,840 | 0 |
\(\infty \)
| 10 | 2 | 70,413,120 | 0 |
\(\infty \)
|
10 | 3 | 20,498,976 | 0 |
\(\infty \)
| 10 | 3 | 41,840,304 | 0 |
\(\infty \)
|
10 | 4 | 12,365,520 | 0 |
\(\infty \)
| 10 | 4 | 24,038,688 | 0 |
\(\infty \)
|
10 | 5 | 9,097,392 | 0 |
\(\infty \)
| 10 | 5 | 15,670,512 | 0 |
\(\infty \)
|
10 | 6 | 7,406,232 | 0 |
\(\infty \)
| 10 | 6 | 7,733,520 | 0 |
\(\infty \)
|
10 | 7 | 6,048,480 | 0 |
\(\infty \)
| 10 | 7 | 11,146,704 | 0 |
\(\infty \)
|
10 | 8 | 4,339,440 | 0 |
\(\infty \)
| 10 | 8 | 7,896,240 | 0 |
\(\infty \)
|
10 | 9 | 3,628,800 | 0 |
\(\infty \)
| 10 | 9 | 3,628,800 | 0 |
\(\infty \)
|
4.2.4 Summary of results
5 Processing speed, parameter sensitivity, and classifier evaluation
5.1 Experimental setup
-
Naive Bayes (NB),
-
Hoeffding Tree (HT),
-
Hoeffding Tree with ADWIN (HT\(_\mathrm{ADWIN}\)),
-
Online Bagging (Bag),
-
Accuracy Updated Ensemble (AUE),
-
Selectively Recursive Approach (SERA).
5.2 Datasets
Airlines
(Air
) and Electricity
(Elec
) as examples of fairly balanced datasets, and KDDCup
and PAKDD
as examples of moderately imbalanced datasets. It is worth noting that we used the smaller version of the KDDCup
dataset and transformed it into a binary classification problem, by combining every class other than “NORMAL” into one “ATTACK” class.Ratio
datasets are designed to test classifier performance under different imbalance ratios without drift. Examples from the minority class create a five-dimensional sphere, whereas majority class examples are uniformly distributed outside that sphere. The Dis
datasets are created in a similar manner, but the minority class is fragmented into spherical subclusters (playing the role of small disjuncts [32]). Datasets \(\texttt {Dis}_{2}\), \(\texttt {Dis}_{3}\), \(\texttt {Dis}_{5}\) have 2, 3, and 5 clusters, respectively. In the AppDis
datasets, every stream begins with a single well-defined cluster, and additional clusters are added as the stream progresses. New disjuncts appear suddenly in negative class space after 40 k (\(\texttt {AppDis}_{2,3,5}\)), 50 k (\(\texttt {AppDis}_{3,5}\)), 60 and 70 k (\(\texttt {AppDis}_{5}\)) examples. In static data mining, the problem of small disjuncts is known to be more problematic than class imbalance per se [32], however, to the best of our knowledge this issue has not been analyzed in stream classification. Datasets MinMaj
, \(\texttt {Gradual}_\mathrm{RC}\), \(\texttt {Sudden}_\mathrm{RC}\) contain class ratio changes over time. In MinMaj
the negative class abruptly becomes the minority; such a virtual drift relates to a problem recently discussed in [46]. \(\texttt {Sudden}_\mathrm{RC}\) was created using a modified version of the SEA generator [42] and contained three sudden class ratio changes (1:1/1:100/1:10/1:1) appearing every 25 k examples. Analogously, Gradual
\(_\mathrm{RC}\) uses a modified Hyperplane generator [45] and simulates a continuous ratio change from 1:1 to 1:100 throughout the entire stream. We note that Ratio
, Dis,
AppDis
, and MinMaj
are unevenly imbalanced, i.e., for a class ratio of 1:99 a positive example can occur less (or more) frequently than every 100 examples.Dataset | No. of inst (k) |
\(\#\)Attrs | Class ratio | Noise (%) | No. of drifts | Drift type |
---|---|---|---|---|---|---|
\(\texttt {Ratio}_{5050}\)
| 100 | 5 | 1:1 | 0 | 0 | None |
\(\texttt {Ratio}_{1090}\)
| 100 | 5 | 1:9 | 0 | 0 | None |
\(\texttt {Ratio}_{0595}\)
| 100 | 5 | 1:19 | 0 | 0 | None |
\(\texttt {Ratio}_{0199}\)
| 100 | 5 | 1:99 | 0 | 0 | None |
\(\texttt {Dis}_{2}\)
| 100 | 5 | 1:9 | 0 | 0 | None |
\(\texttt {Dis}_{3}\)
| 100 | 5 | 1:9 | 0 | 0 | None |
\(\texttt {Dis}_{5}\)
| 100 | 5 | 1:9 | 0 | 0 | None |
\(\texttt {AppDis}_{2}\)
| 100 | 5 | 1:9 | 0 | 1 | Sudden |
\(\texttt {AppDis}_{3}\)
| 100 | 5 | 1:9 | 0 | 2 | Sudden |
\(\texttt {AppDis}_{5}\)
| 100 | 5 | 1:9 | 0 | 4 | Sudden |
MinMaj
| 100 | 5 | 1:19/19:1 | 0 | 1 | Sud. virt. |
\(\texttt {Gradual}_\mathrm{RC}\)
| 100 | 3 | 1:1 \(\rightarrow \) 1:100 | 5 | 1 | Grad. virt. |
\(\texttt {Sudden}_\mathrm{RC}\)
| 100 | 3 | 1:1/1:100/1:10/1:1 | 10 | 3 | Rec. virt. |
Air
| 539 | 7 |
\(\sim \)24:30 | – | – | Unknown |
Elec
| 45 | 8 |
\(\sim \)19:26 | – | – | Unknown |
KDDCup
| 494 | 41 |
\(\sim \)1:4 | – | – | Unknown |
PAKDD
| 50 | 30 |
\(\sim \)1:4 | – | – | Unknown |
\(\texttt {RBF}_{20\mathrm{k}}\)
| 20 | 20 | 1:1 | 0 | 0 | None |
\(\texttt {RBF}_{20\mathrm{kSD}}\)
| 20 | 20 | 1:1 | 0 | 1 | Sudden |
5.3 Prequential AUC evaluation time
Prequential AUC | Prequential \(\kappa \)
| |
---|---|---|
\(\texttt {RBF}_{20\mathrm{k}}\)
|
\(7.230 \pm 0.158\)
|
\(7.189 \pm 0.246\)
|
\(\texttt {RBF}_{20\mathrm{kSD}}\)
|
\(7.344 \pm 0.519\)
|
\(7.287 \pm 0.243\)
|
5.4 Parameter sensitivity
100 | 250 | 500 | 750 | 1000 | 2000 | 3000 | 4000 | 5000 | Mean | |
---|---|---|---|---|---|---|---|---|---|---|
\(\texttt {Ratio}_{5050}\)
| 0.982 | 0.983 | 0.983 | 0.984 | 0.984 | 0.984 | 0.984 | 0.984 | 0.984 | 0.984 |
\(\texttt {Ratio}_{1090}\)
| 0.987 | 0.987 | 0.987 | 0.987 | 0.986 | 0.986 | 0.986 | 0.986 | 0.986 | 0.986 |
\(\texttt {Ratio}_{0595}\)
| 0.988 | 0.987 | 0.987 | 0.987 | 0.987 | 0.986 | 0.986 | 0.986 | 0.986 | 0.987 |
\(\texttt {Ratio}_{0199}\)
| 0.949 | 0.929 | 0.928 | 0.931 | 0.932 | 0.934 | 0.934 | 0.933 | 0.932 | 0.934 |
\(\texttt {Dis}_{2}\)
| 0.965 | 0.965 | 0.965 | 0.965 | 0.965 | 0.964 | 0.964 | 0.964 | 0.963 | 0.964 |
\(\texttt {Dis}_{3}\)
| 0.966 | 0.966 | 0.966 | 0.966 | 0.965 | 0.965 | 0.964 | 0.964 | 0.963 | 0.965 |
\(\texttt {Dis}_{5}\)
| 0.965 | 0.964 | 0.964 | 0.964 | 0.964 | 0.964 | 0.964 | 0.963 | 0.963 | 0.964 |
\(\texttt {AppDis}_{2}\)
| 0.985 | 0.985 | 0.984 | 0.984 | 0.984 | 0.984 | 0.984 | 0.984 | 0.984 | 0.984 |
\(\texttt {AppDis}_{3}\)
| 0.982 | 0.982 | 0.982 | 0.982 | 0.982 | 0.981 | 0.981 | 0.981 | 0.981 | 0.982 |
\(\texttt {AppDis}_{5}\)
| 0.980 | 0.981 | 0.981 | 0.981 | 0.981 | 0.981 | 0.981 | 0.981 | 0.981 | 0.981 |
MinMaj
| 0.986 | 0.985 | 0.985 | 0.984 | 0.985 | 0.984 | 0.984 | 0.984 | 0.984 | 0.985 |
\(\texttt {Gradual}_\mathrm{RC}\)
| 0.537 | 0.530 | 0.541 | 0.541 | 0.545 | 0.550 | 0.554 | 0.557 | 0.560 | 0.546 |
\(\texttt {Sudden}_\mathrm{RC}\)
| 0.748 | 0.751 | 0.751 | 0.753 | 0.753 | 0.754 | 0.755 | 0.756 | 0.756 | 0.753 |
Air
| 0.651 | 0.652 | 0.652 | 0.653 | 0.653 | 0.655 | 0.657 | 0.659 | 0.660 | 0.655 |
Elec
| 0.871 | 0.864 | 0.860 | 0.857 | 0.856 | 0.855 | 0.856 | 0.856 | 0.857 | 0.859 |
KDDCup
| 0.997 | 0.996 | 0.994 | 0.994 | 0.994 | 0.994 | 0.993 | 0.992 | 0.992 | 0.994 |
PAKDD
| 0.544 | 0.563 | 0.573 | 0.577 | 0.579 | 0.584 | 0.585 | 0.586 | 0.586 | 0.575 |
KDDCup
dataset the mean value of average prequential AUC for all window sizes \(d\in [100;5000]\) is 0.994. Therefore, the deviation for \(d=100\), where prequential AUC for KDDCup
equals 0.997, is: \(\frac{0.997-0.994}{0.994} \times 100\% \approx 0.3\%\).PAKDD
datasets. PAKDD
is the hardest of the analyzed datasets and requires larger windows for better AUC estimates. \(\texttt {Ratio}_{0199}\), on the other hand, is highly skewed and unevenly imbalanced, therefore, in parts of the stream minority examples appear less often than 100 or 200 examples (the largest positive class “gap” in this dataset is 700 examples). It is worth noting that this observation is in accordance with the analysis performed in Sect. 4, where we noted that the window size should be large enough to incorporate at least one positive example at all times. In terms of other dependencies upon d, as discussed in previous sections the memory required to calculate AUC grows linearly to d, whereas the processing time for single example grows logarithmically with respect to d.5.5 Comparison of imbalanced stream evaluation measures
NB | Bag | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Acc. | AUC |
\(G_\mu \)
|
\(\kappa \)
|
\(\kappa _\mathrm{m}\)
| Rec. | Acc. | AUC |
\(G_\mu \)
|
\(\kappa \)
|
\(\kappa _\mathrm{m}\)
| Rec. | |
\(\texttt {Ratio}_{5050}\)
| 0.95 | 0.98 | 0.95 | 0.91 | 0.91 | 0.99 | 0.98 | 0.99 | 0.98 | 0.96 | 0.96 | 1.00 |
\(\texttt {Ratio}_{1090}\)
| 0.95 | 0.98 | 0.84 | 0.73 | 0.54 | 0.72 | 0.97 | 0.99 | 0.95 | 0.84 | 0.70 | 0.94 |
\(\texttt {Ratio}_{0595}\)
| 0.96 | 0.98 | 0.68 | 0.54 | 0.28 | 0.46 | 0.98 | 0.99 | 0.81 | 0.71 | 0.51 | 0.68 |
\(\texttt {Ratio}_{0199}\)
| 0.99 | 0.98 | 0.12 | 0.07 | 0.01 | 0.04 | 0.99 | 0.98 | 0.00 | 0.00 |
\(-\)0.02 | 0.00 |
\(\texttt {Dis}_{2}\)
| 0.92 | 0.95 | 0.56 | 0.41 | 0.22 | 0.32 | 0.94 | 0.97 | 0.72 | 0.59 | 0.39 | 0.56 |
\(\texttt {Dis}_{3}\)
| 0.92 | 0.96 | 0.53 | 0.38 | 0.21 | 0.28 | 0.94 | 0.98 | 0.75 | 0.63 | 0.45 | 0.59 |
\(\texttt {Dis}_{5}\)
| 0.92 | 0.96 | 0.51 | 0.37 | 0.21 | 0.26 | 0.95 | 0.98 | 0.75 | 0.64 | 0.45 | 0.59 |
\(\texttt {AppDis}_{2}\)
| 0.94 | 0.94 | 0.76 | 0.63 | 0.43 | 0.60 | 0.97 | 0.99 | 0.91 | 0.82 | 0.68 | 0.84 |
\(\texttt {AppDis}_{3}\)
| 0.94 | 0.94 | 0.73 | 0.60 | 0.41 | 0.58 | 0.97 | 0.99 | 0.90 | 0.81 | 0.68 | 0.82 |
\(\texttt {AppDis}_{5}\)
| 0.94 | 0.95 | 0.74 | 0.62 | 0.45 | 0.60 | 0.97 | 0.99 | 0.90 | 0.83 | 0.71 | 0.83 |
MinMaj
| 0.95 | 0.98 | 0.80 | 0.60 | 0.05 | 0.69 | 0.98 | 0.99 | 0.85 | 0.80 | 0.69 | 0.77 |
\(\texttt {Gradual}_\mathrm{RC}\)
| 0.93 | 0.67 | 0.25 | 0.16 | 0.47 | 0.12 | 0.93 | 0.66 | 0.33 | 0.20 | 0.48 | 0.14 |
\(\texttt {Sudden}_\mathrm{RC}\)
| 0.81 | 0.71 | 0.48 | 0.33 | 0.40 | 0.50 | 0.88 | 0.77 | 0.64 | 0.51 | 0.64 | 0.53 |
Air
| 0.65 | 0.66 | 0.54 | 0.20 | 0.02 | 0.37 | 0.65 | 0.66 | 0.56 | 0.22 | 0.03 | 0.39 |
Elec
| 0.73 | 0.78 | 0.61 | 0.40 | 0.37 | 0.91 | 0.82 | 0.91 | 0.80 | 0.63 | 0.58 | 0.86 |
KDDCup
| 0.98 | 0.98 | 0.74 | 0.28 | 0.40 | 0.94 | 1.00 | 1.00 | 0.87 | 0.51 | 0.95 | 0.99 |
PAKDD
| 0.54 | 0.64 | 0.51 | 0.11 |
\(-\)1.32 | 0.64 | 0.80 | 0.62 | 0.00 | 0.00 | 0.00 | 0.00 |
HT | HT\(_\mathrm{ADWIN}\)
| |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Acc. | AUC |
\(G_\mu \)
|
\(\kappa \)
|
\(\kappa _\mathrm{m}\)
| Rec. | Acc. | AUC |
\(G_\mu \)
|
\(\kappa \)
|
\(\kappa _\mathrm{m}\)
| Rec. | |
\(\texttt {Ratio}_{5050}\)
| 0.97 | 0.98 | 0.97 | 0.95 | 0.95 | 1.00 | 0.98 | 0.99 | 0.98 | 0.96 | 0.95 | 1.00 |
\(\texttt {Ratio}_{1090}\)
| 0.97 | 0.99 | 0.94 | 0.83 | 0.68 | 0.92 | 0.97 | 0.99 | 0.95 | 0.83 | 0.68 | 0.93 |
\(\texttt {Ratio}_{0595}\)
| 0.97 | 0.99 | 0.83 | 0.71 | 0.49 | 0.71 | 0.97 | 0.98 | 0.84 | 0.70 | 0.46 | 0.73 |
\(\texttt {Ratio}_{0199}\)
| 0.99 | 0.93 | 0.04 | 0.02 |
\(-\)0.02 | 0.01 | 0.99 | 0.93 | 0.04 | 0.02 |
\(-\)0.01 | 0.01 |
\(\texttt {Dis}_{2}\)
| 0.93 | 0.96 | 0.73 | 0.58 | 0.34 | 0.57 | 0.93 | 0.96 | 0.74 | 0.58 | 0.33 | 0.59 |
\(\texttt {Dis}_{3}\)
| 0.94 | 0.97 | 0.74 | 0.60 | 0.39 | 0.57 | 0.94 | 0.96 | 0.77 | 0.63 | 0.41 | 0.63 |
\(\texttt {Dis}_{5}\)
| 0.94 | 0.96 | 0.74 | 0.61 | 0.39 | 0.57 | 0.94 | 0.97 | 0.77 | 0.63 | 0.40 | 0.63 |
\(\texttt {AppDis}_{2}\)
| 0.97 | 0.98 | 0.91 | 0.81 | 0.65 | 0.84 | 0.97 | 0.98 | 0.91 | 0.81 | 0.65 | 0.86 |
\(\texttt {AppDis}_{3}\)
| 0.96 | 0.98 | 0.89 | 0.80 | 0.65 | 0.82 | 0.97 | 0.98 | 0.92 | 0.82 | 0.67 | 0.86 |
\(\texttt {AppDis}_{5}\)
| 0.97 | 0.98 | 0.90 | 0.82 | 0.69 | 0.83 | 0.96 | 0.98 | 0.88 | 0.79 | 0.65 | 0.79 |
MinMaj
| 0.98 | 0.98 | 0.87 | 0.80 | 0.68 | 0.79 | 0.98 | 0.98 | 0.87 | 0.80 | 0.66 | 0.82 |
\(\texttt {Gradual}_\mathrm{RC}\)
| 0.93 | 0.63 | 0.36 | 0.22 | 0.48 | 0.16 | 0.93 | 0.63 | 0.37 | 0.22 | 0.48 | 0.17 |
\(\texttt {Sudden}_\mathrm{RC}\)
| 0.88 | 0.75 | 0.64 | 0.51 | 0.63 | 0.53 | 0.88 | 0.75 | 0.65 | 0.52 | 0.64 | 0.54 |
Air
| 0.65 | 0.65 | 0.56 | 0.22 | 0.04 | 0.39 | 0.64 | 0.61 | 0.54 | 0.18 | 0.00 | 0.44 |
Elec
| 0.79 | 0.86 | 0.77 | 0.56 | 0.51 | 0.83 | 0.83 | 0.87 | 0.81 | 0.65 | 0.61 | 0.89 |
KDDCup
| 1.00 | 0.99 | 0.91 | 0.50 | 0.92 | 0.99 | 1.00 | 0.99 | 0.80 | 0.46 | 0.89 | 0.99 |
PAKDD
| 0.80 | 0.58 | 0.05 | 0.00 |
\(-\)0.01 | 0.01 | 0.80 | 0.59 | 0.04 | 0.00 |
\(-\)0.01 | 0.01 |
AUE | SERA | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Acc. | AUC |
\(G_\mu \)
|
\(\kappa \)
|
\(\kappa _\mathrm{m}\)
| Rec. | Acc. | AUC |
\(G_\mu \)
|
\(\kappa \)
|
\(\kappa _\mathrm{m}\)
| Rec. | |
\(\texttt {Ratio}_{5050}\)
| 0.97 | 0.99 | 0.97 | 0.95 | 0.94 | 0.98 | 0.98 | 0.99 | 0.97 | 0.95 | 0.95 | 0.98 |
\(\texttt {Ratio}_{1090}\)
| 0.97 | 0.99 | 0.92 | 0.85 | 0.75 | 0.90 | 0.98 | 0.99 | 0.95 | 0.89 | 0.80 | 0.93 |
\(\texttt {Ratio}_{0595}\)
| 0.98 | 0.98 | 0.78 | 0.69 | 0.52 | 0.66 | 0.98 | 0.99 | 0.91 | 0.82 | 0.68 | 0.85 |
\(\texttt {Ratio}_{0199}\)
| 0.99 | 0.97 | 0.00 | 0.00 |
\(-\)0.02 | 0.00 | 0.99 | 0.98 | 0.41 | 0.36 | 0.23 | 0.28 |
\(\texttt {Dis}_{2}\)
| 0.96 | 0.98 | 0.80 | 0.71 | 0.59 | 0.70 | 0.98 | 0.99 | 0.92 | 0.85 | 0.75 | 0.89 |
\(\texttt {Dis}_{3}\)
| 0.96 | 0.98 | 0.80 | 0.72 | 0.59 | 0.70 | 0.97 | 0.98 | 0.91 | 0.84 | 0.75 | 0.87 |
\(\texttt {Dis}_{5}\)
| 0.95 | 0.97 | 0.74 | 0.65 | 0.51 | 0.61 | 0.97 | 0.98 | 0.90 | 0.83 | 0.72 | 0.85 |
\(\texttt {AppDis}_{2}\)
| 0.97 | 0.98 | 0.89 | 0.82 | 0.72 | 0.83 | 0.97 | 0.98 | 0.90 | 0.83 | 0.73 | 0.84 |
\(\texttt {AppDis}_{3}\)
| 0.97 | 0.98 | 0.87 | 0.81 | 0.70 | 0.81 | 0.97 | 0.98 | 0.88 | 0.81 | 0.70 | 0.80 |
\(\texttt {AppDis}_{5}\)
| 0.97 | 0.98 | 0.87 | 0.81 | 0.71 | 0.80 | 0.97 | 0.98 | 0.85 | 0.79 | 0.68 | 0.75 |
MinMaj
| 0.98 | 0.98 | 0.81 | 0.76 | 0.65 | 0.74 | 0.98 | 0.99 | 0.93 | 0.81 | 0.62 | 0.89 |
\(\texttt {Gradual}_\mathrm{RC}\)
| 0.92 | 0.64 | 0.17 | 0.09 | 0.43 | 0.07 | 0.93 | 0.65 | 0.46 | 0.33 | 0.52 | 0.24 |
\(\texttt {Sudden}_\mathrm{RC}\)
| 0.87 | 0.76 | 0.64 | 0.51 | 0.61 | 0.54 | 0.87 | 0.76 | 0.65 | 0.52 | 0.62 | 0.55 |
Air
| 0.67 | 0.67 | 0.55 | 0.23 | 0.08 | 0.44 | 0.61 | 0.61 | 0.55 | 0.16 |
\(-\)0.07 | 0.42 |
Elec
| 0.76 | 0.82 | 0.71 | 0.49 | 0.42 | 0.81 | 0.75 | 0.82 | 0.69 | 0.46 | 0.40 | 0.81 |
KDDCup
| 0.87 | 0.96 | 0.38 | 0.11 |
\(-\)15.83 | 0.43 | 0.99 | 0.99 | 0.81 | 0.57 |
\(-\)0.06 | 0.97 |
PAKDD
| 0.80 | 0.62 | 0.01 | 0.00 | 0.00 | 0.00 | 0.80 | 0.64 | 0.14 | 0.03 | 0.00 | 0.03 |
Ratio
datasets (from 1:1 to 1:99), values of all four aforementioned measures systematically drop. AUC, on the other hand, performs differently. This is the result of AUC being a measure that assesses the ranking capabilities of a classifier. On datasets where the classifiers rank most examples correctly (Ratio
, Dis
, AppDis
, MinMaj
) AUC reports values similar to accuracy. On the other hand, on datasets where the classifiers are unable to correctly rank examples (\(\texttt {Gradual}_\mathrm{RC}\), \(\texttt {Sudden}_\mathrm{RC}\), PAKDD
) AUC values are lower than accuracy. This phenomenon is known from static data mining, where it was reported that AUC and accuracy are consistent measures, with AUC being more discriminating on some datasets [30].NB | Bag | HT | HT\(_\mathrm{ADWIN}\)
| AUE | SERA | |
---|---|---|---|---|---|---|
Accuracy | 5.47 |
2.47
| 3.59 | 3.88 | 3.06 | 2.53 |
AUC | 4.88 |
1.59
| 3.76 | 4.29 | 3.82 | 2.65 |
G-mean | 5.35 | 3.21 | 3.12 | 2.53 | 4.38 |
2.41
|
Kappa | 5.24 | 3.06 | 3.76 | 3.24 | 3.71 |
2.00
|
Kappa M | 5.41 |
2.41
| 3.88 | 4.00 | 2.76 | 2.53 |
Recall | 4.94 | 3.38 | 3.35 |
2.41
| 4.32 | 2.59 |
6 Conclusions and outlook
-
Prequential AUC was compared with earlier, limited attempts to calculate AUC for data streams. As a result, prequential AUC was found to give less pessimistic estimations and, therefore, preferable to batch, incremental, and block AUC estimations for streams with concept drifts.
-
The proposed measure was found to be statistically consistent and comparably discriminant with AUC calculated on stationary data.
-
Prequential AUC was also positively evaluated in terms of processing speed. This stands in contrast to batch-calculated AUC, and is one of the key features making prequential AUC applicable to real-world streams.
-
The window size used for calculating AUC with forgetting was found to have negligible impact on the resulting AUC estimation.
-
We performed a comparative study of prequential accuracy, AUC, Kappa, Kappa M, G-mean, and recall. On a set of imbalanced data streams with various difficulty factors, the analyzed measures reacted differently to drifts and showcased characteristic properties concerning: focus on one or both classes, sensitivity to class ratio changes or minority–majority class swaps, ranking and crisp classification assessment.