1 Introduction
2 Principles and an illustrative example
Test ID | Input1 | Input2 | Input3 | Output |
---|---|---|---|---|
T1 | 0 | 1301 | 1 | INVALID |
T2 | 1108 | 1 | 1 | INVALID |
T3 | 1 | 0 | −665 | INVALID |
T4 | 1 | 1 | 0 | INVALID |
T5 | 582 | 582 | 582 | EQUILATERAL |
T6 | 1 | 1088 | 15 | INVALID |
T7 | 1 | 2 | 450 | INVALID |
T8 | 1663 | 1088 | 823 | SCALENE |
T9 | 1187 | 1146 | 1 | INVALID |
T10 | 1640 | 1640 | 1956 | ISOSCELES |
T11 | 784 | 784 | 1956 | INVALID |
T12 | 1 | 450 | 1 | INVALID |
T13 | 1146 | 1 | 1146 | ISOSCELES |
T14 | 1640 | 1956 | 1956 | ISOSCELES |
T15 | −1 | 1 | 1 | INVALID |
T16 | 1 | −1 | 1 | INVALID |
T17 | 1 | 2 | 3 | SCALENE |
T18 | 2 | 3 | 1 | SCALENE |
T19 | 3 | 1 | 2 | SCALENE |
T20 | 1 | 1 | 2 | INVALID |
T21 | 1 | 2 | 1 | INVALID |
T22 | 2 | 1 | 1 | INVALID |
T23 | 1 | 1 | 1 | EQUILATERAL |
T24 | 0 | 1 | 1 | INVALID |
T25 | 1 | 0 | 1 | INVALID |
T26 | 1 | 2 | −1 | INVALID |
T27 | 1 | 1 | −1 | INVALID |
T28 | 0 | 0 | 0 | INVALID |
T29 | 3 | 2 | 5 | SCALENE |
T30 | 5 | 9 | 2 | INVALID |
T31 | 7 | 4 | 3 | SCALENE |
T32 | 3 | 8 | 3 | INVALID |
T33 | 7 | 3 | 3 | INVALID |
T34 | 1108 | 1 | 1 | ISOSCELES |
T35 | 1108 | 2 | 2 | ISOSCELES |
+<0,1301,1,INVALID>,<1108,1,1,INVALID> ...<1108,2,2,ISOSCELES>+
) and then apply a clustering algorithm. This groups the data into 4 clusters, illustrated by Fig. 2, one large cluster containing 29 items and 3 much smaller ones containing 1, 2 and 3 items, respectively. By concentrating first of all on the small clusters, the tester would find two failing outputs after examining just six results: these are T34 and T35 which appear in cluster 3 along with the passing case T13 (for information cluster 1 contains T8 and cluster 2 contains T10 and T14). At this point the programmer may feel that they have enough evidence that the program is not working and choose to stop examining test results and work on debugging the program. This evidence has been obtained after looking at just a fraction of all test outputs, saving the developer time and making the testing process much more efficient.3
3 Background and related work
3.1 Test oracles based on invariant detection
3.2 Test oracles based on anomaly detection techniques
4 Clustering techniques
4.1 Agglomerative hierarchical clustering
4.2 DBSCAN
4.3 Expectation-maximization (EM) clustering algorithm
5 Experimental evaluation
5.1 Subject programs
5.2 Experimental set-up
Input | Output | |
---|---|---|
Nanoxml |
Flower
colour=“Red”
| Xml element name is: Flower
|
Smell=“Sweet”
| ||
Name=“Rose”
| ||
Season=“Spring”
| ||
Encoding |
FCRSSNRSS
|
F
|
Siena | Filter senp{x=0}filter{x=20 y=30 z=10}Event senp{x=0}event{x=20}senp{x=0}event{y=30 z=10} |
Subscribing for filter{x=20 y=30 z=10}publishing for event{x=20}publishing for event{y=30 z=10} |
Encoding |
F111E1E11
|
SF111PE1PE11
|
Sed |
sed -e ’s/dog/cat/’ ../inputs/default.in
| The modified text file (change and add operations) |
Encoding |
sed-es/dog/cat/<1>
|
114a36c34c29c26c3|4c0a
|
Sequence traces | Hash key values |
---|---|
net.n3.nanoxml.XMLElement. | 0L |
getFullName():::EXIT283 | |
net.n3.nanoxml.XMLUtil.skipWhitespace | A |
(net.n3.nanoxml.IXMLReader,char, java.lang.StringBuffer, boolean[]):::ENTER | |
net.n3.nanoxml.StdXMLReader. | 37 |
getEncoding(java.lang.String):::ENTER | |
\(\vdots\)
|
\(\vdots\)
|
5.3 Evaluation of clustering techniques
-
Small clusters are those of average size or less (i.e. (number of data points)/(number of clusters)). In the above example, the average cluster size is (21/6) = 3.5, so the small clusters are all of these containing \(\le\) 3 data points (i.e. clusters 1, 2 and 3).
-
Precision: Five of the outputs in the 3 small clusters are failures (TPs) and one is a pass (FP), so PR = 5/(5 + 1) = 0.83
-
Recall: Five of the outputs in the 3 small clusters are failures (TPs) but 5 failures also ended up being allocated to the “large” clusters (TNs), so RE = 5/(5 + 5) = 0.5
-
The F-measure is then \(2 \times (0.83 \times 0.5)/(0.83 + 0.5) = 0.62\).
6 Experiment 1 (clustering test input/output pairs): results and discussion
6.1 Distribution of failures
6.2 Failures found versus cluster counts and cluster sizes
Cluster details | NanoXML version | ||||||||
---|---|---|---|---|---|---|---|---|---|
(% tests) | V1 | V2 | V3 | V5 | |||||
Count (%) | Size (%) | % | F | % | F | % | F | % | F |
Single linkage | |||||||||
1 | 50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
5 | 10 | 14.81 | 0.2398 | 18.30 | 0.2763 | 7.14 | 0.1187 | 26.15 | 0.1367 |
10 | 3.5 | 53.08 | 0.6322 | 63.38 | 0.7142 | 50.0 | 0.5931 | 40.0 | 0.4521 |
15 | 3 | 56.79 | 0.6215 | 63.38 | 0.6521 | 65.71 | 0.7105 | 61.53 | 0.6011 |
20 | 2.5 | 56.79 | 0.5679 | 63.38 | 0.6080 | 72.85 | 0.7554 | 66.15 | 0.5771 |
25 | 2 | 56.79 | 0.5379 | 63.38 | 0.5555 | 74.28 | 0.6886 | 66.15 | 0.5407 |
Average linkage | |||||||||
1 | 50 | 7.4 | 0.070 | 28.1 | 0.0394 | 100 | 1 | 10.76 | 0.0907 |
5 | 10 | 56.79 | 0.6012 | 63.38 | 0.6337 | 34.28 | 0.4658 | 26.75 | 0.2984 |
10 | 6.25 | 56.79 | 0.5840 | 63.38 | 0.6337 | 45.71 | 0.5764 | 61.53 | 0.5969 |
15 | 3.25 | 56.79 | 0.5643 | 63.38 | 0.6293 | 82.85 | 0.811 | 52.30 | 0.5311 |
20 | 2.5 | 51.85 | 0.5121 | 54.92 | 0.5164 | 75.71 | 0.7065 | 52.30 | 0.4329 |
25 | 2.25 | 65.43 | 0.5578 | 61.97 | 0.5534 | 75.71 | 0.6623 | 61.53 | 0.4847 |
Complete linkage | |||||||||
1 | 50 | 12.30 | 0.1209 | 28.1 | 0.0436 | 100 | 1 | 10.76 | 0.0887 |
5 | 10 | 12.30 | 0.1673 | 29.57 | 0.2673 | 20.0 | 0.3333 | 26.15 | 0.2832 |
10 | 6.25 | 35.80 | 0.3693 | 17.71 | 0.493 | 67.14 | 0.8033 | 44.61 | 0.3999 |
15 | 3.12 | 59.25 | 0.5643 | 46.47 | 0.4176 | 84.28 | 0.8193 | 55.38 | 0.4443 |
20 | 2.5 | 51.85 | 0.5123 | 54.92 | 0.5130 | 75.71 | 0.6541 | 52.30 | 0.4329 |
25 | 2.25 | 54.38 | 0.5145 | 64.78 | 0.5677 | 075.71 | 0.6623 | 53.84 | 0.4373 |
Cluster details | Siena version | Cluster details | Sed version | ||||
---|---|---|---|---|---|---|---|
(% tests) | V2 | (% tests) | V5 | ||||
Count (%) | Size (%) | % | F | Count (%) | Size (%) | % | F |
Single linkage | |||||||
1 | 19.8 | 0 | 0 | 1 | 19.8 | 14.9 | 0.266 |
5 | 4 | 3.57 | 0.0357 | 5 | 6.4 | 23.6 | 0.2954 |
10 | 2 | 40.47 | 0.3415 | 10 | 2.685 | 23.6 | 0.2903 |
15 | 1.21 | 48.80 | 0.4431 | 15 | 1.69 | 23.6 | 0.2427 |
20 | 0.79 | 72.61 | 0.6522 | 20 | 1.22 | 27.7 | 0.2375 |
25 | 0.6 | 60.71 | 0.5397 | 25 | 1 | 36.1 | 0.2723 |
Average linkage | |||||||
1 | 19.8 | 0 | 0 | 1 | 19.8 | 0 | 0 |
5 | 4 | 16.66 | 0.1325 | 5 | 6.46 | 9.7 | 0.1605 |
10 | 2 | 41.66 | 0.3909 | 10 | 2.628 | 12.5 | 0.1680 |
15 | 1.21 | 41.66 | 0.3703 | 15 | 1.563 | 18.0 | 0.2041 |
20 | 79 | 67.85 | 0.6194 | 20 | 1.08 | 25.0 | 0.2448 |
25 | 6 | 75.0 | 0.6236 | 25 | 0.808 | 29.1 | 0.2542 |
Complete linkage | |||||||
1 | 19.80 | 0 | 0 | 1 | 20 | 9.7 | 0.1662 |
5 | 4 | 33.33 | 0.1811 | 5 | 6.466 | 22.2 | 0.2641 |
10 | 2 | 47.61 | 0.3477 | 10 | 2.71 | 33.3 | 0.3836 |
15 | 1.21 | 66.66 | 0.4932 | 15 | 1.69 | 29.1 | 0.2995 |
20 | 79 | 72.61 | 0.6522 | 20 | 1.24 | 36.1 | 0.3074 |
25 | 6 | 67.1 | 0.5367 | 25 | 0.968 | 41.6 | 0.3058 |
Systems | Cluster details | EM | ||
---|---|---|---|---|
Count (%) | Size (%) | % | F | |
Nanoxml V1 | 1.94 | 25 | 49.38 | 0.6557 |
Nanoxml V2 | 2.42 | 20.2 | 50.70 | 0.2950 |
Nanoxml V3 | 2.42 | 20 | 62.85 | 0.4398 |
Nanoxml V5 | 1.45 | 33 | 64.28 | 0.7757 |
Siena V2 | 2.02 | 16.66 | 35.71 | 0.2238 |
Sed V5 | 2.71 | 9.9 | 5.55 | 0.0655 |
Systems | Cluster details | DBSCAN | ||
---|---|---|---|---|
Count (%) | Size (%) | % | F | |
Nanoxml V1 | 19.9 | 2.68 | 25.92 | 0.2914 |
Nanoxml V2 | 19.9 | 2.68 | 22.53 | 0.2422 |
Nanoxml V3 | 19.9 | 2.70 | 25.11 | 0.2664 |
Nanoxml V5 | 19.9 | 2.60 | 16.92 | 0.1758 |
Siena V2 | 4.45 | 4.54 | 3.57 | 0.0357 |
Sed V5 | 48.23 | 1 | 83.3 | 0.3091 |
6.3 Failure density of smallest clusters
Cluster size | Failures found | Cumulative |
---|---|---|
Version 1 (25 %) | ||
1, 13, 0.48 % |
F1:3, F2:1, F6:1 | 3/7 |
2, 13, 0.97 % | F1:4, F2:2, F6:2 | 3/7 |
3, 4, 1.45 % | F6:3 | 3/7 |
4, 8, 1.94 % | F2:16, F5:8, F7:8 | 5/7 |
Version 2 (15 %) | ||
1, 7, 0.48 % |
F1:3, F2:1, F6:1 | 3/7 |
2, 3, 0.97 % | F6:2 | 3/7 |
3, 5, 1.45 % | F6:3 | 3/7 |
4, 6, 1.94 % | F2:8, F5:8, F7:8 | 5/7 |
5, 2, 2.42 % | F2:5 | 5/7 |
6, 1, 2.91 % | F2:6 | 5/7 |
Version 3 (15 %) | ||
1, 10, 0.48 % |
F1:4, F2:1, F3:1, F4:2, F6:1 | 5/7 |
2, 4, 0.97 % | F1:1, F4:2, F6:2 | 5/7 |
3, 5, 1.45 % | F4:6, F6:3 | 5/7 |
4, 6, 1.94 % | F2:8, F5:8, F7:8 | 7/7 |
5, 2, 2.42 % | F2:5 | 7/7 |
6, 1, 2.91 % | F2:6 | 7/7 |
Version 5 (25 %) | ||
1, 13, 0.48 % |
F1:3, F2:1 | 2/8 |
2, 14, 0.97 % | F1:2, F2:2 | 2/8 |
3, 8, 1.45 % | F2:3 | 2/8 |
4, 7, 1.94 % | F2:28 | 2/8 |
Complete linkage (25 %) | ||
---|---|---|
Cluster size | Failures found | Cumulative |
1, 62, 0.19 % | (F1:7, F2:4, F3:7) | 3/4 |
2, 16, 0.38 % | (F1:2, F3:4) | 3/4 |
3, 5, 0.57 % | (F2:3, F3:3) | 3/4 |
4, 2, 0.76 % | (–) | 3/4 |
5, 1, 0.96 % | (–) | 3/4 |
7 Experiment 2 (clustering test input/output pairs and execution traces): results and discussion
7.1 Distribution of failures over clusters
7.2 Failure composition of small clusters
Cluster details | NanoXML version | ||||
---|---|---|---|---|---|
(% tests) | V1 | V2 | V3 | V5 | |
Count (%) | Size (%) | (%, F) | (%, F) | (%, F) | (%, F) |
Single linkage | |||||
1 | 50 |
(64.28, 0.72)
| (0, 0) | (2.89, 0.03) | (0, 0) |
5 | 11.95 | (5.71, 0.09) | (5, 0.08) | (34.78, 0.5) | (40, 0.53) |
10 | 6.37 | (47.14, 0.59) | (40, 0.43) | (66.66, 0.79) | (41.53, 0.53) |
15 | 5.03 |
(64.28, 0.73)
|
(78.33, 0.84)
|
(82.60, 0.84)
| (60, 0.63) |
20 | 3.31 | (57.14, 0.65) | (68.33, 0.72) | (73.91, 0.75) | (60, 0.6) |
25 | 2.76 | (58.57, 0.62) | (68.33, 0.70) | (69.56, 0.70) | (44.61, 0.44) |
Average linkage | |||||
1 | 50 | (0, 0) |
(100, 0.94)
|
(84.05, 0.91)
|
(100, 1)
|
5 | 12.04 | (7.14, 0.11) | (6.66, 0.11) | (14.49, 0.24) | (9.23, 0.14) |
10 | 6.47 | (44.28, 0.56) | (48.33, 0.64) | (34.78, 0.47) | (26.15, 0.35) |
15 | 4.30 |
(64.28, 0.73)
|
(78.33, 0.81)
| (68.11, 0.74) |
(60, 0.61)
|
20 | 3.27 | (64.28, 0.58) | (68.33, 0.61) | (75.36, 0.71) | (60, 0.56) |
25 | 2.74 | (58.57, 0.55) | (70, 0.57) | (69.56, 0.64) | (46.15, 0.37) |
Complete linkage | |||||
1 | 50 | (0, 0) |
(100, 1)
|
(100, 0.98)
|
(100, 1)
|
5 | 11.94 | (0, 0) | (3.33, 0.004) | (26.08, 0.40) | (43.07, 0.60) |
10 | 6.39 | (2.85, 0.02) | (3.33, 0.03) | (49.27, 0.65)) |
(70.76, 0.83)
|
15 | 4.27 | (31.42, 0.36) | (10, 0.08) |
(85.50, 0.88)
| (70.76, 0.74) |
20 | 3.37 | (52.85, 0.55) | (45, 0.45) | (76.81, 0.72) | (61.53, 0.60) |
25 | 2.73 |
(58.57, 0.57)
|
(70, 0.69)
| (69.56, 0.67) | (46.15, 0.42) |
Cluster details | Siena version | |||
---|---|---|---|---|
(% tests) | V2 | V4 | V6 | |
Count (%) | Size (%) | (%, F) | (%, F) | (%, F) |
Single linkage | ||||
1 | 19.8 | (0, 0) | (0, 0) | (0, 0) |
5 | 4 | (17.85, 0.21) | (17.85, 0.15) | (17.85, 0.15) |
10 | 1.99 | (34.52, 0.31) | (34.52, 0.31) | (34.52, 0.31) |
15 | 1.21 | (61.90, 0.47) | (61.90, 0.47) | (61.90, 0.47) |
20 | 0.79 |
(75, 0.66)
|
(75, 0.66)
|
(75, 0.66)
|
25 | 0.6 | (60.71, 0.48) | (75, 0.62) | (75, 0.62) |
Average Linkage:
| ||||
1 | 20.13 |
(100, 0.96)
|
(100, 1)
|
(100, 1)
|
5 | 4 | (23.80, 0.19) | (23.80, 0.20) | (23.80, 0.20) |
10 | 1.98 |
(75, 0.65)
|
(75, 0.65)
|
(75, 0.65)
|
15 | 1.20 | (75, 0.57) | (75, 0.57) | (75, 0.57) |
20 | 0.81 | (71.42, 0.60) | (75, 0.62) | (75, 0.62) |
25 | 0.6 | (75, 0.64) | (75, 0.62) | (75, 0.62) |
Complete Linkage:
| ||||
1 | 20 |
(100, 1)
|
(100, 1)
|
(100, 1)
|
5 | 4.04 |
(100, 0.89)
|
(100, 0.89)
|
(100, 0.89)
|
10 | 1.99 | (60.71, 0.56) | (60.71, 0.56) | (60.71, 0.56) |
15 | 1.27 | (71.42, 0.52) | (71.42, 0.53) | (71.42, 0.53) |
20 | 0.79 | (75, 0.66) | (75, 0.66) | (75, 0.66) |
25 | 0.6 | (75, 0.62) | (75, 0.62) | (75, 0.62) |
Cluster details | Sed version | ||
---|---|---|---|
(% tests) | V5 | ||
Count (%) | Size (%) | % | F |
Single linkage | |||
1 | 19.8 | 24.2 | 0.3164 |
5 | 6.4 | 27.2 | 0.3265 |
10 | 2.6 | 16.6 | 0.2128 |
15 | 1.65 | 24.2 | 0.2402 |
20 | 1.2 |
34.8
|
0.2872
|
25 | 1 |
34.8
| 0.1839 |
Average linkage | |||
1 | 19.8 | 24.2 | 0.3164 |
5 | 7.6 | 13.6 | 0.2040 |
10 | 2.68 | 16.66 | 0.2053 |
15 | 1.67 | 25.7 | 0.2513 |
20 | 1.2 | 36.3 |
0.2960
|
25 | 1 |
39.39
| 0.2651 |
Complete linkage | |||
1 | 20 | 12.1 | 0.1948 |
5 | 6.6 | 22.7 | 0.2880 |
10 | 2.71 | 33.3 | 0.3757 |
15 | 1.69 | 28.0 | 0.2775 |
20 | 1.22 | 31.8 | 0.2589 |
25 | 1 |
37.8
|
0.2933
|
Systems | Cluster details | DBSCAN | |
---|---|---|---|
Count (%) | Size (%) | (%, F) | |
Nanoxml V1 | 50 | 1.425 | (32.85, 0.31) |
Nanoxml V2 | 15 | 5.1 | (78.33, 0.81) |
Nanoxml V3 | 15 | 4.08 | (79.71, 0.81) |
Nanoxml V5 | 15 | 3.78 | (69.23, 0.67) |
Siena V2 | 6 | 3.33 | (53.57, 0.48) |
Siena V4 | 6 | 3.33 | (53.57, 0.48) |
Siena V6 | 6 | 3.33 | (53.57, 0.48) |
Sed V5 | 8 | 3.5 | (9.0, 0.106) |
Systems | Cluster details | EM | |
---|---|---|---|
Count (%) | Size (%) | (%, F) | |
Nanoxml V1 | 2.41 | 20 | (40, 0.42) |
Nanoxml V2 | 1.93 | 25.25 | 0 |
Nanoxml V3 | 2.41 | 19.8 | (5.79, 0.09) |
Nanoxml V5 | 1.44 | 33.33 | 0 |
Siena V2 | 1.41 | 14.28 | 0 |
Siena V4 | 1.41 | 14.28 | 0 |
Siena V6 | 1.41 | 14.28 | 0 |
Sed V5 | 1 | 33.33 | (9.00, 0.085) |
7.3 Fault density of smallest clusters
Cluster size | Failures found | Cumulative |
---|---|---|
Version 1 (15 %) | ||
1, 10, 0.67 % | (F1:2, F2:2, F6:1) | 3/7 |
2, 3, 1.34 % | (F1:2, F6:2) | 3/7 |
3, 2, 2.01 % | (F2:3, F6:3) | 3/7 |
4, 5, 2.68 % | (F2:4, F5:8, F7:8) | 5/7 |
5, 1, 3.35 % | (F2:4) | 5/7 |
6, 1, 4.02 % | (F2:6) | 5/7 |
Version 2 (15 %) | ||
1, 8, 0.67 % | (F1:3, F2:2, F6:2) | 3/7 |
2, 4, 1.34 % | (F1:2, F6:4) | 3/7 |
3, 3, 2.01 % | (F2:3) | 3/7 |
4, 5, 2.68 % | (F2:4, F5:8, F7:8) | 5/7 |
5, 1, 3.35 % | (F2:5) | 5/7 |
6, 1, 4.02 % | (F2:6) | 5/7 |
Version 3 (15 %) | ||
1, 8, 0.59 % | (F1:3, F2:1, F4:2, F6:1) | 4/7 |
2, 4, 1.18 % | (F1:2, F4:2, F6:2) | 4/7 |
3, 5, 1.77 % | (F4:3, F6:6) | 4/7 |
4, 6, 2.36 % | (F2:8, F5:8, F7:8) | 6/7 |
5, 1, 2.95 % | (F5:5) | 6/7 |
6, 1, 3.55 % | (F2:6) | 6/7 |
Version 3 (15 %) | ||
1, 7, 0.62 % | (F1:1, F2:1) | 2/8 |
2, 4, 1.25 % | (F1:2) | 2/8 |
3, 3, 1.88 % | (-) | 2/8 |
4, 6, 2.51 % | (F2:24) | 2/8 |
5, 1, 3.14 % | (F1:5) | 2/8 |
6, 1, 3.77 % | (F2:6) | 2/8 |
Version 2 (5 %) | ||
---|---|---|
Cluster size | Failures found | Cumulative |
1, 5, 0.20 % | (–) | −/1 |
2, 1, 0.40 % | (F:2) | 1/1 |
3, 8, 0.60 % | (F:24) | 1/1 |
4, 1, 0.80 % | (F:4) | 1/1 |
6, 3, 1.21 % | (F:12) | 1/1 |
8, 1, 1.61 % | (–) | 1/1 |
9, 1, 1.82 % | (F:9) | 1/1 |
11, 3, 2.22 % | (F:33) | 1/1 |
Complete linkage (25 %) | ||
---|---|---|
Cluster size | Failures found | Cumulative |
1, 59, 0.19 % | (F1:8, F2:1, F3:7) | 3/4 |
2, 17, 0.38 % | (F2:2, F3:4) | 3/4 |
3, 6, 0.57 % | (F3:3) | 3/4 |
4, 2, 0.76 % | (–) | 3/4 |
5, 1, 0.96 % | (–) | 3/4 |
7.4 Impact of failure density
NanoXML | ||||||
---|---|---|---|---|---|---|
Cluster count (%) | 10 % | 5 % | 1 % | |||
Failures found (%) | F-measure | Failures found (%) | F-measure | Failures found (%) | F-measure | |
1 | 100 | 1 | 100 | 1 | 100 | 1 |
5 | 100 | 1 | 100 | 1 | 100 | 0.43 |
10 | 100 | 0.88 | 100 | 0.60 | 100 | 0.11 |
15 | 100 | 0.72 | 100 | 0.43 | 100 | 0.09 |
20 | 100 | 0.59 | 100 | 0.33 | 100 | 0.07 |
25 | 100 | 0.56 | 100 | 0.34 | 100 | 0.09 |
Siena
| ||||||
---|---|---|---|---|---|---|
Cluster count (%) | 10 % | 5 % | 1 % | |||
Failures found (%) | F-measure | Failures found (%) | F-measure | Failures found (%) | F-measure | |
1 | 100 | 1 | 100 | 0.94 | 85 | 0.91 |
5 | 100 | 0.78 | 0 | 0 | 100 | 0.14 |
10 | 100 | 0.69 | 100 | 0.43 | 100 | 0.13 |
15 | 52 | 0.43 | 100 | 0.4 | 100 | 0.14 |
20 | 100 | 0.64 | 100 | 0.42 | 100 | 0.11 |
25 | 100 | 0.58 | 100 | 0.31 | 100 | 0.07 |