1 Introduction
2 Software Metrics
2.1 Metric Sets Under Study
Metric name | Internal attribute | Description |
---|---|---|
(a) Metrics for methods and functions | ||
Cyclomatic Number (VG) | Control-flow structuredness | Calculated based on the control flow graph G = (V,E) and number of a method M as VG(M) = |E| − |V| + p, where p is the number of entries and exits. |
Nested Block Depth (NBD) | Control-flow structuredness | Maximum number of nested blocks in a method. |
Number of Function Calls (NFC) | Coupling | Number of functions called by a method |
Number of Statements (NST) | Size | Number of statements of a method |
(b) Metrics for classes | ||
Weighted Methods per Class (WMC) | Method complexity | Complexity of a class as the sum of the complexity of its methods. Here, VG is used as complexity measure. |
Coupling Between Objects (CBO) | Coupling | Number of classes, to which a class is coupled. |
Response For a Class (RFC) | Coupling | Size of the response set of a class, i.e. all methods that can be invoked directly or indirectly by calling a method of a class. |
Number of Overridden Methods (NORM) | Inheritance | Number of methods defined by a parent that are overridden by a class |
Number of Methods (NOM) | Size | Number of methods of a class |
Lines of Code (LOC) | Size | Lines of code, excluding empty and comment-only lines. |
Number of Static Methods (NSM) | Staticness | Number of static methods of a class |
2.2 Thresholds for Software Metrics
Metric name | Language | Threshold | Source |
---|---|---|---|
(a) Metrics for methods and functions | |||
VG | C | 24 | French (1999) |
C+ + | 10 | French (1999) | |
C# | 10 | French (1999) | |
NBD | C | 5 | French (1999) |
C+ + | 5 | French (1999) | |
C# | 5 | French (1999) | |
NFC | C | 5 | – |
C+ + | 5 | – | |
C# | 5 | – | |
NST | C | 50 | – |
C+ + | 50 | – | |
C# | 50 | – | |
(b) Metrics for classes | |||
WMC | Java | 100 | Benlarbi et al. (2000) |
CBO | Java | 5 | Benlarbi et al. (2000) |
RFC | Java | 100 | Benlarbi et al. (2000) |
NORM | Java | 3 | Lorenz and Kidd (1994) |
LOC | Java | 500 | Adapted from Copeland (2005) |
NOM | Java | 20 | Adapted from Copeland (2005) |
NSM | Java | 4 | Lorenz and Kidd (1994) |
3 Foundations of Machine Learning
3.1 Concept Learning in the Presence of Noise
3.2 A Rectangle Learning Algorithm
4 Optimization of Metric Sets and Thresholds
4.1 Calculation of Thresholds Using Rectangle Learning
4.2 Threshold and Metric Set Optimization Algorithm
4.3 Optimization of the Efficiency of Metric Sets with Thresholds
4.4 Reduction of the Classification Complexity
4.5 Learning of Environment Specific Thresholds
5 Case Studies
-
R1: Is the method to optimize the efficiency of metric sets effective?
-
R2: Is the method to reduce classification complexity effective?
-
R3: Are the methods applicable and effective to different levels of abstraction (e.g., methods, classes, packages) and programming languages?
-
R4: Is threshold recalculation with the rectangle learning algorithm necessary or is it sufficient to reuse known thresholds?
-
R5: Is the exponential nature of the approach a threat to its scalability?
5.1 Methodology
5.2 Case Study 1: Optimization of Metric Sets for Methods
Project name | Version | Language | Number of methods | |
---|---|---|---|---|
Total | Problematic | |||
(a) Projects used for method-level analysis | ||||
Apache Webserver | 2.2.10 | C | 6718 | 1995 |
kdebase | 12/05/2008 | C+ + | 21404 | 4161 |
kdelibs | 12/05/2008 | C+ + | 37444 | 4921 |
AspectDNG | 1.0.3 | C# | 2759 | 232 |
NetTopologieSuite | 1.7.1.RC1 | C# | 3059 | 317 |
SharpDevelop | 2.2.1.2648 | C# | 15700 | 1950 |
Project name | Version | Language | Number of classes | |
---|---|---|---|---|
Total | Problematic | |||
(b) Projects used for class-level analysis | ||||
Eclipse java development tools | 3.2 | Java | 4833 | 3349 |
Eclipse platform project | 3.2 | Java | 5399 | 3517 |
Metric | Language | Median | Arithmetic mean | Max value | Threshold |
---|---|---|---|---|---|
VG | C | 2 | 5.74 | 734 | 24 |
C+ + | 1 | 3.09 | 366 | 10 | |
C# | 1 | 2.18 | 134 | 10 | |
NBD | C | 2 | 2.15 | 21 | 5 |
C+ + | 2 | 1.76 | 13 | 5 | |
C# | 3 | 2.71 | 11 | 5 | |
NFC | C | 2 | 6.1 | 410 | 5 |
C+ + | 2 | 7.81 | 997 | 5 | |
C# | 1 | 2.44 | 230 | 5 | |
NST | C | 2 | 15.61 | 1660 | 50 |
C+ + | 3 | 8.33 | 1132 | 50 | |
C# | 1 | 4.78 | 544 | 50 |
5.3 Case Study 2: Optimization of Metric Sets for Classes
Metric | Median | Arithmetic mean | Max value | Threshold |
---|---|---|---|---|
WMC | 12 | 27.48 | 2138 | 100 |
CBO | 8 | 13.40 | 212 | 5 |
RFC | 20 | 35.21 | 675 | 100 |
NORM | 0 | 0.96 | 166 | 3 |
LOC | 24 | 82.95 | 6619 | 500 |
NOM | 6 | 10.79 | 418 | 20 |
NSM | 0 | 0.81 | 128 | 4 |
M
*
|
T
*
| Error ε
| MCC |
\(F\mbox{-}score\)
| |
---|---|---|---|---|---|
(a) Case study 1 | |||||
Language | |||||
C | {NFC} | {5} | 0.78% | 0.9793 | 0.9942 |
C+ + | {NFC} | {5} | 0.06% | 0.9956 | 0.9986 |
C# | {NFC} | {5} | 0.59% | 0.9555 | 0.9949 |
(b) Case study 2 | |||||
M
*
| |||||
{CBO, NORM, NSM} | {5, 3, 4} | 0.27% | 0.9939 | 0.9959 | |
(c) Case study 3 | |||||
Language | |||||
C | {NST} | {50} | 0.84% | 0.9274 | 0.9955 |
C+ + | {VG} | {10} | 0.87% | 0.9139 | 0.9954 |
C# | {VG} | {9} | 1.36% | 0.7598 | 0.9930 |
(d) Case study 4 | |||||
λ
| |||||
1 | {RFC, NORM, NOM, NSM} | {98, 3, 20, 4} | 1.71% | 0.9449 | 0.9894 |
2 | {WMC, RFC} | {99,110} | 2.21% | 0.8494 | 0.9880 |
5.4 Case Study 3: Reduction of the Classification Complexity for Methods
5.5 Case Study 4: Reduction of the Classification Complexity for Classes
6 Discussion
6.1 Discussion of Research Questions
6.2 Comparison to Other Methods
Max. no. of metrics | No. of threshold combinations | Calc. time assuming 0.1 ms per hypothesis |
---|---|---|
1 | 1,415 | 141.5 ms |
2 | 629,076 | ~ 63 s |
3 | 149,235,857 | ~ 248 min |
4 | 18,565,376,659 | ~ 21.5 days |
5 | 1,201,532,717,441 | ~ 3.8 years |
6 | 37,125,301,717,441 | ~ 117 years |
7 | 438,665,979,997,440 | ~ 1391 years |