1 Introduction
2 Related works
3 Artificial immune system
3.1 Artificial intelligence recognition system (AIRS)
-
Divide the training dataset into np 2 partitions.
-
Allocate training partitions to processes.
-
Combine np number of memory pools.
-
Use a merging approach to create the combined memory pool.
3.2 CLONALG
-
Plasma cell that produces antibody as an effector of the immune response.
-
Long-lived memory cell, in case a similar antigen appears.
-
Local search, provided via affinity maturation of cloned antibodies. More clones are produced for better-matched antibodies.
-
A search that provides a global scope and involves the insertion of randomly generated antibodies to be inserted into the population to further increase the diversity and provide a means for potentially escaping local optima.
3.3 Immunos81
-
T-Cell, both partitioning learned information and decisions about how new information is exposed to the system are the duty of this cell, each specific antigen has a T-Cell, and it has one or more groups of B-Cells.
-
B-Cell, there is an instance of a group of antigens.
-
Antigen, this is a defect; it has a data vector of attributes where the nature of each attribute like name and data type is known.
-
Antigen-Type, depends on the domain, antigens are identified by their names and series of attributes, which is a duty of T-Cell.
-
Antigen group/clone based on the antigen’s type or a specific classification label forms a group that is called clone of B-Cell and as mentioned above, are controlled by a T-Cell.
-
The recognition part of the antibody is called paratope, which is bound to the specific part of the antigen, epitopes (attributes of the antigen).
4 Feature selection
4.1 Principal component analysis
4.2 Correlation-based feature selection
5 Experiment description
5.1 Dataset selection
5.2 Variable selection
Attributes names | Information |
---|---|
loc | McCabe’s line count of code |
v(g) | McCabe “cyclomatic complexity” |
ev(g) | McCabe “essential complexity” |
iv(g) | McCabe “design complexity” |
n | Halstead total operators + operands |
v | Halstead “volume” |
l | Halstead “program length” |
d | Halstead “difficulty” |
i | Halstead “intelligence” |
e | Halstead “effort” |
b | Halstead “delivered bugs” |
t | Halstead’s time estimator |
lOCode | Halstead’s line count |
lOComment | Halstead’s count of lines of comments |
lOBlank | Halstead’s count of blank lines |
lOCodeAndlOComment | Lines of code and comments |
uniq_op | Unique operators |
uniq_opnd | Unique operands |
total_op | Total operators |
total_opnd | Total operand |
branchCount | Branch count of the flow graph |
5.3 Simulator selection
5.4 Performance measurements criteria
No (predicted) | Yes (predicted) | |
---|---|---|
No (actual) | TN | FP |
Yes (actual) | FN | TP |
6 Analysis of the experiment
6.1 Experiment 1
Algorithms | JM1 | KC1 | CM1 | PC3 |
---|---|---|---|---|
Decision tree, J48 | ||||
Accuracy | 79.50 | 84.54 | 87.95 | 88.36 |
AUC | 0.653 | 0.689 | 0.558 | 0.599 |
Random forest | ||||
Accuracy | 81.14 | 85.44 | 87.95 | 89.89 |
AUC |
0.717
|
0.789
|
0.723
|
0.795
|
Naïve Bayes | ||||
Accuracy | 80.42 | 82.36 | 85.34 | 48.69 |
AUC | 0.679 |
0.790
| 0.658 | 0.756 |
NN, Back Propagation | . | |||
Accuracy | 80.65 | 84.54 | 89.96 | 89.76 |
AUC | 0.500 | 0.500 | 0.499 | 0.500 |
Decision Table | ||||
Accuracy | 80.91 | 84.87 | 89.16 | 89.51 |
AUC |
0.703
|
0.785
| 0.626 | 0.657 |
AIRS1 | ||||
Accuracy | 71.67 | 74.63 | 80.92 | 85.16 |
AUC | 0.551 | 0.563 |
0.549
|
0.577
|
AIRS2 | ||||
Accuracy | 68.53 | 68.90 | 84.94 | 88.10 |
AUC | 0.542 | 0.529 | 0.516 | 0.549 |
AIRSParallel | ||||
Accuracy | 71.93 | 82.02 | 84.74 | 86.95 |
AUC |
0.558
|
0.605
|
0.543
| 0.540 |
Immunosl | ||||
Accuracy | 56.37 | 50.55 | 32.93 | 17.98 |
AUC | 0.610 | 0.681 | 0.610 | 0.529 |
Immunos2 | ||||
Accuracy |
80.65
| 75.25 | 89.16 | 89.51 |
AUC | 0.500 | 0.511 | 0.494 | 0.499 |
Immunos99 | ||||
Accuracy | 74.10 | 53.39 | 36.75 | 39.21 |
AUC | 0.515 | 0.691 | 0.613 | 0.584 |
CLONALG | ||||
Accuracy | 73.01 | 82.50 | 87.95 | 87.20 |
AUC | 0.509 | 0.532 | 0.506 | 0.491 |
CSCA | ||||
Accuracy | 80.17 | 83.97 | 88.15 | 89.00 |
AUC |
0.549
| 0.593 | 0.489 | 0.515 |
Algorithms | JM1 | KC1 | CM1 | PC3 |
---|---|---|---|---|
Decision tree, J48 | ||||
PD | 0.232 | 0.331 | 0.061 |
0.206
|
PF | 0.070 | 0.061 | 0.031 |
0.039
|
Random forest | ||||
PD |
0.242
| 0.313 | 0.061 | 0.181 |
PF |
0.052
| 0.047 | 0.031 |
0.019
|
Naïve Bayes | ||||
PD | 0.201 | 0.377 |
0.286
| 0.085 |
PF | 0.051 | 0.095 |
0.089
| 0.555 |
NN, Back Propagation | ||||
PD | 0.000 | 0.000 | 0.000 | 0.000 |
PF | 0.000 | 0.000 | 0.002 | 0.000 |
Decision Table | ||||
PD |
0.129
| 0.166 | 0.000 | 0.000 |
PF |
0.028
| 0.026 | 0.011 | 0.003 |
AIRS1 | ||||
PD | 0.282 | 0.298 |
0.224
|
0.231
|
PF | 0.179 | 0.172 |
0.127
|
0.078
|
AIRS2 | ||||
PD | 0.309 | 0.298 | 0.102 | 0.131 |
PF | 0.225 | 0.239 | 0.069 | 0.033 |
AIRSParallel | ||||
PD | 0.300 |
0.294
|
0.163
| 0.125 |
PF | 0.183 |
0.084
|
0.078
| 0.046 |
Immunosl | ||||
PD | 0.685 | 0.936 | 0.969 | 0.969 |
PF | 0.465 | 0.573 | 0.732 | 0.910 |
Immunos2 | ||||
PD | 0.000 | 0.163 | 0.000 | 0.000 |
PF | 0.000 | 0.140 | 0.001 | 0.003 |
Immunos99 | ||||
PD | 0.155 | 0.917 | 0.918 | 0.925 |
PF | 0.118 | 0.536 | 0.693 | 0.758 |
CLONALG | ||||
PD | 0.149 | 0.107 | 0.041 | 0.013 |
PF | 0.130 | 0.044 | 0.029 | 0.030 |
CSCA | ||||
PD |
0.138
| 0.236 | 0.000 | 0.044 |
PF |
0.039
| 0.050 | 0.022 | 0.014 |
6.2 Experiment 2
Attributes names |
---|
LOC_BLANK |
BRANCH_COUNT |
CALL_PAIRS |
LOC_CODE_AND_COMMENT |
LOC_COMMENTS |
CONDITION_COUNT |
CYCLOMATIC_COMPLEXITY |
CYCLOMATIC_DENSITY |
DECISION_COUNT |
DECISION_DENSITY |
DESIGN_COMPLEXITY |
DESIGN_DENSITY |
EDGE_COUNT |
ESSENTIAL_COMPLEXITY |
ESSENTIAL_DENSITY |
LOC_EXECUTABLE |
PARAMETER_COUNT |
HALSTEAD_CONTENT |
HALSTEAD_DIFFICULTY |
HALSTEAD_EFFORT |
HALSTEAD_ERROR_EST |
HALSTEAD_LENGTH |
HALSTEAD_LEVEL |
HALSTEAD_PROG_TIME |
HALSTEAD_VOLUME |
MAINTENANCE_SEVERITY |
MODIFIED_CONDITION_COUNT |
MULTIPLE_CONDITION_COUNT |
NODE_COUNT |
NORMALIZED_CYLOMATIC_COMPLEXITY |
NUM_OPERANDS |
NUM_OPERATORS |
NUM_UNIQUE_OPERANDS |
NUM_UNIQUE_OPERATORS |
NUMBER_OF_LINES |
PERCENT_COMMENTS |
LOC_TOTAL |
Algorithms | PC3 (37) | PC3 (21) |
---|---|---|
AIRSParallel | ||
Accuracy | 87.46 | 86.95 |
AUC | 0.554 | 0.540 |
PD | 0.150 | 0.125 |
PF | 0.043 | 0.046 |
6.3 Experiment 3
Algorithms | JM1 | KC1 | CM1 |
---|---|---|---|
Decision tree, J48 | |||
Accuracy | 81.04 | 85.78 | 90.16 |
AUC | 0.661 | 0.744 | 0.616 |
Random forest | |||
Accuracy | 80.85 | 84.93 | 88.15 |
AUC |
0.706
|
0.782
|
0.736
|
Naïve Bayes | |||
Accuracy | 80.02 | 82.98 | 85.94 |
AUC | 0.635 |
0.756
|
0.669
|
NN, Back Propagation | |||
Accuracy | 79.78 | 81.18 | 89.56 |
AUC | 0.583 | 0.665 | 0.515 |
Decision Table | |||
Accuracy | 80.71 | 85.54 | 89.56 |
AUC |
0.701
|
0.765
| 0.532 |
AIRS1 | |||
Accuracy | 67.02 | 73.88 | 84.14 |
AUC | 0.555 | 0.580 | 0.530 |
AIRS2 | |||
Accuracy | 71.66 | 75.77 | 82.93 |
AUC | 0.555 | 0.576 | 0.514 |
AIRSParallel | |||
Accuracy | 71.62 | 80.65 | 85.95 |
AUC | 0.568 | 0.609 | 0.549 |
Immunosl | |||
Accuracy | 69.85 | 73.49 | 69.88 |
AUC |
0.638
| 0.705 |
0.660
|
Immunos2 | |||
Accuracy | 80.65 | 84.54 |
90.16
|
AUC | 0.500 | 0.500 | 0.500 |
Immunos99 | |||
Accuracy | 70.35 | 75.53 | 71.29 |
AUC |
0.632
| 0.709 |
0.650
|
CLONALG | |||
Accuracy | 73.27 | 80.75 | 87.15 |
AUC | 0.517 | 0.505 | 0.505 |
CSCA | |||
Accuracy | 79.58 | 84.92 | 88.55 |
AUC | 0.573 | 0.601 | 0.509 |
Algorithms | JM1 | KC1 | CM1 |
---|---|---|---|
Decision tree, J48 | |||
PD | 0.096 | 0.239 | 0.041 |
PF | 0.018 | 0.029 | 0.004 |
Random forest | |||
PD | 0.235 | 0.267 | 0.041 |
PF | 0.054 | 0.045 | 0.027 |
Naïve Bayes | |||
pd
| 0.195 | 0.337 | 0.224 |
PF | 0.055 | 0.080 | 0.071 |
NN, Back Propagation | |||
PD | 0.224 | 0.451 | 0.000 |
PF | 0.056 | 0.121 | 0.007 |
Decision Table | |||
PD | 0.101 | 0.166 | 0.041 |
PF | 0.024 | 0.019 | 0.011 |
AIRS1 | |||
PD | 0.366 | 0.350 | 0.143 |
PF | 0.257 | 0.190 | 0.082 |
AIRS2 | |||
PD | 0.291 | 0.313 | 0.122 |
PF | 0.181 | 0.161 | 0.094 |
AIRSParallel | |||
PD | 0.326 | 0.319 | 0.163 |
PF | 0.019 | 0.100 | 0.065 |
Immunosl | |||
PD | 0.540 | 0.663 | 0.612 |
PF | 0.264 | 0.252 | 0.292 |
Immunos2 | |||
PD | 0.000 | 0.000 | 0.000 |
PF | 0.000 | 0.000 | 0.000 |
Immunos99 | |||
PD | 0.516 | 0.641 | 0.571 |
PF | 0.251 | 0.224 | 0.272 |
CLONALG | |||
PD | 0.166 | 0.067 | 0.041 |
PF | 0.133 | 0.057 | 0.038 |
CSCA | |||
PD | 0.211 | 0.242 | 0.041 |
PF | 0.064 | 0.040 | 0.022 |
Algorithms | JM1 | KC1 | CM1 |
---|---|---|---|
Decision tree, J48 | |||
Accuracy | 81.01 | 84.68 | 89.31 |
AUC | 0.664 | 0.705 | 0.542 |
Random forest | |||
Accuracy | 80.28 | 84.83 | 88.15 |
AUC |
0.710
|
0.786
| 0.615 |
Naïve Bayes | |||
Accuracy | 80.41 | 82.41 | 86.55 |
AUC | 0.665 |
0.785
|
0.691
|
NN, Back Propagation | |||
Accuracy | 80.65 | 84.54 | 90.16 |
AUC | 0.500 | 0.500 | 0.500 |
Decision Table | |||
Accuracy | 80.81 | 84.92 | 89.16 |
AUC |
0.701
|
0.781
| 0.626 |
AIRS1 | |||
Accuracy | 66.76 | 76.34 | 84.54 |
AUC | 0.567 | 0.602 | 0.569 |
AIRS2 | |||
Accuracy | 73.36 | 77.34 | 82.53 |
AUC | 0.565 | 0.591 | 0.530 |
AIRSParallel | |||
Accuracy | 70.17 | 79.47 | 86.14 |
AUC | 0.564 | 0.588 | 0.488 |
Immunos1 | |||
Accuracy | 59.99 | 49.98 | 69.88 |
AUC | 0.600 | 0.678 | 0.697 |
Immunos2 | |||
Accuracy | 80.65 | 80.23 | 90.16 |
AUC | 0.500 | 0.491 | 0.500 |
Immunos99 | |||
Accuracy | 65.02 | 62.21 | 76.51 |
AUC | 0.594 | 0.705 | 0.679 |
CLONALG | |||
Accuracy | 72.92 | 79.28 | 87.95 |
AUC | 0.512 | 0.522 | 0.497 |
CSCA | |||
Accuracy | 79.55 | 83.21 | 87.75 |
AUC | 0.575 |
0.590
| 0.505 |
Algorithms | JM1 | KC1 | CM1 |
---|---|---|---|
Decision tree, J48 | |||
PD | 0.148 | 0.175 | 0.000 |
PF | 0.031 | 0.030 | 0.009 |
Random forest | |||
PD |
0.243
|
0.282
|
0.102
|
PF |
0.063
|
0.048
|
0.033
|
Naïve Bayes | |||
pd
|
0.223
|
0.365
|
0.306
|
PF |
0.056
|
0.092
|
0.073
|
NN, Back Propagation | |||
pd
| 0.000 | 0.000 | 0.000 |
PF | 0.000 | 0.000 | 0.000 |
Decision Table | |||
PD | 0.108 | 0.178 | 0.000 |
PF | 0.024 | 0.028 | 0.011 |
AIRS1 | |||
PD | 0.402 | 0.368 | 0.224 |
PF | 0.269 | 0.184 | 0.087 |
AIRS2 | |||
PD | 0.290 | 0.328 | 0.163 |
PF | 0.160 | 0.145 | 0.102 |
AIRSParallel | |||
PD | 0.301 | 0.301 | 0.041 |
PF | 0.173 | 0.125 | 0.065 |
Immunos1 | |||
PD | 0.600 | 0.936 | 0.694 |
PF | 0.400 | 0.058 | 0.301 |
Immunos2 | |||
PD | 0.000 | 0.040 | 0.000 |
PF | 0.000 | 0.058 | 0.000 |
Immunos99 | |||
PD | 0.502 | 0.825 | 0.571 |
PF | 0.314 | 0.415 | 0.214 |
CLONALG | |||
PD | 0.159 | 0.129 | 0.020 |
PF | 0.134 | 0.086 | 0.027 |
CSCA | |||
PD |
0.217
|
0.239
| 0.041 |
PF |
0.066
|
0.059
| 0.031 |
6.4 Experiment 4
Algorithms | CM1 (Old Values) | CM1 |
---|---|---|
Decision tree, J48 | ||
Accuracy | 87.95 | 75.50 |
AUC | 0.558 | 0.534 |
Random Forest | ||
Accuracy | 87.95 | 76.71 |
AUC | 0.723 | 0.656 |
Naïve Bayes | ||
Accuracy | 85.34 | 77.31 |
AUC | 0.658 | 0.614 |
NN, Back Propagation | ||
Accuracy | 89.96 | 80.32 |
AUC | 0.499 | 0.500 |
Decision Table | ||
Accuracy | 89.16 | 79.72 |
AUC | 0.626 | 0.554 |
AIRS1 | ||
Accuracy | 80.92 | 66.27 |
AUC | 0.549 | 0.497 |
AIRS2 | ||
Accuracy | 84.94 | 68.67 |
AUC | 0.516 | 0.501 |
AIRSParallel | ||
Accuracy | 84.74 | 76.10 |
AUC | 0.543 |
0.555
|
Immunosl | ||
Accuracy | 32.92 | 38.35 |
AUC | 0.610 | 0.574 |
Immunos2 | ||
Accuracy | 89.16 | 80.32 |
AUC | 0.494 | 0.500 |
Immunos99 | ||
Accuracy | 36.75 | 50.00 |
AUC | 0.613 | 0.600 |
CLONALG | ||
Accuracy | 87.95 | 73.69 |
AUC | 0.506 |
0.520
|
CSCA | ||
Accuracy | 88.15 | 78.31 |
AUC | 0.489 |
0.530
|