Introduction
Background
Statistical measures of association
Predictor | Response | |
---|---|---|
Case 1 | Categorical | Categorical |
Case 2 | Categorical | Continuous |
Case 3 | Continuous | Categorical |
Case 4 | Continuous | Continuous |
Applied metaheuristics
Tabu search
-
Let f(x) be the cost function and f*(x) be the best-known objective function value.
-
Let x be the current solution.
-
Let x* be the best-known solution.
-
Let S(x) be the set of all neighborhood solutions about x.
-
Let T be the set of prohibited moves, known as the Tabu list.
Simulated annealing
-
Let f(x) be the cost function and f*(x) be the best-known objective function value.
-
Let x be the current solution.
-
Let x* be the best-known solution.
-
Let S(x) be the set of all neighborhood solutions about x.
-
Let k be the index of the current iteration.
-
Let T(k) be a temperature parameter.
Genetic algorithms
-
Candidate solutions are referred to as chromosomes and are analogous to the feasible solutions discussed in previous heuristics.
-
Elements or sub-components of the chromosomes are referred to as genes, and the values assigned to genes are called alleles.
-
Selection refers to the process by which candidate solutions with better fitness values are assigned preference, imposing a survival-of-the-fittest mechanism to the algorithm.
-
Recombination refers to the generation of new candidate solutions by merging select elements of two or more solutions identified as having traits conducive to fitness. Solutions designated by the selection process as good are then designated as parent solutions to be combined to produce offspring or child solutions.
-
Mutation refers to the process by which the algorithm will randomly modify a child solution by slightly altering one or more of its elements.
Machine learning
Fuzzy inference systems
Methods: proposed framework
Data analysis and preprocessing
Filter #1: individual strength of association
Machine learning problem formulation
Filter #2: membership in solution groups
Feature categorization through fuzzy inference
-
Level 1. This label is for features that produce high quality solutions in single-variable models. Other features receiving this highest level would be those appearing consistently in high-quality solutions but exhibit low or nonexistent membership in low-quality solutions.
-
Level 2. This label is for features that exhibit strong membership in high-quality solutions but also exhibit some non-trivial membership in low-quality solutions. This is an indicator that the value contributed by these features is in their interaction with other features in producing a high-quality result.
-
Level 3. This label is for features tending not to appear in the best solutions but also tending not to appear in the worst solutions. Rather, these features tend to appear in the “squishy middle”, the subsets of features producing generally unremarkable results.
-
Level 4. This label is for features that exhibit strong membership in low-quality solutions but exhibit non-trivial membership in high-quality solutions.
-
Level 5: this label is for features that perform poorly regardless of circumstance, exhibiting strong membership in low-quality solutions and low or nonexistent membership in high-quality solutions. These features only contribute to non-poor solutions when they happen to be combined with especially high-performing features.
FIS step (1): identification of inputs
Fuzzy input | Description |
---|---|
\(X_{1}\)
| Feature’s membership in high precision group |
\(X_{2}\)
| Feature’s membership in low precision group |
\(X_{3}\)
| Feature’s membership in high recall group |
\(X_{4}\)
| Feature’s membership in low recall group |
FIS step (2): definition of input membership functions
FIS step (3): definition of rules
Logical operator | Convention |
---|---|
AND |
\({ \hbox{min} }\left( {\mu_{A} \left( x \right), \mu_{B} \left( x \right)} \right)\)
|
OR |
\({ \hbox{max} }\left( {\mu_{A} \left( x \right), \mu_{B} \left( x \right)} \right)\)
|
NOT |
\(1 - \mu_{A} \left( x \right)\)
|
FIS step (4): fuzzification
FIS step (5): activation of rules
FIS step (6): defuzzification
Results and discussion
Example 1: robot execution failures
Description of data
Data analysis and preprocessing
First filter
Machine learning problem formulation
Second filter
Fuzzy inference system
Results
Label | Number of features |
---|---|
Level 1 (highest value) | 15 |
Level 2 | 2 |
Level 3 | 348 |
Level 4 | 76 |
Level 5 (lowest value) | 0 |
Discussion
Label | Number of features |
---|---|
Level 1 (highest value) | 13 |
Level 2 | 2 |
Level 3 | 312 |
Level 4 | 114 |
Level 5 (lowest value) | 0 |
Label | Number of features |
---|---|
Level 1 (highest value) | 2 |
Level 2 | 13 |
Level 3 | 181 |
Level 4 | 245 |
Level 5 (lowest value) | 0 |
Rank order | Scenario 1: 1000 subsets | Scenario 2: 200,000 subsets | ||
---|---|---|---|---|
Feature | Crisp output | Feature | Crisp output | |
1 | 15 | 0.891666667 | 22 | 0.881925144 |
2 | 95 | 0.891666667 | 21 | 0.874986572 |
3 | 182 | 0.891666667 | 4 | 0.869815795 |
4 | 22 | 0.881298879 | 417 | 0.863758421 |
5 | 330 | 0.881298879 | 42 | 0.863278371 |
6 | 108 | 0.813182504 | 156 | 0.858847362 |
7 | 129 | 0.733510773 | 12 | 0.846923559 |
8 | 138 | 0.733510773 | 95 | 0.845564336 |
9 | 370 | 0.733510773 | 6 | 0.829632339 |
10 | 68 | 0.711872842 | 47 | 0.814060738 |
11 | 107 | 0.711872842 | 230 | 0.753711779 |
12 | 159 | 0.711872842 | 125 | 0.73260824 |
13 | 169 | 0.711872842 | 82 | 0.707993795 |
14 | 226 | 0.711872842 | 201 | 0.69445375 |
15 | 386 | 0.711872842 | 28 | 0.684691585 |
Actual (down)\predicted (across) |
0
| 1 |
---|---|---|
0 | 11 | 0 |
1 | 7 | 0 |
Actual (down)\predicted (across) | 0 | 1 |
---|---|---|
0 | 11 | 0 |
1 | 0 | 7 |
Example 2: single proton emission computed tomography (SPECT) images
Description of data
Class | Training set | Test set | Full set |
---|---|---|---|
0 | 42 | 13 | 55 |
1 | 171 | 41 | 212 |
Total | 203 | 54 | 267 |
Problem formulation
Summary of results
Model | Number of features | Classification accuracy | relative size | Relative performance | PSR |
---|---|---|---|---|---|
Original | 22 | 0.7778 | 1 | 1 | 1 |
Filter #1 | 15 | 0.7222 | 0.6818 | 0.9285 | 1.3618 |
Final | 6 | 0.8148 | 0.2727 | 1.0476 | 3.8416 |
Example 3: single proton emission computed tomography image features (SPECTF)
Description of data
Problem formulation
Summary of results
Model | Number of features | Classification accuracy | Relative size | Relative performance | PSR |
---|---|---|---|---|---|
Original | 44 | 0.8148 | 1 | 1 | 1 |
Filter #1 | 18 | 0.7407 | 0.4091 | 0.9091 | 2.2222 |
Final (keep best) | 6 | 0.8703 | 0.1364 | 1.0681 | 7.8308 |
Final (drop worst) | 10 | 0.8333 | 0.2273 | 1.0227 | 4.5 |