31012018  Methodologies and Application  Issue 11/2019 Open Access
Knowledge aggregation in decisionmaking process with Cfuzzy random forest using OWA operators
 Journal:
 Soft Computing > Issue 11/2019
1 Introduction
1.1 Cfuzzy decision trees

The decision tree operates on a small (usually) set of discrete attributes,

To split the node, the single attribute which brings the most information gain is chosen,

In their traditional form, the decision tree is designed to deal with discrete class problems—the continuous problems are handled by regression trees.

To handle continuous values, it is necessary to perform the discretization which can impact the overall performance of the tree.

Information brought by the nodes which were not selected to split the node is kind of lost.

c is a number of clusters,

N is a number of training instances,

\(i=1,2,\ldots ,c\),

\(k=1,2,\ldots ,N\),

\(U=[u_{ik}]\) is a partition matrix,

m is a fuzzification factor (usually \(m=2\)),

\(d_{ik}\) is a distance function between the ith prototype and the kth instance,

\(f_i\) is the prototype of the cluster,

\(\varvec{Z}=\{\varvec{x}(k),y(k)\}\) is an input–output pair of data instances,

\(\varvec{z}_k=[x_1(k) x_2{k}\ldots x_n(k)y(k)]^\mathrm{T}\).

\(\varvec{X}_i=\{\varvec{x}(k)u_i(\varvec{x}(k))>u_j(\varvec{x}(k))\) for all \(j \ne i\}\), where j pertains to the nodes originating from the same parent, denotes all elements of the data set which belong to the given node in virtue of the highest membership grade,

\(\varvec{Y}_i=\{y(k)\varvec{x}(k) \in \varvec{X}_i\}\) collects the output coordinates of the elements that have already been assigned to \(\varvec{X}_i\),

\(\varvec{U}_i=[u_i(\varvec{x}(1))u_i(\varvec{x}(2))\cdots u_i(\varvec{x}(\varvec{Y_i}))]\) is a vector of the grades of membership of the elements in \(\varvec{X}_i\),

\(\varvec{N}_i=\langle \varvec{X}_i, \varvec{Y}_i, \varvec{U}_i\rangle \),

\(m_i\) is the representative of this node positioned in the output space,

\(V_i\) is the variability of the data in the output space existing at the given node.
1.2 Fuzzy random forests

The robustness of ensemble classifiers,

The power of the randomness to decrease the correlation between the trees and increase the diversity of them,

The flexibility of fuzzy logic for dealing with imperfect data.
1.3 OWA operators

n is the size of the collection,

\(W=[W_1, W_2, \ldots , W_n]\) is the vector of weights, where:

\(W_i\in [0,1]\),

\(\sum _{i=1}^nW_i = 1,\)


\(A=[a_1, a_2, \ldots , a_n]\)—collection of objects, \(a_i\in [0,1]\),

\(B=[b_1, b_2, \ldots , b_n]\)—vector A sorted in descending order.
1.4 Other approaches
2 Cfuzzy random forest
2.1 Notation

c is the number of clusters,

T is the number of trees in the CFRF ensemble,

t is the particular tree,

\(N_t\) is the number of nodes in the tree t,

n is a particular leaf reached in a tree,

E is a training set,

e is a data instance,

I is the number of classes,

i is a particular class,

\(C\_{ FRF}\) is a matrix with size \((T \times \textit{MAX}_{N_{t}})\) with \(\textit{MAX}_{N_{t}}=\mathrm{max}\left\{ N_1,N_2,\ldots ,N_t\right\} \), where each element of the matrix is a vector of size I containing the support for every class provided by every activated leaf n on each tree t; this matrix represents Cfuzzy forest or Cfuzzy random forest,

\(\varvec{w}=[w_1, w_2, \ldots , w_c]\) are weight’s of the OWA operator

\(W=[\varvec{w}_1, \varvec{w}_2, \ldots , \varvec{w}_c]\) is matrix of the OWA operators of the forest,

\(W_t=[\varvec{w}_1, \varvec{w}_2, \ldots , \varvec{w}_c]\) is matrix of the OWA operators of the t tree,

\(W_n=[\varvec{w}_1, \varvec{w}_2, \ldots , \varvec{w}_c]\) is matrix of the OWA operators of the n node,

\(M=\bigcup _{i=1}^cM_i\) is the of subsets of training objects belonging to the children of the given node (each element of the matrix corresponds with single node’s child),

\(U=[U_1, U_2, \ldots , U_{E}]\) is the tree’s partition matrix of the training objects,

\(U_i = [u_1, u_2, \ldots , u_c]\) are memberships of the ith object to the c cluster,

\(B=\left\{ B_1, B_2, \ldots , B_b\right\} \) are the unsplit nodes,

\(V=[V_1, V_2, \ldots , V_b]\) is the variability vector.
2.2 The idea of Cfuzzy random forest classifier

Random note to split selection—taken from the Random Forests:

The full randomness—selecting the random node to split instead of the most heterogenous,

The limited randomness—selecting the set of nodes with the highest diversity, then randomly selecting one of them to perform the division. The size of the set is given and the same for each split,


Random creation of partition matrix—taken from the Cfuzzy decision trees:

At first, the centroid (prototype) of the each cluster selection is fully random. The objects which belong to the parent node are divided into clusters grouped around these centroids using the shortest distance criterion. Then the prototypes and the partition matrix are being corrected as long as they achieved the stop criterion,

Each tree in the forest, created the way described above, can be selected from the set of created trees. To create the single tree which will be chosen to the forest, the set of trees can be build. Each tree from such set is tested, and the best of these trees (the one which achieved the best classification accuracy for the training set) is being chosen as the part of forest. The size of the set is given and the same for each tree in the forest.

2.3 Cfuzzy random forest learning

The kind of trees used in the forest: Cfuzzy decision trees instead of Janikow’s fuzzy trees,

The way of random selection of the node to split,

The fact of using OWA operators (see Sect. 3).
2.4 Cfuzzy random forest classification
3 Aggregating the knowledge contained in trees in decision making with OWA operators

Local OWA—computed for each node of the each tree in the forest,

Global OWA:

Global OWA for each tree in the forest—computed once for each tree in the forest,

Global OWA for the whole forest—computed once for the whole forest.

3.1 Obtaining OWA operators weights using genetic algorithm

r is the crossover probability,

s is the mutation probability

\(p \in [0,1]\) is randomly generated number which decides about crossover,

\(q \in [0,1]\) is randomly generated number which decides about mutation,

k is randomly generated crossover point,

i is the number of epoches.
3.2 Local OWA
3.2.1 Learning
3.2.2 Classification
3.3 Global OWA
3.3.1 Global OWA for each tree in the forest
3.3.2 Global OWA for the whole forest
3.4 Example of using OWA operators during classification process
Dataset  Number of instances  Number of attributes (without class)  Number of classes  Attribute types 

Hepatitis  155  19  2  Categorical, integer, real 
Dermatology  366  33  2  Categorical, integer 
Semeion Handwritten Digit  1593  256  2  Integer 
Diabetic retinopathy  1151  20  2  Integer, real 
Pima Indians diabetes  768  8  2  Integer, real 
Balance scale  625  4  3  Categorical 
Indian liver patient  583  10  2  Integer, real 
MUSK (Version 1)  476  168  2  Integer 
QSAR biodegradation  1055  41  2  Integer, real 
Climate model simulation crashes  540  18  2  Real 

a node where the object i belongs (for Local OWA),

a tree t (for global OWA for each tree),

a forest (for global OWA for the whole forest).
4 Experiments description

Cfuzzy forest without OWA,

Cfuzzy forest using local OWA,

Cfuzzy forest using global OWA,

Cfuzzy forest using global tree OWA,

Cfuzzy random forest without OWA,

Cfuzzy random forest using local OWA,

Cfuzzy random forest using global OWA,

Cfuzzy random forest using global tree OWA.
5 Results

Classification accuracy achieved using C4.5 rev. 8 decision tree,

Classification accuracy achieved using a single Cfuzzy decision tree,

Classification accuracies achieved using Cfuzzy forest without OWA and with three proposed OWA variants,

Classification accuracies achieved using Cfuzzy random forest without OWA and with three proposed OWA variants.
Dataset  Hepatitis  Dermatology  Semeion Handwritten Digit  Diabetic retinopathy  Pima Indians diabetes  Balance scale  Indian liver patient  MUSK (Version 1)  QSAR biodegradation  Climate model simulation crashes 

Number of clusters  13  5  3  2  5  2  2  5  8  3 
C4.5 rev, 8  58.06  94.32  94.79  64.00  73.73  78.40  66.84  80.21  82.94  92.41 
Single Cfuzzy tree  55.49  94.00  94.35  64.64  73.7  82.24  66.04  77.72  75.55  89.07 
CFF, no OWA  65.81 
97.82
 96.92  65.07  75.13  83.04  68.78 
85.93

85.12

95.75

CFF, local OWA  59.36  97.54  96.86  65.07  75.13  83.04  68.12  84.88  83.69 
95.75

CFF, global OWA 
67.10

97.82
 96.92  65.07  75.13  83.04  68.78 
85.93

85.12

95.75

CFF, global tree OWA  63.23  97.54  96.92  65.07  75.13  83.04  68.78  85.30  83.79 
95.75

CFRF, no OWA  64.52  97.27 
97.30
 65.16 
75.78
 83.36 
69.81
 84.65  82.94  94.27 
CFRF, local OWA  64.52  97.54 
97.30

65.42

75.78

83.52
 69.29  83.61  82.18  94.45 
CFRF, global OWA  58.71  97.27 
97.30
 65.16 
75.78
 83.36 
69.81
 84.65  82.94  94.27 
CFRF, global tree OWA  63.87  96.72 
97.30
 65.16 
75.78
 83.36 
69.81
 84.44  83.41  94.27 
Number of clusters  2  3  5  8  13  2  3  5  8  13 

Cfuzzy forest  Cfuzzy random forest  
Average number of clusters  362  341  312  343  457  369  346  308  331  508 
Average tree width  68  88  110  150  179  72  91  112  145  197 
Average tree height  13  9  7  6  6  13  10  6  6  7 
Number of clusters  2  3  5  8  13  2  3  5  8  13 

Cfuzzy forest  Cfuzzy random forest  
Average number of clusters  204  113  124  128  162  205  116  122  131  160 
Average tree width  37  30  61  69  115  38  31  56  72  118 
Average tree height  11  7  5  4  3  11  7  5  3  4 
Number of clusters  2  3  5  8  13  2  3  5  8  13 

Cfuzzy forest  Cfuzzy random forest  
Learning times  
00:58:42  01:20:28  01:21:22  01:31:29  05:19:59  00:59:06  01:10:13  01:17:25  01:28:56  05:16:34  
Classification times  
Without OWA  00:00:20  00:00:31  00:01:08  00:01:44  00:02:54  00:00:20  00:00:27  00:00:56  00:01:43  00:02:58 
With local OWA  00:00:20  00:00:31  00:01:06  00:01:35  00:02:42  00:00:20  00:00:27  00:00:55  00:01:35  00:02:49 
With global OWA  00:01:45  00:03:06  00:06:57  00:10:17  00:15:42  00:01:47  00:02:44  00:05:38  00:09:54  00:15:53 
With global tree OWA  00:01:48  00:03:13  00:07:03  00:10:35  00:16:19  00:01:51  00:02:49  00:05:48  00:10:13  00:16:32 
Number of clusters  2  3  5  8  13  2  3  5  8  13 

Cfuzzy forest  Cfuzzy random forest  
Learning times  
00:10:31  00:10:08  00:12:59  00:18:35  00:22:44  00:10:36  00:10:09  00:12:59  00:17:49  00:22:24  
Classification times  
Without OWA  00:00:01  00:00:01  00:00:02  00:00:04  00:00:07  00:00:01  00:00:01  00:00:02  00:00:04  00:00:07 
With local OWA  00:00:01  00:00:01  00:00:02  00:00:04  00:00:07  00:00:01  00:00:01  00:00:02  00:00:04  00:00:07 
With global OWA  00:00:21  00:00:31  00:00:53  00:01:29  00:02:22  00:00:21  00:00:31  00:00:53  00:01:31  00:02:18 
With global tree OWA  00:00:24  00:00:36  00:01:03  00:01:49  00:03:03  00:00:24  00:00:37  00:01:04  00:01:51  00:02:58 