1 Introduction and motivation
2 Related work
3 Description logic-based information systems
3.1 Description logics and semantics
-
if \(A \in \Sigma _C\) then \(A\) is a concept,
-
if \(C\) and \(D\) are concepts and \(r \in \Sigma _R\) is a role then \(\top \), \(\bot \), \(\lnot C\), \(C \sqcap D\), \(C \sqcup D\), \(\exists r.C\) and \(\forall r.C\) are also concepts.
-
\(\top \) stands for the top concept,
-
\(\bot \) stands for the bottom concept,
-
\(\lnot C\) stands for the complement of \(C\),
-
\(C \sqcap D\) stands for the intersection of \(C\) and \(D\),
-
\(C \sqcup D\) stands for the union of \(C\) and \(D\),
-
\(\exists r.C\) stands for the existential restriction of \(C\) by \(r\),
-
\(\forall r.C\) stands for the value/universal restriction of \(C\) by \(r\).
-
if \(A \in \Sigma _C\) then \(A\) is a concept,
-
if \(r \in \Sigma _R\) then \(r\) is a role,
-
if \(C\) and \(D\) are concepts, \(R\) and \(S\) are roles then
-
\(\varepsilon \), \(R \circ S\), \(R \sqcup S\), \(R^*\), \(C?\) are roles,
-
\(\top \), \(\bot \), \(\lnot C\), \(C \sqcap D\), \(C \sqcup D\), \(\exists R.C\) and \(\forall R.C\) are concepts.
-
-
\(\varepsilon \) stands for the identity relation,
-
\(R \circ S\) stands for the sequential composition of \(R\) and \(S\),
-
\(R \sqcup S\) stands for the set-theoretical union of \(R\) and \(S\),
-
\(R^*\) stands for the reflexive and transitive closure of \(R\),
-
\(C?\) stands for the test operator.
-
if \(r \in \Sigma _{oR}\) then \(r\) is an object role of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
if \(A \in \Sigma _C\) then \(A\) is concept of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
if \(A \in \Sigma _A\setminus \Sigma _C\) and \(d \in range (A)\) then \(A=d\) and \(A \ne d\) are concepts of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
if \(A \in \Sigma _{nA}\) and \(d \in range (A)\) then \(A \le d\), \(A < d\), \(A \ge d\) and \(A > d\) are concepts of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
if \(R\) and \(S\) are object roles of \(\mathcal {L}_{\Sigma ,\Phi }\), \(C\) and \(D\) are concepts of \(\mathcal {L}_{\Sigma ,\Phi }\), \(r \in \Sigma _{oR}\), \(\sigma \in \Sigma _{dR}\), \(a \in \Sigma _I\), and \(n\) is a natural number then
-
\(\varepsilon \), \(R \circ S\), \(R \sqcup S\), \(R^*\) and \(C?\) are object roles of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
\(\top \), \(\bot \), \(\lnot C\), \(C \sqcap D\), \(C \sqcup D\), \(\forall R.C\) and \(\exists R.C\) are concepts of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
if \(d \in range (\sigma )\) then \(\exists \sigma .\{d\}\) is a concept of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
if \(I \in \Phi \) then \(R^-\) is an object role of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
if \(O \in \Phi \) then \(\{a\}\) is a concept of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
if \(F \in \Phi \) then \(\le \,1\,r\) is a concept of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
if \(\{F,I\} \subseteq \Phi \) then \(\le \,1\,r^-\) is a concept of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
if \(N \in \Phi \) then \(\ge \,n\,r\) and \(\le \,n\,r\) are concepts of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
if \(\{N,I\} \subseteq \Phi \) then \(\ge \,n\,r^-\) and \(\le \,n\,r^-\) are concepts of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
if \(Q \in \Phi \) then \(\ge \,n\,r.C\) and \(\le \,n\,r.C\) are concepts of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
if \(\{Q,I\} \subseteq \Phi \) then \(\ge \,n\,r^-.C\) and \(\le \,n\,r^-.C\) are concepts of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
if \(U \in \Phi \) then \(U\) is an object role of \(\mathcal {L}_{\Sigma ,\Phi }\),
-
if \(\mathsf {Self}\in \Phi \) then \(\exists r.\mathsf {Self}\) is a concept of \(\mathcal {L}_{\Sigma ,\Phi }\).
-
3.2 Description logic-based information systems
-
\(\mathcal {A}\) is a finite set, called the ABox (assertion box) of \(\mathcal {KB}\), consisting of individual assertions.
-
\(\mathcal {T}\) is a finite list \((\varphi _1, \varphi _2, \ldots , \varphi _n)\), called the TBox (terminological box) of \(\mathcal {KB}\), where each \(\varphi _i\) is a terminological axiom of one of the the following forms:
-
\(A \equiv C\), where \(A \in \Sigma _C\) is a concept name not occurring in \(C\), \(\mathcal {A}\) and \(\varphi _1, \varphi _2, \ldots , \varphi _{i-1}\),
-
\(r \equiv R\), where \(R\) is an object role of \(\mathcal {L}_{\Sigma ,\Phi }\) and \(r \in \Sigma _{oR}\) is an object role name not occurring in \(R\), \(\mathcal {A}\) and \(\varphi _1, \varphi _2, \ldots , \varphi _{i-1}\).
-
-
\(\Delta ^\mathcal {I}= \Sigma _I\) (i.e., the domain of \(\mathcal {I}\) consists of all the individual names of \(\Sigma \)),
-
if \(A \in \Sigma _C\) is a primitive concept of \(\mathcal {KB}\) then\(A^\mathcal {I}= \{a \mid A(a) \in \mathcal {A}\}\),
-
if \(B\, \in \, \Sigma _A\! \setminus \! \Sigma _C\) then \(B^\mathcal {I}\, : \Delta ^\mathcal {I}\, \rightarrow \, range (B)\) is the partial function such that \(B^\mathcal {I}(a) = c\) if \((B(a) = c) \in \mathcal {A}\),
-
if \(r \in \Sigma _{oR}\) is a primitive object role of \(\mathcal {KB}\) then \(r^\mathcal {I}= \{\left\langle a,b\right\rangle \, \mid r(a,b) \in \mathcal {A}\}\),
-
if \(\sigma \in \Sigma _{dR}\) then \(\sigma ^\mathcal {I}= \{\left\langle a,d\right\rangle \, \mid \sigma (a,d) \in \mathcal {A}\}\),
-
if \(A \equiv C\) is a definition from \(\mathcal {T}\) then \(A^\mathcal {I}= C^\mathcal {I}\),
-
if \(r \equiv R\) is a definition from \(\mathcal {T}\) then \(r^\mathcal {I}= R^\mathcal {I}\),
-
if \(A \in \Sigma _C\) but \(A\) does not occur in \(\mathcal {KB}\) then \(A^\mathcal {I}= \emptyset \),
-
if \(r \in \Sigma _{oR}\) but \(r\) does not occur in \(\mathcal {KB}\) then \(r^\mathcal {I}= \emptyset \).
4 Bisimulation and indiscernibility
-
the set \(\{y \in \Delta ^\mathcal {I}\mid r^\mathcal {I}(x,y)\}\) is finite,
-
if \(I \in \Phi ^\dag \) then the set \(\{y \in \Delta ^\mathcal {I}\mid r^\mathcal {I}(y,x)\}\) is finite.
5 Concept learning
-
\(\mathcal {I}\models C(a)\) for all \(a \in E^+\), and
-
\(\mathcal {I}\models \lnot C(a)\) for all \(a \in E^-\).
-
by the first assertion of Theorem 4, \(C^\mathcal {I}\) should be the union of a number of equivalence classes of \(\Delta ^\mathcal {I}\) w.r.t. \(\sim _{\Sigma ^\dag ,\Phi ^\dag ,\mathcal {I}}\),
-
we should have that \(a^\mathcal {I}\in C^\mathcal {I}\) for all \(a \in E^+\), and \(a^\mathcal {I}\notin C^\mathcal {I}\) for all \(a \in E^-\).
-
\(Y_i\) is characterized by an appropriate concept \(C_i\) (such that \(Y_i = C_i^\mathcal {I}\)),
-
we keep information about whether \(Y_i\) is split by \(A_d^\mathcal {I}\),
-
if \(Y_i \subseteq A_d^\mathcal {I}\) then \( LargestContainer [i] := j\), where \(1 \le j \le n\) is the subscript of the largest block \(Y_j\) such that \(Y_i \subseteq Y_j \subseteq A_d^\mathcal {I}\).
5.1 Selectors
-
\(A\), where \(A \in \Sigma ^\dag _C\),
-
\(A=d\), where \(A \in \Sigma ^\dag _A\!\setminus \!\Sigma ^\dag _C\) and \(d \in range (A)\),
-
\(\exists \sigma .\{d\}\), where \(\sigma \in \Sigma ^\dag _{dR}\) and \(d \in range (\sigma )\),
-
\(\exists r.C_{i_t}\), where \(r \in \Sigma ^\dag _{oR}\) and \(1 \le t \le k\),
-
\(\exists r^-.C_{i_t}\), if \(I \in \Phi ^\dag \), \(r \in \Sigma ^\dag _{oR}\) and \(1 \le t \le k\),
-
\(\{a\}\), if \(O \in \Phi ^\dag \) and \(a \in \Sigma ^\dag _I\),
-
\(\le \,1\,r\), if \(F \in \Phi ^\dag \) and \(r \in \Sigma ^\dag _{oR}\),
-
\(\le \,1\,r^-\), if \(\{F,I\} \subseteq \Phi ^\dag \) and \(r \in \Sigma ^\dag _{oR}\),
-
\(\ge \,l\,r\) and \(\le \,m\,r\), if \(N \in \Phi ^\dag \), \(r \in \Sigma ^\dag _{oR}\), \(0 < l \le \#\Delta ^\mathcal {I}\) and \(0 \le m < \#\Delta ^\mathcal {I}\),
-
\(\ge \,l\,r^-\,\) and \(\,\le \,m\,r^-\,\), if \(\{N\,,I\} \subseteq \, \Phi ^\dag \), \(r\,\in \,\Sigma ^\dag _{oR}\), \(\,0\, <\,l \,\le \, \#\Delta ^\mathcal {I}\) and \(0 \le m < \#\Delta ^\mathcal {I}\),
-
\(\ge \,l\,r.C_{i_t}\) and \(\le \,m\,r.C_{i_t}\), if \(Q\, \in \, \Phi ^\dag \), \(r\, \in \, \Sigma ^\dag _{oR}\), \(1 \le t \le k\), \(0 < l \le \#C_{i_t}^\mathcal {I}\) and \(0 \le m < \#C_{i_t}^\mathcal {I}\),
-
\(\ge \,l\,r^-.C_{i_t}\) and \(\le \,m\,r^-.C_{i_t}\), if \(\{Q,I\} \subseteq \Phi ^\dag \), \(r \in \Sigma ^\dag _{oR}\), \(1 \le t \le k\), \(0 < l \le \#C_{i_t}^\mathcal {I}\) and \(0 \le m < \#C_{i_t}^\mathcal {I}\),
-
\(\exists r.\mathsf {Self}\), if \(\mathsf {Self}\in \Phi ^\dag \) and \(r \in \Sigma ^\dag _{oR}\).
-
Clearly, \(Z(a^\mathcal {I}, a^\mathcal {I})\) always holds for \(a \in \Sigma ^\dag _I\).
-
Consider the assertion (2) and suppose \(Z(x,x')\) holds. Since the partition \(\mathbb {Y}\) cannot be granulated anymore using the concept \(A\), we have \(Y_{i_j} \subseteq A^\mathcal {I}\) or \(Y_{i_j} \cap A^\mathcal {I}= \emptyset \) for any \(1 \le j \le k\). If \(x, x' \in Y_{i_j}\) and \(Y_{i_j} \subseteq A^\mathcal {I}\) then \(x, x' \in A^\mathcal {I}\). If \(x, x' \in Y_{i_j}\) and \(Y_{i_j} \cap A^\mathcal {I}= \emptyset \) then \(x, x' \notin A^\mathcal {I}\). Therefore \(A^\mathcal {I}(x) \Leftrightarrow A^\mathcal {I}(x')\).
-
Consider the assertion (3) and suppose \(Z(x,x')\) holds. Similarly as above, since the partition cannot be granulated anymore using any selector \((B=d)\) with \(d \in range (B)\), we have \(Y_{i_j} \subseteq (B=d)^\mathcal {I}\) or \(Y_{i_j} \cap (B=d)^\mathcal {I}= \emptyset \) for any \(1 \le j \le k\). If \(x, x' \in Y_{i_j}\) and \(Y_{i_j} \subseteq (B=d)^\mathcal {I}\) for some \(d \in range (B)\) then \(x, x' \in (B=d)^\mathcal {I}\), and hence \(B(x) = d = B(x')\). If \(x, x' \in Y_{i_j}\) and \(Y_{i_j} \cap (B=d)^\mathcal {I}= \emptyset \) for all \(d \in range (B)\) then both \(B(x)\) and \(B(x')\) are undefined.
-
Consider the assertion (4) and suppose that \(Z(x,x')\) and \(r^\mathcal {I}(x,y)\) hold. Let \(Y_{i_t}\) be the block of \(\mathbb {Y}\) that contains \(y\). We have that \(x \in (\exists r.C_{i_t})^\mathcal {I}\). Since \(\mathbb {Y}\) cannot be granulated anymore using the selector \((\exists r.C_{i_t})\), it follows that \(x' \in (\exists r.C_{i_t})^\mathcal {I}\). Hence, there exists \(y' \in \Delta ^\mathcal {I}\) such that \(r^\mathcal {I}(x',y')\) holds and \(y' \in C_{i_t}^\mathcal {I}= Y_{i_t}\), which means \(Z(y,y')\).
-
Similarly, the assertion (8) also holds when \(I \in \Phi ^\dag \).
-
Consider the assertion (10) and the case \(N \in \Phi ^\dag \). Suppose \(Z(x,x')\) holds and let \(l = \#\{y \mid r^\mathcal {I}(x,y)\}\). Since the partition \(\mathbb {Y}\) cannot be granulated anymore using the selectors \(\ge \,l\,r\) (when \(l > 0\)) and \(\le \,l\,r\) (when \(l < \#\Delta ^\mathcal {I}\)), we have that \(x' \in (\ge \,l\,r)^\mathcal {I}\) and \(x' \in (\le \,l\,r)^\mathcal {I}\). Therefore \(\#\{y' \mid r^\mathcal {I}(x',y')\} = l\).
-
Consider the assertion (12) and the case \(F \in \Phi ^\dag \). Suppose \(Z(x,x')\) holds. Using the argumentation given for (2) with \(A\) replaced by \((\le \,1\,r)\), we can derive that \(x \in (\le \,1\,r)^\mathcal {I}\) iff \(x' \in (\le \,1\,r)^\mathcal {I}\). Therefore, \(\#\{y \mid r^\mathcal {I}(x,y)\} \le 1 \Leftrightarrow \#\{y' \mid r^\mathcal {I}(x',y')\} \le 1\).
-
Consider the assertion (14) and the case \(Q \in \Phi ^\dag \). Suppose \(Z(x,x')\) holds. Let \(S = \{y \in \Delta ^\mathcal {I}\mid r^\mathcal {I}(x,y)\}\) and \(S' = \{y' \in \Delta ^\mathcal {I}\mid r^\mathcal {I}(x',y')\}\). Clearly, \(S\) and \(S'\) are finite. For the sake of contradiction, suppose there does not exist any bijection \(h:S \rightarrow S'\) such that \(h \subseteq Z\). Thus, there must exist \(1 \le t \le k\) such that \(\#(S\cap Y_{i_t}) = l \ne m = \#(S' \cap Y_{i_t})\). If \(l > m\) then \(x \in (\ge \,l\,r.C_{i_t})^\mathcal {I}\) but \(x' \notin (\ge \,l\,r.C_{i_t})^\mathcal {I}\). If \(m > l\) then \(x \notin (\ge \,m\,r.C_{i_t})^\mathcal {I}\) but \(x' \in (\ge \,m\,r.C_{i_t})^\mathcal {I}\). Therefore, \(x\) and \(x'\) are distinguishable by a basic selector in \(\mathcal {L}_{\Sigma ^\dag ,\Phi ^\dag }\), which contradicts the fact that the partition \(\mathbb {Y}\) cannot be granulated anymore.
-
\(A \le d\) and \(A < d\), where \(A \in \Sigma ^\dag _{nA}\), \(d \in range (A)\) and \(d\) is not a minimal element of \( range (A)\),
-
\(A \ge d\) and \(A > d\), where \(A \in \Sigma ^\dag _{nA}\), \(d \in range (A)\) and \(d\) is not a maximal element of \( range (A)\),
-
\(\exists r.\top \), \(\exists r.C_i\) and \(\forall r.C_i\), where \(r\,\in \,\Sigma ^\dag _{oR}\) and \(1 \le i \le n\),
-
\(\exists r^-.\top \), \(\exists r^-.C_i\) and \(\forall r^-.C_i\), if \(I \in \Phi ^\dag \), \(r \in \Sigma ^\dag _{oR}\) and \(1 \le i \le n\),
-
\(\ge \,l\,r.C_i\) and \(\le \,m\,r.C_i\), if \(Q \in \Phi ^\dag \), \(r \in \Sigma ^\dag _{oR}\), \(1 \le i \le n\), \(0 < l \le \#C_i^\mathcal {I}\) and \(0 \le m < \#C_i^\mathcal {I}\),
-
\(\ge \,l\,r^-.C_i\) and \(\le \,m\,r^-.C_i\), if \(\{Q,I\} \subseteq \Phi ^\dag \), \(r \in \Sigma ^\dag _{oR}\), \(1 \le i \le n\), \(0 < l \le \#C_i^\mathcal {I}\) and \(0 \le m < \#C_i^\mathcal {I}\).
-
the length of the resulting concept is usually long,
-
the accuracy, precision, recall and F1 measures of the resulting classifier are not high for new objects.
-
\(\exists r.D_u\) and \(\forall r.D_u\), where \(r \in \Sigma ^\dag _{oR}\) and \(D_u \in \mathbb {D}\),
-
\(\exists r^-\,.D_u\) and \(\forall r^-\,.D_u\), if \(I \in \Phi ^\dag \), \(r \in \Sigma ^\dag _{oR}\) and \(D_u \in \mathbb {D}\),
-
\(\ge \,l\,r.D_u\) and \(\le \,m\,r.D_u\), if \(Q \in \Phi ^\dag \), \(r \in \Sigma ^\dag _{oR}\), \(D_u \in \mathbb {D}\), \(0 < l \le \#D_u^\mathcal {I}\) and \(0 \le m < \#D_u^\mathcal {I}\),
-
\(\ge \,l\,r^-.D_u\) and \(\le \,m\,r^-.D_u\), if \(\{Q, I\} \subseteq \Phi ^\dag \), \(r \in \Sigma ^\dag _{oR}\), \(D_u \in \mathbb {D}\), \(0 < l \le \#D_u^\mathcal {I}\) and \(0 \le m < \#D_u^\mathcal {I}\).
5.2 Simplicity of concepts
-
\(0\) if \(C\) is of the form \(\top \), \(\bot \), \(A\), \(A=d\), \(A \ne d\), \(A>d\), \(A \ge d\), \(A<d\) or \(A \le d\),
-
\(\mathsf{mdepth }(D)\) if \(C\) is the normal form of \(\lnot D\),
-
\(1\) if \(C\) is of the form \(\exists \sigma .\{d\}\), \(\exists r.\mathsf {Self}\), \(\ge \,n\,R\) or \(\le \,n\,R\),
-
\(1 + \mathsf{mdepth }(D)\) if \(C\) is of the form \(\exists R.D\), \(\forall R.D\), \(\ge \,n\,R.D\) or \(\le \,n\,R.D\),
-
\(\max \{ {\mathsf {mdepth}}(D_1), {\mathsf {mdepth}}(D_2), \ldots , {\mathsf {mdepth}}(D_n)\}\) if \(C\) is of the form \(\sqcap \{D_1, D_2, \ldots , D_n\}\) or \(\sqcup \{D_1, D_2, \ldots , D_n\}\).
-
\(0\) if \(C\) is \(\top \) or \(\bot \),
-
\(1\) if \(C\) is of the form \(A\), \(A=d\), \(A \ne d\), \(A>d\), \(A \ge d\), \(A<d\) or \(A \le d\),
-
\(\mathsf{length }(D)\) if \(C\) is the normal form of \(\lnot D\),
-
\(3\) if \(C\) is of the form \(\exists \sigma .\{d\}\), \(\exists r.\mathsf {Self}\), \(\ge n\,R\) or \(\le n\,R\),
-
\(2 + \mathsf{length }(D)\) if \(C\) is of the form \(\exists R.D\) or \(\forall R.D\),
-
\(3 + \mathsf{length }(D)\) if \(C\) is of the form \(\ge \,n\,R.D\) or \(\le \,n\,R.D\),
-
\(1+ \mathsf{length }(D_1) + \mathsf{length }(D_2) + \cdots + \mathsf{length }(D_n)\) if \(C\) is of the form \(\sqcap \{D_1, D_2, \ldots , D_n\}\) or \(\sqcup \{D_1, D_2, \ldots , D_n\}\).
-
\(\mathsf{mdepth }(\sqcap \{\lnot A, \ge \,2\,r.(\exists r.\top ), \exists s.B\}) = 2\),
-
\(\mathsf{length }(\sqcap \{\lnot A, \ge \,2\,r.(\exists r.\top ), \exists r.B\}) = 10\).
-
\(\mathsf{length }(C) < \mathsf{length }(D)\), or
-
\(\mathsf{length }(C) \!=\! \mathsf{length }(D)\) and \(\mathsf{mdepth }(C) {\le } \mathsf{mdepth }(D)\).
5.3 Information gain in the context of DLs
5.4 A bisimulation-based concept learning algorithm
chooseBlockSelector
in Step 4 is used to choose the best block and selector at the current moment in the granulation process. This function applies information gain measure while taking into account also simplicity of selectors. Suppose that we have the current partition \(\mathbb {Y}= \{Y_{i_1}, Y_{i_2}, \ldots , Y_{i_k}\}\) and the current set of selectors \(\mathbb {D}=\{D_1, D_2, \ldots , D_h\}\).-
\(\bigsqcup \mathbb {C}= C_{l_1} \sqcup C_{l_2} \sqcup \cdots \sqcup C_{l_p}\), if \(\mathbb {C}= \{C_{l_1}, C_{l_2}, \ldots , C_{l_p}\}\),
-
\(\bigsqcup \mathbb {C}= \bot \), if \(\mathbb {C}= \emptyset \).
Simplify
can use the following techniques:-
The first technique is pruning. Given a decision tree generated by the granulation process, it allows to reduce the size of the tree by removing parts that provide little power in classifying objects. Pruning can be done top-down or bottom-up. For example, using the bottom-up fashion, one can repeatedly choose a node whose successors are leaves and cut off the successors if the average accuracy of the resulting concept is not worse on the training and validating datasets.
-
The second technique is based on replacement. The resulting concept is usually a disjunction \(C_1 \sqcup C_2 \sqcup \cdots \sqcup C_n\). In the case \(C_i\) is a too complex concept, we consider replacing it by a simpler one from the set of selectors if they have the same denotation in the considered information system. The replacement is done only when the accuracy of the resulting concept is not worse on the validating dataset.
-
The third technique is simplification. De Morgan’s laws are used to reduce the resulting concept to an equivalent one.
6 Illustrative examples
-
\(Y_2 := \{ P _1, P _4, P _6\}\), \(C_2 := Awarded \)
-
\(Y_3 := \{ P _2, P _3, P _5\}\), \(C_3 := \lnot Awarded \)
-
\( LargestContainer [3] := 3\) (as \(Y_3\) is not split by \(E\))
-
\(partition := \{Y_2, Y_3\}\)
-
\(Y_4 := \{ P _1\}\), \(C_4 := C_2 \sqcap \lnot \exists cited\,\,\_by .\top \)
-
\( LargestContainer [4] := 4\) (as \(Y_4\) is not split by \(E\))
-
\(Y_5 := \{ P _4, P _6\}\), \(C_5 := C_2 \sqcap \exists cited\,\,\_by .\top \)
-
\( LargestContainer [5] := 5\) (as \(Y_5\) is not split by \(E\))
-
\(partition := \{Y_3, Y_4, Y_5\}\)
-
\(Y_2 := \{ P _1, P _4, P _6\}\), \(C_2 := Awarded \)
-
\(Y_3 := \{ P _2, P _3, P _5\}\), \(C_3 := \lnot Awarded \)
-
\( LargestContainer [3] := 3\) (as \(Y_3\) is not split by \(E\))
-
\(partition := \{Y_2, Y_3\}\).
-
\(Y_4 := \{ P _1\}\), \(C_4 := C_2 \sqcap ( Year \ge 2009)\)
-
\( LargestContainer [4] := 4\) (as \(Y_4\) is not split by \(E\))
-
\(Y_5 := \{ P _4, P _6\}\), \(C_5 := C_2 \sqcap ( Year < 2009)\)
-
\( LargestContainer [5] := 5\) (as \(Y_5\) is not split by \(E\))
-
\(partition := \{Y_3, Y_4, Y_5\}\).
7 Experimental results
-
the average (Avg.) modal depth (Dep.) of the origin concepts (Org.),
-
the average length (Len.) of the origin concepts,
-
the average modal depth of the resulting concepts (Res.),
-
the average length of the resulting concepts,
-
the average accuracy (Acc.), precision (Pre.), recall (Rec.) and F1 measures,
-
the standard variant, minimum (Min) and maximum (Max) values of accuracy, precision, recall and F1 measures.
Avg. Dep. Res./Org. | Avg. Len. Res./Org. | Avg. Acc. [min; max] | Avg. Pre. [min; max] | Avg. Rec. [min; max] | Avg. F1 [min; max] | |
---|---|---|---|---|---|---|
WebKB dataset | ||||||
Simple selectors | 0.82/1.02 | 6.81/4.41 |
\(93.84\pm 13.50\)
|
\(92.09\pm 17.04\)
|
\(92.82\pm 17.32\)
|
\(91.59\pm 16.68\)
|
[33.69; 100.0] | [32.08; 100.0] | [23.08; 100.0] | [27.69; 100.0] | |||
Simple and extended | 0.84/1.02 | 3.40/4.41 |
\(94.60\pm 12.20\)
|
\(92.81\pm 15.93\)
|
\(93.14\pm 17.17\)
|
\(92.33\pm 16.17\)
|
Selectors | [33.69; 100.0] | [32.08; 100.0] | [23.08; 100.0] | [27.69; 100.0] | ||
PokerHand dataset | ||||||
Simple selectors | 1.41/2.60 | 37.02/15.97 |
\(97.17\pm 08.61\)
|
\(95.96\pm 14.99\)
|
\(94.95\pm 14.40\)
|
\(94.66\pm 14.64\)
|
[50.57; 100.0] | [01.67; 100.0] | [01.67; 100.0] | [01.67; 100.0] | |||
Simple and extended | 1.23/2.60 | 3.47/15.97 |
\(99.44\pm 02.15\)
|
\(98.68\pm 09.08\)
|
\(98.06\pm 09.58\)
|
\(98.18\pm 09.14\)
|
Selectors | [83.25; 100.0] | [01.67; 100.0] | [01.67; 100.0] | [01.67; 100.0] | ||
Family dataset | ||||||
Simple selectors | 2.38/3.34 | 78.50/18.59 |
\(88.50\pm 16.65\)
|
\(90.60\pm 18.57\)
|
\(85.66\pm 22.36\)
|
\(86.09\pm 20.10\)
|
[27.91; 100.0] | [04.55; 100.0] | [07.69; 100.0] | [08.70; 100.0] | |||
Simple and extended | 2.29/3.34 | 10.20/18.59 |
\(92.79\pm 14.35\)
|
\(91.99\pm 18.40\)
| 91.75 \(\pm \)19.82 |
\(90.39\pm 19.89\)
|
Selectors | [27.91; 100.0] | [04.55; 100.0] | [07.69; 100.0] | [08.70; 100.0] |
Dep. Res. | Len. Res. | Avg. Acc. [min; max] | Avg. Pre. [min; max] | Avg. Rec. [min; max] | Avg. F1 [min; max] | |
---|---|---|---|---|---|---|
Concept: \(Grandparent = \exists hasChild.(\exists hasChild.\top )\)
| ||||||
Simple selectors | 2.00 | 4.00 |
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
[100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | |||
Simple and extended | 2.00 | 4.00 |
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
Selectors | [100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | ||
Concept: \(Grandfather = Male \sqcap \exists hasChild.(\exists hasChild.\top )\)
| ||||||
Simple selectors | 2.00 | 36.00 |
\(95.90\pm 01.39\)
|
\(87.38\pm 06.81\)
|
\(79.15\pm 17.38\)
|
\(81.44\pm 08.35\)
|
[94.28; 97.67] | [80.00; 96.43] | [57.45; 100.0] | [72.00; 92.31] | |||
Simple and extended | 2.00 | 07.00 |
\(99.46\pm 00.77\)
|
\(100.0\pm 00.00\)
|
\(95.74\pm 6.02\)
|
\(97.73\pm 03.21\)
|
Selectors | [98.37; 100.0] | [100.0; 100.0] | [87.23; 100.0] | [93.18; 100.0] | ||
Concept: \(Grandmother = Female \sqcap \exists hasChild.(\exists hasChild.\top )\)
| ||||||
Simple selectors | 2.00 | 18.00 |
\(89.74\pm 01.30\)
|
\(100.0\pm 00.00\)
|
\(15.32\pm 04.47\)
|
\(26.31\pm 06.85\)
|
[88.37; 91.49] | [100.0; 100.0] | [09.30; 20.00] | [17.02; 33.33] | |||
Simple and extended | 2.00 | 07.00 |
\(99.91\pm 00.13\)
|
\(100.0\pm 00.00\)
|
\(99.22\pm 01.10\)
|
\(99.61\pm 00.55\)
|
Selectors | [99.73; 100.0] | [100.0; 100.0] | [97.67; 100.0] | [98.82; 100.0] | ||
Concept: \(Niece = Female \sqcap \exists hasChild^-.(\exists hasBrother.\top \sqcup \exists hasSister.\top )\)
| ||||||
Simple selectors | 3.00 | 151.00 |
\(85.57\pm 09.47\)
|
\(57.92\pm 32.09\)
|
\(64.70\pm 29.35\)
|
\(60.69\pm 31.33\)
|
[72.21; 93.02] | [12.66; 83.33] | [23.26; 87.50] | [16.39; 83.33] | |||
Simple and extended | 2.00 | 11.00 |
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
Selectors | [100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | ||
Concept: \(Nephew = Male \sqcap \exists hasChild^-.(\exists hasBrother.\top \sqcup \exists hasSister.\top )\)
| ||||||
Simple selectors | 3.00 | 178.00 |
\(91.40\pm 05.74\)
|
\(77.04\pm 26.30\)
|
\(88.40\pm 01.99\)
|
\(79.82\pm 17.72\)
|
[83.38; 95.74] | [40.22; 100.0] | [86.05; 90.91] | [54.81; 93.75] | |||
Simple and extended | 2.00 | 11.00 |
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
Selectors | [100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] |
Dep. Res. | Len. Res. | Avg. Acc. [min; max] | Avg. Pre. [min; max] | Avg. Rec. [min; max] | Avg. F1 [min; max] | |
---|---|---|---|---|---|---|
One pair | ||||||
Simple selectors | 4.0 | 109.00 |
\(42.57\pm 01.48\)
|
\(16.74\pm 00.87\)
|
\(76.00\pm 4.03\)
|
\(27.44\pm 01.42\)
|
[40.71; 45.24] | [15.64; 18.05] | [71.67; 81.67] | [25.67; 29.45] | |||
Simple and extended selectors | 5.00 | 15.00 |
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
[100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | |||
Two pairs | ||||||
Simple selectors | 4.00 | 25.00 |
\(36.33\pm 00.47\)
|
\(17.16\pm 0.53\)
|
\(90.33\pm 4.14\)
|
\(28.83\pm 00.96\)
|
[35.48; 36.67] | [16.34; 17.70] | [83.33; 95.00] | [27.32; 29.84] | |||
Simple and extended selectors | 5.00 | 15.00 |
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
[100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | |||
Three of a kind | ||||||
Simple selectors | 4.00 | 48.00 |
\(52.52\pm 02.16\)
|
\(20.92\pm 1.01\)
|
\(83.33\pm 01.83\)
|
\(33.43\pm 01.39\)
|
[50.71; 56.67] | [19.75; 22.77] | [80.00; 85.00] | [31.68; 35.92] | |||
Simple and extended selectors | 3.00 | 11.00 |
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
[100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | |||
Straight | ||||||
Simple selectors | 5.00 | 97.00 |
\(81.24\pm 02.01\)
|
\(39.65\pm 04.62\)
|
\(58.33\pm 04.94\)
|
\(47.13\pm 4.41\)
|
[80.00; 85.24] | [36.36; 48.72] | [53.33; 65.00] | [43.24; 55.07] | |||
Simple and extended selectors | 5.00 | 32.00 |
\(98.67\pm 00.68\)
|
\(96.35\pm 03.44\)
|
\(94.33\pm 02.00\)
|
\(95.31\pm 02.35\)
|
[97.62; 99.52] | [90.32; 100.0] | [91.67; 96.67] | [91.80; 98.31] | |||
Flush | ||||||
Simple selectors | 2.00 | 10.00 |
\(94.33\pm 00.80\)
|
\(71.71\pm 02.79\)
|
\(100.0\pm 00.00\)
|
\(83.49\pm 01.92\)
|
[92.86; 95.24] | [66.67; 75.00] | [100.0; 100.0] | [80.00; 85.71] | |||
Simple and extended selectors | 3.00 | 7.00 |
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
[100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | |||
Full house | ||||||
Simple selectors | 4.00 | 68.00 |
\(60.48\pm 03.05\)
|
\(25.95\pm 01.45\)
|
\(94.67\pm 2.45\)
|
\(40.71\pm 01.73\)
|
[57.62; 64.76] | [24.23; 28.00] | [91.67; 98.33] | [38.33; 43.08] | |||
Simple and extended selectors | 2.00 | 6.00 |
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
\(100.0\pm 00.00\)
|
[100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] | [100.0; 100.0] |