1 Introduction
2 Expressive Explainable Artificial Intelligence
3 Theoretical Background
3.1 GradCAM
3.2 Concept Vector Analysis
3.3 Metric Learning
3.4 Inductive Logic Programming
okay(e1)
, not okay(e2)
) where the predicate indicates membership to either the positive or negative class. The BK literals act as means to further describing the examples symbolically (e.g. contains(e1, p42)
, left_of(p42, p43)
stating that example e1
contains part p42
etc). ILP methods are generally designed to induce hypotheses that classify as many positive examples as possible to be positive by avoiding to classify as many negative examples to be positive. The hypotheses are constructed by using predicates from the BK to form a set of first-order logic clauses.:-
(a left facing arrow) and the conjunctively connected preconditions (The conjunction is expressed by a comma). The rule ends with a period. The above example can be interpreted as follows: instance A
is okay if it contains parts B
and C
and B
is left of C
in the image instance.4 SymMetric: Finding Verbal Explanations for CNN Classification Results
4.1 Most Important Feature Vectors for Symbolic Explanations
4.2 Low-Time-Budget User Labeling
Lab
Lab
to find a transformation \(\theta\) that brings feature vectors of similar semantics closer together according to the cosine similarity. Further, the mean vector \(\overline{\text {Lab}_c}\) is calculated for all labels \(c \in C\) in the labeled vectors. \(\theta\) is now applied to all most important feature vectors in \(V^*\) of all image instances in neighborhood S. That way, we achieve a space where also the unlabeled feature vectors are closer to their respective semantically similar vectors in terms of the cosine similarity.4.3 Symbolic Explanation Generation with ILP
contains(s1, p1). concept(p1, eye)
to state that image s1
contains an image patch p1
that is of concept eye
. To enrich the background knowledge for the images, we additionally find spatial relations between the sub-concepts. We only use the four relations left_of
, right_of
, top_of
and bottom_of
for now. Literals such as left_of(p1, p2)
or top_of(p3, p2)
are then added to the BK. We find these in a straight forward fashion by comparing the coordinates of the found sub-concepts. E.g. for the top_of
relation we scan if there is a sub-concept present in the 45° section facing upwards in the image. Finally, Aleph is used to generate explanations by inducing first-order logic rules.4.4 Algorithm
5 Conducted Experiments and Results
Accuracy | F1 score | |
---|---|---|
Train data | 0.9713 | 0.9706 |
Test data | 1.0 | 1.0 |
Eye | Ear | Whiskers | |
---|---|---|---|
Without metric learning | 0.31 (0.10) | 0.33 (0.11) | 0.25 (0.06) |
With metric learning | 0.35 (0.12) | 0.34 (0.12) | 0.29 (0.07) |
eye
, nose
, mouth
) on each image \(s \in S\) by finding the location of the image patch whose feature vector is most similar to the sub-concept vector (see Sect. 4.3). For the eyes who appear two times on each image we have to deviate slightly from that approach. In order to get two points for two eyes, we do not find the closest vector in the entirety of the vectors that are part of the eye cluster. We rather find the two “super patches” that are formed by coherent patches but whose member patches have no neighbors from the respective other super patch. For each image, we now can write literals like face(s1)
or not face(s2)
depending on whether \(s_i\) represents a positive or negative image instance. We further write background knowledge for the occurrence of a sub-concept in an image (e.g. contains(s1, p1), concept(p1, eye)
etc.) as well as the derived spatial relations between the sub-concepts (e.g. left_of(p1, p2), bottom_of(p2, p3)
). The last step consists of inducing general rules from the examples and the BK. See Fig. 4 for the rules as well as intermediate data collected during the experiment.
B
to be an eye and C
to be a mouth. This is on par with the construction of the positive examples, i.e. a normal face.