1 Introduction
2 Glass Identification Database
2.1 Description of the data
Class | 1 | 2 | 3 | 5 | 6 | 7 |
---|---|---|---|---|---|---|
Frequency | 70 | 76 | 17 | 13 | 9 | 29 |
2.2 Learning models on the glass data
mlr3
[18] has been used. The steps are described in the following paragraphs.Algorithm | Parameter | Lower | Upper | Scale |
---|---|---|---|---|
nn | Decay | 0.00001 | 10 | Log |
nn | Size | 1 | 20 | Linear |
rf | Num.trees | 64 | 2048 | Log |
rf | mtry | 1 | 9 | Linear |
Algorithm | Parameter | Values |
---|---|---|
nn | Decay | 0.10 |
nn | Size | 18 |
rf | Num.trees | 138 |
rf | mtry | 2 |
3 Explainability
3.1 Partial dependence plots
3.2 A measure of explainability
3.3 Multiclass extension of explainability
4 Aplication to the glass data
4.1 Partial dependence plots
DALEX
framework [3]. Note that all predictied posterior probabilities sum up to one which results in the observed scale of the plots.4.2 Explainability of the plots
Class | 1 | 2 | 3 | 5 | 6 | 7 |
---|---|---|---|---|---|---|
RI | 0.240 | 0.108 | − 0.171 | 0.037 | 0.034 | 0.018 |
Na | 0.050 | 0.083 | 0.018 | 0.006 | 0.152 | 0.152 |
Mg | 0.265 | 0.026 | 0.104 | 0.246 | 0.187 | 0.152 |
Al | 0.306 | 0.084 | − 0.006 | 0.103 | − 0.010 | 0.186 |
Si | 0.016 | 0.025 | 0.041 | 0.016 | 0.015 | 0.016 |
K | 0.053 | 0.092 | 0.008 | 0.023 | 0.247 | 0.029 |
Ca | 0.070 | 0.205 | 0.058 | 0.178 | 0.025 | 0.016 |
Ba | 0.044 | 0.066 | 0.021 | 0.002 | − 0.024 | 0.466 |
Fe | − 0.001 | 0.022 | 0.000 | 0.002 | 0.007 | 0.007 |
Class | 1 | 2 | 3 | 5 | 6 | 7 |
---|---|---|---|---|---|---|
Step 1 | Al | Ca | Mg | Mg | K | Ba |
Step 2 | Mg | RI | Ca | Na | Ba | Al |
\(\hat{\varUpsilon }\) | 0.523 | 0.350 | 0.226 | 0.470 | 0.324 | 0.643 |
iml
framework [20]. The corresponding heat maps are given in Fig. 6. From the color scale it can be easily recognized that the classes have different prior probabilites. Such two-dimensional PDPs allow to detect interactions but unfortunately no visualization for PDPs of \(dim(X_s) > 2\) is possible and thus higher order interactions as they might have been identified by the underlying model stay hidden. Both measures \(\hat{\varUpsilon }\) and \(\hat{\varUpsilon }^{MC}\) help to quantify the degree of explanation of the model by its corresponding partial dependence plots.