Skip to main content
Erschienen in:
Buchtitelbild

Open Access 2023 | OriginalPaper | Buchkapitel

LEMON: Alternative Sampling for More Faithful Explanation Through Local Surrogate Models

verfasst von : Dennis Collaris, Pratik Gajane, Joost Jorritsma, Jarke J. van Wijk, Mykola Pechenizkiy

Erschienen in: Advances in Intelligent Data Analysis XXI

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Local surrogate learning is a popular and successful method for machine learning explanation. It uses synthetic transfer data to approximate a complex reference model. The sampling technique used for this transfer data has a significant impact on the provided explanation, but remains relatively unexplored in literature. In this work, we explore alternative sampling techniques in pursuit of more faithful and robust explanations, and present LEMON: a sampling technique that samples directly from the desired distribution instead of reweighting samples as done in other explanation techniques (e.g., LIME). Next, we evaluate our technique in a synthetic and UCI dataset-based experiment, and show that our sampling technique yields more faithful explanations compared to current state-of-the-art explainers.

1 Introduction

Explaining artificial intelligence (XAI) is important in high-impact domains such as credit scoring, employment and housing [5, 9, 12]. In these fields, incorrect model behavior may lead to additional direct costs, opportunity costs, as well as unfavorable bias and discrimination. XAI techniques can help identify and alleviate such problems [1]. Let us consider a real-world example: recent work has shown that, for commercial face classification services, accuracy of gender classification on dark-skinned females is significantly worse than on any other group [8]. This discrepancy was conjectured to be largely due to unrepresentative training datasets and imbalanced test benchmarks. However, using explanation techniques, it was shown that the classifiers made use of makeup as a proxy for gender in a way that did not generalize to the rest of the population [20].
A common approach to explain machine learning models is to create an explanatory, or surrogate model that mimics the reference model. As the surrogate is typically simpler, it can be used to understand the complex reference model. This enables us to understand any model (i.e., model-agnostic approach) without having to alter that model (which could hurt performance). The extent to which this surrogate accurately approximates the reference model is called faithfulness (or fidelity).
There are two ways to obtain a surrogate model. The first is to globally mimic the reference model with an inherently simple surrogate model. However, due to this simplicity, the resulting surrogate can often not faithfully represent the reference model, which leads to inaccurate or incorrect explanations. Another approach is to consider only a small part of the complex reference model, and locally mimic that portion. Such surrogate models remain locally faithful to the reference model, while also being simple enough to understand. The current state-of-the-art techniques to explain individual predictions (e.g., LIME [21]) apply this approach by targeting only the part of the model that is relevant for that particular prediction. This process is illustrated in Fig. 1.
To generate such a surrogate, a simple model is trained on transfer data: a set of data points labeled by the reference model. This technique is well-known, but until recently was only applied to approximate models globally. For local explanations, samples from a constrained region are used to obtain a surrogate that is locally faithful, and simple enough to be considered interpretable [4, 21].
In this paper, we investigate transfer data sampling techniques for local surrogate models, and identify that the faithfulness of existing techniques may be impaired in high dimensionality. We explore alternative sampling techniques and introduce Local Explainable MOdel explanations using N-ball sampling (LEMON): an improved sampling technique that is more faithful and robust than the current state-of-the-art techniques by sampling directly from the desired distribution instead of reweighting samples (see Fig. 1).
The idea of using transfer data to approximate a model globally was introduced by Craven et al. [10] and Domingos [11] and it has been used for model compression [3, 7, 18, 19, 22, 24], comprehensibility [4, 6, 21, 23] and generalization [18].
The types of surrogate models used vary widely. While for local explanation, linear regression is sufficient [21], global explanation requires more expressive surrogate models, e.g., shallow neural networks [22, 24], decision trees [6, 10], and rule sets [11, 23]. Furthermore, we identified two main categories of sampling techniques for surrogate learning used in previous work:
Synthetic sampling draws new samples from a distribution (e.g., uniform or normal), independently of the original data. For local techniques, this distribution is restricted to a predefined region of the feature space (i.e., the region of interest). The advantage of this approach is that we can sample as many transfer data points as desired. Most local explanation techniques use this approach.
Observation-based sampling uses the training data of the model as transfer data. When features in a dataset are correlated, certain values in feature space are less likely (or impossible) to occur compared to the correlated (or ‘sensible’) region of feature space. Observation-based sampling yields more samples in that sensible part of the feature space. However, the number of samples for transfer data is limited. Oversampling techniques like Naive Bayes Estimation (NBE) or MUNGE [7] can partially address this problem.
Which of these sampling techniques to use for surrogate learning is generally not considered thoroughly. For example, some authors make empirical claims such as “We have found that using the original training set works well” [18]. However, it is unclear what kind of benefit observation-based sampling yields compared to synthetic sampling, or how the chosen synthetic sample distribution affects the quality of explanations.
The vast majority of the reviewed papers focused on global approximations, in which the faithfulness (i.e., accuracy with respect to the reference model) of the surrogate model is compromised in order to simplify the surrogate and hence the resulting explanation, or reduce its memory footprint for model compression. The focus of this paper is on sampling for local surrogates instead. By only considering a small part of the reference model, and only locally mimicking that portion of the complex model, the surrogate remains faithful and simple. This approach is more recent and gained a lot of popularity with the introduction of the LIME explainability framework [21].

3 Issues with Sampling for Local Surrogates

To understand sampling for local surrogates, we consider LIME [21], as it is a widely popular local explanation technique, and for its clear and accessible usage of surrogate models. The transfer data in LIME are samples that are drawn from a fixed multivariate Gaussian distribution centered on the global mean of the training data. Here, fixed means that the distribution does not depend on the data point to be explained. Next, these samples are weighted based on their proximity to the data point to be explained. The locality of the technique is a result of this weighing. Then, a linear regression surrogate model is trained on these weighted samples, and the coefficients are presented as a “feature contribution” explanation that shows how important a feature is to a prediction: a small change in a feature with a high coefficient will lead to a large change in prediction, and hence can be considered important to the model.
The quality of a local surrogate is typically measured in faithfulness: the extent to which the local surrogate locally represents the reference model.
As a consequence of fixing the transfer data independently of the point to be explained, a notable drawback of systems such as LIME is that as the dimensionality of the data increases, the chances of obtaining samples close to the instance to be explained gets ever smaller. Hence, the robustness and faithfulness are significantly impaired for high-dimensional data. This is very similar to the known “curse-of-dimensionality” limitation of rejection sampling, in which most proposed points are not accepted as valid samples in high dimensions. In addition to faithfulness, Alvarez-Melis and Jaakkola [2] have demonstrated that using only few relevant samples (100 in their study) degrades the robustness of the explanation from LIME (i.e., very different explanations for similar inputs).
To experimentally verify this effect, we set up an experiment in which we can arbitrarily increase the dimensionality of the model without affecting other semantics of the machine learning setup. Consider the n-dimensional feature space \(\mathcal {X} \subset \mathbb {R}^n\), and two classification models representing a hyperbox (\(b(\textbf{x})\)) and hypersphere (\(s(\textbf{x})\)) respectively:
$$\begin{aligned} b(\textbf{x}) = \Vert \textbf{x}\Vert _\infty \le 1,\quad \text {and}\quad s(\textbf{x}) = \Vert \textbf{x}\Vert _2 \le 1 \end{aligned}$$
(1)
classifying \(\textbf{x}\in \mathcal {X}\) as either true or false. These models are simple enough to quickly change the dimensionality of the model, while being complex enough to resemble a realistic complex classification model that cannot perfectly be represented by the local surrogate model.
We chose the input data point \(\textbf{x}= [ 1, 0, 0, ... ]\), a point on the surface of the decision boundary of the model. Next, a surrogate model is generated using LIME and four different kernel width parameters. We chose values from 0.1 to 0.4 to approximate the right side of the model only (\(x_0 > 0\)). By measuring the faithfulness of the surrogate models generated for different dimensional models, we assert whether the faithfulness is impaired in high-dimensional space.
For data point \(\textbf{x}\) and varying levels of dimensionality, we measure the faithfulness of the linear surrogate model, using the cosine similarity between the coefficient vector of the surrogate, and that of the best possible linear model in this setup: \(f(\textbf{x}) = \textbf{x}_0\), coefficients shown in Fig. 2c. Contrary to more traditional faithfulness metrics (e.g., RMSE or \(R^2\)), this approach measures the agreement between the models without the need for additional sampling.
Figure 2 shows that for models with only a modest number of dimensions (i.e., 10–20 depending on the kernel width), the faithfulness of LIME is already significantly impaired, which can result in untrustworthy and misleading explanations. In addition, the explanations are not robust as indicated by the heavy fluctuations. This happens because in high-dimensions, very few relevant samples are generated in the neighborhood of the point to be explained, and hence, the linear model is unable to approximate the behavior of the reference model. For a 15-dimensional box model, Fig. 2c shows the expected coefficients (blue) and coefficients of the linear surrogate from LIME (green). Even though only feature 0 has a substantial role for prediction, LIME incorrectly reports that many other features are relevant.
Note that LIME employs LASSO feature selection to reduce dimensionality ahead of explanation. However, this step is subject to the same limitations as outlined in this section. For simplicity, we disregard the feature selection option.

4 LEMON: Robust N-Ball Sampling

We introduce LEMON: Local Explainable MOdel explanations using N-ball sampling, which addresses the issues identified in Sect. 3. This technique samples directly from the desired distribution (defined by a distance-kernel function), instead of reweighting samples. This naturally yields data points where we need them: in the neighborhood (or region of interest) of the instance \(\textbf{x}\) to be explained.

4.1 Sampling from a Hypersphere

We first use sampling within a unit hypersphere followed by scaling the samples by radius r (region of interest), and translating the samples to be centered at \(\textbf{x}\).
Fishman [14] and Harman and Lacko [17] describe an efficient way to obtain points within an n dimensional hypersphere (i.e., n-sphere). If \(Y \sim N(0,1)\), then \(S_n = \frac{Y}{\parallel Y \parallel }\) is uniformly distributed on a unit n-sphere. Next, when we apply
$$\begin{aligned} S_n \cdot U^{\frac{1}{n}} \end{aligned}$$
(2)
where U has the uniform distribution on (0, 1), we obtain the uniform distribution of a unit n-ball; the region enclosed by an n-sphere. Uniform samples from this distribution correspond to points uniformly distributed within the n-sphere.
This method will ensure that all samples reside strictly within the region of interest within radius r around \(\textbf{x}\). With more relevant samples, the surrogate model can represent the reference model faithfully, and output more robust results with less variance between subsequent runs of the algorithm.

4.2 Accommodating Arbitrary Distance Kernels

Sampling uniformly from a hypersphere is restrictive, and makes it challenging to compare fairly against LIME, in which the samples are normally distributed. In addition, different domains may require different distance metrics and kernels (e.g., cosine distance for text and L2 distance for images [21]). Hence, we expand our sampling technique to accommodate arbitrary distance kernels.
Let K(r) denote a distance kernel on the domain \([0, r_\text {max}]\), where the maximal distance \(r_\text {max}>0\) may depend on the kernel. To sample points weighted by this kernel, note that the total weight of points at radius r is given by \(c_nK(r)r^{n-1}\) for some dimension-dependent constant \(c_n\). Thus, the cumulative distribution function (cdf) for the radius of a sample is
$$\begin{aligned} F(r) \mathop {\mathrm {{:}{=}}}\limits \mathbb {P}\big (\Vert X\Vert \le r\big ) = \frac{\int _0^{r}K(s)s^{n-1}\textrm{d}s}{\int _0^{r_\text {max}}K(s)s^{n-1}\textrm{d}s}, \ \text {for}\ r \le r_\text {max}. \end{aligned}$$
(3)
To sample using Eq. (3), we use inverse transform sampling [16]. However, an exact analytical integral of this density function may not always exist e.g., the Gaussian distance kernel used by LIME does not have a closed solution. Hence, we numerically approximate the inverse to sample from arbitrary distance kernels. Next, we show two examples of specific types of distance kernels that can be used with this technique.
Uniform Distance Kernel. The most basic distance kernel is the uniform kernel \(K_\text {uniform}(r)\mathop {\mathrm {{:}{=}}}\limits 1.\) We first show that substituting this distance kernel function into Eq. (3) yields the same cdf as the uniform sampling approach in Eq. (2). We get for \(r \le r_{\text {max}}\), \( F(r) = \frac{\int _0^{r}s^{n-1}\textrm{d}s}{\int _0^{r_\text {max}}s^{n-1}\textrm{d}s} = \frac{r^n/n}{r_\text {max}^n/n} = \left( r/r_\text {max}\right) ^n. \) Ignoring the factor \(S_n\) that determines the angle from (2), we get for \(r \le r_{\text {max}}\),
$$\begin{aligned} \mathbb {P}\left( U^{1/n}< \tfrac{r}{r_\text {max}}\right) = \mathbb {P}\left( U < \left( \tfrac{r}{r_\text {max}}\right) ^n\right) = \left( \tfrac{r}{r_\text {max}}\right) ^n = F(r). \end{aligned}$$
In the equation above, the second equality follows from the fact that U has the uniform distribution on (0, 1). Ergo, using this uniform distance kernel leads to uniformly distributed samples within a hypersphere of radius \(r_\text {max}\). An example of sampling using this distance kernel is shown in Fig. 3a.
Gaussian Distance Kernel. For a fair comparison with other methods, our sampling technique should also support the Gaussian distance kernel as used in LIME, defined as
$$\begin{aligned} K_\text {gaussian}(r)\mathop {\mathrm {{:}{=}}}\limits \exp \big (-r^2/(2\sigma ^2)\big ). \end{aligned}$$
(4)
However, this distance kernel poses a problem: the Gaussian distribution is unbounded, while for our numeric approximations we require a kernel whose domain is bounded by some radius \(r_\text {max} < \infty \). For comparison to the Gaussian kernel used in LIME, we use a truncated distance kernel: we sample points from a Gaussian distribution with the same kernel standard deviation conditioned to be at most \(r_\text {max}\). Here, we choose \(r_\text {max}\) such that a fraction p of the sampled points resides within this radius. In Appendix A we show that
$$\begin{aligned} r_\text {max} = \sqrt{2\sigma ^2\varGamma ^{-1}\Big (\frac{n}{2}, (1-p)\varGamma \Big (\frac{n}{2}\Big )\Big )}, \end{aligned}$$
(5)
Alternatively, we can start with a predefined radius \(r_\text {max}\) that defines the region that we would like to explain using a Gaussian distance kernel. This yields a \(\sigma ^2\) such that a fraction \(p\in (0,1)\) of the sampled points resides within, i.e., \( \sigma ^2\ =\ \tfrac{r_\text {max}^2}{2\varGamma ^{-1}\big (\frac{n}{2}, (1-p)\varGamma \big (\frac{n}{2}\big )\big )}.\)
Using a truncated Gaussian distance kernel with these parameters enables us to generate samples that are distributed very closely to how samples in LIME are weighted, which enables us to fairly compare both techniques. An example of sampling using this distance kernel is shown in Fig. 3b.

5 Evaluation

In this section, we first revisit the first synthetic evaluation example introduced in Sect. 3. Next, to use a more realistic scenario, we compare LEMON and LIME on standardized UCI datasets and a variety of models. Source code for our experiments can be found here: https://​github.​com/​iamDecode/​lemon-evaluation.

5.1 Synthetic Scenario

In Sect. 3 we showed that the faithfulness of LIME is impaired for models trained on higher dimensional data (Fig. 2). We repeat this experiment with our LEMON sampling technique. We chose a truncated Gaussian kernel with the same \(\sigma \) as LIME, and an \(r_\text {max}\) computed using Eq. (5) with \(p=0.999\). This ensures we generate samples that are distributed very closely to how samples in LIME are weighted, such that we can fairly compare both techniques.
The results are shown in Fig. 4. In contrast to the results for LIME (Fig. 2), LEMON remains faithful to the reference model regardless of the dimensionality of the model. This is because more relevant samples are generated in the neighborhood of the point to be explained even in high dimensions. With more samples, the linear model is able to approximate the behavior of the reference model better than LIME. For a 15-dimensional box model, Fig. 4c shows the expected coefficients (blue) and coefficients of the linear surrogate from LEMON (green) are very close, as opposed to the coefficients of LIME shown in Fig. 2c. In addition, the results show smaller vertical fluctuations compared to LIME, indicating that the robustness of explanations from LEMON is affected less by variation in the transfer data.

5.2 Real-World Datasets

We used the Wine and Breast Cancer Wisconsin dataset from the UCI repository, and Diabetes dataset [13] which are ubiquitous in machine learning research. The datasets have a dimensionality of 13, 32, and 9 respectively, and contain only continuous features. For the reference models to be explained, we chose a Naive Bayes classifier, a Neural network with three layers of 100 neurons each, and a Random forest with 200 trees. As the kernel width may have a considerable impact on the explanation, we chose a wide range of kernel width parameter values \(\sigma = 0.1, 0.3, 0.5, 1.0, 4.0\) and \(\frac{3}{4}\sqrt{n}\). The latter is the default kernel width used in LIME, but is so large (\(>1\)) that it can hardly be considered local. Next, we computed \(r_\text {max}\) for LEMON using Eq. (5) with \(p=0.999\).
To evaluate, it is not possible to directly compare the resulting surrogate model against a perfect surrogate model like we did for our synthetic scenario evaluation, because a perfect surrogate model for these classifiers is not known. Instead, we chose to compute the Root Mean Square Error (RMSE) based on newly sampled evaluation data in the neighborhood of the point to be explained. For each data point, we generated \(m=50,000\) new samples in the area within radius \(r_\text {max}\) using Eq. (3) and an equivalent distance kernel to the ones used in LIME and LEMON. Next, we recorded the RMSE between the predicted score of the reference \(\mathbf {\hat{y}}^r\) and surrogate model \(\mathbf {\hat{y}}^s\) for all m samples:
$$\begin{aligned} RMSE (\mathbf {\hat{y}}^r, \mathbf {\hat{y}}^s) = \sqrt{\frac{1}{m}\sum _{i=1}^{m}{(\mathbf {\hat{y}}^r_i - \mathbf {\hat{y}}^s_i)^2}}. \end{aligned}$$
(6)
Note that due to the simple nature of the linear surrogate and complexity of the reference classifier, a perfect \( RMSE = 0\) is implausible [15]. However, the metric does enable us to compare the relative faithfulness between LIME and LEMON. In Table 1 we show the mean RMSE scores over all data points in the dataset.
Table 1.
Average faithfulness scores (RMSE on 50,000 samples, lower is better) of explanations generated for all instances in each of the 3 datasets, classified by 3 different ML models (Naive Bayes, Neural network and Random forest), using 6 different kernel width values. LEMON consistently achieves higher faithfulness compared to LIME.
 
Kernel
Naive Bayes
Neural network
Random forest
 
width (\(\sigma \))
LIME
LEMON
LIME
LEMON
LIME
LEMON
Wine dataset
0.1
0.009
0.003
0.036
0.007
0.041
0.018
(\(n=13\))
0.3
0.044
0.026
0.147
0.079
0.118
0.051
 
0.5
0.103
0.071
0.283
0.143
0.186
0.082
 
1.0
0.258
0.224
0.273
0.247
0.156
0.120
 
\(\frac{3}{4}\sqrt{n}\)
0.652
0.303
0.543
0.271
0.376
0.124
 
4.0
0.848
0.282
0.827
0.307
0.545
0.120
Diabetes dataset
0.1
0.018
0.016
0.017
0.015
0.072
0.036
(\(n=9\))
0.3
0.057
0.031
0.051
0.026
0.141
0.053
 
0.5
0.079
0.045
0.073
0.032
0.112
0.064
 
1.0
0.120
0.110
0.068
0.063
0.104
0.088
 
\(\frac{3}{4}\sqrt{n}\)
0.387
0.257
0.239
0.146
0.247
0.100
 
4.0
0.686
0.349
0.452
0.192
0.419
0.096
Breast cancer dataset
0.1
0.011
0.006
0.222
0.102
0.038
0.015
(\(n=32\))
0.3
0.052
0.030
0.401
0.208
0.103
0.038
 
0.5
0.151
0.104
0.458
0.229
0.171
0.057
 
1.0
0.490
0.263
0.585
0.312
0.265
0.072
 
\(\frac{3}{4}\sqrt{n}\)
0.512
0.001
0.781
0.331
0.367
0.065
 
4.0
0.504
0.002
1.162
0.305
0.358
0.065
These results show LEMON manages to consistently improve the faithfulness of the local surrogate model compared to LIME. This holds for each dataset, model and kernel width combination we have tested. On average, LEMON achieves 50.8% less RMSE compared to LIME. Next, we see that explanations generated with smaller kernel width tend to have a smaller RMSE. This is expected, because smaller regions naturally contain less intricate decision boundaries from the reference model, and smaller output gradients (e.g., the further we zoom in on a model, the better a linear model will fit its gradient).
There are a few exceptions, most notably the Naive Bayes classifier trained on the Breast cancer dataset. Here, the LEMON explanations get lower RMSE scores for very large kernel width values (\(> 1\)). While a smaller kernel width yields a faithful local surrogate, for larger kernel widths a linear surrogate may not be able to capture the complex behavior of the reference classifier. But if we increase the kernel width beyond the bounds of the original feature space (approximately \(\sigma > 1\)) the evaluation data points become out-of-distribution. In our example, the mean Euclidean distance of all Breast cancer training data to a point to be explained is 491.76, the mean Euclidean distance of all evaluation data with \(\sigma = 1\) is 586.91 and \(\sigma = \frac{3}{4}\sqrt{n}\) is 2409.37. The latter is almost five times larger than the training data. Hence, most predictions for evaluation data points are out-of-distribution model predictions, which yields unexpected results.
These (unrealistically) large kernel widths cause LIME to produce RSME scores exceeding 1 for certain dataset and model combinations (e.g., Neural network for the Breast cancer dataset). Smaller kernel width values should be chosen to ensure that LIME explanations remain faithful to the reference model.
The RMSE scores vary per dataset and per model, because both affect how much difference in predicted score (i.e., gradient) can be expected within the sampling region. For instance, in Naive Bayes models the predicted score changes smoothly for changes in the feature value. Hence, this model can be closely approximated with a linear model (especially for small kernel width values). The other two models are more complex, and hence cannot always be accurately approximated with a linear model (especially for larger kernel width values).

6 Discussion and Future Work

The LIME explanation framework includes an optional preceding feature selection step (using LASSO). One could argue that feature selection ahead of the explanation technique decreases the dimensionality, enabling LIME to be more suitable in higher dimensional space than we have shown in Sect. 3. However, the feature selection algorithm still needs to consider the full feature space in order to select features, which it cannot properly do without sufficient neighboring samples: it is subject to the same limitations. Hence, in our study we have disregarded the feature selection step in LIME as it makes evaluation and comparison more difficult. However, we expect similar results when including feature selection as part of both compared algorithms.
Supporting Observation-Based Sampling. Sampling with either a uniform or Gaussian distance kernel remains a synthetic approach: new samples are drawn regardless of the distribution of the original data. Thus, the surrogate model may be fitted using out-of-distribution data. To address this limitation, we cannot simply use a custom distance kernel in Eq. (3). The distance kernel is a kernel function applied to the distance r between a sample \(\textbf{x}_i\) and the instance to be explained \(\textbf{x}\), and r does not tell us enough about the location of that sample. Instead, we propose to find all original dataset samples \(\big \{\textbf{s}\in \mathcal {X} \,\mid \, \sqrt{(\textbf{x}-\textbf{s})^2} < r_\text {max} \big \}\) within radius \(r_\text {max}\). Next, we approximate the density of these local samples with kernel density estimation (KDE) and sample points from the resulting estimated density function. This can be done by choosing a random point, and offsetting it by randomly drawn value from the KDE kernel function. This yields an alternative probability distribution on the ball of radius \(r_\text {max}\) around \(\textbf{x}\) to Eq. (3), but does not change the key idea behind LEMON.
Kernel Shape. Previous work and this paper assume that a spherical region around an instance is the best representation of a local neighborhood. However, some recent rule based techniques effectively use hyperboxes instead [23]. In addition, sampling towards the closest decision boundary may yield samples with a more salient gradient. It would be interesting to investigate what the relevance and effect is of the shape of the sampling region. Next, it would be interesting to see if we can extend our work to explain multiple instances at once by sampling from multiple distributions efficiently.
Measuring Faithfulness. We currently evaluated the explanations using faithfulness: the more closely the local surrogate model resembles the reference model, the better. However, there is no consensus on the best way to measure this. LIME itself calculates faithfulness based on the transfer data points the surrogate itself was trained on. This is problematic because, as we have shown in Sect. 3, LIME produces only few relevant samples in the neighborhood of the point to be explained. Hence, the surrogate model would be evaluated using few relevant samples, leading to misleading faithfulness scores. In our synthetic examples, we could circumvent this as the most optimal set of coefficients was known, and hence we use the cosine similarity between the most optimal coefficients and those from the local surrogate. However, in a realistic scenario, the most optimal coefficients are simply not known. For evaluating with real datasets (Sect. 5.2), we thus decided to use the RMSE between the reference and surrogate model, computed on many (50,000) newly generated samples instead of the original transfer data.
As an alternative, we have considered the coefficient of determination (\(R^2\)) as a metric for faithfulness. This metric is used internally in LIME, and shows the proportion of the variance in the response variable of a regressor that can be explained by the predictor variables. However, we noted that for some (outlier) data points in our evaluation, almost all sampled data points get roughly the same predicted outcome from the reference classifier. In such case, the variance of the predicted outcomes is (very close to) 0. Computing the \(R^2\) score with this data yields \(R^2\) values of (close to) minus infinity, severely skewing the results.
Finally, faithfulness in itself does not guarantee the best possible explanation. There are many (often subjective) desiderata to consider when evaluating explanations, which are almost impossible to formalize due to their subjective nature. Hence, we do not claim to find an optimal explanation, just one closer to the behavior of the original model.

7 Conclusion

In this work, we explore alternative sampling techniques in pursuit of more faithful and robust explanations. To this end, we present LEMON: a sampling technique that outperforms current state-of-the-art techniques by sampling surrogate transfer data directly from the desired distribution instead of reweighting globally sampled transfer data. With both a synthetic evaluation, and evaluation with real-world datasets, we show that our sampling technique outperforms the state-of-the-art approaches in terms of faithfulness, measured in cosine similarity to the most optimal surrogate model, and RMSE error between reference and surrogate model predictions respectively.

Acknowledgments

This work is part of the TEPAIV research project with project number 612.001.752, the NWO research project with project number 613.009.122, and the research programme Commit2Data, specifically the RATE Analytics project with project number 628.003.001, which are all financed by the Dutch Research Council (NWO).
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/​), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Anhänge

A Bounds on Gaussian Distance Kernel

Consider a point x and, for some \(\sigma >0\), equip every point at distance r from x with a weight given by the kernel
$$\begin{aligned} K(r)\mathop {\mathrm {{:}{=}}}\limits \exp \big (-r^2/(2\sigma ^2)\big ). \end{aligned}$$
(7)
We would like to find the radius of interest \(r_p\) such that the total weight of the points within distance \(r_p\) is at least a fraction p of the total weight. Since the surface of an n-dimensional ball is given by \(c_nr^{d-1}\) for some dimension-dependent constant \(c_n>0\), we have to find the smallest \(r_p\) that satisfies the inequality
$$\begin{aligned} \frac{\int _0^{r_p} c_n r^{n-1}K(r) \textrm{d}r}{\int _0^\infty c_n r^{n-1}K(r) \textrm{d}r}\ \ge \ p \quad \Leftrightarrow \quad \int _0^{r_p} r^{n-1}K(r) \textrm{d}r \ \ge \ p\int _0^\infty r^{n-1}K(r) \textrm{d}r. \end{aligned}$$
(8)
Rewriting the integrals,
$$ \begin{aligned} (1-p)\int _{0}^{\infty } r^{n-1}K(r) \textrm{d}r \& \ge \int _{r_p}^{\infty } r^{n-1}K(r) \textrm{d}r . \end{aligned}$$
(9)
Let \(\varGamma (z, s)\mathop {\mathrm {{:}{=}}}\limits \int _{s}^\infty t^{z -1}\exp (-t)\textrm{d} t\) denote the incomplete gamma function and define the gamma function as \(\varGamma (z)\mathop {\mathrm {{:}{=}}}\limits \varGamma (z,0)\). Recall (7), so that the change of variables \(t=r^2/(2\sigma ^2)\) to both integrals in Eq. (9) yields
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-30047-9_7/MediaObjects/542055_1_En_7_Equ10_HTML.png
(10)
Writing \(u\mapsto \varGamma ^{-1}(\frac{n}{2}, u)\), we have to find the smallest \(r_p\) such that
$$\begin{aligned} \varGamma ^{(-1)}\Big (\frac{n}{2}, (1-p)\varGamma \Big (\frac{n}{2}\Big )\Big ) \le \frac{r_p^2}{2\sigma ^2}, \end{aligned}$$
(11)
which is given by choosing
$$\begin{aligned} r_p = \sqrt{2\sigma ^2\varGamma ^{(-1)}\Big (\frac{n}{2}, (1-p)\varGamma \Big (\frac{n}{2}\Big )\Big )} \quad \Leftrightarrow \quad \sigma ^2 = \frac{r_p^2}{2\varGamma ^{(-1)}\Big (\frac{n}{2}, (1-p)\varGamma \Big (\frac{n}{2}\Big )\Big )}. \end{aligned}$$
Literatur
2.
Zurück zum Zitat Alvarez-Melis, D., Jaakkola, T.S.: On the robustness of interpretability methods. In: Workshop Human Interp. Mach. Learn., pp. 66–71 (2018) Alvarez-Melis, D., Jaakkola, T.S.: On the robustness of interpretability methods. In: Workshop Human Interp. Mach. Learn., pp. 66–71 (2018)
3.
Zurück zum Zitat Ba, J., Caruana, R.: Do deep nets really need to be deep? In: Adv. Neural Inf. Proc. Sys., pp. 2654–2662 (2014) Ba, J., Caruana, R.: Do deep nets really need to be deep? In: Adv. Neural Inf. Proc. Sys., pp. 2654–2662 (2014)
4.
Zurück zum Zitat Baehrens, D., Schroeter, T., Harmeling, S., Kawanabe, M., Hansen, K., Müller, K.R.: How to explain individual classification decisions. J. Mach. Learn. Res. 11, 1803–1831 (2010)MathSciNetMATH Baehrens, D., Schroeter, T., Harmeling, S., Kawanabe, M., Hansen, K., Müller, K.R.: How to explain individual classification decisions. J. Mach. Learn. Res. 11, 1803–1831 (2010)MathSciNetMATH
5.
Zurück zum Zitat Barocas, S., Selbst, A.D.: Big data’s disparate impact. California Law Rev. 671–732 (2016) Barocas, S., Selbst, A.D.: Big data’s disparate impact. California Law Rev. 671–732 (2016)
6.
7.
Zurück zum Zitat Buciluă, C., Caruana, R., Niculescu-Mizil, A.: Model compression. In: Int. Conf. Knowl. Discovery Data Mining, pp. 535–541. ACM SIGKDD (2006) Buciluă, C., Caruana, R., Niculescu-Mizil, A.: Model compression. In: Int. Conf. Knowl. Discovery Data Mining, pp. 535–541. ACM SIGKDD (2006)
8.
Zurück zum Zitat Buolamwini, J., Gebru, T.: Gender shades: intersectional accuracy disparities in commercial gender classification. In: Conf. Fairness, Accountability and Transparency, pp. 77–91. PMLR (2018) Buolamwini, J., Gebru, T.: Gender shades: intersectional accuracy disparities in commercial gender classification. In: Conf. Fairness, Accountability and Transparency, pp. 77–91. PMLR (2018)
9.
Zurück zum Zitat Citron, D.K., Pasquale, F.D.: The scored society: due process for automated predictions. Wash. L. Rev. 89, 1 (2014) Citron, D.K., Pasquale, F.D.: The scored society: due process for automated predictions. Wash. L. Rev. 89, 1 (2014)
10.
Zurück zum Zitat Craven, M., Shavlik, J.W.: Extracting tree-structured representations of trained networks. In: Adv. Neural Inf. Process. Sys., pp. 24–30 (1996) Craven, M., Shavlik, J.W.: Extracting tree-structured representations of trained networks. In: Adv. Neural Inf. Process. Sys., pp. 24–30 (1996)
11.
Zurück zum Zitat Domingos, P.: Knowledge acquisition from examples via multiple models. In: Int. Conf. Machine Learn., pp. 98–106 (1997) Domingos, P.: Knowledge acquisition from examples via multiple models. In: Int. Conf. Machine Learn., pp. 98–106 (1997)
12.
Zurück zum Zitat Edelman, B.G., Luca, M.: Digital discrimination: the case of airbnb.com (2014) Edelman, B.G., Luca, M.: Digital discrimination: the case of airbnb.com (2014)
13.
Zurück zum Zitat Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression (2004) Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression (2004)
15.
Zurück zum Zitat Garreau, D., Luxburg, U.: Explaining the explainer: a first theoretical analysis of lime. In: Int. Conf. AI Stat., pp. 1287–1296. PMLR (2020) Garreau, D., Luxburg, U.: Explaining the explainer: a first theoretical analysis of lime. In: Int. Conf. AI Stat., pp. 1287–1296. PMLR (2020)
16.
Zurück zum Zitat Gass, S.I., Fu, M.C. (eds.): Inverse transform method, p. 815. Springer (2013) Gass, S.I., Fu, M.C. (eds.): Inverse transform method, p. 815. Springer (2013)
17.
Zurück zum Zitat Harman, R., Lacko, V.: On decompositional algorithms for uniform sampling from n-spheres and n-balls. J. Multivar. Anal. 101(10), 2297–2304 (2010)MathSciNetCrossRefMATH Harman, R., Lacko, V.: On decompositional algorithms for uniform sampling from n-spheres and n-balls. J. Multivar. Anal. 101(10), 2297–2304 (2010)MathSciNetCrossRefMATH
18.
Zurück zum Zitat Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. NIPS Deep Learning and Representation Learning Workshop. arXiv preprint arXiv:1503.02531 (2015) Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. NIPS Deep Learning and Representation Learning Workshop. arXiv preprint arXiv:​1503.​02531 (2015)
19.
Zurück zum Zitat Lopes, R.G., Fenu, S., Starner, T.: Data-free knowledge distillation for deep neural networks. NIPS Learn. Limited Labeled Data Workshop (LLD). arXiv preprint arXiv:1710.07535 (2017) Lopes, R.G., Fenu, S., Starner, T.: Data-free knowledge distillation for deep neural networks. NIPS Learn. Limited Labeled Data Workshop (LLD). arXiv preprint arXiv:​1710.​07535 (2017)
20.
Zurück zum Zitat Muthukumar, V., Pedapati, T., Ratha, N., Sattigeri, P., Wu, C.W., Kingsbury, B.E.A.: Understanding unequal gender classification accuracy from face images. arXiv preprint arXiv:1812.00099 (2018) Muthukumar, V., Pedapati, T., Ratha, N., Sattigeri, P., Wu, C.W., Kingsbury, B.E.A.: Understanding unequal gender classification accuracy from face images. arXiv preprint arXiv:​1812.​00099 (2018)
21.
Zurück zum Zitat Ribeiro, M.T., Singh, S., Guestrin, C.: Why should I trust you?: explaining the predictions of any classifier. In: Int. Conf. Knowl. Discovery Data Mining., pp. 1135–1144. ACM SIGKDD (2016) Ribeiro, M.T., Singh, S., Guestrin, C.: Why should I trust you?: explaining the predictions of any classifier. In: Int. Conf. Knowl. Discovery Data Mining., pp. 1135–1144. ACM SIGKDD (2016)
22.
Zurück zum Zitat Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: hints for thin deep nets. arXiv preprint arXiv:1412.6550 (2014) Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: hints for thin deep nets. arXiv preprint arXiv:​1412.​6550 (2014)
23.
Zurück zum Zitat Sanchez, I., Rocktaschel, T., Riedel, S., Singh, S.: Towards extracting faithful and descriptive representations of latent variable models. AAAI Spring Syposium Knowl. Represent. Reasoning 1, 1–4 (2015) Sanchez, I., Rocktaschel, T., Riedel, S., Singh, S.: Towards extracting faithful and descriptive representations of latent variable models. AAAI Spring Syposium Knowl. Represent. Reasoning 1, 1–4 (2015)
24.
Zurück zum Zitat Xu, Z., Hsu, Y.-C., Huang, J.: Training shallow and thin networks for acceleration via knowledge distillation with conditional adversarial networks. arXiv preprint arXiv:1709.00513 (2017) Xu, Z., Hsu, Y.-C., Huang, J.: Training shallow and thin networks for acceleration via knowledge distillation with conditional adversarial networks. arXiv preprint arXiv:​1709.​00513 (2017)
Metadaten
Titel
LEMON: Alternative Sampling for More Faithful Explanation Through Local Surrogate Models
verfasst von
Dennis Collaris
Pratik Gajane
Joost Jorritsma
Jarke J. van Wijk
Mykola Pechenizkiy
Copyright-Jahr
2023
DOI
https://doi.org/10.1007/978-3-031-30047-9_7

Premium Partner