Abstract
Machine learning is increasingly used to inform decision making in sensitive situations where decisions have consequential effects on individuals’ lives. In these settings, in addition to requiring models to be accurate and robust, socially relevant values such as fairness, privacy, accountability, and explainability play an important role in the adoption and impact of said technologies. In this work, we focus on algorithmic recourse, which is concerned with providing explanations and recommendations to individuals who are unfavorably treated by automated decision-making systems. We first perform an extensive literature review, and align the efforts of many authors by presenting unified definitions, formulations, and solutions to recourse. Then, we provide an overview of the prospective research directions toward which the community may engage, challenging existing assumptions and making explicit connections to other ethical challenges such as security, privacy, and fairness.
1 INTRODUCTION
Consider the following setting: a 28-year-old female professional working as a software engineer seeks a mortgage to purchase a home. Consider further that the loan-granting institution (e.g., bank) uses a binary classifier and denies the individual the loan based on her attributes. Naturally, in this context, answering the following questions become relevant to the individual:
Q1: | Why was I rejected the loan? | ||||
Q2: | What can I do to get the loan in the future? |
In the setting of the preceding example, unless the policy of the bank is relaxed, the individual must expend effort to change their situation to be favorably treated by the decision-making system. Examples such as this are prevalent not only in finance [20, 168] but also in justice (e.g., pretrial bail) [10], healthcare (e.g., prescription of life-altering medication) [24, 28, 91], and other settings (e.g., hiring) [50, 149, 169, 207] broadly classified as consequential decision-making settings [20, 35, 52, 113]. Given the rapid adoption of automated decision-making systems in these settings, designing models that not only have high objective accuracy but also afford the individual with explanations and recommendations to favorably change their situation is of paramount importance, and even argued to be a legal necessity (GDPR [231]). This is the concern of algorithmic recourse.
Contributions. Our review brings together the plethora of recent works on algorithmic recourse. In reviewing the literature, we identify opportunities to consolidate definitions, construct technical baselines for future comparison, and situate recourse in the broader ethical machine learning (ML) literature. The primary contribution is thus a unified overview of the definitions (Section 2), formulations (Section 3), and technical solutions (Section 4) for computing (nearest) contrastive explanations and (minimal) consequential recommendations, across a broad range of setups. A high-level summary is curated and presented in Table 1. Clearly distinguishing between the preceding two recourse offerings and presenting their often overlooked different technical treatment is a common theme throughout this work. Additionally, we identify a number of prospective research directions which challenge the assumptions of existing setups, and present extensions to better situate recourse in the broader ethical ML literature (Section 5).
Who is this article for? This article aims to engage different audiences: practitioners (
Scope. To adequately present the material, we provide brief introductions to various related topics, such as explainable ML (or XAI), causal and counterfactual inference, and optimization, among others. Our work is not meant to be a comprehensive review of these topics and merely aims to situate algorithmic recourse in the context of these related fields. For the interested reader, an in-depth discussion on the social, philosophical, cognitive, and psychological aspects of explanations can be found elsewhere [21, 36, 157, 226]. Other works of interest are available on XAI [1, 2, 32, 68, 79, 86, 94, 139, 150, 163], causal and counterfactual inference, [82, 95, 166, 185, 186, 206, 245], and optimization [33, 174, 214].
2 BACKGROUND
2.1 Recourse Definitions
In its relatively young field, algorithmic recourse has been defined as, for example, “an actionable set of changes a person can undertake in order to improve their outcome” [109], “the ability of a person to obtain a desired outcome from a fixed model” [223], and “the systematic process of reversing unfavorable decisions by algorithms and bureaucracies across a range of counterfactual scenarios” [226]. Similarly, legal recourse pertains to actions by individuals or corporations to remedy legal difficulties [235]. Parallel to this, explanations aim to, at least in part, assist data-subjects to “understand what could be changed to receive a desired result in the future” [234]. This plurality of overlapping definitions, and the evolving consensus therein, motivates us to attempt to present a unified definition under the umbrella of recourse.
Intuitively, the commonality shared among these definitions is the desire to assist individuals who are negatively affected by automated decision-making systems to improve their circumstances (captured in the individual’s feature set) to overcome the adverse decision (from the predictive model). Such assistance may take on the form of explanations (where they need to get to) or recommendations (what actions should they perform to get there).
Therefore, we submit that recourse can be achieved by an affected individual if they can understand [69, 109, 234] and accordingly act [114, 115] to alleviate an unfavorable situation, thus exercising temporally extended agency [226]. Concretely, in the example of the previous section, recourse is offered when the loan applicant is given answers to both questions: provided explanation(s) as to why the loan was rejected (Q1) and offered recommendation(s) on how to obtain the loan in the future (Q2). In the following, we describe the similarities and often overlooked differences between these questions and the different set of assumptions and tools needed to sufficiently answer each in general settings. Such a distinction is made possible by looking at the context from a causal perspective, which we summarize next.
2.2 Recourse and Causality
2.2.1 Contrastive Explanations.
In an extensive survey of the social science literature, Miller [157] concluded that when people ask “Why P?” questions (e.g., Q1), they are typically asking “Why P rather than Q?,” where Q is often implicit in the context [101, 199]. A response to such questions is commonly referred to as a contrastive explanation and is appealing for two reasons. First, contrastive questions provide a “window” into the questioner’s mental model, identifying what they had expected (i.e., Q, the contrast case), and thus the explanation can be better tuned toward the individual’s uncertainty and gap in understanding [156]. Second, providing contrastive explanations may be “simpler, more feasible, and cognitively less demanding” [156] than offering recommendations.
2.2.2 Consequential Recommendations.
Providing an affected individual with recommendations (e.g., response to Q2) amounts to suggesting a set of actions (a.k.a. flipsets [223]) that should be performed to achieve a favorable outcome in the future. In this regard, several works have used contrastive explanations to directly infer actionable recommendations [109, 209, 223, 234] where actions are considered as independent shifts to the feature values of the individual (P) that results in the contrast (Q). Recent work, however, has cautioned against this approach, citing the implicit assumption of independently manipulable features as a potential limitation that may lead to suboptimal or infeasible actions [21, 114, 146, 226]. To overcome this, Karimi et al. [114] suggest that actions may instead be interpreted as interventions in a causal model of the world in which actions will take place and not as independent feature manipulations derived from contrastive explanations. Formulated in this manner, for example, using a structural causal model (SCM) [184], the downstream effects of interventions on other variables in the model (e.g., descendants of the intervened-upon variables) can directly be accounted for when recommending actions [21]. Thus, a recommended set of actions for recourse, in a world governed by an SCM, are referred to as consequential recommendations [114].
2.2.3 Clarifying Terminology: Contrastive, Consequential, and Counterfactual.
To summarize, recourse explanations are commonly sought in a contrastive manner, and recourse recommendations can be considered as interventions on variables modeled using an SCM. Thus, we can rewrite the two recourse questions as follows:
Q1: | What profile would have led to receiving the loan? | ||||
Q2: | What actions would have led me to develop this profile?1 |
Viewed in this manner, both contrastive explanations and consequential recommendations can be classified as a counterfactual [151], in that each considers the alteration of an entity in the history of the event P, where P is the undesired model output. Thus, responses to Q1 (respectively, Q2) may also be called counterfactual explanations (respectively, counterfactual recommendations), meaning what could have been (respectively, what could have been done) [36].2
To better illustrate the difference between contrastive explanations and consequential recommendations, we refer to Figure 1. According to Lewis [137, p. 217], “to explain an event is to provide some information about its causal history.” Lipton [138, p. 256] argues that to explain why P rather than Q, “we must cite a causal difference between P and not-Q, consisting of a cause of P and the absence of a corresponding event in the history of not-Q.” In the algorithmic recourse setting, because the model outputs are determined by its inputs (which temporally precede the prediction), the input features may be considered as the causes of the prediction. The determining factor in whether one is providing contrastive explanations as opposed to consequential recommendations is thus the level at which the causal history [203] is considered: whereas providing explanations only requires information on the relationship between the model inputs, \( \lbrace \mathrm{X}_i\rbrace _{i=1}^D \), and predictions, \( \hat{\mathrm{Y}} \), recommendations require information as far back as the causal relationships among inputs. The reliance on fewer assumptions [138, 157] thus explains why generating recourse explanations is easier than generating recourse recommendations [156, 234].
Next, we summarize the technical literature in Table 1 (subsequent surveys can be found elsewhere [117, 215, 227]). We find that most of the recourse literature has focused on generating contrastive explanations rather than consequential recommendations (c.f., Section 2). Differentiable models are the most widely supported class of models, and many constraints are only sparsely supported (c.f., Section 3). All tools generate solutions that to some extent trade off desirable requirements, such as optimality, perfect coverage, efficient runtime, and access (c.f., Section 4), resulting in a lack of unifying comparison (c.f., Section 5). This table does not aim at serving to rank or be a qualitatively comparison of surveyed methods, and one has to exercise caution when comparing different setups. As a systematic organization of knowledge, we believe the table may be useful to practitioners looking for methods that satisfy certain properties, and useful for researchers who want to identify open problems and methods to further develop. Offering recourse in diverse settings with desirable properties remains an open challenge, which we explore in the following sections.
3 FORMULATION
Given a fixed predictive model, commonly assumed to be a binary classifier, \( h: \mathcal {X} \rightarrow \lbrace 0, 1\rbrace \), with \( \mathcal {X} = \mathcal {X}_1 \times \cdots \times \mathcal {X}_D \), we can define the set of contrastive explanations for a (factual) input \( \boldsymbol {x}^\texttt {F}\in \mathcal {X} \) as \( \mathcal {E} := \lbrace \boldsymbol {x}^\texttt {CF}\in \mathcal {P}(\mathcal {X}) ~|~ h(\boldsymbol {x}^\texttt {CF}) \not= h(\boldsymbol {x}^\texttt {F}) \rbrace \). Here, \( \mathcal {P}(\mathcal {X}) \subseteq \mathcal {X} \) is a plausible subspace of \( \mathcal {X} \), according to the distribution of training data (see Section 3.3.1). Descriptively, contrastive explanations identify alternative feature combinations (in nearby worlds [136]) that result in a favorable prediction from the fixed model. Assuming a notion of dissimilarity between instances, represented as \( \texttt {dist}(\cdot , \cdot) : \mathcal {X} \times \mathcal {X} \rightarrow \mathbb {R}_+ \), one can identify nearest contrastive explanations (a.k.a. counterfactual explanations) as follows: (1) \( \begin{equation} \begin{aligned}\boldsymbol {x}^\texttt {*CF}\in \mathop {argmin}_{\boldsymbol {x}^\texttt {CF}\in \mathcal {P}(\mathcal {X})} && \texttt {dist}(\boldsymbol {x}^\texttt {CF}, \boldsymbol {x}^\texttt {F}) \\ \text{s.t.} && h(\boldsymbol {x}^\texttt {CF}) \not= h(\boldsymbol {x}^\texttt {F}) \\ && \boldsymbol {x}^\texttt {CF}= \boldsymbol {x}^\texttt {F}+ \boldsymbol {\delta }, \\ \end{aligned} \end{equation} \) where \( \boldsymbol {\delta } \) is the perturbation applied independently to the feature vector \( \boldsymbol {x}^\texttt {F} \) to obtain the counterfactual instance \( \boldsymbol {x}^\texttt {CFE} \) [234]. As discussed in Section 2, although contrastive explanations identify the feature vectors that would achieve recourse, in general, the set of actions that would need to be performed to realize these features are not directly implied from the explanation [114]. Thus, a consequential recommendation for (factual) input \( \boldsymbol {x}^\texttt {F}\in \mathcal {X} \) is defined as \( \mathcal {R} := \lbrace a \in \mathcal {A}(\boldsymbol {x}^\texttt {F}) ~|~ h(\boldsymbol {x}^\texttt {SCF}(\boldsymbol {a}; \boldsymbol {x}^\texttt {F})) \not= h(\boldsymbol {x}^\texttt {F}) \rbrace \). Here, \( \mathcal {A}(\boldsymbol {x}^\texttt {F}) \) is the set of feasible actions that can be performed by the individual seeking recourse (see Section 3.3.2). Approaching the recourse problem from a causal perspective within the SCM framework [114], actions are considered as interventions of the form \( \boldsymbol {a}= \mathrm{do}(\lbrace \mathrm{X}_i := x^\texttt {F}_i + \theta _i \rbrace _{i\in \mathcal {I}}) \in \mathcal {A}(\boldsymbol {x}^\texttt {F}) \), and \( \boldsymbol {x}^\texttt {SCF}(\boldsymbol {a}; \boldsymbol {x}^\texttt {F}) \) denotes the structural counterfactual of \( \boldsymbol {x}^\texttt {F} \) had action \( \boldsymbol {a} \) been performed, all else being equal [184]. Finally, given a notion of cost of actions, capturing the effort expended by the individual as \( \texttt {cost}(\cdot ; \cdot) : \mathcal {A} \times \mathcal {X} \rightarrow \mathbb {R}_+ \), one can identify minimal consequential recommendations as follows: (2) \( \begin{equation} \begin{aligned}\boldsymbol {a}^*\in \mathop {argmin}_{\boldsymbol {a}\in \mathcal {A}(\boldsymbol {x}^\texttt {F})} && \texttt {cost}(\boldsymbol {a}; \boldsymbol {x}^\texttt {F}) \\ \text{s.t.} && h(\boldsymbol {x}^\texttt {CF}) \not= h(\boldsymbol {x}^\texttt {F}) \\ && \boldsymbol {x}^\texttt {CF}= \boldsymbol {x}^\texttt {SCF}(\boldsymbol {a}; \boldsymbol {x}^\texttt {F}). \\ \end{aligned} \end{equation} \)
Importantly, the solution of (1) yields a nearest contrastive explanation (i.e., \( \boldsymbol {x}^\texttt {CF}= \boldsymbol {x}^\texttt {F}+ \boldsymbol {\delta }^* \)), with no direct mapping to a set of (minimal) consequential recommendations [114]. Conversely, solving (2) yields both a minimal consequential recommendation (i.e., \( \boldsymbol {a}^* \)) and a contrastive explanation (i.e., by construction \( \boldsymbol {x}^\texttt {CF}= \boldsymbol {x}^\texttt {SCF}(\boldsymbol {a}^*; \boldsymbol {x}^\texttt {F}) \)).3
Our position is that, with the aim of providing recourse, the primary goal should be to provide minimal consequential recommendations that result in a (not necessarily nearest) contrastive explanation when acted upon. Offering nearest contrastive explanations that are not necessarily attainable through minimal effort is of secondary importance to the individual. In practice, however, due to the additional assumptions needed to solve (2) (specifically for computing \( \boldsymbol {x}^\texttt {SCF} \)), the literature often resorts to solving (1). Despite the prevalance of methods assuming independent features, we quote the negative result of Karimi et al. [115, prop. 2], whereby the authors show that without full specification of the causal relations among variables, recourse cannot be guaranteed. As pointed out by Chou et al. [47], explanations derived from popular algorithms based on non-causal spurious correlations “may yield sub-optimal, erroneous, or even biased explanations.” Therefore, for the rest of this section, we operate on the assumption of knowledge of an SCM, which allows for a deterministic formulation of recourse yielding both minimal consequential recommendation and resulting contrastive explanations. Later in Sections 5 and 6, we relax these assumptions and discuss the relatively easier settings in which causal assumptions are scarce or absent entire. This is not to say that counterfactual explanations not accompanied by recommendation actions are not useful. On the contrary, as noted by Karimi et al. [114], counterfactual explanations hold promise for “guided audit of the data” [234] and evaluating various desirable model properties, such as robustness [98, 209] or fairness [96, 113, 209, 223] (see also Section 5). Besides this, it has been shown that designers of interpretable ML systems use counterfactual explanations for predicting model behavior [125], uncovering inaccuracies in the data profile of individuals [226], and also for medical outcome prediction under hypothesized treatments [153] or realistic medical image generation [238]. Beyond the consequential domains mentioned earlier and in Section 1, counterfactual explanations have been used for changing behavior of source code [48], overcoming class imbalance issues by amending underrepresented classes with plausible counterfactuals [217], and aiding the interpretable design of molecules [239].
In the rest of this section, we provide an overview of the objective function and constraints used in (1) and (2), followed by a description of the datatypes commonly used in recourse settings. Finally, we conclude with related formulations. Then in Section 4, we review the tools used to solve the formulations defined here.
3.1 Optimization Objective
Generally, it is difficult to define dissimilarity (
3.1.1 On dist .
Wachter et al. [234] define
This distance has several desirable properties, including accounting and correcting for the different ranges across features through the MAD heuristic, robustness to outliers with the use of the median absolute difference, and, finally, favoring sparse solutions through the use of \( \ell _1 \) Manhattan distance.
Karimi et al. [113] propose a weighted combination of \( \ell _p \) norms as a flexible measure across a variety of situations. The weights, \( \alpha , \beta , \gamma , \zeta \) as shown in the following, allow practitioners to balance between sparsity of changes (i.e., through the \( \ell _0 \) norm), an elastic measure of distance (i.e., through the \( \ell _1, \ell _2 \) norms) [63], and a maximum normalized change across all features (i.e., through the \( \ell _\infty \) norm): (4) \( \begin{equation} \begin{aligned}\texttt {dist}(\boldsymbol {x}; \boldsymbol {x}^\texttt {F}) = \alpha || \boldsymbol {\delta }||_0 + \beta || \boldsymbol {\delta }||_1 + \gamma || \boldsymbol {\delta }||_2 + \zeta || \boldsymbol {\delta }||_\infty , \end{aligned} \end{equation} \) where \( \boldsymbol {\delta }= [\delta _1, \ldots , \delta _D]^T \) and \( \delta _k: \mathcal {X}_k \times \mathcal {X}_k \rightarrow [0,1] ~ \forall ~ k \in [D] \). This measure accounts for the variability across heterogeneous features (see Section 3.5) by independently normalizing the change in each dimension according to its spread. Additional weights may also be used to relative emphasize changes in specific variables. Finally, other works aim to minimize dissimilarity on a graph manifold [189], in terms of Euclidean distance in a learned feature space [109, 183], or using a Riemannian metric in a latent space [14, 15].
3.1.2 On cost .
Similar to Karimi et al. [113], various works explore \( \ell _p \) norms to measure cost of actions. Ramakrishnan et al. [190] explore \( \ell _1, \ell _2 \) norm as well as constant cost if specific actions are undertaken; Karimi et al. [114], ,115] minimize the \( \ell _2 \) norm between \( \boldsymbol {x}^\texttt {F} \) and the action \( \boldsymbol {a} \) assignment (i.e., \( ||\boldsymbol {\theta }||_2 \)); and Cui et al. [55] explore combinations of \( \ell _0, \ell _1, \ell _2 \) norms over a user-specified cost matrix. Encoding individual-dependent restrictions is critical—for example, obtaining credit is more difficult for an foreign student than for a local resident.
Beyond \( \ell _p \) norms, the work of Ustun et al. [223] propose the total- and maximum-log percentile shift measures, to automatically account for the distribution of points in the dataset—for example, (5) \( \begin{equation} \begin{aligned}\texttt {cost}(\boldsymbol {a}; \boldsymbol {x}^\texttt {F}) &= \max _{k \in [D]} \left|Q_k\left(\boldsymbol {x}^\texttt {F}_k + \theta _k\right) - Q_k\left(\boldsymbol {x}^\texttt {F}_k\right)\right|, \end{aligned} \end{equation} \) where \( Q_k(\cdot) \) is the CDF of \( x_k \) in the target population. This type of metric naturally accounts for the relative difficulty of moving to unlikely (high or low percentile) regions of the data distribution. For instance, going from a 50th to a 55th percentile in school grades is simpler than going from the 90th to the 95th percentile.
3.1.3 On the Relation Between dist and cost .
In a world in which changing one variable does not affect others, one can see a parallel between the counterfactual instance of (1) (i.e., \( \boldsymbol {x}^\texttt {CF}= \boldsymbol {x}^\texttt {F}+ \boldsymbol {\delta } \)) and that of (2) (i.e., \( \boldsymbol {x}^\texttt {CF}= \boldsymbol {x}^\texttt {SCF}(\boldsymbol {a}; \boldsymbol {x}^\texttt {F}) = \boldsymbol {x}^\texttt {F}+ \boldsymbol {\theta } \)). This mirroring form suggests that definitions of dissimilarity between individuals (i.e.,
3.2 Model and Counterfactual Constraints
3.2.1 Model.
A variety of fixed models have been explored in the literature for which recourse is to be generated. As summarized in Table 1, we broadly divide them in four categories: tree-based (TB), kernel-based (KB), differentiable (DF), and other (OT) types (e.g., generalized linear models, naive Bayes, k-nearest neighbors). Although the literature on recourse has primarily focused on binary classification settings, most formulations can easily be extended to multi-class classification or regression settings. Extensions to such settings are straightforward, where the model constraint is replaced with \( h(\boldsymbol {x}^\texttt {CF}) = k \) for a target class or \( h(\boldsymbol {x}^\texttt {CF}) \in [a,b] \) for a desired regression interval, respectively. Alternatively, soft predictions may be used in place of hard predictions, where the goal may be, for example, to increase the prediction gap between the highest-predicted and second-highest-predicted class—that is, \( \text{Pred}(\boldsymbol {x}^\texttt {CF})_i - \text{Pred}(\boldsymbol {x}^\texttt {CF})_j, \) where \( i = \mathop {argmax}_{k \in K} \text{Pred}(\boldsymbol {x}^\texttt {CF})_k \), \( j = \mathop {argmax}_{k \in K \setminus i} \text{Pred}(\boldsymbol {x}^\texttt {CF})_k \). In the end, the model constraint representing the change in prediction may be arbitrarily non-linear, non-differentiable, and non-monotone [21], which may limit the applicability of solutions (c.f. Section 4).
3.2.2 Counterfactual.
The counterfactual constraint depends on the type of recourse offered. Whereas \( \boldsymbol {x}^\texttt {CF}= \boldsymbol {x}^\texttt {F}+ \boldsymbol {\delta } \) in (1) is a linear constraint, computing \( \boldsymbol {x}^\texttt {CF}= \boldsymbol {x}^\texttt {SCF}(\boldsymbol {a}; \boldsymbol {x}^\texttt {F}) \) in (2) involves performing the three step abduction-action-prediction process of Pearl et al. [186] and may thus be non-parametric and arbitrary involved. A closed-form expression for deterministically computing the counterfactual in additive noise models is presented in the work of Karimi et al. [114], and probabilistic derivations for more general SCMs are presented in another work by Karimi et al. [115].
3.3 Actionability and Plausibility Constraints
3.3.1 Plausibility.
Existing literature has formalized plausibility constraint as one of three categories: domain-consistency, density-consistency, and prototypical-consistency. Whereas domain-consistency restricts the counterfactual instance to the range of admissible values for the domain of features [113], density-consistency focuses on likely states in the (empirical) distribution of features [63, 64, 109, 111, 131, 183] identifying instances close to the data manifold. A third class of plausibility constraints selects counterfactual instances that are either directly present in the dataset [189, 241] or close to a prototypical example of the dataset [11, 12, 124, 131, 225].
3.3.2 Actionability (Feasibility).
The set of feasible actions, \( \mathcal {A}(\boldsymbol {x}^\texttt {F}) \), is the set of interventions \( \mathrm{do}(\lbrace \mathrm{X}_i := x^\texttt {F}_i + \theta _i\rbrace _{i\in \mathcal {I}}) \) that the individual, \( \boldsymbol {x}^\texttt {F} \), is able to perform. To determine \( \mathcal {A}(\boldsymbol {x}^\texttt {F}) \), we must identify the set of variables upon which interventions are possible, as well as the pre-/post-conditions that the intervention must satisfy. The actionability of each variable falls into three categories [114, 130]:
(1) | Actionable (and mutable) (e.g., bank balance); | ||||
(2) | Mutable but non-actionable (e.g., credit score); | ||||
(3) | Immutable (and non-actionable) (e.g., birthplace). |
Intuitively, mutable but non-actionable variables are not directly actionable by the individual but may change as a consequence of a change to its causal ancestors (e.g., regular debt payment).
Having identified the set of actionable variables, an intervention can change the value of a variable unconditionally (e.g., bank balance can increase or decrease), or conditionally to a specific value [114] or in a specific direction [223]. Karimi et al. [114] present the following examples to show that the actionable feasibility of an intervention on \( \mathrm{X}_i \) may be contingent on any number of conditions:
All such feasibility conditions can easily be encoded as Boolean/logical constraint into \( \mathcal {A}(\boldsymbol {x}^\texttt {F}) \) and jointly solved for in the constrained optimization formulations (1), (2). An important side note to consider is that \( \mathcal {A}(\boldsymbol {x}^\texttt {F}) \) is not restricted by the SCM assumptions but instead by individual-/context-dependent consideration that determine the form, feasibility, and scope of actions [114].
3.3.3 On the Relation Between Actionability and Plausibility.
While seemingly overlapping, actionability (i.e., \( \mathcal {A}(\boldsymbol {x}^\texttt {F}) \)) and plausibility (i.e., \( \mathcal {P}(\mathcal {X}) \)) are two distinct concepts: whereas the former restrict actions to those that are possible to do, the latter require that the resulting counterfactual instance be possibly true,believable,or realistic. Consider a Middle Eastern Ph.D. student who is denied a U.S. visa to attend a conference. Although it is quite likely for there to be favorably treated foreign students from other countries with similar characteristics (plausible gender, field of study, academic record, etc.), it is impossible for our student to act on their birthplace for recourse (i.e., a plausible explanation but an infeasible recommendation). Conversely, an individual may perform a set of feasible actions that would put them in an implausible state (e.g., small \( p(\boldsymbol {x}^\texttt {CF}) \); not in the dataset) where the model fails to classify with high confidence. Thus, actionability and plausibility constraints may be used in conjunction depending on the recourse setting they describe.
3.4 Diversity and Sparsity Constraints
3.4.1 Diversity.
Diverse recourse is often sought in the presence of uncertainty (e.g., unknown user preferences when defining
In the first camp, Wachter et al. [234] show that different runs of their gradient-based optimizer over a non-convex model (e.g., multilayer perceptron) results in different solutions as a result of different random seeds. Sharma et al. [209] show that multiple evolved instances of the genetic-based optimization approach can be used as diverse explanations, hence benefiting from not requiring multiple re-runs of the optimizer. Downs et al. [69], Mahajan et al. [146], and Pawelczyk et al. [183] generate diverse counterfactuals by passing multiple samples from a latent space that is shared between factual and counterfactual instances through a decoder, and filtering those instances that correctly flip the prediction.
In the second camp, Russell [205] pursues a strategy whereby subsequent runs of the optimizer would prevent changing features in the same manner as prior explanations/recommendations. Karimi et al. [113] continue in this direction and suggest that subsequent recourse should not fall within an \( \ell _p \)-ball surrounding any of the earlier explanations/recommendations. Cheng et al. [46] and Mothilal et al. [167] present a differentiable constraint that maximizes diversity among generated explanations by maximizing the determinant of a (kernel) matrix of the generated counterfactuals.
3.4.2 Sparsity.
It is often argued that sparser solutions are desirable as they emphasize fewer changes (in explanations) or fewer variables to act upon (in recommendations) and are thus more interpretable for the individual [154]. Although this is not generally accepted [182, 225], one can formulate this requirement as an additional constraint, whereby, for example, \( ||\boldsymbol {\delta }||_0 \le s \) or \( ||\boldsymbol {\theta }||_0 \le s \). Formulating sparsity as an additional (hard) constraint rather than optimizing for it in the objective grants the flexibility to optimize for a different object while ensuring that a solution would be sparse.
3.5 Datatypes and Encoding
A common theme in consequential decision-making settings is the use of datatypes that refer to real-world attributes of individuals. As a result, datasets are often tabular , comprising heterogeneous features, with a mix of numeric (real, integer), binary, categorical, or ordinal variables. Most commonly, Adult [3], Australian Credit [71], German Credit [17], GiveMeCredit [248], COMPAS [127], and HELOC [103], among others, are used, which are highly heterogeneous.
Different feature types obey different statistical properties—for example, the integer-based heart rate, real-valued BMI, categorical blood type, and ordinal age group differ drastically in their range. Thus, heterogeneous data requires special handling to preserve their semantics. A common approach is to encode each variable according to a predetermined strategy, which preprocesses the data before model training and consequently during recourse generation (for either recourse type, i.e., consequential recommendations or contrastive explanations). For instance, categorical and ordinal features may be encoded using one-hot encoding and thermometer encoding, respectively. To preserve the semantics of each variable during recourse generation, we must also ensure that the generated explanations/recommendations result in counterfactual instances that also satisfy the encoding constraints. For instance, Boolean and linear constraints of the form \( \sum _j \boldsymbol {x}_{i,j} = 1 ~ \forall ~ \boldsymbol {x}_{i,j} \in \lbrace 0,1\rbrace \) are used to ensure that multiple categories are be simultaneously active, and thermometer-encoded ordinal variables are required to satisfy \( \boldsymbol {x}_{i,j} \ge \boldsymbol {x}_{i,j+1} ~ \forall ~ \boldsymbol {x}_{i,j} \in \lbrace 0,1\rbrace \). For a detailed overview, refer to the work of Nazabal et al. [171].
In addition to tabular data, one may require contrastive explanations for image-based or text-based datasets, as summarized in Table 1. For image-based datasets, the algorithm may optionally operate on the raw data, or on super-pixel or other forms of extracted features (e.g., a hidden representation in a neural network). Text-based datasets are also commonly encoded as vectors representing GloVe [187] or bag-of-words embeddings [198]. For a recent work that explores the use of counterfactual explanations for functional data, refer to Carrizosa et al. [42].
3.6 Related Formulations
The problem formulation for recourse generation, and specifically that of contrastive explanations, (1), is broadly related to several other problems in data mining and ML. For instance, the cost-minimizing inverse classification problem [4, 128, 129, 130, 132, 147] aims to identify the “minimum required change to a case in order to reclassify it as a member of a different preferred class” [147]. Actionable knowledge extraction is employed in data mining to suggest “behaviors which render a state of an instance into a preferred state” [70] according to a classifier [37, 38, 39, 112, 247]. Finally, adversarial perturbations are small imperceptible changes to the input of a classifier that would alter the output prediction to a false and highly confident region [40, 88, 164, 165, 172, 177, 178, 216]. An additional parallel shared by the preceding methods is in their assumption of a fixed underlying model. Extensions of the preceding, in which model designers anticipate and aim to prevent malicious behavior, exist in the strategic classification [67, 99, 104, 120, 135, 140, 155, 158] and adversarial robustness [41, 49, 76, 246] literature. For recent expositions on the similarities and contrasts between these methods, refer to works elsewhere [75, 78, 180].
Whereas there exists strong parallels in their formulations, the differences arise in their intended use cases and guarantees for the stakeholders involved. For example, as opposed to recourse that aims to build trust with affected individuals, the primary use case cited in the actionable knowledge extraction literature is to deliver cost-effective actions to maximize profit or other business objectives. Furthermore, whereas a contrastive explanation aims to inform an individual about ways in which their situation would have led to a desirable outcome, an adversarial perturbation aims to fool the human by being imperceptable (e.g., by leaving the data distribution). In a sense, imperceptability is the antithesis of explainability and trust. Finally, building on the presentation in Section 2, offering consequential recommendations relies on a causal modeling of the world, which is largely ignored by other approaches.
4 SOLUTION
By definition, recourse is offered when an individual is presented with contrastive explanations and consequential recommendations, which can be obtained by solving (1) and (2), respectively. Notably, the objective that is to be minimized (i.e.,
4.1 Properties
We remark that the optimizer and the resulting solutions should ideally satisfy some desirable properties, as detailed next. In practice, methods typically trade off optimal guarantee \( \delta ^* \), perfect coverage \( \Omega ^* \), or efficient runtime \( \tau ^* \), and may otherwise require prohibitive access to the underlying data or predictive model.
4.1.1 Optimality.
Identified counterfactual instances should ideally be proximal to the factual instance, corresponding to a small change to the individual’s situation. When optimizing for minimal
4.1.2 Perfect Coverage.
Coverage is defined as the number of individuals for which the algorithm can identify a plausible counterfactual instance (through either recourse type), if at least one solution existed [113]. Communicating the domain of applicability to users is critical for building trust [161, 234], and ideally the algorithm offers recourse to all individuals (i.e., perfect coverage).
4.1.3 Efficient Runtime.
As explanations/recommendations are likely to be offered in conversational settings [101, 105, 157, 212, 240], it is desirable to generate recourse in near-real time. Despite the general underreporting of runtime complexities in the literature, in Table 1 we highlight those algorithms that report efficient wall-clock runtimes that may enable interaction between the algorithm and individual.
4.1.4 Access.
Different optimization approaches may require various levels of access to the underlying dataset or model. Access to the model may involve query access (where only the label is returned), gradient access (where the gradient of the output with respect to the input is requested), class probabilities access (from which one can infer the confidence of the prediction), or complete white-box access (where all model parameters are known).
Naturally, there are practical implications to how much access is permissible in each setting, which further restricts the choice of tools. Consider an organization that seeks to generate recourse for their clients. Unless these algorithms are run in-house by the said organization, it is unlikely that the organization would hand over training data, model parameters, or even non-rate-limited API of their models to a third-party to generate recourse.
4.2 Tools
We consider the richly explored field of optimization [33, 174, 214] outside the scope of this work and suffice to briefly review the tools used specifically for recourse generation, highlighting their domain of applicability, and relegating technical details to appropriate references. Not only is solving (1) and (2) difficult in general settings [130], it has even been shown to be NP-complete or NP-hard in restricted settings, such as solving for integer-based variables [12], solving for additive tree models [16, 55, 219] or neural networks [116], and solving for quadratic objectives and constraints [12, 33, 179]. Thus, except for exhaustive search over a potentially uncountable set of solutions, most works pursue approximate solutions in restricted settings, trading off the preceding desirable properties (see Table 1). Solutions can be broadly categorized as gradient-based optimization, model-based, search-based, verification-based, and heuristics-based.6
Under differentiability of the objective and constraints, gradient-based optimization solutions such as FISTA [26] are employed [63, 64, 225] to find globally optimal solutions under convex Lagrangian, and first-order methods such as (L-)BFGS or projected gradient descent may be used to identify local optima otherwise. Relatedly, rather than solving recourse for each individual independently, some works pursue a model-based approach, whereby a mapping from factual to counterfactual instances is learned through gradient optimization [146, 183]. These methods enjoy efficient runtimes at the cost of coverage loss and poor handling of heterogeneous data.
For non-differentiable settings, branch-and-bound-based [134] approaches split the search domain into smaller regions within which a solution may be easier to find. Under linearity of the objectives and constraints, integer linear programming (ILP) algorithms may be used when datatypes are discrete [55, 223], and mixed-integer linear programming (MILP) extensions are utilized when some variables are not discrete [110, 205]. (M)ILP formulations are solved using powerful off-the-shelf solvers such as CPLEX [54] and Gurobi [176]. One may also use a combination of iterative binary search and verification tools to obtain solutions to (1) and (2). Here, the problem is reformulated as a constrained satisfaction problem, where the constraint corresponding to the objective (
A number of heuristics-based approaches are also explored, such as finding the shortest path (Dijkstra’s algorithm [53]) between \( \boldsymbol {x}^\texttt {F} \) and potential \( \boldsymbol {x}^\texttt {CF} \)s on an empirical graph where edges are placed between similar instances (according to, e.g., Gaussian kernel) [189]. Finally, genetic-based approaches [243, 252] find solutions over different evolutions of candidate solutions according to various heuristics [22, 56, 93, 124, 209] and benefit from being model-/datatype-/norm-agnostic via only requiring query access to the model.
5 PROSPECTS
In the previous sections, we covered the definitions, formulations, and solutions of existing works aiming to offer algorithmic recourse. We showed that generating recourse explanations and recommendations required counterfactual reasoning based on different levels of causal knowledge. Counterfactual reasoning has roots not only in the philosophy of science [101, 102, 136, 137, 138, 244] but also in the psychology of human agents [36, 156, 157] and benefits from strong technical foundations [19, 97]. User studies have demonstrated that causal relationships are assessed by evaluating counterfactuals [151], and counterfactual simulation is used to predict future events [83]. Specifically in the context of XAI, it has been shown that counterfactuals can “make the decisions of inscrutable systems intelligible to developers and users” [36], and that people perform better at predicting model behavior when presented with counterfactual instances [125]. Organizations seek to deploy counterfactual-based explanations citing their easy-to-understand nature [30, 31] and GDPR compliance [234]. Finally, from a practitioner’s standpoint, not only does algorithmic recourse benefit from the widely exercised practice of sharing open source implementations (see Table 1), but various graphical interfaces have also been developed to assist the onboarding of non-technical stakeholders [46, 87, 241].
There are, however, several implicit assumptions made in existing setups—for example, that the world dynamics are known and do not change, that the predictive (supervised) model is fixed, and that changes only arise due to the actions of the individual seeking recourse. Moreover, in the multi-agent settings considered (with, e.g., banks and loan seekers), agents are assumed to act truthfully with no gaming or false reporting of features, and agents are aligned in the aim to minimize an agreed-upon objective function. In the following, we explore settings in which these assumptions do not hold and offer potential solutions for extending to more realistic recourse settings.
5.1 Beyond Deterministic Recourse
In (2), we saw that minimal consequential recommendations are generated subject to the constraint that the counterfactual instance, \( \boldsymbol {x}^\texttt {CF} \), is assigned to be the structural counterfactual of \( \boldsymbol {x}^\texttt {F} \) under hypothetical actions \( \boldsymbol {a} \)—that is, \( \boldsymbol {x}^\texttt {CF}= \boldsymbol {x}^\texttt {SCF}(\boldsymbol {a}; \boldsymbol {x}^\texttt {F}) \) [114]. Computing the structural counterfactual exactly, however, relies on strong assumptions (i.e., the true SCM is an additive noise model and is known). Karimi et al. [115] show that without complete knowledge of the true SCM, counterfactual analysis cannot be done exactly and thus recourse cannot be offered deterministically. Although they then present methods that offer recourse with high probability, they do so under specification of a causally sufficient graph. Future research in this direction may explore less strict settings, perhaps accounting for hidden confounders [232], or partially observed graphs [9, 51, 218], further adding to the uncertainty of recourse recommendations [27, 122]. Alternatively, sources of stochasticity may enter the recourse process via a non-deterministic decision-making system. For example, it has been demonstrated that for models trained on selective labels, fair and optimal decisions should be made stochastically [25, 119, 221].
5.2 Beyond Supervised Recourse
In Section 3.2 we discussed how the standard binary classification setting could be extended to support multi-class classification and regression. Beyond these classic supervised learning settings, an individual may be subject to an automated decision maker that determines a matching of applicants to resources across a population (e.g., kindergarten assignment for children or housing for low-income families). Alternatively, one can expect to generate explanations in more interactive settings, such as for the actions and policies of a reinforcement learning agent [144, 145, 201, 224] or for recommender systems [60, 84]. Finally, explanations may also be generated for time-series data [5, 16, 142], which can be extended to support online data streams and models that change over time [21, 61, 182, 200, 226].
5.3 Beyond Individualized Recourse
So far, the presented formulations aimed to offer recourse explanations pertaining to a single individual and assumed that recourse recommendations would be undertaken by that individual. However, it is natural to extend the notion of recourse beyond the data subject in question, or beyond a single individual in the population.
An example of the former setting is when the family member of a patient decides on a treatment on their behalf when the patient cannot directly exercise their agency due to incapacitation [226]. One may also consider common cases in judicial processes where a legal counsel represents and seeks recourse for their client that may then be exercised by another fiduciary. In such settings, the formulation of cost and feasibility of actions may need to be adjusted to account for restrictions on both the subject and the actor.
Alternatively, recourse may be achieved through the collective action of a group of people rather than that of a single individual [114]. For instance, the efforts of social and political activists may culminate in a law change that offers better conditions for a group of individuals. In such settings, a (background) variable that is non-actionable (or incurs high cost) on an individual level may be rendered as actionable on a group level, which may in turn bring down the cost for all members of the group. This example also suggests that background variables may capture contextual information (e.g., economy) that are not characteristics of, but nonetheless affect, the individual. Furthermore, the individual may not have control over these macro variables that change over time and violate the stationarity assumption of the world. Additionally, explicit consideration for multiple agents [175], building theories of minds for each agent so as to best persuade [123], and viewing recourse as a sequential game under non-instantaneous effects of actions taken place [170, 228] is an underexplored area for future research. Finally, the need to analyze recourse on a subpopulation level may arise due to uncertainty in assumptions [115] or as an intentional study of other properties of the system, such as fairness [46, 96, 113, 195, 223], which we explore further in the following.
5.4 On the Interplay of Recourse and Ethical ML
The preceding research questions have primarily focused on one stakeholder: the affected individual. However, giving the right of recourse to individuals should not be considered in a vacuum and independently of the effect that providing explanations/recommendations may have on other stakeholders (e.g., model deployer and regulators), or in relation to other desirable properties (e.g., fairness, security, privacy, robustness), broadly referred to as ethical ML. We explore this interplay next.
5.4.1 Recourse and Fairness.
Fairness in ML is a primary area of study for researchers concerned with uncovering and correcting potentially discriminatory behavior of ML models. In this regard, prior work has informally used the concept of fairness of recourse as a means to evaluate the fairness of predictions. For instance, Ustun et al. [223] look at comparable male/female individuals that were denied a loan and show that a disparity can be detected if the suggested recourse actions (namely, flipsets) require relatively more effort for individuals of a particular subgroup. Along these lines, Sharma et al. [209] evaluate group fairness via aggregating and comparing the cost of recourse (namely, burden) over individuals of different subpopulations. Karimi et al. [113] show that the addition of feasibility constraints (e.g., non-decreasing age) that results in an increase in the cost of recourse indicates a reliance of the fixed model on the sensitive attribute age, which is often considered as legally and socially unfair. Here we clarify that these notions are distinct and would benefit from a proper mathematical study of the relation between them.
The preceding examples suggest that evidence of discriminatory recourse (e.g., reliance on race) may be used to uncover unfair classification. We show, however, that the contrapositive statement does not hold: consider, for example, a 2D dataset comprising of two subgroups (i.e., \( s \in \lbrace 0,1\rbrace \)), where \( p(x|s) = \mathcal {N}(0, 10^s) \). Consider a binary classifier, \( h: \mathbb {R} \times \mathbb {S} \rightarrow \lbrace 0,1\rbrace \), where \( h(x, s) = \text{sign}(x) \). Although the distribution of predictions satisfies demographic parity, the (average) recourse actions required of negatively predicted individuals in \( s=1 \) is larger than those in \( s=0 \). Thus, we observe unfair recourse even when the predictions are demographically fair.
This contradiction (the conditional and contrapositive not holding simultaneously) can be resolved by considering a new and distinct notion of fairness (i.e., fairness of recourse) that does not imply or is not implied by the fairness of prediction. In this regard, equalizing recourse was recently presented by Gupta et al. [96], who offered the first recourse-based (and prediction-independent) notion of fairness. The authors demonstrate that one can directly calibrate for the average distance to the decision boundary to be equalized across different subgroups during the training of both linear and non-linear classifiers. A natural extension would involve considering the cost of recourse actions, as opposed to the distance to the decision boundary, in flipping the prediction across subgroups. In summary, recourse may trigger new definitions of fairness to ensure non-discriminatory behavior of ML models, as in the work of Von Kügelgen et al. [233].
5.4.2 Recourse and Robustness.
Robustness often refers to our expectation that model outputs should not change as a result of (small) changes to the input. In the context of recourse, we expect that similar individuals should receive similar explanations/recommendations, or that recourse suggestions for an individual should be to some extent invariant to the underlying decision-making system trained on the same dataset [204]. In practice, however, the stability of both gradient-based [8, 65, 85, 152] and counterfactual-based [131, 133] explanation systems has been called into question. Interestingly, it has been argued that it is possible for a model to have robust predictions but non-robust explanations [98], and vice versa [131] (similar to relation between fair predictions and fair recourse). Parallel studies argue that the sparsity of counterfactuals may contribute to non-robust recourse when evaluating explanations generated under different fixed models [182]. Finally, in the case of consequential recommendations, robustness will be affected by assumptions of the causal generative process (see Figure 1). Carefully reviewing assumptions and exploring such robustness issues in more detail is necessary to build trust in the recourse system and in turn in the algorithmic decision-making system. For recent works in this direction, which study the sensitivity of recourse methods to uncertainty in the features, model, or causal assumptions, refer to works found elsewhere [62, 66, 160, 173, 211, 222, 230]. Another interesting research question pertains to the generation of recourse when actionability of the recommended actions changes over time or due to external sources (e.g., inflation). Studying and accounting for various failure modes of recourse is important to offer robust and ultimately useful recourse.
5.4.3 Recourse, Security, and Privacy.
Model extraction concerns have been raised in various settings for ML APIs [141, 197, 220, 236]. In such settings, an adversary aims to obtain a surrogate model, \( \hat{f} \), that is similar (e.g., in fidelity) to the target model, f: (6) \( \begin{equation} \begin{aligned}f \approx \hat{f} = \mathrm{arg}\min _{\hat{f} \in \mathcal {F}} \mathbb {E}_{\boldsymbol {x}\sim P(\boldsymbol {x})}\Big [\mathcal {L}\big (f(\boldsymbol {x}), \hat{f}(\boldsymbol {x})\big)\Big ]. \end{aligned} \end{equation} \)
Here, an adversary may have access to various model outputs (classification label [141], class probabilities [220], etc.) under different query budgets (unlimited, rate- limited, etc. [44, 108]). Model extraction may be accelerated in the presence of additional information, such as gradients of outputs with respect to inputs7 [159], or contrastive explanations [7]. Related to recourse, and of practical concern [21, 208, 213, 223], is a study of the ability of an adversary with access to a recourse API in extracting a model. Specifically, we consider a setting in which the adversary has access to a prediction API and a recourse API, which given a factual instance, \( \boldsymbol {x}^\texttt {F} \), returns a nearest contrastive explanation, \( \boldsymbol {x}^\texttt {*CF} \), using a known distance function, \( \Delta : \mathcal {X} \times \mathcal {X} \rightarrow \mathbb {R}_+ \).8 How many queries should be made to these APIs to perform a functionally equivalent extraction of \( f(\cdot) \)?
In a first attack strategy, one could learn a surrogate model on a dataset where factual instances and labels (form the training set or randomly sampled) are augmented with counterfactual instances and counterfactual labels. This idea was explored by Aïvodji et al. [7], where they demonstrated that a high-fidelity surrogate model can be extracted even under low query budgets. Although easy to understand and implement, this attack strategy implicitly assumes that constructed dataset has independent and identically distributed data, and thus does not make use of the relations between factual and counterfactual pairs.
An alternative attack strategy considers that the model f can be fully represented by its decision boundaries, or the complementary decision regions \( \lbrace \mathcal {R}_1, \ldots , \mathcal {R}_l\rbrace \). Every contrastive explanation returned from the recourse API informs us that all instances surrounding the factual instance, up to a distance of \( \Delta (\boldsymbol {x}^\texttt {F}, \boldsymbol {x}^\texttt {*CF}) \), share the same class label as \( \boldsymbol {x}^\texttt {F} \) according to f (otherwise, that instance would be the nearest contrastive explanation). More formally, \( f(\boldsymbol {x}^\texttt {F}) = f(\boldsymbol {x}) ~ \forall ~ \boldsymbol {x}\in \mathcal {B}^\Delta _{\boldsymbol {x}^\texttt {F}} \), where \( \mathcal {B}^\Delta _{\boldsymbol {x}^\texttt {F}} \) is referred to as the \( \Delta \)-factual ball, centered at \( \boldsymbol {x}^\texttt {F} \) and with radius \( \Delta (\boldsymbol {x}^\texttt {F}, \boldsymbol {x}^\texttt {*CF}) \). The model extraction problem can thus be formulated as the number of factual balls needed to maximally pack all decision regions (Figure 2): (7) \( \begin{equation} \begin{aligned}\Pr \Big [ \mathrm{Vol}(\mathcal {R}_l) - \cup _{i=1, {\mathcal {B}^\Delta _{\boldsymbol {x}^\texttt {F}}}_i \subseteq \mathcal {R}_l}^n \mathrm{Vol}({\mathcal {B}^\Delta _{\boldsymbol {x}^\texttt {F}}}_i) \le \epsilon \Big ] \ge 1 - \delta ~~ \forall ~~ l. \end{aligned} \end{equation} \)
As in other extraction settings, \( \hat{f} \) can then used to infer private information on individuals in the training set, to uncover exploitable system vulnerabilities, or for free internal use. Understanding attack strategies may guide recourse policy and the design of defensive mechanisms to hinder the exploitation of such vulnerabilities.
Surprisingly, a model need not be extracted in the preceding sense to be revealing of sensitive information. Building on the preceding intuition, we note that a single contrastive explanation informs the data subject that there are no instances in a certain vicinity (i.e., within \( \mathcal {B}^\Delta _{\boldsymbol {x}^\texttt {F}} \)) such that their prediction is different. This information informs the data subject about, for example, whether their similar friend was also denied a loan, violating their predictive privacy. Even under partial knowledge of the friend’s attributes, an adversary may use the information about the shared predictions in \( \mathcal {B}^\Delta _{\boldsymbol {x}^\texttt {F}} \) to perform membership inference attacks [210] or infer missing attributes [72]. This problem is worsened when multiple diverse explanations are generated and is an open problem.
5.4.4 Recourse and Manipulation.
Although a central goal of recourse is to foster trust between an individual and an automated system, it would be simplistic to assume that all parties will act truthfully in this process. For instance, having learned something about the decision-making process (perhaps through recommendations given to similar individuals), an individual may exaggerate some of their attributes for a better chance of favorable treatment [226]. Trust can also be violated by the recourse-offering party. As discussed earlier, the multiplicity of recourse explanations/recommendations (see Section 4.1.1) may allow for an organization to cherry-pick “the most socially acceptable explanation out of many equally plausible ones” [21, 98, 126] (also see fairwashing [6]). In such cases of misaligned incentives, the oversight of a regulatory body, perhaps with random audits of either party, seems necessary. Another solution may involve mandating a minimum number of diverse recourse offerings, which would conflict with security considerations.
5.5 Toward Unifying Benchmarks
Table 1 presents an overview of the diverse settings in which recourse is sought. Despite the abundance of open source implementations built on robust tools and working well in their respective settings, a comparative benchmark for recourse is lacking. This problem is exacerbated for consequential recommendations that further rely on assumptions about the causal generative process. To make objective progress, however, new contributions should be evaluated against existing methods. Thus, a next step for the community is the curation of an online challenge (e.g., using Kaggle) to benchmark the performance of existing and new methods. To broadly cover practical settings, we envision multiple tracks where authors can submit their generated explanations/recommendations given a fixed classifier, test dataset, and pre-determined
6 CONCLUSION
Our work started with a case study of a 28-year-old female professional who was denied a loan by an automated decision-making system. We aimed to assist this individual in overcoming their difficult situation—that is, to achieve algorithmic recourse, which was contingent on offering answers to two questions: why and how? We studied the relation between these questions and arrived at distinct responses, namely contrastive explanations and consequential recommendations. Mindful of the goal of recourse, we emphasized minimal consequential recommendations over nearest contrastive explanations, as the former directly optimizes for the least effort from the individual. Furthermore, we noted that offering recourse recommendations automatically implied recourse explanations (through simulation of the causal effect of undertaken actions), whereas the converse would not. In reviewing the literature, however, we observed an underexploration of consequential recommendations, which we attribute to its reliance on additional assumptions at the level of the causal generative process of the world in which actions take place.
We remark that the primary emphasis of recourse is on providing actionable recommendations for overcoming adverse predictions, unlike counterfactual explanations that highlight similar instances with different predictions. This subtlety is also apparent in parallel literature on adversarial ML (see Section 3.6): counterfactual explanations identify the adversarial instances, whereas recourse informs on how to create them. Therefore, to provide actionable recommendations under real-world constraints, we must consider that actions on one subset of variables may have consequences on other variables. Such constraints can be accounted for using the SCM framework, and therefore we primarily take this approach in our work.
In addition to unifying and precisely defining recourse, we present an overview of the many constraints (e.g., actionability, plausibility, diversity, sparsity) that are needed to model realistic recourse settings. With accompanying illustrative examples, we distinguish between the notions of
Finally, we identify several prospective research directions that challenge the assumptions of existing setups and present extensions to better situate recourse in the broader ethical ML literature. The presented examples and discussion serve to illustrate the diversity of stakeholder needs and a tension between the desirable system properties (fairness, security, privacy, robustness) that we seek to offer alongside recourse. Satisfyingly addressing these needs and navigating the entailed trade-offs may require new definitions and techniques, and relies on the cross-disciplinary expertise of a panel of technical and social scientists. It is our hope that this document may guide further discussion and progress in this direction.
ACKNOWLEDGMENTS
A.-H. Karimi sincerely thanks the senior authors for encouraging him to undertake the daunting task of writing a first draft, which eventually resulted in this article. A.-H. Karimi is also appreciative of Julius von Kügelgen and Umang Bhatt for fruitful discussions on recourse and fairness, Muhammad Waleed Gondal, and the anonymous reviewers for constructive feedback throughout, and NSERC and CLS for generous funding support.
Footnotes
1 A common assumption when offering recommendations is that the world is stationary; thus, actions that would have led me to develop this profile had they been performed in the past will result in the same were they to be performed now. This assumption is challenged in other works [194, 226] and discussed further in Section 5.3.
Footnote2 Note that “some researchers tend to either collapse or intentionally distinguish contrastive from counterfactual reasoning despite their conceptual similarity” [215], adding to confusion. For cross-disciplinary reviews, please refer to other works [156, 157, 215].
Footnote3 Relatedly, the counterfactual instance that results from performing optimal actions, \( \boldsymbol {a}^* \), need not correspond to the counterfactual instance resulting from optimally and independently shifting features according to \( \boldsymbol {\delta }^* \); see Proposition 4.1 and Figure 1 in the work of Karimi et al. [114]. This discrepancy may arise due to, for example, minimal recommendations suggesting that actions be performed on an ancestor of those variables that are input to the model.
Footnote4 Optimization terminology refers to both of these constraint sets as feasibility sets.
Footnote5 The existence of multiple equally costly recourse actions is commonly referred to as the Rashoman effect [34].
Footnote6 Alternative categorization of recourse generating methods can be found in the work of Redelmeier et al. [196].
Footnote7 A large class of explanation methods rely on the gradients to offer saliency/attribution maps, especially in the image domain.
Footnote8 Explanation models such as MACE [113] provide optimal solutions, \( \boldsymbol {x}^{\texttt {CF}}_{\epsilon } \), where \( f(\boldsymbol {x}^\texttt {F}) \not= f(\boldsymbol {x}^{\texttt {CF}}_{\epsilon }),~ \Delta (\boldsymbol {x}^\texttt {F}, \boldsymbol {x}^{\texttt {CF}}_{\epsilon }) \le \Delta (\boldsymbol {x}^\texttt {F}, \boldsymbol {x}^\texttt {*CF}) + \epsilon \), where \( \boldsymbol {x}^\texttt {*CF} \) is the optimal nearest contrastive explanation. In practice, \( \epsilon =1e-5, \) which in turn results in \( \boldsymbol {x}^{\texttt {CF}}_{\epsilon }\approx \boldsymbol {x}^\texttt {*CF} \).
Footnote9 Since the manuscript of this work was first published online, several works have pushed in this direction [58, 58, 106, 192, 251], and also a public repository for running many recourse generating methods in a unified and comparable fashion [181].
Footnote
- [1] . 2018. Trends and trajectories for explainable, accountable and intelligible systems: An HCI research agenda. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–18.Google ScholarDigital Library
- [2] . 2018. Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access 6 (2018), 52138–52160.Google ScholarCross Ref
- [3] . 1996. Adult Data Set. Retrieved April 12, 2022 from https://archive.ics.uci. edu/ml/datasets/adult.Google Scholar
- [4] . 2010. The inverse classification problem. Journal of Computer Science and Technology 25, 3 (2010), 458–468.Google ScholarDigital Library
- [5] . 2020. Cold-start promotional sales forecasting through gradient boosted-based contrastive explanations. IEEE Access 8 (2020), 137574–137586.Google ScholarCross Ref
- [6] . 2019. Fairwashing: The risk of rationalization. arXiv preprint arXiv:1901.09749 (2019).Google Scholar
- [7] . 2020. Model extraction from counterfactual explanations. arXiv preprint arXiv:2009.01884 (2020).Google Scholar
- [8] . 2018. On the robustness of interpretability methods. arXiv preprint arXiv:1806.08049 (2018).Google Scholar
- [9] . 1996. Identification of causal effects using instrumental variables. Journal of the American Statistical Association 91, 434 (1996), 444–455.Google ScholarCross Ref
- [10] . 2016. Machine bias. ProPublica. Retrieved April 12, 2022 from https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.Google Scholar
- [11] . 2019. Efficient computation of counterfactual explanations of LVQ models. arXiv preprint arXiv:1908.00735 (2019).Google Scholar
- [12] . 2019. On the computation of counterfactual explanations—A survey. arXiv preprint arXiv:1911.07749 (2019).Google Scholar
- [13] . 2020. Convex density constraints for computing plausible counterfactual explanations. arXiv preprint arXiv:2002.04862 (2020).Google Scholar
- [14] . 2017. Latent space oddity: On the curvature of deep generative models. arXiv preprint arXiv:1710.11379 (2017).Google Scholar
- [15] . 2020. Geometrically enriched latent spaces. arXiv preprint arXiv:2008.00565 (2020).Google Scholar
- [16] . 2020. Counterfactual explanations for machine learning on multivariate time series data. arXiv preprint arXiv:2008.10781 (2020).Google Scholar
- [17] . 2013. UCI machine learning repository.Google Scholar
- [18] . 2019. Imperceptible adversarial attacks on tabular data. arXiv preprint arXiv:1911.03274 (2019).Google Scholar
- [19] . 2020. On Pearl’s hierarchy and the foundations of causal inference. ACM Special Volume in Honor of Judea Pearl (provisional title) (2020).Google Scholar
- [20] . 2017. Fairness in machine learning. NIPS Tutorial 1 (2017).Google Scholar
- [21] . 2020. The hidden assumptions behind counterfactual explanations and principal reasons. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 80–89.Google ScholarDigital Library
- [22] . 2020. Plausible counterfactuals: Auditing deep learning classifiers with realistic adversarial examples. arXiv preprint arXiv:2003.11323 (2020).Google Scholar
- [23] . 2011. CVC4. In Proceedings of the International Conference on Computer Aided Verification. 171–177.Google ScholarCross Ref
- [24] . 2017. Interpretability via model extraction. arXiv preprint arXiv:1706.09773 (2017).Google Scholar
- [25] . 2019. Equal opportunity in online classification with partial feedback. In Advances in Neural Information Processing Systems. 8974–8984.Google Scholar
- [26] . 2009. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences 2, 1 (2009), 183–202.Google ScholarDigital Library
- [27] . 2022. Causal explanations and XAI. arXiv preprint arXiv:2201.13169 (2022).Google Scholar
- [28] . 2019. The need for uncertainty quantification in machine-assisted medical decision making. Nature Machine Intelligence 1, 1 (2019), 20–23.Google ScholarCross Ref
- [29] . 2020. An ASP-based approach to counterfactual explanations for classification. arXiv preprint arXiv:2004.13237 (2020).Google Scholar
- [30] . 2020. Machine learning explainability for external stakeholders. arXiv preprint arXiv:2007.05408 (2020).Google Scholar
- [31] . 2020. Explainable machine learning in deployment. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 648–657.Google ScholarDigital Library
- [32] . n.d. Explanation and justification in machine learning: A survey.Google Scholar
- [33] . 2004. Convex Optimization. Cambridge University Press.Google ScholarDigital Library
- [34] . 2001. Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical Science 16, 3 (2001), 199–231.Google ScholarCross Ref
- [35] . 2016. How the machine ‘thinks’: Understanding opacity in machine learning algorithms. Big Data & Society 3, 1 (2016), 2053951715622512.Google ScholarCross Ref
- [36] . 2019. Counterfactuals in explainable artificial intelligence (XAI): Evidence from human reasoning. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19). 6276–6282.Google ScholarCross Ref
- [37] . 2007. Knowledge actionability: Satisfying technical and business interestingness. International Journal of Business Intelligence and Data Mining 2, 4 (2007), 496–514.Google ScholarDigital Library
- [38] . 2006. Domain-driven actionable knowledge discovery in the real world. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. 821–830.Google ScholarDigital Library
- [39] . 2009. Flexible frameworks for actionable knowledge discovery. IEEE Transactions on Knowledge and Data Engineering 22, 9 (2009), 1299–1312.Google Scholar
- [40] . 2017. Towards evaluating the robustness of neural networks. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP’17). IEEE, Los Alamitos, CA, 39–57.Google ScholarCross Ref
- [41] . 2019. Unlabeled data improves adversarial robustness. arXiv preprint arXiv:1905.13736 (2019).Google Scholar
- [42] . 2022. Counterfactual explanations for functional data: A mathematical optimization approach.Google Scholar
- [43] . 2019. EMAP: Explanation by minimal adversarial perturbation. arXiv preprint arXiv:1912.00872 (2019).Google Scholar
- [44] . 2020. Boosting decision-based black-box adversarial attacks with random sign flip. In Proceedings of the European Conference on Computer Vision.Google ScholarDigital Library
- [45] . 2020. Strategic recourse in linear classification. arXiv preprint arXiv:2011.00355 (2020).Google Scholar
- [46] . 2020. DECE: Decision explorer with counterfactual explanations for machine learning models. arXiv preprint arXiv:2008.08353 (2020).Google Scholar
- [47] . 2022. Counterfactuals and causability in explainable artificial intelligence: Theory, algorithms, and applications. Information Fusion 81 (2022), 59–83.Google ScholarDigital Library
- [48] . 2021. Counterfactual explanations for models of code. arXiv preprint arXiv:2111.05711 (2021).Google Scholar
- [49] . 2019. Certified adversarial robustness via randomized smoothing. In Proceedings of the International Conference on Machine Learning. 1310–1320.Google Scholar
- [50] . 2019. Efficient candidate screening under multiple tests and implications for fairness. arXiv preprint arXiv:1905.11361 (2019).Google Scholar
- [51] . 1999. Causal discovery from a mixture of experimental and observational data. In Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence. 116–125.Google Scholar
- [52] . 2018. The measure and mismeasure of fairness: A critical review of fair machine learning. arXiv preprint arXiv:1808.00023 (2018).Google Scholar
- [53] . 2009. Introduction to Algorithms. MIT Press, Cambridge, MA.Google ScholarDigital Library
- [54] . 2009. V12. 1: User’s manual for CPLEX. International Business Machines Corporation 46, 53 (2009), 157.Google Scholar
- [55] . 2015. Optimal action extraction for random forests and boosted trees. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 179–188.Google ScholarDigital Library
- [56] . 2020. Multi-objective counterfactual explanations. arXiv preprint arXiv:2004.11165 (2020).Google Scholar
- [57] . 2008. Z3: An efficient SMT solver. In Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems. 337–340.Google ScholarCross Ref
- [58] . 2021. A framework and benchmarking study for counterfactual generating methods on tabular data. Applied Sciences 11, 16 (2021), 7274.Google ScholarCross Ref
- [59] . 2022. Synthesizing explainable counterfactual policies for algorithmic recourse with program synthesis. arXiv preprint arXiv:2201.07135 (2022).Google Scholar
- [60] . 2020. Recommendations and user agency: The reachability of collaboratively-filtered information. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 436–445.Google ScholarDigital Library
- [61] . 2021. Instance-based counterfactual explanations for time series classification. In Proceedings of the International Conference on Case-Based Reasoning. 32–47.Google ScholarDigital Library
- [62] . 2021. Uncertainty estimation and out-of-distribution detection for counterfactual explanations: Pitfalls and solutions. arXiv preprint arXiv:2107.09734 (2021).Google Scholar
- [63] . 2018. Explanations based on the missing: Towards contrastive explanations with pertinent negatives. In Advances in Neural Information Processing Systems. 592–603.Google Scholar
- [64] . 2019. Model agnostic contrastive explanations for structured data. arXiv preprint arXiv:1906.00117 (2019).Google Scholar
- [65] . 2019. Explanations can be manipulated and geometry is to blame. arXiv preprint arXiv:1906.07983 (2019).Google Scholar
- [66] . 2021. On the adversarial robustness of causal algorithmic recourse. arXiv preprint arXiv:2112.11313 (2021).Google Scholar
- [67] . 2018. Strategic classification from revealed preferences. In Proceedings of the 2018 ACM Conference on Economics and Computation. 55–70.Google ScholarDigital Library
- [68] . 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).Google Scholar
- [69] . 2020. CRUDS: Counterfactual recourse using disentangled subspaces. In Proceedings of the 2020 ICML Workshop on Human Interpretability in Machine Learning (WHI’20). 1–23.Google Scholar
- [70] . 2011. Efficient action extraction with many-to-many relationship between actions and features. In Proceedings of the International Workshop on Logic, Rationality, and Interaction. 384–385.Google ScholarCross Ref
- [71] . 2017. UCI Machine Learning Repository. Retrieved April 12, 2022 from http://archive.ics.uci.edu/ml.Google Scholar
- [72] . 2018. Privacy-preserving prediction. arXiv preprint arXiv:1803.10266 (2018).Google Scholar
- [73] . 2012. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference. ACM, New York, NY, 214–226.Google ScholarDigital Library
- [74] . 2021. Discriminative attribution from counterfactuals. arXiv preprint arXiv:2109.13412 (2021).Google Scholar
- [75] . 2021. Explaining classifiers using adversarial perturbations on the perceptual ball. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10693–10702.Google ScholarCross Ref
- [76] . 2015. Fundamental limits on adversarial robustness. In Proceedings of the ICML Workshop on Deep Learning.Google Scholar
- [77] . 2020. Explaining data-driven decisions made by AI systems: The counterfactual approach. arXiv preprint arXiv:2001.07417 (2020).Google Scholar
- [78] . 2021. The intriguing relation between counterfactual explanations and adversarial examples. Minds and Machines 32, 1 (2021), 1–33.Google Scholar
- [79] . 2014. Comprehensible classification models: A position paper. ACM SIGKDD Explorations Newsletter 15, 1 (2014), 1–10.Google ScholarDigital Library
- [80] . 2021. Feature attribution and recourse via probabilistic contrastive counterfactuals. In Proceedings of the ICML Workshop on Algorithmic Recourse. 1–6.Google Scholar
- [81] . 2015. PySMT: A solver-agnostic library for fast prototyping of SMT-based algorithms. In Proceedings of the SMT Workshop, Vol. 2015.Google Scholar
- [82] . 2011. Causality and statistical learning. arXiv:1003.2619 (2011).Google Scholar
- [83] . 2017. Eye-tracking causality. Psychological Science 28, 12 (2017), 1731–1744.Google ScholarCross Ref
- [84] . 2020. PRINCE: Provider-side interpretability with counterfactual explanations in recommender systems. In Proceedings of the 13th International Conference on Web Search and Data Mining. 196–204.Google ScholarDigital Library
- [85] . 2019. Interpretation of neural networks is fragile. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 3681–3688.Google ScholarDigital Library
- [86] . 2018. Explaining explanations: An overview of interpretability of machine learning. In Proceedings of the 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA’18). IEEE, Los Alamitos, CA, 80–89.Google ScholarCross Ref
- [87] . 2020. ViCE: Visual counterfactual explanations for machine learning models. In Proceedings of the 25th International Conference on Intelligent User Interfaces. 531–535.Google ScholarDigital Library
- [88] . 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).Google Scholar
- [89] . 2019. Counterfactual visual explanations. arXiv preprint arXiv:1904.07451 (2019).Google Scholar
- [90] . 2018. Interpretable credit application predictions with counterfactual explanations. arXiv preprint arXiv:1811.05245 (2018).Google Scholar
- [91] . 2020. On the ethics of algorithmic decision-making in healthcare. Journal of Medical Ethics 46, 3 (2020), 205–211.Google ScholarCross Ref
- [92] . 2019. Black box explanation by learning image exemplars in the latent feature space. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 189–205.Google ScholarDigital Library
- [93] . 2018. Local rule-based explanations of black box decision systems. arXiv preprint arXiv:1805.10820 (2018).Google Scholar
- [94] . 2018. A survey of methods for explaining black box models. ACM Computing Surveys 51, 5 (2018), 1–42.Google ScholarDigital Library
- [95] . 2018. A survey of learning causality with data: Problems and methods. arXiv preprint arXiv:1809.09337 (2018).Google Scholar
- [96] . 2019. Equalizing recourse across groups. arXiv preprint arXiv:1909.03166 (2019).Google Scholar
- [97] . 2005. Causes and explanations: A structural-model approach. Part I: Causes. British Journal for the Philosophy of Science 56, 4 (2005), 843–887.Google ScholarCross Ref
- [98] . 2020. Robustness in machine learning explanations: Does it matter? In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 640–647.Google ScholarDigital Library
- [99] . 2016. Strategic classification. In Proceedings of the 2016 ACM Conference on Innovations in Theoretical Computer Science. 111–122.Google ScholarDigital Library
- [100] . 2020. PermuteAttack: Counterfactual explanation of machine learning credit scorecards. arXiv preprint arXiv:2008.10138 (2020).Google Scholar
- [101] . 1990. Conversational processes and causal explanation. Psychological Bulletin 107, 1 (1990), 65.Google ScholarCross Ref
- [102] . 1986. Knowledge-based causal attribution: The abnormal conditions focus model. Psychological Review 93, 1 (1986), 75.Google ScholarCross Ref
- [103] . n.d. FICO Explainable Machine Learning Challenge. Retrieved April 12, 2022 from https://community.fico.com/s/explainable-machine-learning-challenge.Google Scholar
- [104] . 2019. The disparate effects of strategic manipulation. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 259–268.Google ScholarDigital Library
- [105] . 2018. Multimodal explanations: Justifying decisions and pointing to the evidence. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8779–8788.Google ScholarCross Ref
- [106] . 2021. On quantitative evaluations of counterfactuals. arXiv preprint arXiv:2111.00177 (2021).Google Scholar
- [107] . 2019. Metric learning for individual fairness. arXiv preprint arXiv:1906.00250 (2019).Google Scholar
- [108] . 2018. Black-box adversarial attacks with limited queries and information. arXiv preprint arXiv:1804.08598 (2018).Google Scholar
- [109] . 2019. REVISE: Towards realistic individual recourse and actionable explanations in black-box decision making systems. arXiv preprint arXiv:1907.09615 (2019).Google Scholar
- [110] . 2020. DACE: Distribution-aware counterfactual explanation by mixed-integer linear optimization. In Proceedings of the 29th International Joint Conference on Artificial Intelligence: Main Track (IJCAI’20). 2855–2862.Google Scholar
- [111] . 2020. Counterfactual explanation based on gradual construction for deep networks. arXiv preprint arXiv:2008.01897 (2020).Google Scholar
- [112] . 2013. Decision tree and naive Bayes algorithm for classification and generation of actionable knowledge for direct marketing. Journal of Software Engineering 4 (2013), 196–206.Google Scholar
- [113] . 2020. Model-agnostic counterfactual explanations for consequential decisions. In Proceedings of the International Conference on Artificial Intelligence and Statistics. 895–905.Google Scholar
- [114] . 2021. Algorithmic recourse: From counterfactual explanations to interventions. In Proceedings of the 4th Conference on Fairness, Accountability, and Transparency (FAccT’21). 353–362.Google ScholarDigital Library
- [115] . 2020. Algorithmic recourse under imperfect causal knowledge: A probabilistic approach. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS’20). 265–277.Google Scholar
- [116] . 2017. Reluplex: An efficient SMT solver for verifying deep neural networks. In Proceedings of the International Conference on Computer Aided Verification. 97–117.Google ScholarCross Ref
- [117] . 2021. If only we had better counterfactual explanations: Five key deficits to rectify in the evaluation of counterfactual XAI techniques. arXiv preprint arXiv:2103.01035 (2021).Google Scholar
- [118] . 2020. Good counterfactuals and where to find them: A case-based technique for generating counterfactuals for explainable AI (XAI). arXiv preprint arXiv:2005.13997 (2020).Google Scholar
- [119] . 2020. Fair decisions despite imperfect predictions. In Proceedings of the International Conference on Artificial Intelligence and Statistics. 277–287.Google Scholar
- [120] . 2020. How do classifiers induce agents to invest effort strategically? ACM Transactions on Economics and Computation 8, 4 (2020), 1–23.Google ScholarDigital Library
- [121] . 2021. Towards unifying feature attribution and counterfactual explanations: Different means to the same end. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. 652–663.Google ScholarDigital Library
- [122] . 2021. A causal perspective on meaningful and robust algorithmic recourse. arXiv preprint arXiv:2107.07853 (2021).Google Scholar
- [123] . 2021. Persuasive contrastive explanations for Bayesian networks. In Proceedings of the European Conference on Symbolic and Quantitative Approaches with Uncertainty. 229–242.Google ScholarDigital Library
- [124] . 2020. Counterfactual explanation of machine learning survival models. arXiv preprint arXiv:2006.16793 (2020).Google Scholar
- [125] . 2019. An evaluation of the human-interpretability of explanation. arXiv preprint arXiv:1902.00006 (2019).Google Scholar
- [126] . 2020. “How do I fool you?” Manipulating user trust via misleading black box explanations. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. 79–85.Google ScholarDigital Library
- [127] . 2016. Propublica/compas-analysis. Retrieved April 13, 2022 from https://github.com/propublica/compas-analysis.Google Scholar
- [128] . 2017. Generalized inverse classification. In Proceedings of the 2017 SIAM International Conference on Data Mining. 162–170.Google ScholarCross Ref
- [129] . 2018. Prophit: Causal inverse classification for multiple continuously valued treatment policies. arXiv preprint arXiv:1802.04918 (2018).Google Scholar
- [130] . 2017. A budget-constrained inverse classification framework for smooth classifiers. In Proceedings of the 2017 IEEE International Conference on Data Mining Workshops (ICDMW’17). IEEE, Los Alamitos, CA, 1184–1193.Google ScholarCross Ref
- [131] . 2019. Issues with post-hoc counterfactual explanations: A discussion. arXiv preprint arXiv:1906.04774 (2019).Google Scholar
- [132] . 2017. Inverse classification for comparison-based interpretability in machine learning. arXiv preprint arXiv:1712.08443 (2017).Google Scholar
- [133] . 2019. The dangers of post-hoc interpretability: Unjustified counterfactual explanations. arXiv preprint arXiv:1907.09294 (2019).Google Scholar
- [134] . 1966. Branch-and-bound methods: A survey. Operations Research 14, 4 (1966), 699–719.Google ScholarDigital Library
- [135] . 2021. Strategic classification made practical. In Proceedings of the International Conference on Machine Learning. 6243–6253.Google Scholar
- [136] . 1973. Counterfactuals. Harvard University Press, Cambridge, MA.Google Scholar
- [137] . 1986. Causal explanation. In Philosophical Papers Vol. II. Oxford University Press, 214–240.Google Scholar
- [138] . 1990. Contrastive explanation. Royal Institute of Philosophy Supplement 27 (1990), 247–266.Google Scholar
- [139] . 2018. The mythos of model interpretability. Queue 16, 3 (2018), 31–57.Google ScholarDigital Library
- [140] . 2020. The disparate equilibria of algorithmic decision making when individuals invest rationally. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 381–391.Google ScholarDigital Library
- [141] . 2005. Adversarial learning. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York, NY, 641–647.Google ScholarDigital Library
- [142] . 2020. Why does my model fail? Contrastive local explanations for retail forecasting. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 90–98.Google ScholarDigital Library
- [143] . 2019. Actionable interpretability through optimizable counterfactual explanations for tree ensembles. arXiv preprint arXiv:1911.12199 (2019).Google Scholar
- [144] . 2019. Explainable reinforcement learning through a causal lens. arXiv preprint arXiv:1905.10958 (2019).Google Scholar
- [145] . 2020. Distal explanations for explainable reinforcement learning agents. arXiv preprint arXiv:2001.10284 (2020).Google Scholar
- [146] . 2019. Preserving causal constraints in counterfactual explanations for machine learning classifiers. arXiv preprint arXiv:1912.03277 (2019).Google Scholar
- [147] . 2000. The cost-minimizing inverse classification problem: A genetic algorithm approach. Decision Support Systems 29, 3 (2000), 283–300.Google ScholarDigital Library
- [148] . 2014. Explaining data-driven document classifications. MIS Quarterly 38, 1 (2014), 73–100.Google ScholarDigital Library
- [149] . 2021. Counterfactual explanations for employment services.Google Scholar
- [150] . 2005. A survey of interestingness measures for knowledge discovery. Knowledge Information Review 20, 1 (2005), 39–61.Google Scholar
- [151] . 1993. Contrastive and counterfactual reasoning in causal judgment. Journal of Personality and Social Psychology 64, 6 (1993), 897.Google ScholarCross Ref
- [152] . 2018. Towards robust interpretability with self-explaining neural networks. In Advances in Neural Information Processing Systems. 7775–7784.Google Scholar
- [153] . 2020. GANterfactual-counterfactual explanations for medical non-experts using generative adversarial learning. arXiv preprint arXiv:2012.11905 (2020).Google Scholar
- [154] . 1956. The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review 63, 2 (1956), 81.Google ScholarCross Ref
- [155] . 2020. Strategic classification is causal modeling in disguise. In Proceedings of the International Conference on Machine Learning. 6917–6926.Google Scholar
- [156] . 2018. Contrastive explanation: A structural-model approach. arXiv preprint arXiv:1811.03163 (2018).Google Scholar
- [157] . 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence 267 (2019), 1–38.Google ScholarCross Ref
- [158] . 2019. The social cost of strategic classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 230–239.Google ScholarDigital Library
- [159] . 2019. Model reconstruction from model explanations. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 1–9.Google ScholarDigital Library
- [160] . 2021. A survey on the robustness of feature importance and counterfactual explanations. arXiv preprint arXiv:2111.00358 (2021).Google Scholar
- [161] . 2019. Explaining explanations in AI. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 279–288.Google ScholarDigital Library
- [162] . 2021. Scaling guarantees for nearest counterfactual explanations. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Soceity (AIES’21). 177–187.Google Scholar
- [163] . 2018. Methods for interpreting and understanding deep neural networks. Digital Signal Processing 73 (2018), 1–15.Google ScholarCross Ref
- [164] . 2017. Universal adversarial perturbations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1765–1773.Google ScholarCross Ref
- [165] . 2016. DeepFool: A simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2574–2582.Google ScholarCross Ref
- [166] . 2020. Causal interpretability for machine learning-problems, methods and evaluation. ACM SIGKDD Explorations Newsletter 22, 1 (2020), 18–33.Google ScholarDigital Library
- [167] . 2019. DiCE: Explaining machine learning classifiers through diverse counterfactual explanations. arXiv preprint arXiv:1905.07697 (2019).Google Scholar
- [168] . 2002. Multi-objective evolutionary algorithms for the risk–return trade-off in bank loan management. International Transactions in Operational Research 9, 5 (2002), 583–597.Google ScholarCross Ref
- [169] . 2018. Fair inference on outcomes. In Proceedings of the AAAI Conference on Artificial Intelligence. 1931.Google Scholar
- [170] . 2021. Consequence-aware sequential counterfactual generation. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 682–698.Google ScholarDigital Library
- [171] . 2020. Handling incomplete heterogeneous data using VAEs. Pattern Recognition 107 (2020), 107501.Google ScholarCross Ref
- [172] . 2015. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 427–436.Google ScholarCross Ref
- [173] . 2021. Distributionally robust recourse action. arXiv:2110.00088v2[math.OC] (2021).Google Scholar
- [174] . 2006. Numerical Optimization. Springer Science & Business Media.Google Scholar
- [175] . 2021. Multi-agent algorithmic recourse. arXiv preprint arXiv:2110.00673 (2021).Google Scholar
- [176] . 2014. Gurobi Optimizer Reference Manual, 2015. Available at http://www.gurobi.com.Google Scholar
- [177] . 2017. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM Asia Conference on Computer and Communications Security. 506–519.Google ScholarDigital Library
- [178] . 2016. The limitations of deep learning in adversarial settings. In Proceedings of the 2016 IEEE European Symposium on Security and Privacy (EuroS&P’16). IEEE, Los Alamitos, CA, 372–387.Google ScholarCross Ref
- [179] . 2017. General heuristics for nonconvex quadratically constrained quadratic programming. arXiv preprint arXiv:1703.07870 (2017).Google Scholar
- [180] . 2021. Exploring counterfactual explanations through the lens of adversarial examples: A theoretical and empirical analysis. arXiv preprint arXiv:2106.09992 (2021).Google Scholar
- [181] . 2021. Carla: A Python library to benchmark algorithmic recourse and counterfactual explanation algorithms. arXiv preprint arXiv:2108.00783 (2021).Google Scholar
- [182] . 2020. On counterfactual explanations under predictive multiplicity. arXiv preprint arXiv:2006.13132 (2020).Google Scholar
- [183] . 2019. Towards user empowerment. arXiv preprint arXiv:1910.09398 (2019).Google Scholar
- [184] . 2000. Causality: Models, Reasoning and Inference, Vol. 29. Springer.Google ScholarDigital Library
- [185] . 2010. The foundations of causal inference. Sociological Methodology 40, 1 (2010), 75–149.Google ScholarCross Ref
- [186] . 2016. Causal Inference in Statistics: A Primer. John Wiley & Sons.Google Scholar
- [187] . 2014. GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532–1543.Google ScholarCross Ref
- [188] . 2017. Elements of Causal Inference. MIT Press, Cambridge, MA.Google ScholarDigital Library
- [189] . 2019. FACE: Feasible and actionable counterfactual explanations. arXiv preprint arXiv:1909.09369 (2019).Google Scholar
- [190] . 2019. Synthesizing action sequences for modifying model decisions. arXiv preprint arXiv:1910.00057 (2019).Google Scholar
- [191] . 2019. Counterfactual explanation algorithms for behavioral and textual data. arXiv preprint arXiv:1912.01819 (2019).Google Scholar
- [192] . 2021. Understanding consumer preferences for explanations generated by XAI algorithms. arXiv preprint arXiv:2107.02624 (2021).Google Scholar
- [193] . 2019. Generating counterfactual and contrastive explanations using SHAP. arXiv preprint arXiv:1906.09293 (2019).Google Scholar
- [194] . 2020. Can I still trust you?: Understanding the impact of distribution shifts on algorithmic recourses. arXiv preprint arXiv:2012.11788 (2020).Google Scholar
- [195] . 2020. Beyond individualized recourse: Interpretable and interactive summaries of actionable recourses. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS’20). 1–12.Google Scholar
- [196] . 2021. MCCE: Monte Carlo sampling of realistic counterfactual explanations. arXiv preprint arXiv:2111.09790 (2021).Google Scholar
- [197] . 2019. Efficiently stealing your machine learning models. In Proceedings of the 18th ACM Workshop on Privacy in the Electronic Society. 198–210.Google ScholarDigital Library
- [198] . 2016. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1135–1144.Google ScholarDigital Library
- [199] . 2018. Contrastive Explanation for Machine Learning. Master’s Thesis.Google Scholar
- [200] . 2021. Explainable artificial intelligence (XAI) on timeseries data: A survey. arXiv preprint arXiv:2104.00950 (2021).Google Scholar
- [201] . 2020. From predictions to decisions: Using lookahead regularization. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS’20). 1–12.Google Scholar
- [202] . 2020. Ensuring actionable recourse via adversarial training. arXiv preprint arXiv:2011.06146 (2020).Google Scholar
- [203] . 2015. Explaining Explanation. Routledge.Google ScholarCross Ref
- [204] . 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1, 5 (2019), 206–215.Google ScholarCross Ref
- [205] . 2019. Efficient search for diverse coherent explanations. In Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT*’19). ACM, New York, NY, 20–28. Google ScholarDigital Library
- [206] . 2019. Causality for machine learning. arXiv preprint arXiv:1911.10500 (2019).Google Scholar
- [207] . 2020. We need fairness and explainability in algorithmic hiring. In Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems. 1716–1720.Google ScholarDigital Library
- [208] . 2018. The intuitive appeal of explainable machines. Fordham Law Review 87 (2018), 1085.Google Scholar
- [209] . 2019. CERTIFAI: Counterfactual explanations for robustness, transparency, interpretability, and fairness of artificial intelligence models. arXiv preprint arXiv:1905.07857 (2019).Google Scholar
- [210] . 2019. Privacy risks of explaining machine learning models. arXiv preprint arXiv:1907.00164 (2019).Google Scholar
- [211] . 2021. Counterfactual explanations can be manipulated. In Proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS’21). 1–14.Google Scholar
- [212] . 2018. Conversational explanations of machine learning predictions through class-contrastive counterfactual statements. In Proceedings of the 27th International Joint Conference on Artificial Intelligence: Doctoral Consortium (IJCAI’18). 5785–5786.Google ScholarDigital Library
- [213] . 2019. Counterfactual explanations of machine learning predictions: Opportunities and challenges for AI safety. In Proceedings of the AAAI Workshop on Artificial Intelligence Safety (SafeAI@AAAI’19). 1–4.Google Scholar
- [214] . 2012. Optimization for Machine Learning. MIT Press, Cambridge, MA.Google ScholarDigital Library
- [215] . 2021. A survey of contrastive and counterfactual explanation generation methods for explainable artificial intelligence. IEEE Access 9 (2021), 11974–12001.Google ScholarCross Ref
- [216] . 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).Google Scholar
- [217] . 2021. Solving the class imbalance problem using a counterfactual method for data augmentation. arXiv preprint arXiv:2111.03516 (2021).Google Scholar
- [218] . 2001. Causal discovery from changes. In Proceedings of the 17th Conference on Uncertainty in Artificial Intelligence. 512–521.Google Scholar
- [219] . 2017. Interpretable predictions of tree-based ensembles via actionable feature tweaking. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 465–474.Google ScholarDigital Library
- [220] . 2016. Stealing machine learning models via prediction APIs. In Proceedings of the 25th USENIX Security Symposium (USENIX Security’16). 601–618.Google Scholar
- [221] . 2020. Decisions, counterfactual explanations and strategic behavior. arXiv preprint arXiv:2002.04333 (2020).Google Scholar
- [222] . 2021. Towards robust and reliable algorithmic recourse. In Proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS’21). 1–12.Google Scholar
- [223] . 2019. Actionable recourse in linear classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, New York, NY, 10–19.Google ScholarDigital Library
- [224] . 2018. Contrastive explanations with local foil trees. arXiv preprint arXiv:1806.07470 (2018).Google Scholar
- [225] . 2019. Interpretable counterfactual explanations guided by prototypes. arXiv preprint arXiv:1907.02584 (2019).Google Scholar
- [226] . 2020. The philosophical basis of algorithmic recourse. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, New York, NY.Google ScholarDigital Library
- [227] . 2020. Counterfactual explanations for machine learning: A review. arXiv preprint arXiv:2010.10596 (2020).Google Scholar
- [228] . 2021. Amortized generation of sequential counterfactual explanations for black-box models. arXiv preprint arXiv:2106.03962 (2021).Google Scholar
- [229] . 2020. Explainable image classification with evidence counterfactual. arXiv preprint arXiv:2004.07511 (2020).Google Scholar
- [230] . 2022. On the robustness of counterfactual explanations to adverse perturbations. arXiv preprint arXiv:2201.09051 (2022).Google Scholar
- [231] . 2017. The EU General Data Protection Regulation (GDPR): A Practical Guide. Springer.Google Scholar
- [232] . 2021. Algorithmic recourse in partially and fully confounded settings through bounding counterfactual effects. arXiv preprint arXiv:2106.11849 (2021).Google Scholar
- [233] . 2020. On the fairness of causal algorithmic recourse. arXiv preprint arXiv:2010.06529 (2020).Google Scholar
- [234] . 2017. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harvard Journal of Law & Technology 31, 2 (2017), 841–887.Google Scholar
- [235] . 1992. Legal recourse and the demand for auditing. Accounting Review 67, 1 (1992), 121–147.Google Scholar
- [236] . 2018. Stealing hyperparameters in machine learning. In Proceedings of the 2018 IEEE Symposium on Security and Privacy (SP’18). IEEE, Los Alamitos, CA, 36–52.Google ScholarCross Ref
- [237] . 2020. SCOUT: Self-aware discriminant counterfactual explanations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8981–8990.Google ScholarCross Ref
- [238] . 2021. Counterfactual explanations for survival prediction of cardiovascular ICU patients. In Proceedings of the International Conference on Artificial Intelligence in Medicine. 338–348.Google ScholarDigital Library
- [239] . 2022. Model agnostic generation of counterfactual explanations for molecules. Chemical Science 13 (2022), 3697–3705.Google ScholarCross Ref
- [240] . 2017. Challenges for transparency. arXiv preprint arXiv:1708.01870 (2017).Google Scholar
- [241] . 2019. The what-if tool: Interactive probing of machine learning models. IEEE Transactions on Visualization and Computer Graphics 26, 1 (2019), 56–65.Google Scholar
- [242] . 2019. Measurable counterfactual local explanations for any classifier. arXiv preprint arXiv:1908.03020 (2019).Google Scholar
- [243] . 1994. A genetic algorithm tutorial. Statistics and Computing 4, 2 (1994), 65–85.Google ScholarCross Ref
- [244] . 2005. Making Things Happen: A Theory of Causal Explanation. Oxford University Press.Google Scholar
- [245] . 2016. Causation and manipulability. In The Stanford Encyclopedia of Philosophy (winter 2016 ed.), (Ed.). Metaphysics Research Lab, Stanford University.Google Scholar
- [246] . 2019. Feature denoising for improving adversarial robustness. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 501–509.Google ScholarCross Ref
- [247] . 2006. Extracting actionable knowledge from decision trees. IEEE Transactions on Knowledge and Data Engineering 19, 1 (2006), 43–56.Google ScholarCross Ref
- [248] . 2009. The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Systems with Applications 36, 2 (2009), 2473–2480.Google ScholarDigital Library
- [249] . 2018. Interpreting neural network judgments via minimal, stable, and symbolic corrections. In Advances in Neural Information Processing Systems. 4874–4885.Google Scholar
- [250] . 2020. Fast real-time counterfactual explanations. arXiv preprint arXiv:2007.05684 (2020).Google Scholar
- [251] . 2021. Evaluating the quality of machine learning explanations: A survey on methods and metrics. Electronics 10, 5 (2021), 593.Google ScholarCross Ref
- [252] . 1998. An Evolutionary Algorithm for Multiobjective Optimization: The Strength Pareto Approach. TIK-Report No. 43. Computer Engineering and Communication Networks Lab, Swiss Federal Institute of Technology, Zurich, Switzerland.Google Scholar
Index Terms
- A Survey of Algorithmic Recourse: Contrastive Explanations and Consequential Recommendations
Recommendations
Algorithmic Recourse: from Counterfactual Explanations to Interventions
FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and TransparencyAs machine learning is increasingly used to inform consequential decision-making (e.g., pre-trial bail and loan approval), it becomes important to explain how the system arrived at its decision, and also suggest actions to achieve a favorable decision. ...
From Explanation to Recommendation: Ethical Standards for Algorithmic Recourse
AIES '22: Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and SocietyPeople are increasingly subject to algorithmic decisions, and it is generally agreed that end-users should be provided an explanation or rationale for these decisions. There are different purposes that explanations can have, such as increasing user ...
RecRec: Algorithmic Recourse for Recommender Systems
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge ManagementRecommender systems play an essential role in the choices people make in domains such as entertainment, shopping, food, news, employment, and education. The machine learning models underlying these recommender systems are often enormously large and black-...
Comments