4.1.1 Overview
The mathematical definitions vary depending on the type of decision-support system: classification, ranking, regression, recommendation, etc.; but also based on underlying fairness notions like group fairness, individual fairness, or causal fairness [
191].
Recently, new notions of fairness (e.g. multi-sided fairness [
31]) involving more than one type of stakeholder and protected group were proposed for recommender systems: recommendations could be fair not only for the clients but also for the reviewers or providers of a service [
102], or also for items presented in the system [
84,
90,
170,
210].
New fairness notions could be identified from social sciences in order to make the systems more aligned with actual fairness values. Many of the proposed fairness definitions and metrics have multiple limitations [
79]. For instance, group fairness does not account for unfairness within a given group and hence, individual fairness was later proposed by Dwork et al. [
49]. The fairness definitions are mostly based on equality notions of fairness, but others might be more relevant for certain use-cases (e.g. affirmative actions [
123], equity, need [
58]). Besides, the identification of unfair situations through causality is also exploited by Madras et al. [
115]. Indeed, most definitions rely on notions of correlations and not causation, whereas the ultimate goal of the systems and the metrics is to support making decisions ideally based on causal arguments.
4.1.2 Fairness metrics
Here, we give examples of the main mathematical definitions and metrics of fairness used for classification tasks.
All definitions and metrics assume the preliminary definition of a protected and a non-protected group of records (usually each record refers to a different individual) defined over the values of one or multiple sensitive attributes (also called protected attributes). For instance, in the aforementioned bank example, each record would represent a client of the bank with the attributes representing the information about this client. A sensitive attribute could be the gender, nationality, or age of the client. A protected group could be defined as all the clients whose age is between 15 and 25 years old, or as all the female clients whose age is in this interval. In the rest of this section, for the sake of clarity, we will take as a non-protected group the male clients, and as a protected group any other client. Most existing metrics only handle having one protected group and the rest of the records being aggregated into the non-protected group.
The definitions and metrics also require knowing the label the classifier predicted for each record (e.g. a positive prediction when a loan is granted and a negative prediction otherwise).
Most definitions rely on the comparison of statistical measures, and more specifically on checking equality of multiple probabilities, while the unfairness is quantified either by computing the difference or ratio of these probabilities. The definitions and metrics differ in the underlying values of fairness that they reflect, and on the exact measures and information required to compute them.
Group Fairness. Group fairness based on predicted labels. The first group of metrics only require knowledge of the predictions of a classifier for each record in a dataset and the membership of each record to the protected or non-protected group at stake. An example of such a metric is
statistical parity[
49]. Statistical parity is verified if the records in both the protected and unprotected groups have an equal probability to receive a positive outcome. An extension of such metric is the
conditional statistical parity [
40] which is verified when the above probabilities are equal, conditioned on another attribute.
In our bank example, the model would be considered fair according to this definition if the male applicants and the other applicants would have the same probability of being labelled as likely to repay the loan given all other attributes are equal.
Group fairness based on predicted labels and ground truth labels. The second group of metrics requires knowing both the classifier predictions and the ideal label that a record should be associated with. A classifier is fair according to these metrics when a measure of accuracy or error computed independently for the protected and the non-protected groups is equal across groups. This measure can be the true positive rate, the true negative rate, the false positive rate, the false negative rate, the sum of the true positive, and false positive rates (named
equalized odds [
70]), the error rate, or the positive predicted value, the negative predictive value, the false discovery rate, the false omission rate, or ratios of errors (e.g. ratios of false negatives on false positives) [
38]. All these metrics have different ethical implications outlined in Verma et al. [
191].
In our example, a model would be fair based on these definitions if the selected measure of accuracy or error rate is the same for both male and female clients. For instance, for the true negative rate, the model would be fair when the probability for male clients labelled as likely to default to actually default is equal to this probability for the non-protected group. For the definition based on recall, the model would be fair if the recall is the same for male and other clients, i.e. if the proportion of male clients being wrongly labelled as likely to default among male clients that would actually repay the loan is the same as for the clients of the protected group.
Group fairness based on prediction probabilities and ground-truth label. The third group of metrics requires knowing the prediction probabilities of the classifier and the ideal label. For instance,
calibration [
95] is verified when for any predicted probability score, the records in both groups have the same probability to receive a positive prediction. For our example, this would mean that for any given probability score between 0 and 1, the clients getting this score belonging to the protected and non-protected groups should all have the same likelihood of actually repaying the loan.
These conceptions of fairness all take the point of view of different stakeholders. While the recall-based definition satisfies what the bank clients would ask themselves—“what is my probability to be incorrectly rejected?”—, the true negative rate-based definition better fits the bank point of view—“of my clients that I decided to reject, how many would have actually repaid my loan?”. The statistical parity metric could be considered to take the society viewpoint as supported by regulations in some countries—“is the set of people to whom a loan is granted demographically balanced?”.
Individual Fairness. Another set of metrics, often named individual fairness metrics in opposition to the above metrics that compare measures computed on groups (referred to as group fairness metrics), relies on the idea that similar individuals should be treated similarly independently of their membership to one of the groups. The computation of such metrics requires the knowledge of each attribute that defines the similarity between records, and the knowledge of the classification outputs.
Fairness through unawareness [
100] is associated with the idea that the sensitive attribute should not be used in the prediction process. In our example, this would simply mean that the gender of the clients is not used by the model, either during training or deployment.
Causal discrimination [
56] is verified when the outputs of a classifier are the same for individuals who are represented with the same attribute values for all attributes except the sensitive attributes. Two bank clients asking for the same loan, having similar financial and employment situations, and simply differing on their gender should receive the same predictions from a model.
Finally,
fairness through awareness [
49] is verified when the distance between the output distributions of the different records is lower than the distance between these records. The different bank clients, all being more or less similar, should receive predictions that follow the same order of similarity, i.e. two clients being similar according to the metric employed should receive predictions that are under this high similarity measure, while two clients being farther apart can receive predictions that are not necessarily as similar as the two previous ones.
Generally, the underlying idea behind these notions of individual fairness is that group fairness notions do not allow to take into account unfairness that could arise within the groups, contrary to these new notions. Essentially, group fairness reflects averages over sets of individuals—if the averages across groups are similar, then the model is considered fair—while individual fairness is interested in each of the individuals and how they are treated in comparison with all other individuals—while a group average might seem high, two individuals within the same group might receive disparate treatment, which in average look fair. In our example, an unfairness measure such as disparate impact could be low, meaning that both male and female clients are given similar percentages of loans, indicating that the model is fair. However, under this measure, two female clients having similar financial status could be treated differently, one receiving the loan and the other not, as in average the measure could still be close to the one for the male group. That is the type of issue that individual fairness metrics target.
“Combinations” of Metrics. Kearns et al. [
91] showed that both group fairness and individual fairness metrics present important limitations in scenarios where multiple protected groups are defined over the intersection of multiple sensitive attributes, despite these scenarios being the most common ones in practice. Typically, the metrics might not account for unfairness issues in certain intersectional groups. In reaction to such limitations, they introduced a new set of metrics that rely on combining the underlying ideas of both group and individual fairness, and a new set of algorithms to optimize machine learning classifiers for them.
Causal Fairness. A last set of metrics relies on causal relations between records and predictions and requires the establishment of a causal graph [
93]. For instance,
counterfactual fairness [
100] is verified when the predictions do not depend on a descendent of the protected attribute node in the graph. In our example, using such metrics would require providing a causal graph, where the protected attribute would be one of the nodes and would entail, verifying that the node representing the loan acceptance/rejection decision is not dependent on the protected attribute node.