The COVID-19 pandemic has drastically changed the way we relate to each other (Singh and Singh
2020) since isolation and DSFMs have significantly impacted human social interactions (Melendez et al. 2020). When people wear DSFMs, the bottom-half of their face (i.e., nose tip, mouth, overall face contour) is covered, and the available perceptual information is reduced. Faces are also the major communication channel for emotion expression (Armony and Vuilleumier
2013). As a matter of fact, emotion expression recruits face muscles in unique ways, and some areas of the face—mainly the eyes, nose, and mouth—convey different fundamental cues (Ekman
1993; Shiota et al.
2003). With safety devices covering about 60–70% of the face area that is relevant for emotional expression, emotions recognition gets harder (Grundmann et al.
2021; Marini et al.
2021). Given the massive use of DSFMs in daily life during the COVID-19 pandemic, understanding the mechanism of compromised identity recognition and emotional perception is of considerable importance (Grundmann et al.
2021).
As such, our study had two aims: firstly, to investigate to what extent DSFMs affect recognition, and whether face learning conditions (presence/absence of DSFMs) could affect recognition performance; secondly, we wanted to assess whether DSFMs interfere with facial emotion recognition, and if high empathy levels could facilitate emotion recognition in faces wearing DSFMs.
Facial identity recognition and DSFMs
The upper half of the face, especially the eyes, allows people to properly recognize faces (Dal Martello and Maloney
2006; Fisher and Cox
1975). However, the lower part also plays an important role in this process, as pointed out from previous research showing that mouth covering (as in the case of DSFMs) leads to a reduced recognition accuracy compared to unobstructed faces, as a result of the interferences with holistic processes (Tanaka and Farah
1993; Tanaka and Sengco
1997). Indeed, the observer is no longer able to process key information about spatial relationships between facial features (Maurer et al.
2002). On this point, our study shows that in some cases DSFMs have a strong effect on performances, specifically when faces are shown and learned partially covered (learning condition with DSFMs). Indeed, in Block 1, in which the learning phase took place without DSFMs, a reduction in face discrimination performances occurred with DSFMs. By contrast, in Block 2, with the learning phase including only faces with DSFMs, participants were better at discriminating faces with than without DSFMs. One plausible explanation of face recognition worsening in Block 1 is that lower face coverage could have disrupted holistic face processing, thus making it difficult for the observer to extrapolate the configural information, and to elaborate a unified representation of the face when an obstacle was present (Carragher
2020).
In Block 2, participants were asked to memorize faces with DSFMs, therefore holistic processing might not take place. We can speculate that, in this case, feature-based processes were immediately engaged, and the observers’ focus moved to the individual characteristics of the observed face (Tanaka and Farah
1993); thus, DSFMs hindered face processing as a unique configuration. Despite face recognition relying on both featural and configural processing, their effects are sometimes dissociable (Cabeza and Kato
2000). Indeed, several studies show the importance of featural aspects. For instance, Kimchi and Amishav (
2010) showed that when faces differ in one component only (e.g., eyes, nose, or mouth), correct discrimination between similar identities is determined by the discriminability of that component itself. Similarly, Cabeza and Kato (
2000) showed that similar individual features between learned target faces and new different ones tend to impair identity recognition (i.e., defined as the “prototype effect”). This suggests that faces’ individual components, if available in memory, can guide face recognition. This is what probably happened in our study when faces were partially occluded by the DSFMs, with holistic processing overtaken by featural recognition. Moreover, previous studies in which participants were asked to memorize face parts (e.g., nose or eyes) showed that it is hard to ignore irrelevant information when the learned parts are embedded in a full face (i.e., holistic interference), while the performance is good when participants are asked to recognize single parts only (Leder and Carbon
2005). These findings are in line with our results, where better recognition for faces wearing DSFMs only emerged if faces were previously learnt with DSFMs. In everyday life, we might have all experienced this phenomenon during the COVID-19 pandemic, when, after meeting new people with the DSFMs, we were surprised to see “how their faces looked like” as soon as seen after the DSFMs were taken down for the first time.
The relevance of the learning modality in face memory emerged also in terms of response bias (i.e., participants’ willingness to respond that a target face in the testing phase did appear in the previous learning phase). Indeed, we found that participants’ responses were more conservative (i.e., higher tendency to reject the “target”) when DSFMs were absent and the study faces were learnt without DSFMs. By contrast, when faces were learnt with DSFMs, responses were more conservative in trials with DSFMs. This result highlights a stronger conservative response bias when the learning modality matched the test modality, with higher caution toward the risk for false alarms. The fact that performances appear to depend upon the learning stage could reflect separate processing tracks for faces learnt with and without DSFMs.
Overall, our results from Experiment 1 suggest that (i) DSFMs have an overall detrimental effect on memory performance but, critically, (ii) this effect is mediated by learning (with or without DSFMs). This also indicates that (iii) masked and unmasked face processing might rely on qualitatively different mechanisms; specifically, holistic processing represents the “default mode” of face processing, whereas under certain conditions (e.g., masked faces) face recognition could be achieved with featural processing.
However, even though our results are in line with Bruce and Young’s (
1986) theory of facial processing, which indicates that any odd element in partially covered faces interferes with the proper face structural encoding process (which normally takes place with a totally uncovered face), and there is evidence of DSFMs disrupting holistic processing in some recent work through face inversion effect (Freud et al.
2020; Stajudhar et al.
2022), we did not directly test holistic processes, and thus, we cannot exclude that the observed effects might be due to a general (i.e., non-face sensitive) context effect.
Facial expression recognition and DSFMs
The debate on the processes underlying emotional facial expressions recognition is still ongoing. Some theories stress the importance of holistic processes (Tanaka et al.
2012; Prazak and Burgund
2014; White
2000), while others emphasize specific facial features’ role (Calvo and Nummenmaa
2008; Ellison and Massaro
1997). Specifically, studies using different experimental manipulations supported the role of holistic and configural mechanisms in the recognition of facial expressions, as evidenced by the face inversion (Derntl et al.
2009a,
b; Prkachin
2003) and composite (Calder and Janesen
2005) effects, two paradigms designed to specifically that impair holistic face processing.
On the contrary, other evidence suggests that emotion recognition is based on individual facial features (e.g., pulling the corners of the mouth or lowering the eyebrows) (Calvo and Nummenmaa
2008), as also emerged from eye-tracking studies (Bombari et al.
2013). Since the specific features of each emotion are heterogeneously distributed across the face (Eisenbarth and Alpers
2011), people tend to preferentially look at the features that are peculiar to each emotion (Calvo and Nummenmaa
2008). It has been shown that, under certain conditions, people rely more on feature-based mechanisms of emotion recognition rather than holistic ones, as in the case of prosopagnosia (Palermo et al.
2011). It is therefore possible that, given the different and complex characteristics of each emotion, recognition cannot be reduced to features processing or holistic processing alone for all emotions (Beaudry et al.
2013).
We hypothesized that the DSFMs would have determined an overall decrease in emotion recognition, in line with recent findings on face and emotion perception with DSFMs (Grundmann et al.
2021; Marini et al.
2021). However, DSFMs in our study had specific effects on different emotions. This result could stem from the characteristic traits of each emotion, as well as their related recognition processes (Wegrzyn et al.
2017). No differences emerged in our study for
fear recognition with and without DSFMs, which is in line with previous studies showing it mainly relies on the upper area of the face (i.e., the eyes) (Beaudry et al.
2013), which typically appear open and tense, firm on a fixed point (Ekman and Friesen
2003). In line with our results, fear is conveyed by high-, but not low-, spatial frequencies, and should not be affected by nose and mouth covering (Smith and Schyns
2009). A similar result emerged for
anger recognition, with no significant differences between the two conditions. A recent study (Grenville and Dwyer
2022), paradoxically, showed that anger recognition accuracy was higher with DSFMs compared to faces with no DSFMs. This result, which has been replicated in our data based on accuracy (see Supplementary material), might stem from biased participants’ responses; as highlighted by SDT (
d’) analysis, anger discrimination was not actually better with DSFMs, but it was simply signaled more frequently as “target present,” thus increasing the possibility to collect a higher number of correct answers. The literature on anger and fear recognition reports a recognizable-top bias, which means that it relies heavily on information from the upper half of the face, most likely the eye region (Calder et al.
2000; Wegrzyn et al.
2015).
With respect to
happiness, discrimination got worse with DSFMs. This emotion is mostly recognized through the mouth; however, since it is the only “pure” positive emotion and occurs more often than sadness, fear, or anger in human relationships (Tomkins
1962), it could be easier to recognize happy faces even if the lower part of the face is obscured. The eyes, therefore, seeme sufficient for happiness recognition in most of the cases, even if holistic processes were disrupted by DSFMs. As happened for happiness, discriminability for faces expressing
sadness was significantly worse with DSFMs. Although sadness’ peculiar features are drawn-down lips corners and lowered and knitted eyebrows, this should not be a distinctive sadness feature, since it is shared with anger and fear. It is therefore possible that DSFMs led to worse recognition performances due to the disrupted holistic processes (i.e., the matching of eyes with mouths).
Lastly,
disgust recognition was the most affected by DSFMs presence, switching from being the best discriminated emotion without the DSFMs to being the least discriminated with DSFMs. In this case, the idiosyncratic trait of reference is the mouth, which also conveys the intensity level of the emotional state experienced (Ekman and Friesen
2003). Given that accurate recognition requires focusing on the lower part of the face, participants’ disgust discrimination performances worsened when this was obscured by the DSFMs (Beaudry et al.
2013).
Confusion matrices (see Supplementary Figs. 1 and 2) could also help us to highlight potential systematic errors’ patterns between the different emotions. The emotion that has been much affected by DSFMs was disgust; it could be thus interesting to look more deeply at disgust misclassification pattern pointed out by the confusion matrices (Supplementary Figs. 1 and 2), which show that disgust was sometimes misperceived as anger compared to other emotions when faces were completely available (Supplementary Fig. 1), but it was mostly labeled as anger when faces were occluded by DSFMs (Supplementary Fig. 2); however, the contrary did not happen (i.e., anger was less misperceived as disgust). This qualitative misinterpretation between anger and disgust has already been documented in other studies (Wegrzyn et al.
2017; Pochedly et al.
2012), and interpreted on the basis of the “nose scrunch” feature shared by both these emotions (Pochedly et al.
2012), as well as the pulled down eyebrows (Dubey and Singh
2016). Thus, when the diagnostic feature of disgust (i.e., mouth) is hidden, people could have the tendency to base their judgment mainly on the eyes, which is similar to that of anger expressions, and erroneously recognize disgust as anger.
Further mechanisms might account for our results, such as the participants’ emotional state, and their psychological and social condition, that are capable of impacting face recognition abilities (Alharbi et al.
2019) (e.g., isolation during the pandemic). As such, the drastic and “negative” changes in daily life habits could have fostered people’s psychological distress, and even caused symptoms of post-traumatic stress disorder (PTSD) or mood alteration, even in the healthy population (Bai et al.
2004; Brooks et al.
2020). Social confinement, as in the time of the COVID-19 pandemic, might induce people to focus on stimuli having a negative component (e.g., sadness). This is explained by the “emotional congruence” phenomenon (Meléndez et al.
2020), which postulates that emotional states tend to ease the encoding of stimuli having the emotional valence the encoder is experiencing (Loeffler et al.
2013). Moreover, when individuals are exposed to stressful situations (that is the case of a global pandemic), they tend to develop anger as a reactive means of establishing safety (Smith et al.
2021). In addition, angry faces broadly represent an important cue of social threat, and several experiments showed that they are detected more accurately, and require shorter processing time and fewer attentional resources as compared to other emotions (Pinkham et al.
2010; Calvo et al.
2006). The potential adaptive value of anger recognition could further explain our findings about participants’ tendencies to misinterpret disgust as anger.
Our results from response bias show that participants’ biased tendencies varied based on the specific emotion and the DSFMs Vs. no DSFMs condition. Indeed, while no significant bias emerged for happiness and fear, participants showed conservative tendencies (i.e., a higher threshold for judging that a certain emotion was present) for stimuli without DSFMs expressing anger and sadness than with DSFMs. By contrast, when the target emotion was disgust, participants’ bias was more conservative with DSFMs than without. However, every erroneous response for each emotion in our task might have differentially impacted the others (e.g., a high rate of false alarms for the target anger impact hits, misses, and correct rejections in a different way for each other emotion), which in turn could affect response bias scores. When considering the responses for each alternative (in our case, each target emotion with and without DSFMs), we should take into account that responses are not “independent” from each other. Indeed, we could assume participants were more (or less) prone to the risk for erroneous responses with certain emotions than with others, but it would be hard to completely disentangle emotions’ “cross-sectional” effects. To facilitate results’ interpretation, it is possible to consider the confusion matrices (see Supplementary material) revealing the relationship between each target with and without DSFMs; for example, anger happens to be the higher chosen response with DSFMs compared to other emotions. The rates of confusion highlight the non-independence of our target’s levels and imply caution when interpreting SDT response bias.
Overall, our results suggest that DSFMs affect recognition processes of some specific emotions, and this might point out a possible complementary, flexible, and interactive role of holistic and feature-based processes in emotion recognition, especially in atypical situations, as when diagnostic information is limited to a specific face region and cannot be based on the whole face. We can speculate that the DSFMs operate as a “misaligning condition” (i.e., top and bottom halves of the face are spatially misaligned) as seen in the “Composite-face task” (Calder et al.
2000). As such, it interferes with the holistic processes commonly used to recognize emotions when the entire face is available (Tanaka et al.
2012). When the DSFM is on, local processes need to be engaged, leading to an alteration (potentially) of facial expression processing. Since emotions have their peculiar characteristics, but at the same time they share features (e.g., nose “scrunch” in anger and disgust), we might argue that holistic but not feature-based processing is advantageous for the recognition of some emotions, while feature-based but not holistic processing is a preferential strategy for other expressions.