Not only the frequency of laughter, but also their other acoustic features reveal great variety throughout the different experiments. Although it has been a constant view that the components of laughs are conceived predominantly as vowel-like outbursts (such as Darwin
1872/1998; Ruch
1993; Nwokah et al.
1999), there may be some variation between the individual sounds that make up laughter (Urbain and Dutoit
2011; Ruch and Ekman
2001). It has been now shown that the structure of laughter is much more complex than it was hypothesised earlier. Laughter was acoustically characterised by Provine (
1996) as follows: short vowel-like notes (75 ms long in general), which were recurred several times regularly. These notes separated from each other by 210-ms-long unvoiced aspiration. The mean fundamental frequency was about 502 Hz in the female’s laughter, while 276 Hz in the case of male laughter. The intensity of laughter decreases from the beginning to the end. According to another study on laughter realisations (Vettin and Todt
2004), the mean f0 of laughter bouts was found to be 171 Hz in the male group (106–355 Hz), while 315 Hz in the female group (117–735 Hz). The duration of two-subjects’ laughter was analysed by Bickley and Hunnicutt (
1992). Due to the fact that a given type of laughter sounds like a sequence of breathy CV syllables, the average duration of a laugh syllable was investigated. Results showed that this value was 204 ms for one speaker and 224 ms for the other speaker. The mean number of syllables was 6.7 in one speaker’s laugh, while 1.2 syllables in the other’s laugh. In a later study, they found a mean duration of laughter of 798 ms in the female’s group, while it was on average 601 ms in the male’s group (Rothgänger et al.
1998). The mean value of the fundamental frequency in the female’s group varied between 160 and 502 Hz, while between 126 and 424 Hz in the male’s group (Bachorowski et al.
2001).
Laughter types
However, the previous research managed laughter as one group; Scott et al. (
2014) distinguished between two types of laughter. Different research labelled them in different ways: voluntary/involuntary (Scott et al.
2014; Chen
2018) or voluntary/evoked (Scott et al.
2014), spontaneous, authentic/fake (Lavan et al.
2016), social/spontaneous (Shochi et al.
2017), spontaneous/volitional (Bryant et al.
2018; Kamiloğlu et al.
2021), mirthful/polite (Tanaka and Campbell
2014; Sabonyté
2018), authentic/acted (Anikin and Lima
2018)—depending on the framework of the given analyses, in some research used as synonyms. However, the two concepts differed not only in their names, but often in their definitions as well.
Some studies have selected the sound samples for a perception test on the basis of a preliminary grouping:
Similarly, Shochi et al. (
2017) investigated the types of laughter in regard to their voluntariness. Firstly, 3 Japanese males and 4 French subjects were asked to listen to 254 laughter that were collected from 12 spontaneous conversations during online video games and decide whether it was social (‘the person is laughing to maintain the communication with the other (e.g. embarrassed laughter, polite laughter, cynical laughter…’) or spontaneous (‘the person is laughing in a spontaneous manner to an external event (e.g. a funny clip)’)). Additionally, ‘I don’t know’ answer was also possible. Then, from altogether, 27 spontaneous and 27 volitional social laughs built up the dataset. In the perception test, 20 Japanese and 82 French native listeners listened to the stimuli and decided what kind of laughter they heard. Results showed that subjects were able to differentiate these types of laughter with about 70% accuracy based only on audio information without their context. Furthermore, acoustic analysis was also conducted regarding these two types of laughs. According to the multiple factor analysis, judgements of both French and Japanese groups were correlated with f0 features (mean and standard deviation), the total duration and the voiced segment duration. Therefore, these acoustic factors were further investigated, and results showed that the total duration of laughs is an important cue for the differentiation regarding their voluntariness. In addition, the voiced duration, the number of voiced segments and the f0 standard deviation also assist in the differentiation between spontaneous and social laughs. The variations of f0 values were higher; the total duration and the voiced segment duration were longer in the case of spontaneous laughs than social ones.
Other examinations have considered the laughs in different (genre) recordings as belonging to one group or the other:
Additionally, Bryant et al. (
2018) conducted a perception test on two types of laughter as well. Participants had to decide whether the laugh was real or fake. The test contained 18 spontaneous laughs from natural conversations (real laughter) and 18 ‘fake/volitional’ laughs. The laughter regarding the first category was collected from 13 conversations, involving female friend speakers, while volitional laughter as samples for second category were recorded from women who were instructed to ‘now laugh’ with no other prompting. This task was conducted with 884 participants from six regions of the world. The overall rate of correct judgments was 64%, which was a performance significantly better than the chance level in the differentiation of real and fake laughs. Results showed that people are able to distinguish spontaneous and fake laughs, regardless of their language or culture. Laughs produced with greater intensity variability, higher pitch, and increased noisy features are considered to be spontaneous.
In another study (Kamiloğlu et al.
2021), spontaneous laughs were elicited by funny videos (they laughed in response to self-selected humorous recordings), while participants were instructed to politely laugh at unfunny jokes for collecting volitional laughter from the similar people. In total, almost eight hundred laughter samples were collected from Dutch and Japanese speakers. Then, 20 Dutch and 18 Japanese participants were asked to answer two questions: ‘Do you think this was a genuine or a polite laugh?’ and ‘Did this laugh sound authentic or not?’—they were asked to choose the yes or no response. Sixteen clips (eight Dutch, eight Japanese; laughter type and gender balanced for each group) that were most accurately discriminated as spontaneous versus volitional and that were judged as most authentic were selected as stimuli for the main experiment. Statistical analysis corroborated previous results: spontaneous laughs had higher rates of intervoicing interval, longer duration, increased f0, F1 and F2 means, lower amplitude variability, higher values of spectral centre of gravity and reduced harmonics-to-noise ratios. These 16 laughs were the stimuli of the main experiment. Participants were asked to hear decontextualised laughs and then decide (i) whether it was spontaneous or volitional; (ii) whether the laughing person was from their own or foreign cultural group. Participants also rated the positivity of each stimulus on a 7-point Likert scale. Results showed that both Dutch and Japanese participants rated spontaneous laughter as more positive than volitional ones. However, no difference was found in the accuracy of group membership identification from spontaneous versus volitional laughter.
Other analyses have selected the two different types of laughter from the same social context:
Lavan et al. (
2016) conducted an experimental study on the acoustic features and perceptual judgement of volitional and spontaneous laughter. Female speakers were asked to produce both spontaneous (‘genuine amusement laughter’) and volitional (‘voluntary, controlled’) laughter. Then, 72 stimuli were selected for a perception test. Nineteen participants were asked to rate the valence and arousal of the stimuli on a 1–7 Likert scale. According to the acoustic analysis, significant differences were found between volitional and spontaneous laughter for most of the measured acoustic parameters: longer total duration, shorter burst duration, higher f0 mean, higher f0 minimum and maximum, a larger f0 variability, a higher percentage of unvoiced segments and lower mean intensity were measured in the case of spontaneous laughter compared to volitional ones. However, they did not find differences between the two laughter types in f0 range, harmonics-to-noise ratio (HNR) and spectral centre of gravity. Furthermore, the results of the perceptual experiment showed that spontaneous and volitional laughter were perceived as being different in arousal, valence, and authenticity; therefore, participants were able to distinguish between these two types of laughter. Combinations of the laughter’s total duration, spectral centre of gravity, and f0 mean were the most prevalent predictors for ratings of spontaneous laughs. In contrast, in the case of volitional laughs, HNR was found to be the most frequent predictor for affective ratings. Results showed that volitional laughs, with a lower HNR, appeared to be more authentic and more positive.
In another study, 100 samples of non-overlapping spontaneous, mirthful and polite laughter were collected from daily conversations and TV talk shows (Sabonytė
2018). Thirty stimuli were used; 30 respondents were asked to label the type of laughter after hearing the recordings with and without the context. Acoustic features of the two types of laughter were analysed. Data showed significant difference in duration between mirthful and polite laughs, but neither in intensity, in f0, F1, F2, in shimmer and jitter values. Bouts of mirthful laughter were longer than bouts of polite laughter of the same number of structures: polite laughter contained one, whereas mirthful laughter consisted of one and more bouts. In the case of polite laughter, the most common form was a two-syllable laughter bout, the mirthful laughter may consisted of more than one bout; the bouts of this type of laughter consisted of one to fifteen syllables (the most common samples are of three or four syllables). In addition, differences were found between the accuracy of cluster analysis and human perception in distinction of these laughter types.
Differentiation between acted and authentic emotional non-verbal vocalisation was analysed (Anikin and Lima
2018). The judgement task was conducted online. The participants were asked to listen to various sounds (from seven different corpora) and decide whether they were real (authentic) or fake (pretending). Results showed that the accuracy of authenticity detection varied regarding the emotional category. Authentic non-verbal vocalisations of fear, anger and pleasure were much more likely to be deemed as authentic as posed vocalisations. The authentic laughs were perceived as authentic in 67% of all cases, and they occurred with higher pitch, larger pitch variability and lower harmonicity than fake ones.
Other studies also investigated the influence of the relation of the speakers on laughter perception (Farley et al.
2022). Laughter stimuli were obtained from telephone calls of 27 callers talking to their romantic partner and a close same-sex friend. Fifty-two samples were selected for the study: in the first task, these laughter had to be judged by the 50 raters regarding pleasantness. In the second task, listeners had to decide whether the laughter was directed towards a friend or a romantic partner. Results showed that listeners were able to identify them in a higher proportion than the chance level (57%). Furthermore, laughter directed at romantic partners were judged to be less pleasant-sounding than those directed at friends. Additionally, in the second part of the study, the eight, most prototypical laughs were selected. Participants had to judge laughter samples regarding spontaneity using bipolar scales (e.g. ‘loud/soft’, ‘natural/forced’, ‘breathy/not breathy’) and regarding vulnerability. Laughter directed at friends were judged to be louder, more masculine, natural-sounding, ‘changing’, mature-sounding, more dominant, and less breathy than those directed at romantic partners. The gender of the speakers also affected judgements significantly: laughter samples from male speakers received higher ratings for masculinity and coldness, while female laughter samples were higher rated for loudness, naturalness, changing, maturity, relaxed, and dominance.
Beside the research focusing on the judgements of laughter types, another group of research aimed at the
automatic classification of different kinds of laughs. The laughter detector developed by Campbell et al. (
2005) can automatically recognise four laughter types based on the speaker’s affective state and their segmental composition (voiced laugh, chuckle, ingressive breathy laugh, nasal grunt). The identification rate was greater than 75%. In Galvan et al. (
2011), automatic recognition based on vocal features also achieved high accuracy scores (70% correct recognition) when discriminating five types of acted laughter: happiness, giddiness, excitement, embarrassment and hurtful.
By social function, samples of laughter were divided into five groups: mirthful, politeness, embarrassment, derision and others (Tanaka and Campbell
2011). In natural communication, the most frequent types of social laughter seemed to be polite and mirthful (Tanaka and Campbell
2011,
2014; Sabonytė
2018). In order to distinguish between different types of laughter, and based on phonetic characteristics (voiced, ingressive, chuckle, nasal), Tanaka and Campbell (
2011) used HMM with the following spectral features: MFCC, RMS power, and delta, power, and achieved a prediction accuracy of 86.79%.
In another study, Tanaka and Campbell (
2014) categorised laughs into either polite or genuinely mirthful categories (based on the majority vote of 20 observers). They determine the main contributing factors in each case by statistical analysis of the acoustic features, principal component analysis and classification tree analysis. SVM was used to predict the most likely category for each laugh in both speaker-specific and speaker-independent manner. Better than 70% accuracy was obtained in automatic classification tests.
Through the investigation of laughter-related body movements, five laughter states (hilarious, social, awkward, fake, and non-laughter) were distinguished automatically by Griffin et al. (
2015).
The automatic detection and classification of laughter occurrences can be beneficial in a number of ways. It could be used in automatic speech recognition (ASR) systems, reducing the word error rate by identifying non-speech sounds. It can be helpful in searching for videos with humorous content. Detection of users’ emotional state from various modalities (body movements, facial expressions, speech) and production of emotional displays can be used in design of human–computer interaction (HCI). Automatic detection of laughter can be useful for detecting the user's affective state and conversational signals such as agreement. Thus, it may facilitate affect-sensitive multimodal human–computer interfaces. Virtual/embodied agents could be made more natural (human-like) using natural-sounding synthesised laughter.
Previous research has therefore made several findings regarding the categorisation of laughter. The previous research examined the different realisations of laughter from many aspects: in the introduction part, we focused on the results obtained based on three main aspects: the production, the perception and automatic categorisation and classification. However, drawing general conclusions is made more difficult by the fact that the individual studies differed in many respects: Some research contrasted spontaneous laughter with fake (Lavan et al.
2016), social (Shochi et al.
2017), volitional (Kamiloğlu et al.
2021), while other examinations distinguished between voluntary/involuntary (Scott et al.
2014; Chen
2018), voluntary/evoked (Scott et al.
2014), or mirthful/polite (Tanaka and Campbell
2014; Sabonyté
2018). On the one hand, these researches defined the groups of laughter differently. On the other hand—as well in close connection with this fact—they also use different methodologies with regard to the data collection: real laughs were collected from ‘natural, spontaneous’ recordings (see Shochi et al.
2017; Bryant et al.
2018; Sabonytė
2018) or during funny videos (Kamiloğlu et al.
2021), ‘fake’ laughter were forced in an artificial way, for example, they asked the speakers to show how would they laugh at an unfunny joke in a polite way (Kamiloğlu et al.
2021), professional actors were requested to produce different types of laughter (Szameitat et al.
2009), or participants were called just to laugh (Bryant et al.
2018), while in other studies both of the laughter were recorded during video games. In addition, differences have been found in respect with, for example, the analysed acoustic features, measurement methods, as well. The results and the conclusions that can be drawn from them are therefore difficult to generalise and are highly limited: The results of the production and perception examination listed above—independently from the categories used to concepts, methods and/or definitions of the different kinds of laughter—found differences between their categories’ laughs acoustically and perceptually. Most of the research maintained the effect of the timing features regarding that the spontaneous laughter realised with longer duration than the samples in the other category. Although, in the case of f0 characteristics, not all research has corroborated a difference between the two groups, if they do, the mean f0 was higher in the case of the spontaneous or real laughter than as for the other category. Regarding the other parameters, such as the intensity, CoG, F1, F2, pitch, jitter, shimmer, HNR, the results of the research were often contradictory.