Throughout human history, people have generated virtually all of their solidarity face-to-face, by physical co-presence. This has been disrupted by a world-wide natural experiment, a social experience of making people stay home, avoid public gatherings, avoid interacting with strangers except when wearing masks and staying six feet apart.

What happens when the normal conditions of social interaction—formulated by Durkheim (1912) and Goffman (1967), and formally stated as the theory of interaction ritual chains—are sharply disrupted? Does everything in the theory disappear, and human social life takes on an entirely new form, operating by different causal mechanisms? Or do we find which variables and processes are stronger than others, which ones are replaceable and which are not?

Since the publication of Interaction Ritual Chains (Collins 2004a, b), the issue has been discussed whether mediated forms of interaction, especially electronic communication in real time, substitute effectively for face-to-face (F2F) interaction. On the whole, this literature has found that electronic media do not substitute for it, but instead supplement it. Studying cell-phone use, Ling (2008) found that persons tend to call the same people that they normally interact with, and much of what they communicate is where they are and how they can meet. He also found there is some feeling of social solidarity—personal belonging—in talking over a mobile phone, but that it is a weaker feeling than F2F. This may explain why cell-phone users spend much more time telephoning than traditional land-line users did, in this respect similar to drug addicts who increase their dose as its effects decline.

Without trying to review the entire literature on mobile-phone/smartphone use and social media generally, it can be noted that communicative fashions change. Many people (especially younger) in recent years prefer to communicate by text messages rather than orally. Even before the coronavirus epidemic of 2020, technological promoters and enthusiasts have touted a future where physical co-presence will be replaced by new forms of electronic communication. In this view, the epidemic has only accelerated the use of technologies in visual and auditory modes promising to substitute almost entirely for the bodily dimension of society.

The coronavirus epidemic of 2020 has provided a natural experiment in two respects. It has banned certain types of interaction rituals, notably religious assemblies and their secular equivalents such as political gatherings, sports, and entertainment. In a milder form, it has restricted ordinary F2F interaction by mandating masks and social distancing, weakening the cues ordinarily used in interaction rituals. In a second aspect, it has substituted remote interactions by electronic media for many forms of coordinated work, for schooling, and for sociable gatherings. Thus we can test how people have reacted to these changes; when the ingredients of interaction rituals are prohibited or curtailed, what happens to social solidarity and social emotions? When electronic media are substituted, what aspects of remote interaction affect which details of the IR process, and with what effects?

In what follows, I will outline the ingredients and outcomes of IRs and what makes them succeed or fail in varying degrees. The part played by F2F bodily co-presence will be singled out, as a facilitator for the central processes of mutual focus and rhythmic entrainment. Then we consider available evidence of IRs during the epidemic, including behavior on public streets, family life and sexuality, remote work and schooling, and public assemblies and audiences.

The ingredients of interaction ritual (IR)

Interaction Ritual Chains [IRC] is micro-sociological theory. It analyzes social processes in detail over short periods of time. Longer periods of time are composed of what happens in shorter periods of time. Taking a lesson from ethnomethodology, it treats noun-like entities such as social class, self, personality, culture, as simplified summaries of what happens on the micro level of seconds and minutes. Examined in micro detail (the methodological paradigm here was Conversation Analysis of tape-recorded talk in natural settings (Heritage 1984), later expanded to video), social relationships and their residues are created and re-created; they are not constant—unless the micro-processes always repeat in the same way. Looked at in micro detail, there is a great deal of fluidity in human social life. Thus, important features of society, such as social solidarity, vary according to the dynamics of situations.

The term “ritual” may be misleading. It comes from Durkheim’s pioneering analysis of religion as a form of behavior involving repeated, stereotyped social action that builds emotions and creates feelings of social membership. Goffman showed that this analysis applies also to the polite and not-so-polite ceremonial and gesture of everyday social life, coining the term “interaction ritual” for these activities. I have continued the terminology, with “IR chains” as a reminder that every social interaction has a starting point in participants’ memories of previous experiences in Durkheimian/Goffmanian rituals. Nevertheless, the term “ritual” can be overly narrow if it calls to mind only the formal, stereotyped rituals that are prescribed in particular religions; or the political, military, and judicial rituals of flags, salutes, and oaths; as well as the kind of interpersonal rituals that are written down in etiquette books. As I describe below, there are a number of ingredients or variables that go into a ritual that is successful in creating solidarity, and these can be analyzed in any social encounter, whether it has the formal qualities of explicitly recognized rituals, or are informal, ritual-like activities. A more accurate way of referring to the central processes is the “mutual focus, rhythmic coordination” model of solidarity. “Interaction ritual” is a term of tradition and convenience; what is important is to see what affects whether solidarity will be high, low, or non-existent in any particular situation.

  1. (1)

    Co-presence: people are physically near to each other where they can see, hear, and otherwise sense which each other is doing.

  2. (2)

    Mutual focus of attention: they focus their attention on the same thing, and become aware that they are doing so. This creates a feeling of intersubjectivity, and the possibility of acting together, such as by making similar gestures or moving in the same direction.

  3. (3)

    Shared mood or emotion: they feel the same emotion, whether excitement, joy, fear, sadness, anger, boredom, or any other.

  4. (4)

    Rhythmic entrainment: they get into the same rhythm, with voice or body. This has been measured in the rhythms of talk, ranging from the pace of turn-taking to the micro-rhythms of vocal soundwaves. Dancing, chanting, clapping, kneeling, or other physical actions are typical examples.

Feedback processes take place among these ingredients. As people pay more attention to each other, they tend to converge on a shared emotion and intensify it; conversely shared emotion intensifies mutual focus. As these increase, rhythmic entrainment increases. Durkheim referred to the increase in mutual excitement and rhythmic bodily activity as “collective effervescence.” Rhythmic entrainment is not limited to highly excited forms (such as in cheering crowds, or the group behavior that soldiers sometimes make when attacking or retreating in combat); there can also be a spread of shared hush, enforced quietness (or awe and respect); or the mutual stalemates that often occur when people threaten violence F2F, but the closeness of their encounter inhibits it (Collins 2008). Thus, “rhythmic entrainment” is a more accurate and encompassing term than “collective effervescence” for referring to highly coordinated social interaction.

It is important to recognize that these feedback processes do not inevitably lead to high levels of mutual focus, shared mood, and rhythmic entrainment. Interactions also may break off; applauding or booing in an audience (concert, speech) can sputter out (Clayman 1993). There are thresholds that have to be passed, and extraneous factors (besides these 4 ingredients, e.g., a disapproving segment of the crowd; something that distracts attention) can prevent positive feedback loops from building. We should not view interaction rituals as automatically succeeding (as sometimes was done with Durkheim’s theory of religious and political rituals—the functionalist fallacy). Instead, we have a set of ingredients or mechanisms that move forward or not, depending on the strength and sequence of local conditions at that time and place. Traditional rituals can die, if they lose their emotional appeal, or if they are challenged and replaced by different rituals; this is the micro-process by which religious, political, and cultural change takes place, often in revolutionary surges when crowds are mobilized.

Successful rituals (in contrast to failed rituals) have the following outcomes:

  1. (5)

    Social solidarity. Individuals feel like members of a group, and recognize others as co-members.

  2. (6)

    Emotional energy (EE). Individuals feel pumped up by a successful interaction ritual; Durkheim noted it makes them feel stronger, explaining why people can do heroic things and put out great effort on such occasions. Sports coaching consists to a considerable degree of such social techniques. Emotional energy is specific to the kinds of things the group is focused upon; whatever endeavor it is, persons with high EE are confident, proactive, and enthusiastic.Footnote 1 Persons feeling low EE are the opposite: they are depressed, passive, alienated. These are the results of failed interaction rituals.

  3. (7)

    Collective symbols. Durkheim called these “sacred objects,” referring to the emblems, places, books, etc. that are the focus of religious worship, and he extended this to political symbols like flags. Leaders can become sacred objects if they are at the focus of enthusiastic crowds. More generally, collective symbols include whatever members of a group consider most important, most characteristic of themselves at their best; thus ideas, words, slogans, items of clothing, gestures can be “sacred objects,” social markers of belonging. The specific jargon of a profession and the slang of a informal group are instantly recognized markers of who is a member and who is not. Collective symbols include ideals and beliefs in the strong sense of the term.

  4. (8)

    Moralities of right and wrong. For any group held together by successful rituals, its fundamental standard of morality is whether people respect its rituals and sacred objects. The worst offense is disrespect for these emblems. People imbued with such moralities feel moral outrage against violations.

In sum, successful interaction rituals are the micro-process that generates almost everything that we refer to as “social order.” It gives people their identities; makes them enthusiastic or antipathetic to various things in their social environment; creates leaders, heroes and villains, the popular and the unpopular; it fills our minds and discourse with meaningful ideas—i.e., those which are emotionally most marked; and it generates morality, both in directing us to positive goods and against emotionally repugnant evils. Accordingly, if we were to get rid of interaction rituals, or weaken them considerably, what would happen to all these aspects of social cohesion and their internalized effects in steering individuals’ lives?

How important is co-presence, compared to other ingredients?

Co-presence, in the scheme as developed by Durkheim and Goffman, is the point of departure. It is when people come together that the other ritual ingredients can be brought into action. This is obvious in the era of tribal rituals where there are no distance media; the rudiments of communication at a distance, in the form of emblems marked on rocks or other physical objects, can serve as reminders but only after individuals who been introduced to them by F2F rituals. Thus, all distance media have their origins in successful IRs that happen in physical co-presence. Writing, inscriptions, paper, books, postal delivery, newspapers, all became capable of transmitting emblems of social membership and its markers in distinctive group meanings, but even here people had to learn to read them and give importance to them, and this was done in F2F settings. Can we say, though, that as media become more ubiquitous and mimic more aspects of F2F interaction, social connections become increasingly transferred to media connections while the bodily interactional basis fades away?

As we attempt to assess available evidence on this question, let us note: the key to a successful IR is ingredients [2] and [3]. Mutual focus of attention and the buildup of shared emotions are what makes or breaks an interaction; if these can be kept going through the feedback processes long enough to pass the threshold to rhythmic entrainment, the ritual will be a success, and outcomes [5–8] will occur for participants. If the feedback loops between [2] and [3] and on into [4] are interrupted or never get going, the IR will fail and [5–8] will decline.

Durkheim noted that rituals must be repeated regularly to keep up these effects; early time-counting systems generally referred to the period of time in which a ritual was repeated, typically a week. Conversely, one way in which social solidarity effects [5–8] can decline is if people don’t repeat their rituals for a considerable time. The fervid political beliefs held when you were a member of a student movement disappear in a few years if one stops taking part; in the same way, religious faith declines if you don’t attend church for several years. Not taking part can result from moving to some other setting, or because of extraneous factors, such as the actions of political authorities, that determine whether the ingredients for an IR can be assembled at all.

Co-presence is important because it facilitates mutual focus, shared emotion, and rhythmic entrainment. By seeing another person’s eyes and face, and the orientation of their body, you know what they are paying attention to. An exchange of glances communicates, I-see-you-seeing-me, and also, I-recognize-what-we-are-both-looking-at. There are also negative and embarrassing failures of this, which Goffman enumerates; in conflictual interactions, these mutual rhythms get interrupted in other ways, such as by one side trying to stare the other down or to control what they should look at or not look at (Collins 2008).

Similarly, looking at the other person’s facial expressions, bodily gestures, as well as hearing their tone of voice and its loudness or softness, communicates what specific emotions are being felt. The James-Lange principle applies here: moving the muscles of one’s face, eyes, and body intensify the felt emotion, and it is triggered and intensified by closely monitoring other’s emotional expressions. Not only does running away with the rest of a crowd make you feel more afraid, but shouting happily, or angrily, with others makes one more happy or angry. Contagious laughter in audiences is a key to the techniques of comedy performance.

Rhythmic entrainment is felt most strongly when it is in all bodily channels: not only seeing and hearing, but the proprioceptive feelings in muscles, breathing, heart rate, and bodily chemicals that make an emotional mood a felt experience, not merely a detached cognition. The attraction of being in a live audience at a sports event is not that one can see the action on the field better (visibility is usually worse than on TV), but it is being pumped up by the excitement of the crowd when they hold their breath together, rise to their feet together, jump up and down, and hug each other together as they respond to the action. In varying degrees, these kinds of embodied experiences are the glue that creates moments of social solidarity.

Available data

The massive unintentional social experiment of restricting most forms of F2F interaction and substituting electronic media during the coronavirus epidemic is full of opportunities for observing its effects. A micro-sociologist needs to seize the opportunity as quickly as possible, as patterns shift over time. If we wait for systematic surveys, we may miss what there is to discover, especially since we are concerned with micro-behavior in real occasions, rather than generalized opinions as to what people claim to have done or believe. I will summarize three forms of data: my own observations of people interacting, mostly on public streets; personal interviews; and news reports, with their varying degrees of detail and representativeness.

Observational data are the most important for examining what happens to interaction rituals. Interviews have the weakness of prompting answers affected by desirability bias; they also lump together separate incidents into a generalized pattern. Standardized surveys are even weaker in these respects; they are seldom repeated often enough to get a picture of how behavior changes over time; they tell us nothing about emergent patterns that are not in the standard codes. For instance, my observations of the emergence of a new pattern of social distancing etiquette, an upsurge of greetings, and its decline over time would not have been found in the traditional methods of social science surveys.

Masked social distancing in public

Here we have a partial restriction of the ingredients of IR: people are bodily co-present, but the F2F aspect is greatly reduced. Masks cover the mouth and lower face, making it harder to recognize emotions, as well as harder to hear what the other person is saying. Thus, we would expect shared emotion and mutual focus of attention would be harder to attain, IRs would weaken, and solidarity decline.

Nevertheless, what we find in observing people on the streets was the opposite, at least for an initial period of time. Simmel’s theory of solidarity through conflict says that when a group is shocked by an enemy—we can widen this to a natural disaster or other shared emergency—solidarity goes up. I tested this immediately after the 9.11.2001 attacks (Collins 2004a), and found that it has a time-pattern: using the display of American flags as an indicator, the pattern looked like this. After the first few days of hushed uncertainty, people started putting up flags on windows and cars; flag-display reached its maximum within two weeks. It stayed at a plateau for 3 months, a period during which there were also repeated displays of flags and ceremonies honoring police and firefighters killed in the attacks. After 3 months, articles starting appearing discussing “can we take our flags down now?” Political controversy, which was almost entirely stifled during this period, started up again. By 6 months, the level of flag-display had declined by more than half, with a long diminishing tail thereafter.

In the US, public alarm over the coronavirus surged about March 16, 2000 when schools and gyms were shut down. By March 20, many states had ordered people to stay indoors. Wearing masks away from home became a requirement in the next two weeks, delayed because of shortage of supplies and controversies over effectiveness. Effective or not, wearing masks now became a social marker of joining the effort against the epidemic, along with keeping 6 feet away from other people. I anticipated that this period of solidarity would last no more than 3 months. Since the period after 9.11.2001 had many public assemblies, often highly emotional, honoring the heroes of the attacks, whereas in 2020 public assemblies were prohibited as dangerous incubators of the epidemic, I expected the period of public solidarity would be shorter, probably 1 or 2 months.

For several years I was in the habit of walking or running for a half hour or more almost daily in my neighborhood or in public parks, and thus have a baseline for normal street behavior.Footnote 2 By early April (about 2 weeks after the lockdown began), I noted that the number of people out walking was up by a factor of two or three from the pre-epidemic period; people deprived of exercise had found something they could do. Soon almost all walkers were now wearing masks, and when meeting others on the sidewalk, one or the other would step out into the street to maintain distance. When doing so, almost everyone waved or called out a friendly greeting. Deliberately avoiding someone would be a mark of fear or an insult; so we countered that by a friendly wave or greeting. This is also Simmelian solidarity. It is clearly related to the onset of the shared emergency; in my walks in previous months and years, I would estimate the proportion of F2F encounters on the street where there was a greeting was less than 20% (chiefly among older people; noticeably absent among the young).

The time-pattern of decline in Simmelian solidarity was the following: By late April (one month after the lockdown), the number of people out walking had noticeably increased. The proportion of people greeting each other declined; this was particularly true in areas along the harbor or oceanfront (the beaches and parks being closed and patrolled by guards); perhaps there was the beginning of a tone of defiance. Younger adults in particular were ignoring social distancing; and friendly waves or greetings were absent (including towards each other).

I began to make systematic counts of how many people were wearing face masks, distancing, and greeting. My focus was on adults who were walking on sidewalks or streets (children at this point rarely wore masks). I did not count runners or bicyclists, since they almost never wore masks—a constant pattern from this point onwards. This may be due partly to decreased lateral visibility, but especially to difficulty breathing when doing heavy exercise. I did not count gardeners or other outdoor workers or delivery persons: the latter usually wore masks (as they worked for bureaucratic organizations that demanded it); manual workers usually did not, nor did they practice social distancing among themselves. One can see here a social class divide in the observance of social distancing etiquette. For walkers, the height of symbolic solidarity (mask-wearing and greetings) was in April; during May the proportion wearing masks gradually declined, as did greetings when social distancing (very noticeable around May 22–23). For this period, a Gallup poll reported 1/3 each said they always, sometimes, or never wore masks outdoors (New York Times June 3, 2020); given the desirability bias in surveys, the mask-compliant numbers are probably exaggerated.

A sharp break occurred in the first week of June, as Black Lives Matter protests and marches broke out. This was 10 weeks after the lockdown began. During the most militant period (the first 4–5 days), when many protest demonstrations were in a mood of righteous anger accompanied by burning or property destruction, photos indicate that few protestors wore masks, and participants massed close together. Footnote 3 This happened despite official warnings that big assemblies, especially when shouting and chanting together, broadcast the virus. A rival source of Simmelian solidarity had been created, and it overrode the already-declining solidarity rituals of the social distancing etiquette. Most of the participants in the protests were young (as one can see in news photos); young people already were largely ignoring social distancing, and signs of solidarity among the young in ordinary public street behavior had been low. They were further IR-starved by the banning of sports and concert participation as audiences, or even as performers. The widespread participation of white youth in the protests (in most photos outnumbering minority participants) was at least in part the response to the sudden opportunity to regain experiences of mass solidarity.

In subsequent weeks, as most protests became smaller and less emotional, photos show participants more often spread out, maintaining social distancing (also no big crowds) and at least half wearing masks. This is probably the effect of being more deliberately organized rather than spontaneous, with organizers and (mostly white middle-class) participants making a conscious effort to present a good appearance by following official coronavirus etiquette.

In California, parks and beaches were opened up again around June 10, along with reiterated regulations on masking and social distancing. My observations for pedestrians June 10–27:

Totals for public parks: 54 of 267 wore masks (20%); 3 greetings (6% of mask-wearers, 0% of unmasked).

For neighborhoods: 23 of 91 wore masks (25%); 15 greetings (43% of mask-wearers, 9% of unmasked).

Those who continued to wear masks showed some solidarity (although declining over time) by greetings; this was more likely in residential neighborhoods (at least middle class) than in public parks, where greetings had largely disappeared.

Occasional conflicts were observed, in the following pattern (mid-June): middle-aged woman says to an unmasked woman approaching her closely outside a medical building: “Could you please stand back? Where is your mask?” Reply: “Don’t be rude!” It appears that both sides felt collective morality is on their side: a formula for intense social conflict. News reports a month earlier noted an upsurge of confrontations between maskless shoppers who grew angry when retail store employees who told them to wear masks; violent incidents however were rare (Wall Street Journal, May 18, 2020). News reports on conflicts over masking largely disappeared by June, when they were upstaged by more dramatic conflicts over race and policing. Individual incidents of conflict over masking may have been a transitional phenomenon; by August, mask-wearing in stores and buildings where it was officially enforced appear to be near-universal (in my region of observation); on the other hand, wearing masks in parks and public streets had largely disappeared. So had greeting rituals associated with social distancing.

When everyone is wearing masks, it becomes more difficult to hear what people are saying; also some of the cues that we use to fill in likely words are missing because we cannot see their mouth and facial gestures, nor can one use facial feedback from the listener to correct one’s articulation. Thus, masked interactions even in ordinary utilitarian situations give rise to misunderstandings, raised voices usually associated with anger, and sometimes gestures of annoyance. I have observed this frequently in grocery stories. Anything that limits multi-modal interaction takes its toll, even in situations where solidarity mainly takes the form of routine civility.

Family solidarity

On the positive side, it appears that at first solidarity increased, at least for some family members. Children of elementary school age and younger seemed happy, as they had more time with parents and attention from them. I observed a large increase in families bicycling together on neighborhood streets (seldom seen before the epidemic); since bicyclists rarely wear masks, and children at this time never did, one could see that their expressions were on the whole happy. It is unlikely that teenagers were similarly affected; I almost never saw them bicycling or walking with adults in neighborhoods or parks. This not surprising, as teen culture is mostly concerned with being independent of adults, and being seen with parents is a status loss except on formal occasions (Milner 2016). Given that teens were prevented from gathering (I only occasionally saw teens out together, and hardly any male–female young couples other than parents), I would predict that data on the level of alienation and anxiety among teenagers would increase for this period. Even though teens are the most media-connected and media-obsessed of all age groups, they are the ones least likely to find it a compensation for a further drop in F2F experience.

On the negative side, doctors report an increase in child-abuse cases, although official statistics show a decline (all attention being focused on COVID-19) (San Diego Union-Tribune June 5, 2020). A national child-abuse hotline reported a 20% increase in calls and 440% increase in text messages over the prior year (Wall Street Journal, May 19, 2020). The stay-at-home situation is favorable to some, perhaps most families with adequate space and resources; where there is family tension, isolation increases abuse, as has long been established (Collins 2008, p. 137). A national survey carried out in May found that reports of clinical symptoms of depression had doubled (compared to a 2014 baseline) to 24% of the US population; depression was especially high among young adults and women, even though they were less vulnerable to COVID-19 (Washington Post, May 27, 2020). As a baseline comparison, embodied social interaction in the smartphone generation was already in decline, especially among teenage girls. By 2018, American teens were spending 6-to-9 hours daily online. Since 2007, time spent on seeing friends or going out in public had fallen sharply, as did dating. In 2019, 36% of girls said they were extremely anxious every day (Wall Street Journal, August 17, 2019). The social causes of anxiety and depression are multiple; deprivation of embodied interaction during the coronavirus epidemic appears to be an intensification of what went before in that demographic.

We have no data on sexual behavior during this period. Likely the birth rate will spike 9 months after the onset of the epidemic. On the other hand, monthly marriage rates must surely drop, as will the frequency of sexual behavior among non-cohabiting individuals; casual hookups as well as commercial sex likely will drop drastically. I have very occasionally seen an unmasked male/female couple necking in a park; formerly active gay pick-up areas look deserted. As a baseline comparison, sexual activity had already declined in the Internet generation; in 2018, 23% of Americans age 18–29 had no sex in the previous year, doubling the percentage of sex-less lives in the pre-social-media 1990s (Wall Street Journal, May 18, 2019). Presumably this will have declined still lower in the coronavirus period. Looking for a bright side in the coronavirus shutdown, The Wall Street Journal (May 30, 2020) touted “Distancing Revives Courtship,” an interview-based story of how dating has gone online, returning to almost Victorian manners, at best watching each other online drinking a glass of wine (definitely no touching). If sex is a form of solidarity, it must surely decline among those who do not already have intimate live-in partners. The same would be true of ordinary fun involving any kind of physical activity together. Research may well find that social distancing makes little difference to upper-middle-class professionals whose social gatherings consist entirely of conversation, but more active persons would likely feel deprived. This is one reason why after bars re-opened in late June 2020, these suddenly crowded venues (photos showed an absence of social distancing and mask-wearing) became hotspots for coronavirus infections. In the tradeoff between lively sociability and risk of sickness, many choose the former.

An earlier version of mediated sex is phone sex, where operators pretend to be sexy women depicted in advertisements. Flowers (1998) found that the majority of callers did little conversation, asking for the operator to describe specific sexual actions she was (pretending to be) performing; apparently most such callers engaged in masturbation. Other callers were lovelorn individuals who called the same operator repeatedly and tried to establish a relationship. Some callers carried out a dating-style bargaining relationship, trying to make themselves attractive and attempting to lure the operator into meeting them personally—which professional operators tried to avoid. Commercial phone sex is probably a good template for online sex. Under conditions of enforced social distancing where it is difficult for individuals to physically meet, the options would appear to be violating those restrictions, or remaining content with masturbation or mere socializing.

Remote schooling

By all accounts, this was not very successful in the first four months of the lockdown (see Rice 2006; Molnar et al. 2017 for the pre-coronavirus period). Leaving aside issues such as the extent of the school population who lack internet access, and schools adopting a no-grading policy, we find that online schooling has a negative effect on student motivation. Online daily absences of students who don’t log in are 30% or more; surveys find there is little interaction with teachers; 50% of students said they don’t feel motivated to complete online assignments (Wall Street Journal, June 6, 2020). Teachers complain they can’t read the body language of students and can’t pick out cues for whom to engage with at what opportune moment. I have watched my 8-year-old grandson during online classes; these usually last less than half an hour, while the teacher goes over the assignment in a pleasant voice, talking to no one in particular. He spent the time playing with a slinky held beneath the level of the screen. Posts on Reddit by college students showed students complained about noise from parents or siblings while they were trying to hear a lecture or take an exam (San Diego Union-Tribune May 23, 2020). Some students said they liked not having to go to campus, since they did not need to find a place to hang around between classes; apparently these were students who did not live near campus, or who had jobs. One student said he liked being able to watch a lecture while doing his homework in bed; online viewing reduced the need to pay attention. But we have no baseline of how much students normally pay attention in class (usually they pretend to, but often their laptops are not being used for taking notes, as any teacher can observe by walking around the classroom). We cannot assume that F2F classrooms are automatically successful Interaction Rituals.

Some college students complained about the anti-cheating protocol during a virtual exam, where they were required to keep their face and hands visible on the webcam at all times. Other Reddit posts said they felt isolated at home, missed their school friends, and were generally apathetic and unmotivated. This suggests a divide between students who are entirely utilitarian in their orientation, and those for whom school is a social experience. Hypothesis: grinds like online learning, party animals don’t; those who value networks, whether intellectual or career, also miss personal contact even though it consists in more than fun.

Besides passive feelings of alienation and deprivation, some students actively took the opportunity to counter-attack. Some coordinated online pranks with fellow-students, such as simultaneously switching off their cameras so that the teacher finds oneself suddenly alone surrounded by blank rectangles. Others organized campaigns to destroy teachers’ ratings on apps such as Google Classroom (Wall Street Journal, June 2, 2020). Others hacked into Zoom conference calls, playing loud pop music, shouting insults and obscenities, or inserting pornographic images on the screen (Washington Post April 5, 2020; Associated Press April 8, 2020). Mass rebellions by students in classrooms against unpopular teachers are not unknown in the past, but they were rare. Online hacking may be a mixture of pranks, fun, alienation, or hostility. The comparison shows that interactions in person result in more conformity, a Goffmanian front-stage show of respect for the situation, and thus at least a mild form of solidarity. This social pressure or entrainment disappears at a distance; violence, too, is difficult to carry out F2F, and much easier at a distance, above all when there is no reciprocal view of each others’ eyes. (See Collins 2008, especially pp. 381–387 on snipers, whose mode of killing hinges on seeing their target through a telescopic lens but cannot be seen by them.) It is reciprocal eye contact that generates intersubjectivity and its constraints.

Working remotely

There is disagreement whether working remotely is effective. Some people prefer working from home. What they like about it are no commuting; reduced meetings which they feel are a waste of time; and fewer distractions in the workplace. Some dislike working at home; what they dislike are more distractions in the household; less team cohesion; and technical and communication difficulties (Wall Street Journal, May 28, 2020: based on a survey of hiring managers). Similar points were made by the head of a state judicial unit, who emphasized that much additional time by management personnel was now spent on meetings, and attempts to keep up morale by remote contact; meetings were often frustrating because considerable time was wasted trying to get the communications technology working for all participants (repeated interviews during March–June 2020). She sometimes went to her office in order to use secure communications, and found it refreshing whenever encountering a colleague in person. Efforts to re-open court business, with social distancing and masking precautions, were welcomed by part of the staff and opposed by others. In this organization, those most eager to return to their office were largely those in higher positions, strongly committed to their professional identity.

Parents who have young children and have lost child care or its functional equivalent in schools have a strong incentive to try to work from home; this appears to be especially strong among working mothers. The desire to work from home versus in the customary work setting is affected by numerous motivations, both utilitarian and emotional; all that can be concluded here is that for at least some segment of the work force, there is an explicit desire for the social interactions of their work place. But even confining analysis to social emotions, it is possible there may be a competition between the EE that parents get from being with their children, and the EE that they get from being with their co-workers.Footnote 4 This kind of competition is encompassed in IRC theory; its guiding principle of motivation is that individuals feel the attractiveness of one or another course of action by comparing the EE they get in one or the other, and choosing (if choices are actually available) the interactional path that gives the most EE.

Hollywood film professionals said they liked spending less time on planes flying around the country; and less high-level meetings which they considered more habitual than necessary (Los Angeles Times May 3, 2020). One producer said: “I don’t think video conferencing is a substitute for being in a room with someone, but it is better than just talking on the phone. There are so many ways you communicate with your expression… when it’s delayed and small, you just lose all that. My feeling is it’s 50% as good as an in-person meeting.” In the actual work of making movies, most emphasized that it is a collective process, and some insisted that spontaneous adjustments on-set were the key site for creativity. They also reiterated the point that live audiences are the only way to reliably tell whether a film is coming across, and larger audiences amplify both comedy and drama (i.e., via emotional contagion).

Some businesses have tried to compensate by having “virtual water-cooler” sessions several times a week, where any employee can log in and chat. It is unclear what proportion took part, how enthusiastically, or with what pattern over time. Some managers reported that company-wide “town-hall meetings” to reassure employees lost interest over time (Wall Street Journal, June 6, 2020). DiMaggio et al. (2019) however, found that online “brainstorming events” for employees in a huge international company were consonant with some patterns of interaction rituals; this research was carried out in 2003–2004, long before the epidemic. The degree of involvement and solidarity in town-hall meetings is affected by scale; the court administrator reported that feedback about morale was positive after online sessions involving groups of around 10, but in larger groups it was hard to get a Q&A discussion going. This is similar to what any speaker can observe in ordinary lecture presentations and panel discussions; even with physical presence, most people are reluctant to “break the ice” after the speakers have been the sole center of attention, but once someone (usually a high-status person in the audience) sizes up the situation and says something, it turns out that many others find they also have comments to make. This is a process of micro-interactional attention, which is especially difficult to handle on remote media.

Many managers said that innovativeness was lost without serendipitous, unscheduled encounters among individuals. In a PricewaterhouseCoopers survey, half of employers reported a dip in productivity with online work (Wall Street Journal, June 6, 2020). Longer trends, going back before the coronavirus epidemic, indicate that the promise of online work was not highly successful. During 2005–2015, the era of the high-speed Internet, the percentage of persons in the US regularly working from home increased slowly; those working from home at least half-time reached a pre-epidemic peak of only 4% (www.npr.org/sections/money/2020/04/28/846671375/why-remote-work-sucks). During this period, several big corporations, initially enthusiastic, tried to shift to primarily online work but abandoned it after concluding it was less effective. In the market-dominating IT companies, the trend instead was to provide more break rooms, food, play, and gym services to keep their workers happy on site. This was abruptly reversed in the coronavirus period.

Zoom fatigue

Popular video-conferencing tools such as Zoom attempt to reproduce F2F interaction by showing an array of participants’ faces on the screen, along with one’s own face for feedback in positioning the camera. Reports on how well it works in generating IR-type rhythm and solidarity are mixed. CEOs of high-tech companies tend to claim that it works well. Among rank-and-file participants, however, complaints are widespread and it even acquired a slang term, ‘Zoom fatigue’ (Wall Street Journal, May 28 and June 17, 2020). Achieving synchrony with others is hard to do with a screen full of faces, delayed real-time feedback, and lack of full body language. Since there is a limit to how many individual faces can be shown, in larger meetings some persons are seen only occasionally, and leaders looking for responses often find they get none. Some of the ingredients of IR (not necessarily under that name) are now being recognized by communications specialists; these include fine-grained synchrony and eye movements. In ordinary F2F conversation, persons do not stare continuously at others’ eyes, but look and look away (Tom Scheff made this point to me in a personal communication during the 1990s; for detailed transcripts of multi-modal interaction see Scheff and Retzinger 1991). Thus, seeing a row of faces staring directly at you is artificial or even disconcerting. Some readers responded with advice: cut off the video to reduce zoom fatigue, go audio-only. Some found hidden benefits in zoom conferencing: once the round of social greetings is over, turn off the video and your mic and do your own work while the boss goes through their agenda.

Continuously seeing one’s own face on the screen is another source of strain. As Goffman pointed out, everyone is concerned with the presentation of their self, in terms of status as well as appropriateness for the situation. But one does not have one’s image constantly in a mirror, and when interaction starts to flow, one loses self-consciousness and throws oneself into the activity, focusing more on others’ reactions than on oneself. Those who cannot do this find social interaction embarrassing and painful. But enforced viewing of one’s own image feels unnatural.

Prolonged video conferencing as a whole seems to have about the same effects as telephone conference calls. In my experience on the national board of a professional association, our mid-year meeting was canceled by a snowstorm, and a 2-day conference call was substituted. The next time I saw the board in person, I polled everyone as to whether they liked the conference call: 18 of 20 did not. Lack of shared emotion was apparent during the event; for example, when it was announced that we had received a large grant, there was no response. No wonder: applause and cheers are coordinated by looking at others, and it is embarrassing to be the only person applauding (Clayman 1993). Work gets done remotely, after a fashion; it just lacks moments of shared enthusiasm.

The strains of remote interaction come out strongly in psychotherapy. On the practical level, many patients like the convenience of not having to travel to an office. But therapists feel the difficulty of making eye contact; Zoom shows facial expressions, but the two sides of the participation never look straight at each other, and the give-and-take of eye contact, its good rhythm in a successful interaction, its out-of-synch quality in an unsuccessful one, is blocked out in this media. Therapy workers thus find the work exhausting (communication from anonymous reviewer; and Backhaus et al. 2012).

Assemblies and audiences

Participating in large audiences or collective-action groups is intrinsically appealing, when it amplifies shared emotions around a mutual focus of attention. This is a main attraction of sports and other spectacles, concerts, and religious congregations, and it is what creates and sustains enthusiasm in political groups and social movements. Thus, the ban on large participatory gatherings should be expected to reduce commitment. Especially vulnerable is the practice of singing together, because it spreads aerial germs more than any other form of social contact. We lack current data on these effects, but the prediction of Durkheimian theory is that religious commitment and belief will fall off as the group is prevented from assembling. How long will this take? Judging from patterns of religious conversion, my hypothesis is that beliefs fall off drastically if there is no participation for 1-to-2 years. When the epidemic finally ends, the level of church attendance will give an answer; during the epidemic, surveys of religious belief on a monthly basis should show a trend—although allowing for desirability bias (which makes religious surveys overstate religious practice) (Hardaway et al. 1998).

Can technology substitute for collective practices like singing together in a congregation? Some Christian organizations have created virtual choirs, where individuals sing their parts alone and their recordings are compiled by sound engineers; the resulting performance is presented online, either showing a series of faces of individual singers, or several faces simultaneously on screen (interview with international religious organization staff). Such videos have been widely viewed, and convey the singers’ enthusiasm. It remains to be seen, over a period of time beyond the onset of the world epidemic, whether participation and commitment levels change.

Similar techniques have been attempted for performances of operas and orchestras (Wall Street Journal, April 27, 2020). Achieving good sound quality is difficult, since this depends on minute timing and adjustments of volume. (Sound quality of amateur efforts by church congregations is admittedly poor.) Making music together works best when there is a strong beat and repeated musical motifs—i.e., when there is a pronounced rhythmic coordination, as in successful conversational IRs. More complex music is more difficult to produce by remote coordination. No doubt it will be possible to compare such recordings with conventionally produced ones over the coming year.

When sports events are played without live audiences, can crowd enthusiasm be supplied by canned cheers? There is, in fact, considerable experience over the years with TV broadcasts, including the long-standing practice of laugh tracks in comedy shows. Most listeners find these artificial; research is needed, however, comparing the sounds and laughs audiences make when they are at a live show or when watching it with a sound track. We also know that important games attract enthusiastic fans even when ticket prices are high—and here TV viewers can actually hear the sound of a live crowd reacting to the action.

What is the extra ingredient of group emotional contagion needed? A natural experiment occurred in March 2013 when a Tunisian soccer match banned fans because of political tensions (Wall Street Journal, May 27, 2020). Fans were able to download an app that connected to loudspeakers in the stadium, producing recorded cheering that got louder as more people tapped on their smart phones more frequently. Fans could thus could hear the effect of their own remote “cheering,” and presumably so could the players on the field (although there are no interviews about the players’ experiences). Audience enthusiasm was high, and much local publicity was given to the experiment. The key ingredient is feedback, from one individual fan to another; they were able to monitor how their own action fit into the dynamics of making collective sounds. This feeling of collective participation should be highest, not when sound is kept at a maximum, but when participants can perceive rising and falling levels in accordance with their own actions. This is what happens in real audiences, who can monitor each other in all perceptual channels (such as recognizing when doing the wave is going around the stadium and when it is fading out). If remote-communications technology is to generate the solidarity and energy of embodied gatherings, it is such details of the IR mechanism that must be reproduced.

Summing up

We have two kinds of evidence to consider: [A] what happens when ordinary F2F interaction is eliminated or curtailed; [B] what happens when F2F interaction is substituted by electronic media interaction. What modifications or restrictions of IR theory are necessary in each case?

Most strongly disrupted have been group assemblies and audiences. Particularly vulnerable are gatherings where the group shouts, sings, or makes noise together, since the coronavirus is a respiratory disease; on the other hand, it is these noise-making activities where are the central rhythmic activity that constitutes them as a high-solidarity group. Durkheim took religious assemblies, as well as enthusiastic political assemblies, as the archetype of solidarity through interaction ritual. It is not surprising that the groups which have most strongly resisted social distancing and masking have been religious groups (generally of the more emotionally expressive sects). Similarly, during the first week of highly emotional demonstrations in the Black Lives Matter movement in early June, photos show a high proportion of protestors ignored social distancing and masking. Groups that want high Durkheimian solidarity reject restraints on the ingredients of IRs; in keeping with IR theory they regard the morality of what they are demonstrating through their ritual as higher than any other claims.

The solidarity that pedestrians ordinarily display when passing each other on the street is a much weaker sort. It is largely the kind of let’s-tacitly-agree-to-pass-each-other-without-causing-problems minimal solidarity that Goffman called “civil disattention”; plus sometimes small polite recognition of friendly relationships with persons of a similar social status. Social distancing and masking strained these processes. Civil disattention was violated by people crossing the street or moving away as someone else approached; in ordinary circumstances this would signal fear or disdain. At such moments, a new ritual was invented: when avoiding each other, persons would wave, briefly make eye contact, and sometimes call out a greeting. This falls into the category of what Goffman called repairs: rituals to make amends for a recognized failure in proper interaction ritual.

This had a paradoxical effect: the total amount of solidarity rituals expressed in walking increased during the early phases of the coronavirus lockdown; more people were out taking exercise (since they were otherwise confined at home, gyms closed, sports prohibited), and they saw each other on the street far more often. This also may be regarded as Simmelian solidarity—the increased we’re-in-this-together feeling at the outset of a public emergency. Having previously measured the length of this Simmelian solidarity period during the 9-11-2001 experience, I expected that solidarity gestures would peak around 2 or 3 months and then decline. This is confirmed in my data; greetings fell off sharply in the third month. If we take mask-wearing as a sign of solidarity and commitment to a public cause, this also declined quite sharply, so that by the fourth and fifth months, mask-wearing in public places had declined to a small minority.Footnote 5 Here there is an important distinction between formal organizations and informal activity; mask-wearing became enforced very widely in stores, medical offices, and government buildings, at the same time that masking was largely dropped in most other places. Simmelian solidarity is a version of spontaneous Durkheimian solidarity, and has similar time-dynamics. In contrast, officially enforced formal regulations attempt to override spontaneous feelings and disregard any psychological time-limits. Formal regulations may become accepted as a matter of routine but they cease to convey any feelings of ritual solidarity.

Another unexpected finding was that family groups of parents and children were much more often seen on the streets. Parents and children were both home and were spending more time together than normally; from the emotional expressions as they bicycled or walked together, children generally looked happy, and their parents too. This is not a paradox for IR theory; the health emergency confined F2F encounters much more to small family groups; the emergency gave them a shared focus of attention and a concern for a common mood; thus, the amount of successful family IRs appear to have gone up. (At least that is what was visible on middle-class streets and in parks.) The above applies mainly to small children.

Teenagers were notably absent from most family groups in public, and they were rarely seen out with each other. Teens are very much a separate, boundary-defending group vis-a-vis their parents and other adults, and their primary social activities—sports and sexual flirtation—were much more seriously disrupted than the activities of smaller children. Mental health surveys indicate that their levels of anxiety and depression, already at a high baseline in the pre-coronavirus period, have risen still higher. We are now spilling over into the second topic, electronic substitutes for F2F interaction, but note that teens’ higher reliance of remote media for social contacts, compared to other age groups, is consistent with a solidarity-draining and EE-draining effects of poor F2F contacts. Online sexual activity is very limiting, compared to bodily contact; a hidden, almost retro-Freudian gulf has been created between already coupled adults, locked down at home with long-term sexual partners; and non-coupled youth, whose opportunities for sexual contact or flirtatious fun are prohibited by restrictions on gatherings at parties, bars, nightclubs, and musical entertainment. It is not surprising that, of the groups most explicitly opposed to these lockdowns (along with religious groups mentioned above), teens and young adults (especially the 20–29 age group) have widely ignored these bans, and sometimes demonstrated against them. In IR theory, the mutual entrainment of sex is one of the strongest rituals, especially important when initiating a relationship; depriving youth of these embodied interactions raises both depression and resistance.

We come now to electronic media as substitutes for F2F interaction. Carrying out work and school by remote electronic access has both utilitarian and social aspects. Many people find practical advantages, saving time and expense of travel, but there are also practical difficulties: technical glitches, breakdowns in coordination; ironically, remote work increases difficulty in getting technical support when helping personnel are also remote. Since conflicts of interest exist between persons in authority and those subject to it, some persons prefer remote work or schooling because it is easier to tune out from meetings and lessons, and do what one is interested in. Authority and social control is harder to exercise without F2F surveillance. There are also more opportunities for rebellion: hacking, pranks, insults to authority disguised as technical breakdowns, anonymous disruptions. Conflict and violence in general are easier to carry out when F2F intersubjectivity is missing; most successful violence occurs when the target is not looking at the attacker, and remote interaction offers just this kind of opportunity. This is the obverse side of F2F as an ingredient of effective interaction rituals; when the ingredient is missing, action at cross-purposes—i.e., conflict—is more frequent.

The kinds of work and teaching/learning involving creativity and innovation particularly suffer when embodied interaction is missing. What remains is talk in its more routine and formal aspects. Persons who already have F2F ties before the shift to remote interaction find it easier to continue the tone of those interactions than persons who have never met; in network terms, it is easier to maintain an existing network than to develop significant new ties remotely. Participants in remote work and schooling divide between those more concerned with utility, and those who value the social experience. Utility is a strong consideration for upper management concerned to reduce costs, as it has been for several decades, hoping to reduce physical plant and save on salaries with the expectation, for example, that fewer remote teachers can do the work of many. In the natural experiment of the coronavirus epidemic, however, many participants have found that teaching is even more labor-intensive. Remote work does carry on with a moderate degree of success; social order continues to exist, but with a lower degree of solidarity and emotional energy. This is consistent with the general point: interaction ritual is weaker when F2F embodied co-presence is lacking, because it is more difficult to achieve high levels of mutual focus of attention and rhythmic coordination.

The recent concept of Zoom fatigue shows there is growing recognition of the micro-interactional difficulties of carrying out satisfactory social relationships remotely, which is to say, successful interaction rituals. The difficulty can be pin-pointed to problems of establishing normal eye contact and rhythm of speaking without the tacit coordination of bodily gestures. Persons trying to generate enthusiastic cooperation in large groups, and those such as psychiatrists concerned with emotional resonance, find remote interactions frustrating. Again we find not that no interactional solidarity can be generated remotely, but it is diluted in strength, and frustrating for persons whose expectations are higher.

Conclusion

We can now answer the questions posed at the outset. Theory of interaction rituals is not superseded; we do not need to invent a new sociology and psychology for the IT era. As far as human beings are concerned, political authorities and technological developments may force people to forego much embodied interaction. People are culturally malleable, but if that means that after a period of acclimation, we can get used to anything, it does not follow that we can do so without paying a price. If people are deprived of embodied interactions, we can expect they will be more depressed, less energetic, feel less solidarity with other people, become more anxious, distrustful, and sometimes hostile.

But what if technology is tweaked so that it better mimics the ingredients and feedback processes that generate successful interaction rituals? That may well be happening, to a degree. Our enforced natural experiments, as well as ongoing studies of interaction in the IT era, show that the ingredient of physical co-presence is chiefly important because it enables the key processes, establishing a mutual focus of attention, and monitoring all the sensory signs of emotion, action, and rhythm, for the degree to which they are shared or at cross-purposes. Adding more sensory modes to electronic media (perhaps eventually by encoding brain signals) may make it possible to technologically mimic quite strong IRs. This would also open the way to a very dangerous form of hacking and manipulation; even a benign form of brain-induced IRs would likely create a drug-like addiction.

Interaction Ritual theory is an attempt to incorporate what we have learned from the sociology of religion, and from the details of human interaction generally, insofar as it involves back-and-forth communication of emotions, cognitions, and bodily rhythms. Future “natural experiments” with electronic media will add refinement to what we know.