Although the human hand has been used in a variety of whiteboard animations presented in experimental research (e.g., Türkay,
2016; van der Meij & Draijer,
2021), there is still no empirical evidence as to whether this is at all conducive to learning at all. However, there is already a considerable body of theory and frameworks that argue for or against the inclusion of the hand in whiteboard animations.
Human instructor
One way to increase the social affordances of computer-based learning environments such as whiteboard animations is the implementation of human instructors (e.g., Pi et al.,
2020; Wang & Antonenko,
2017; Wilson et al.,
2018). In recent years, several studies have investigated the effects of a human instructor on multimedia learning (e.g., Lawson et al.,
2021; Ramlatchan et al.,
2020; Wang et al.,
2020). Human instructors must be distinguished from pedagogical agents, which are defined as virtual (nonhuman) on-screen characters (Martha & Santoso,
2019). However, Henderson and Schroeder (
2021) argue that the use of human instructors and pedagogical agents follow the same logic, as both serve to enhance social interaction in learning and ultimately improve learning performance. Similarly, studies comparing human and virtual instructors have found no significant differences in terms of their effects on learning performance (e.g., Horovitz & Mayer,
2021). Lawson et al. (
2021) were also able to show that learners can similarly recognize emotional tones displayed by human or virtual instructors.
It is generally assumed that such social partners serve educational purposes by guiding the learner through a digital learning environment (Heidig & Clarebout,
2011). The instructor or agent can be seen as a knowledgeable mentor who motivates the learner (Baylor & PALS,
2003). In principle, the human instructor does not have to be completely visible, i.e., from head to toe. For example, it is sufficient, if a human hand is visible (e.g., Fiorella & Mayer,
2016; Schroeder & Traxler,
2017) or a human voice (e.g., Atkinson et al.,
2005; Mayer et al.,
2003) is audible to recite the learning content. Studies have shown that an instructor in a multimedia learning environment increases intrinsic motivation (Beege et al.,
2022) and mental effort (Lin et al.,
2020).
The approach of implementing pedagogical agents or instructors in multimedia learning is closely related to the
embodiment principle (e.g., Fiorella,
2021). Based on the assumption that the human motor system is involved in a variety of cognitive tasks (e.g., mathematics; Wakefield et al.,
2019), the
embodiment principle recommends implementing task-relevant sensorimotor experiences in the learning environment. In this context, studies have shown that a pedagogical agent performing physical movements such as gestures within a learning environment leads to improved learning performance (e.g., Mayer & DaPra,
2012; for a meta-analysis see Davis,
2018). It is not always necessary for the learner to perform such movements– often it is sufficient to observe them (e.g., Cook et al.,
2013). This kind of “thinking with the body” extends working memory capacity and cognitively relieves the learner (Sepp et al.,
2019).
The justification for including human instructors in multimedia learning environments can be further explained by several other theories. From a human–computer interaction perspective, the beneficial effect of pedagogical agents can be explained by the
computers as social actors (CASA) paradigm (Nass et al.,
1994). In this context, people tend to interact with a pedagogical agent presented on a computer in a similar way as they would with a real person. When people attribute human-like characteristics to a digitally presented instructor, the
persona effect comes into play (e.g., Craig et al.,
2002). In this context, instructors need to have a persona—an authentic agent that facilitates learning and appears engaging, human-like, and credible (Baylor & Ryu,
2003). Similarly, the
social agency theory, which is anchored in multimedia learning research (Mayer et al.,
2003), posits that the presence of a pedagogical agent or human instructor in a multimedia learning environment causes learners to feel that they are engaged in social interaction. When learners perceive such social cues, they become more engaged in learning, which in turn is associated with better learning outcomes. In this context, the
cognitive-affective-social theory of learning in digital environments (CASTLE), proposed by Schneider et al. (
2022a), argues that social cues resulting from the interaction with pedagogical agents or human instructors activate social schemata that lead to improved learning-relevant, motivational, and metacognitive processes.
Seductive detail
It is often argued that a human hand in an instructional video or whiteboard animation is unnecessary or even detrimental to learning, leading to the classification of the human hand as a seductive detail (for a meta-analysis, see Sundararajan & Adesope,
2020). This position is supported by
cognitive load theory (CLT; Sweller,
2020). In line with this cognitive-oriented framework, the hand in a whiteboard animation can be defined as interesting but irrelevant information that is not essential for achieving the learning goal (e.g., Harp & Mayer,
1998). According to CLT, the argument against such seductive details is that they increase
extraneous cognitive load (ECL). In general, ECL depends on the presentation and design of the learning material (Sweller et al.,
2019).
The aim, therefore, is to reduce
extraneous processing by providing appropriate learning materials so that unnecessary cognitive resources are not wasted on processes irrelevant to learning. This frees up enough resources to deal with the complexity of the information to be learned. The complexity of the learning material is referred to as
intrinsic cognitive load (ICL). It is assumed that the task complexity depends on the element interactivity, which describes the amount of information that must be learned at the same time. Besides, the complexity of a task can be reduced if learners can draw on prior knowledge (Sweller et al.,
2019). The third type of cognitive load,
germane cognitive load (GCL), refers to learning-relevant activities in which learners actively invest cognitive resources (Kalyuga,
2011). In contrast to ICL and ECL, which are perceived passively, GCL plays an active role in learning. Ideally, the investment of cognitive resources results in knowledge being stored in long-term memory in the form of schemata (Kirschner,
2002).
Returning to the human hand in whiteboard animations as a seductive detail: It is suggested that when ECL is increased, fewer cognitive resources are available to devote to intrinsic load. In this context, CLT recommends a “less is more” approach to the design of learning environments (Mayer,
2014). This means that learning materials should be designed in such a way that available working memory resources are used for
germane processing, i.e. the construction and automation of schemata (Kirschner,
2002; Paas & van Merriënboer,
2020). Similarly, the
coherence principle, derived from CTML, suggests that non-essential visual information, such as a human hand, should be avoided because processing it unnecessarily consumes cognitive resources (Mayer et al.,
2008). Similarly, Schroeder and Traxler (
2017) have found that the inclusion of a human hand in an instructional video is associated with lower learning performance compared to a condition in which the human hand is absent. The authors suggest that the hand is an extraneous feature that consumes working memory resources that would be needed for learning. This seems to be especially the case when teaching complex topics (causing high ICL) with an instructional video. However, the study by Schroeder and Traxler (
2017) was based purely on a comparison of whether the presence of a human hand is conducive to learning or not.
Dynamic drawing principle
A salient feature of whiteboard animations is the drawing of visual content by a human hand, similar to a teacher writing on a whiteboard. In this context, empirical findings support the idea of implementing human-generated drawings in instructional videos. In multimedia learning research, the
dynamic drawing principle (Fiorella & Mayer,
2016; Fiorella et al.,
2019,
2020; Mayer et al.,
2020) describes that people learn better when a video lecture shows the instructor drawing content than when the instructor refers to already drawn content. Accordingly, in a study by Fiorella and Mayer (
2016), students watched a video lecture either in an already-drawn format or watched the instructor draw the content by hand. Across four experiments, results revealed that watching an instructor drawing content is beneficial for learning. It also appeared that watching the instructor drawing contents was only beneficial for learners with low prior knowledge. Furthermore, observing a drawing instructor promotes learning when both the instructor’s body and the instructor’s hand are visible. The instructor drawing hypothesis was also confirmed in another study by Fiorella et al. (
2019).
As outlined by Fiorella et al. (
2020), there are cognitive and motivational benefits to observing dynamic drawings, as is common in whiteboard animations. In this context, basic principles of multimedia learning are considered when content is drawn by an instructor (e.g., Fiorella & Mayer,
2016). Thus, dynamic drawings act as signals that direct learners’ attention to learning-relevant content (
signaling principle or
cueing principle; Alpizar et al.,
2020; Chun,
2000). In addition, the simultaneous presentation of visual drawings and corresponding oral explanations supports
temporal contiguity within the learning material (e.g., Ginns,
2006). In this way, ECL can be reduced so that sufficient cognitive resources are available for processing learning-relevant information. Considering the above remarks on social cues, it is assumed that dynamic drawing motivates learners to engage deeply in
generative processing (Fiorella et al.,
2020). As a result, learners are motivated to invest mental effort in learning so that engaged learning leads to successful learning.