Gesture-based interaction via finger tracking for mobile augmented reality

Hürst, Wolfgang; van Wezel, Casper

doi:10.1007/s11042-011-0983-y

Gesture-based interaction via finger tracking for mobile augmented reality

Open access
Published: 18 January 2012

Volume 62, pages 233–258, (2013)
Cite this article

Download PDF

You have full access to this open access article

Multimedia Tools and Applications Aims and scope Submit manuscript

Gesture-based interaction via finger tracking for mobile augmented reality

Download PDF

Wolfgang Hürst¹ &
Casper van Wezel¹

17k Accesses
121 Citations
6 Altmetric
Explore all metrics

Abstract

The goal of this research is to explore new interaction metaphors for augmented reality on mobile phones, i.e. applications where users look at the live image of the device’s video camera and 3D virtual objects enrich the scene that they see. Common interaction concepts for such applications are often limited to pure 2D pointing and clicking on the device’s touch screen. Such an interaction with virtual objects is not only restrictive but also difficult, for example, due to the small form factor. In this article, we investigate the potential of finger tracking for gesture-based interaction. We present two experiments evaluating canonical operations such as translation, rotation, and scaling of virtual objects with respect to performance (time and accuracy) and engagement (subjective user feedback). Our results indicate a high entertainment value, but low accuracy if objects are manipulated in midair, suggesting great possibilities for leisure applications but limited usage for serious tasks.

Social interactions in the metaverse: Framework, initial evidence, and research roadmap

Article Open access 07 December 2022

Augmented Reality: A Comprehensive Review

Article 20 October 2022

The challenges of entering the metaverse: An experiment on the effect of extended reality on workload

Article Open access 12 February 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction and motivation

In mobile augmented reality (AR), users look at the live image of the video camera on their mobile phone and the scene that they see (i.e. the reality) is enriched by integrated tridimensional virtual objects (i.e. an augmented reality; cf. Fig. 1). It is assumed that this technique offers tremendous potential for applications in areas such as cultural heritage, entertainment, or tourism. Imagine, for example, a fantasy game with flying dragons, hidden doors, trolls that can change their appearance based on the environment, exploding walls, and objects that catch fire. Huynh et al. present a user study of a collaboratively played augmented realty board game using handhelds that shows that such games are indeed “fun to play” and “enable a kind of social play experience unlike non-AR computer games” [21].

If virtual objects are just associated with a particular position in the real world but have no direct relation to physical objects therein, the accelerometer and compass integrated in modern phones are sufficient to create an AR environment. Using location information, for example via GPS, users can even move in these AR worlds. If virtual objects have some sort of connection to the real world other than just a global position, such as a virtual plant growing in a real plant pot, further information is needed, for example the concrete location of the pot within each video frame. This is usually achieved either by the markers (e.g. [38, 42]) or via natural feature tracking (e.g. [43–45]). For general information about augmented reality and related techniques, we refer to [8].

In order to explore the full potential of AR on mobile phones, users need to be able to create, access, modify, and annotated virtual objects and their relations to the real world by manipulations in 3D. However, current interaction with such AR environments is often limited to pure 2D pointing and clicking via the device’s touch screen. Such interaction features several problems in relation to mobile AR. First, users are required to hold the device in the direction of a virtual object’s position in the real world, thus forcing them to take a position that might not be optimal for interaction (cf. Fig. 2, left). Second, in general interface design, we can, for example, make icons big enough so they can easily be hit even with larger fingers. In AR, the size of the virtual objects on the screen is dictated by the real world, and thus they might be too small for easy control via touch screen. In addition, moving fingers over the screen covers large parts of the content (cf. Fig. 2, center). Third, interactions in 3D, such as putting a virtual flower into a real plant pot, have to be done by 2D operations on the device’s screen (cf. Fig. 2, right). Finally, in a setting with real and virtual objects, such as an AR board game, users must constantly switch between operations on the board (to manipulate real game pieces) and interaction on the touch screen (to interact with virtual characters and objects), as illustrated in the left and center image in Fig. 3. It seems obvious that a smooth integration of both interaction tasks is more natural and should lead to better game play experience. Some of these intuitive arguments are confirmed by recent studies such as [35] that identified differences in task characteristics when comparing laboratory studies with building-selection tasks in an urban area.

To deal with these issues, we investigate the potential of gesture interaction based on finger tracking in front of a phone’s camera. While this approach allows us to deal with the last problem by offering a smooth integration of interaction with both virtual and real objects, it also contains many potential issues, not just in terms of a robust tracking of fingers and hand gestures, but also considering the actual interaction experience. In this article, we focus on the second one. Using markers attached to the fingers, we evaluate canonical interactions such as translating, scaling, and rotating virtual objects in a mobile AR setting. Our goal is to verify the usability of this concept, identify typical usage scenarios as well as potential limitations, and investigate useful implementations. For work related to marker-less hand tracking in relation to mobile and augmented reality interaction, we refer to [3, 39, 41, 46–48].

In the remainder of this article, we start by describing the context of our research and discussing related work in Section 2. Section 3 presents a first user study evaluating finger-based interaction with virtual objects in midair [22]. We compare this concept to touch screen and another sensor-based approach in order to investigate the general feasibility of finger-based interaction. Based on a detailed analysis, we describe a second experiment in Section 4 investigating different finger-based interaction concepts for objects related to a physical object (e.g. a virtual object on a real board in an AR board game), such as illustrated in Fig. 3, left and center. Final conclusions and future work are presented in Section 5.

2 Context and related work

Most current commercial AR games on mobiles such as smart phones or the recently released Nintendo 3DS feature rather simple interaction with virtual objects such as selecting them or making them explode by clicking on a touch screen. However, finger gestures on a touch screen impose certain problems in such a setting, as discussed in Section 1. Alternative approaches for navigation in virtual reality environments on mobile phones include tilting [9, 10, 14]. This interaction concept is unsuitable in AR, because the orientation of the device is needed to create the augmented world. For the same reason, interactions based on motion gestures, such as the ones studied by [36, 37] are not applicable to this scenario.

One of the most common approaches for interaction in AR are tangible interfaces, where users take real objects that are recognized and tracked by the system to manipulate virtual objects [5, 16, 24, 40]. While this technique proved to work very well and intuitive in many scenarios, it suffers in others from forcing the user to utilize and control real objects even when interacting with pure virtual ones. Another approach that recently gained lots of attention due to research projects such as MIT’s Sixth Sense [29, 30] and commercial tools such as Microsoft’s Kinect [25] is interaction via finger or hand tracking. Related work demonstrated the benefits of using hand tracking for manipulation of virtual reality and 3D data, for example in Computer Aided Design [46].

On mobile phones, both industry [13] and academia [31, 41] have started to explore utilizing the user-facing camera for gesture-based control of traditional interfaces. Examples for tangible interaction in context with AR on mobile phones are [33, 34]. [1, 18] use a combination of tangible interaction, joy pad, and key based interaction. [19, 20] evaluate the usage of movement of the phone in combination with a traditional button interface. [23] gives an example for body part tracking (e.g. hands) to interact with an AR environment in a setting using external cameras and large screens. [11] features one of the first approaches to utilize finger tracking in AR using a camera and projector mounted on the ceiling. A comparable setup is used by [26] where interaction with an augmented desk environment was done by automatic finger tracking. [12] uses markers attached to the index finger to realize interaction in an AR board game. [28] presents a detailed study on the usability and usefulness of finger tracking for AR environments using head mounted displays.

Particularly the work by [28] demonstrates promising potential for the usage of finger tracking in AR interaction. However, it is unclear if these positive results still apply when mobile phones are used to create the augmented reality instead of head mounted displays. Examples using mobile handheld devices include [39] who creates AR objects in the palm of ones hand by automatic hand detection and tracking. [4] presents a system where free-hand drawings are used to create AR objects in a mechanical engineering e-learning application. Hagbi et al. use hand-drawn shapes as markers for AR objects [17]. However, both do not enable explicit object manipulation, which is the main focus of our research. Using finger tracking for gesture-based interaction, as proposed in our work, has also been investigated by [3, 6, 48]. [6] describes design options for such a concept and discusses related advantages. The authors claim that a related user test on biomechanics showed no significant differences in physical comfort between their approach and other ways of interaction, but no further information is given. The concept was also identified as being “useful”, “innovative”, and “fun” based on a video illustrating the approach and interviews with potential users. However, no actual implementation or prototype was tested. [3] and [48] both focus on realizing a marker-less hand tracking, but provide limited information on actual usability. [3] gives a detailed description of potential use cases, but no testing has been done.

The goal of the research presented in this article is to further investigate and explore the full potential of finger tracking for mobile AR interaction. Usability of this concept could be critical due to several limiting factors. First, there are restrictions imposed by the hardware. The camera’s field of view (FOV) defines the region in which a hand or finger can be tracked. For example, the camera of the phone we used in our first experiment features a horizontal and vertical viewing angle of approximately 55 and 42°, respectively. The camera’s resolution has an influence on how fine-grained the interaction can be. For example, a low resolution does not enable us to make a very detailed tracking, especially if the finger is very close to the camera. Second, there are restrictions imposed by human biomechanics. Every arm has a reachable workspace which can be defined as the volume within which all points can be reached by a chosen reference point on the wrist [27]. It is restricted, among other characteristics, by the human arm length, which, for example, in the Netherlands is on average about 0.5968 m for males and 0.5541 m for females. In addition, the phone has to be held at a certain distance from the eyes. For example, the Least Distance of Distinctive Vision (LDDV), which describes the closest distance at which a person with normal vision can comfortably look at something, is commonly around 25 cm [2]. Considering all these issues, we get a rather limited area that can be used for interaction in a mobile AR scenario, as illustrated in Fig. 4. [48] observed that the size of area (1) in the image is around 15 to 25 cm.

3 Interaction with objects floating in midair

3.1 Interaction concepts and tasks

3.1.1 Goal and motivation

In order to evaluate the general feasibility of finger tracking in relation to mobile AR, we start by comparing it with standard interaction via touch screen and another interaction concept that depends on how the device is moved (utilizing accelerometer and compass sensors). Our overall aim is to target more complex operations with virtual objects than pure clicking. In the ideal case, a system should support all canonical object manipulations in 3D, i.e. selection, translation, scaling, and rotation. However, for this initial study and in order to be able to better compare it to touch screen based interaction (which per default only delivers two-dimensional data), we restrict ourselves here to the three tasks of selecting virtual objects, selecting entries in context menus, and translation of 3D objects in 2D (i.e. left/right and up/down). Objects are floating in midair, i.e. they have a given position in the real world, but no direct connection to objects therein (cf. Fig. 3, right).

3.1.2 Framework and setup

AR environment

We implemented all three interaction concepts described below on a Motorola Droid/Milestone phone with Android OS version 2.1. Since we restricted ourselves to manipulations of virtual objects floating in midair in this study, no natural feature tracking was needed. The AR environment used in this experiment thus relied solely on sensor information from the integrated compass and accelerometer.

3.1.3 Interaction concepts

Touch screen-based concept

For the standard touch screen-based interaction, the three tasks have been implemented in the following way: selecting an object is achieved by simply clicking on it on the touch screen (cf. Fig. 5, left). This selection evokes a context menu that was implemented in a pie menu style [7], which has proven to be more effective for pen and finger based interaction (compared to list-like menus as commonly used for mouse or touchpad based interaction; cf. Fig. 5, right). One of these menu entries puts the object in “translation mode” in which a user can move an object around by clicking on it and dragging it across the screen. If the device is moved without dragging the object, it stays at its position with respect to the real world (represented by the live video stream). Leaving translation mode by clicking at a related icon fixes the object at its new final position in the real world.

In terms of usability, we expect this approach to be simple and intuitive because it conforms to standard touch screen based interaction. In case of menu selection, we can also expect it to be reliable and accurate because we are in full control of the interface design (e.g. we can make the menu entries large enough and placed as far apart of each other so they can be hit easily with your finger). We expect some accuracy problems though when selecting virtual object, especially if they are rather small, very close to each other, or overlapping because they are placed behind each other in the 3D world. In addition, we expect that users might feel uncomfortable when interacting with the touch screen while they have to hold the device upright and targeted towards a specific position in the real world (cf. Fig. 2, center). This might be particularly true in the translation task for situations where the object has to be moved to a position that is not shown in the original video image (e.g. if users want to place it somewhere behind them).

Device-based concept

Our second interaction concept uses the position and orientation of the device (defined by the data delivered from the integrated accelerometer and compass) for interaction. In this case, a reticule is visualized in the center of the screen (cf. Fig. 6, left) and used for selection and translation. Holding it over an object for a certain amount of time (1.25 s in our implementation) selects the object and evokes the pie menu. In order to avoid accidental selection of random objects, a progress bar is shown above the object to illustrate the time till it is selected. Menu selection works in the same way by moving the reticule over one of the entries and holding it till the bar is filled (cf. Fig. 6, right). In translation mode, the object sticks to the reticule in the center of the screen while the device is moved around. It can be placed at a target position by clicking anywhere on the touch screen. This action also forces the system to leave translation mode and go back to normal interaction.

Compared to touch screen interaction, we expect this “device based” interaction to take longer when selecting objects and menu entries because users can not directly select them but have to wait till the progress bar is filled. In terms of accuracy, it might allow for a more precise selection because the reticule could be pointed more accurately at a rather small target than your finger. However, holding the device at one position over a longer period of time (even if it’s just 1.25 s) might prove to be critical especially when the device is hold straight up into the air. Translation with this approach seems intuitive and might be easier to handle because people just have to move the device (in contrast to the touch screen where they have to move the device and drag the object at the same time). However, placing the object at a target position by clicking on the touch screen might introduce some inaccuracy because we can expect that the device shakes a little when being touched while held up straight in the air. This is also true for the touch screen based interaction, but might be more critical here because for touch screen interaction the finger already rests on the screen. Hence, we just have to it release it whereas here we have to explicitly click the icon (i.e. perform a click and release action).

Finger-based concept

Touch screen based interaction seems intuitive because it conforms to regular smart phone interaction with almost all common applications (including most current commercial mobile AR programs). However, it only allows to remotely controlling the tridimensional augmented reality via 2D input on the touch screen. If we track the users’ index finger when their hand is moved in front of the device (i.e. when it appears in the live video on the screen), we can realize a finger based interaction where the tip of the finger can be used to directly interact with objects, i.e. select and manipulate them. In the ideal case, we can track the finger in all three dimensions and thus enable full manipulation of objects in 3D. However, since 3D tracking with a single camera is difficult and noisy (especially on a mobile phone with a moving camera and relatively low processing power) we restrict ourselves to 2D interactions in the study presented in this paper. In order to avoid influences of noisy input from the tracking algorithm, we also decided to use a robust marker based tracking approach where users attach a small sticker to the tip of their index finger (cf. Fig. 7). Object selection is done by “touching” an object (i.e. holding the finger at the position in the real world where the virtual object is displayed on the screen) until an associated progress bar is filled. Menu entries can be selected in a similar fashion. In translation mode, objects can be moved by “pushing” them. For example, an object is moved to the right by approaching it with the finger from the left side and pushing it rightwards. Clicking anywhere on the touch screen places the object at its final position and leaves translation mode.

3.2 First evaluation: objects in midair

3.2.1 Setup and procedure

Participants

In the previous section we discussed some intuitive assumptions about the usability and potential problems with the respective concepts. In Section 2, we also described potential limitations of finger-based interaction in mobile AR in general. For example, Fig. 8 illustrates the problem that occurs when a finger is too close to the camera. In order to verify these expected advantages and disadvantages we set up a user study with 18 participants (12 male and 6 female, 5 users at ages 15–20 years, 8 at ages 21–30, 1 at ages 31–40 and 41–50, and 3 at ages 51–52). For the finger tracking, users were free in where to place the marker on their finger tip. Only one user placed it on his nail. All others used it on the inner side of their finger as shown in Figs. 7 and 8. Eleven participants held the device in their right hand and used the left hand for interaction. For the other seven it was the other way around. No differences could be observed in the evaluation related to marker placement or hand usage.

Task

A within-group study was used, i.e. each participant tested each interface (subsequently called touch screen, device, and finger) and task (subsequently called object selection, menu selection, and translation). Interfaces were presented in different order to the participants to exclude potential learning effects. For each user, tasks were done in the following order: object selection, then menu selection, then translation, because this would also be the natural order in a real usage case. For each task, there was one introduction test in which the interaction method was explained to the subject, one practice test in which the subject could try it out, and finally three “real” tests (four in case of the translation task) that were used in the evaluation. Subjects were aware that the first two tests were not part of the actual experiment and were told to perform the other tests as fast and accurate as possible. The three tests used in the object selection task can be classified as easy (objects were placed far away from each other), medium (objects were closer together), and hard (objects overlapped; cf. Fig. 9). In the menu selection task, the menu contained three entries and users had to select one of the entries on top, at the bottom, and to the right in each of the three tests (cf. Figs. 5, 6, and 7, right). In the translation task, subjects had to move an object to an indicated target position (cf. Fig. 10). The view in one image covered a range of 72.5°. In two of the four tests, the target position was placed within the same window as the object that had to be moved (at an angle of 35° between target and initial object position to the left and right, respectively). In the other two tests, the target was outside of the initial view but users were told in which direction they had to look to find it. It was placed at an angel of 130° between target and object, to the left in one case, and to the right in the other one. The order of tests was randomized for each participant to avoid any order-related influences on the results.

Evaluation

For the evaluation, we logged the time it took to complete the task, success or failure, and all input data (i.e. the sensor data delivered from the accelerometer, compass, and the marker tracker). Since entertainment and leisure applications play an important role in mobile computing, we were not only interested in pure accuracy and performance, but also in issues such as fun, engagement, and individual preference. Hence, users had to fill out a related questionnaire and were interviewed and asked about their feedback and further comments at the end of the evaluation.

3.2.2 Results and discussion

Time

Figure 9 shows the time it took the subjects to complete the tests averaged over all users for each task (Fig. 11, top left) and for individual tests within one task (Fig. 11, top right and bottom). The averages in Fig. 11, top left show that touch screen is the fastest approach for both selection tasks. This observation conforms to our expectations mentioned in the previous section. For device and finger, selection times are longer but still seem reasonable.

Menu selection task

Looking at the individual tests used in each of these tasks, times in the menu selection task seem to be independent of the position of the selected menu entry (cf. Fig. 11, bottom left). Almost all tests were performed correctly: only three mistakes happened over all with the device approach and one with the finger approach.

Object selection task

In case of the object selection task (cf. Fig. 11, top right), touch screen interaction again performed fastest and there were no differences for the different levels of difficulty of the tasks. However, there was one mistake among all easy tests and in five of the hard tests the wrong object was selected thus confirming our assumption that interaction via touch screen will be critical in terms of accuracy for small or close objects. Finger interaction worked more accurate with only two mistakes in the hard test. However, this came at the price of a large increase of selection time. Looking into the data we realized that this was mostly due to subjects holding the finger relatively close to the camera resulting in a large marker that made it difficult to select an individual object that was partly overlapped by others. Once the users moved their hand further away from the camera, selection worked well as indicated by the low amounts of errors. For the device approach, there was a relatively large number of errors for the hard test, but looking into the data we realized that this was only due to a mistake that we made in the setup of the experiment: in all six cases, the reticule was already placed over an object when the test started and the subjects did not move the device away from it fast enough to avoid accidental selection. If we eliminate these users from the test set, the time illustrated for the device approach in the hard test illustrated in Fig. 11, top right increases from 10,618 msec to 14,741 msec which is still in about the same range as the time used for the tests with easy and medium levels of difficulty. Since all tests in which the initialization problem did not happen have been solved correctly, we can conclude that being forced to point the device to a particular position over a longer period of time did not result in accuracy problems as we suspected.

Translation task

In the translation task, we see an expected increase in time in case of the finger and touch screen approach if the target position is outside of the original viewing window (cf. Fig. 11, bottom right). In contrast to this, there are only small increases in the time it took to solve the tasks when the device approach is used. In order to verify the quality of the results for the translation task, we calculated the difference between the center of the target and the actual placement of the object by the participants in the tests. Figure 12 illustrates these differences averaged over all users. It can be seen that the device approach was not only the fastest but also very accurate, especially in the more difficult cases with an angel of 130° between targets and initial object position. Finger based interaction on the other hand seems very inaccurate. A closer look into the data reveals that the reason for the high difference between target position and actual object placement are actually many situations in which the participants accidentally hit the touch screen and thus placed the object at a random position before reaching the actual target. This illustrates a basic problem with finger-based interaction, i.e. that users have to concentrate on two issues: moving the device and “pushing” the object with their finger at the same time. In case of device-based interaction on the other hand, users could comfortably take the device in both hands while moving around thus resulting in no erroneous input and a high accuracy as well as fast solution time. A closer look into the data also revealed that the lower performance in case of the touch based interaction are mostly due to one outlier who did extremely bad (most likely also due to an accidental input) but otherwise they are at about the same level of accuracy as the device approach (but of course at the price of a worse performance time; cf. Fig. 11).

Fun and engagement

Based on this formal analysis of the data, we can conclude that touch based interaction seems appropriate for selection tasks. For translation tasks, the device based approach seems to be more suitable, whereas finger based interaction seems less useful in general. However, especially in entertainment and leisure applications, speed and accuracy are not the only relevant issues, but fun and engagement can be equally important. In fact, for gaming, mastering an inaccurate interface might actually be the whole purpose of the game (think of balancing a marble through a maze by tilting your phone appropriately—an interaction task that can be fun but is by no means easy to handle). Hence, we asked the participants to rank the interaction concepts based on performance, fun, and both. Results are summarized in Table 1. Rankings for performance clearly reflect the results illustrated in Figs. 11 and 12 with touch based interaction ranked highest by 11 participants, and device ranked first by six. Only one user listed finger based interaction as top choice for performance. However, when it comes to fun and engagement, the vast majority of subjects ranked it as their top choice, whereas device and touch were ranked first only four and one time, respectively. Consequently, rankings are more diverse when users were asked to consider both issues. No clear “winner” can be identified here, and in fact, many users commented that it depends on the task: touch and device based interaction are more appropriate for applications requiring accurate placement and control, whereas finger based interaction was considered as very suitable for gaming applications. Typical comments about the finger-based approach characterized it as “fun” and “the coolest” of all three. However, its handling was also criticized as summarized by one user who described it as “challenging but fun”. When asked about the usefulness of the three approaches for the three tasks of the evaluation, rankings clearly reflect the objective data discussed above. Table 2 shows that all participants ranked touch based interaction first for both selection tasks, but the vast majority voted for the device approach in case of translation. It is somehow surprising though that despite its low performance, finger based interaction was ranked second by eight, seven, and nine users, respectively, in each of the three tasks—another indication that people enjoyed and liked the approach although it is much harder to handle than the other two.

Table 1 Times how often an interface was ranked first, second, and third with respect to performance versus fun and engagement versus both (T touch screen, D device, F finger)

Full size table

Table 2 Times how often an interface was ranked first, second, and third with respect to the individual tasks (T touch screen, D device, F finger)

Full size table

3.2.3 Summary of major findings

Touch screen based interaction achieved the best results in the shortest time in both selection tasks and was rated highest by users in terms of performance. Device based interaction achieved the best results in the shortest time in the translation task and was highly ranked by the users for this purpose. Finger tracking based interaction seems to offer great potential in terms of fun and engagement, because users responded very enthusiastically about it considering these aspects. However, both qualitative as well as objective data suggest a very low performance when trying to perform canonical operations such as selection and translation of virtual objects floating in mid air. This makes it rather questionable if this concept is suitable for interaction, especially for serious applications. Major problems identified include that holding up the device over longer periods of time can be tiresome and especially moving objects over long distances is difficult to control. However, these characteristics are typical for interaction in midair but might not appear if objects have a clear relation to the environment, such as virtual play figures in AR board game, where people can rest their arm on the table and the interaction range is naturally restricted (cf. Fig. 3 left/center vs. right). Another major issue was low tracking performance because people held their fingers too close to the camera. Again, we expect a different experience if the objects are not floating in midair but are directly related to physical objects in the real environment. In addition, we identified that pushing the object from different sides—although originally considered as an intuitive interaction—turned out to be awkward and hard to handle, for example when pushing an object from the right using the left hand. This might be overcome by investigating different versions for gestures that are more natural and easier to control.

4 Interaction with objects related to the real world

4.1 Interaction concepts and tasks

4.1.1 Goal and motivation

In order to further investigate the potential of finger-based interaction and to verify if our above arguments about the manipulation of objects that have a connection to physical ones are true, we set up this second experiment. Because we cannot expect a better performance of finger interaction compared to touch screen and device based due to our results from the first study, our major focus is on game play experience. Consequently, we do our evaluation in a board game scenario. Our major goal with this study is to identify if and what kind of interactions are possible, natural, and useful for achieving an engaging game play. Hence, we set up an experiment comparing different realizations of the canonical interactions translation, scaling, and rotation of two kinds of virtual objects—ones related to the actual game board and ones floating slightly above it (cf. Fig. 3, left and center).

4.1.2 Framework and setup