In this experiment, we aim to evaluate our system in a more realistic scenario and investigate whether the user can achieve effective mutual interactions. The study exhibited symmetric collaboration in which participants worked more equally than the first experiment. Inspired by our previous research [
21], we design a “go and shopping together” task that simulates the situation in which two users walk together side-by-side to a shop to purchase something. This is a physical task collaboratively performed by users which requires users to use gesture cue and share their viewpoint to finish the job in a mobile status. It is suitable for the comprehensive testing of system’s performance.
Task and procedures
The go shopping together experiment was performed in a stationery store, a larger space than the workspace in the first experiment. The workspace consists of various type of products with various exteriors. Before the start of the experiment, we explained the use of our system and the task to each participant. Each participant was given 10 min to practice. The task was to look for a product that could interest both users as a little gift (such as a ceramic craft or plastic decoration).
During the experiment, verbal communication was supported via Internet IP phone call. The participants were free to discuss with each other. The walking user walked around and communicated with the Indoor User, and the latter could request the former to move or perform some operation such as picking up and holding some objects with hands. This task was open-ended, and the only requirement was that the participants must arrive at an agreement to select a product. After the pilot test, we observed that the duration of completion was primarily influenced by the personal preference. Therefore, in this part of the experiment, we did not enforce any time limitation and the task continued until participants find a satisfying object.
At the end of the experiment, participants were asked to fill out a questionnaire. Following this, post-task interviews were conducted. We were primarily interested in user feedback about the operation, language and gestures used in the task, potential application cases and possible opportunities for system modification.
Observation and feedback
In this section, based on experiment results and our observations, we provide a description of how users achieved a “go together feeling” from different aspects and the general user feedback of our system. It includes following five parts:
Aspect 1 Look together: independent viewpoint The results of Q1 indicated that both users could use independent viewing. Users enjoyed a certain degree of freedom, which relaxed the viewpoint restriction in traditional communication systems. When using our system, the indoor user observed the shared remote scenery and looked around independently using the free viewpoint. Our system enhances the immersive feeling of the indoor user because of the ability to act independently. As participants said: “I could look around at will without asking my partner to change the viewpoint, which was convenient”. Another reason for preference and confidence of use is the ability to view the entire scene in the independent viewpoint. One subject said “I saw the entire environment just like I was really there”. For the walking user, without paying additional attention to contorting the indoor user’s viewpoint (which should always be done with traditional video call), he/she could look around independently and enjoy the experience: “…I could spend time looking around independently. It also increased the efficiency.”
Aspect 2 Look together: viewpoint sharing From the results of Q2, we confirmed that although the walking user and the indoor user viewed independently, they could still share focus awareness easily during the communication. This enhanced the co-presence feeling and improved the efficiency of communication. The indoor user obtained the walking user’s viewing direction from the panoramic sense, while his/her own head motion was transmitted to the latter through the head avatar. Sharing mutual attention was used as one of the nonverbal cues, which assisted users in understanding the messages being relayed and helped users quickly join the same scenery and experience a joint attention as commented by the participants: “When my partner found something interesting, I could quickly find the same thing after a quick confirmation of his viewing direction.” Users felt a more accompanying feeling by being aware of each other’s actions: “I liked seeing the head avatar even though sometimes we did not talk or discuss. Knowing my partner’s situation made me feel accompanied”.
Aspect 3 Gesture together In the results of Q3 and Q4, both users gave positive scores. This indicate that users could perform gestures to transmit their intentions and achieve mutual smooth communication. During the communication, users used mutual gesture interaction as a nonverbal body cue. One reason for perceiving the being together feeling from this interaction is the ability to naturally gesture with a side-by-side perspective in the same world as the real collocated situation. As one indoor participant said, “…seeing her making a gesture was vivid and made me feel like she (walking user) was next to me”, and the partner said “I felt the appearance of hands and gestures were intuitive and convincing.” Another reason is, in some cases, gestures provide more accurate instructions and reduce dependence on language description, which improves efficiency. Users often use hand gestures to indicate their interests and to communicate with the other user. As one said “I felt he knew where I was pointing at, so I just said ‘this’ or ‘that’ to identify an object or direction.”, while another commented “…it (pointing assistance) was very fluid and guided me nicely”
We also observed that information transmission from the walking user to the indoor user was graded slightly higher than that from the indoor user to the walking user. After some post-task interviews with the participants, we determined that the difference in information transmission was likely because the walking user could make gestures and actually touch an object to cause a more visual feedback like depressing an object’s surface with fingers.
Aspect 4 Gesture and talk together: gesture-speech cooperative interaction From the result of
gesture rate, we know that mutual gesture interaction included a noticeable proportion of the entire user communication moments. During the communication, we noticed that users usually intended to gesture with a speech communication, especially when the context was related to the environments or ambiance. We defined this as a
Gesture-
Speech Cooperative Interaction. To measure the pattern that gestures used with speech communication, different gestures made with certain types of phrases were recorded (see Table
1). We found that such interactions exhibit the following features:
Table 1
Types of interactions when gesture is used with speech: Gesture-Speech Cooperative Interaction
Both | Direction guidance | Showing with arm | “This way” |
Both | Direct partner’s attention | Pointing with finger | “Look at that” |
Both | Establishing visual annotation | Counting fingers | “One, two, three…” |
Indoor | Instructions related to an object | Pointing with finger (Pointing Cue used) | “Pick up this cup” |
Indoor | Instructions related to adjusting the objects | Showing the palm and turning it over | “Turn to the other side” |
Indoor | Instructions related to the size | Showing approximate size with two hands | “It’s this wide” |
Indoor | Confirmation to walking user | Thumb up/ring gesture | “Yes, it’s good” |
Walking | Query to indoor user | Pick up the object | “Is this correct?” |
Walking | Instructions related to an object | Holding a object and turn it to different sides | “Look at this side” |
1
Gesture and speech worked cooperatively to realize the full interaction.
2
Hand gestures were used as a visible cue that contained the main directive information.
3
Speech was used to draw the recipient’s attention and to indicate the start/stop gestures.
4
Although users did not always keep talking each time a hand gesture was made, the beginning of the gestures was usually accompanied by speech.
5
Speech would be used for the supplementary explanation of gestures. We consider the type in Table
1—line 4 as an example. Users instructed recipients to select an object with a hand gesture (pointing), using speech to explain the following action (pick up). It could be noted that the term ‘that’ in speech explicitly requires the recipient to find something beyond the conversation itself—the information of direction which the speaker instructs; simply listening to it would be inadequate.
General feedback In this experiment, all pairs of participants were able to complete the task successfully and enjoyed the remote communication experience. From the results of Q5 and Q6, we confirmed that both users could receive a co-presence feeling. Users were aware of their partners as the task was being executed and believed that he/she was not alone or secluded, which kept users in close connection. From the results of Q7, we confirmed that users could generally receive common perceptions and experience a “go together” feeling using our system. This was also supported by comments of participants including: “I really enjoyed this collaboration. I was able to feel going together with my partner” and “We could make decisions together just like we were in the same place.” Most participants experienced a co-presence feeling using our system—“I (walking user) found the presence of the head avatar and the hand gestures of the indoor partner to be quite helpful and intuitive, which gave a feeling that my partner was right here with me”, and “I (indoor user) could look around and discuss with my partner, feeling as though I was there going together with my partner”.