Skip to main content
Erschienen in: Journal on Multimodal User Interfaces 1/2021

Open Access 13.07.2020 | Original Paper

Exploring crossmodal perceptual enhancement and integration in a sequence-reproducing task with cognitive priming

verfasst von: Feng Feng, Puhong Li, Tony Stockman

Erschienen in: Journal on Multimodal User Interfaces | Ausgabe 1/2021

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Crossmodal correspondence, a perceptual phenomenon which has been extensively studied in cognitive science, has been shown to play a critical role in people’s information processing performance. However, the evidence has been collected mostly based on strictly-controlled stimuli and displayed in a noise-free environment. In real-world interaction scenarios, background noise may blur crossmodal effects that designers intend to leverage. More seriously, it may induce additional crossmodal effects, which can be mutually exclusive to the intended one, leading to unexpected distractions from the task at hand. In this paper, we report two experiments designed to tackle these problems with cognitive priming techniques. The first experiment examined how to enhance the perception of specific crossmodal stimuli, namely pitch–brightness and pitch–elevation stimuli. The second experiment investigated how people perceive and respond to crossmodal stimuli that were mutually exclusive. Results showed that first, people’s crossmodal perception was affected by cognitive priming, though the effect varies according to the combination of crossmodal stimuli and the types of priming material. Second, when two crossmodal stimuli are mutually exclusive, priming on only the dominant one (Pitch–elevation) lead to improved performance. These results can help inform future design of multisensory systems by presenting details of how to enhance crossmodal information with cognitive priming.
Hinweise

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Understanding how people integrate multimodal information to optimise sensory-motor activity has implications for multisensory system design. The multimodal perceptual phenomenon of crossmodal correspondence (CC) has long been investigated in the cognitive science research field. It refers to the associated perceptual relationship between two or more sensory modalities, from which information perceived through visual, auditory and other sensory channels are combined to form a stable output [41]. For example, an upward visual stimulus perceptually corresponds with a high pitch sound [27, 39]. Multisensory information presented with this regularity enables people to have better perceptual accuracy than that displayed in a reversed polarity, i.e. incongruent correspondence between, for instance, downward visual stimulus with a high pitch sound.
Despite mounting evidence showing that congruent CCs play a positive effect on sensory-motor performance [8, 13, 30], implementations with an effort from Human-computer interaction (HCI) have certain limitations due to the following reasons. On the one hand, an environment that was controlled to maximise the internal validity of the experiment is different from a real-world scenario where interaction activities take place. In this latter scenario, streams of interactive information unavoidably mingle with physical and attentional background noise, which may act as visual, auditory or multisensory distractors. As a result, the CCs observed from the strictly-controlled, noise-free experiments may not function well when applied in a real-world scenario. On the other hand, many cases of crossmodal interaction, such as spatial localisation [19], graph sonification [3], and motor skill learning [17], involve consistent sensory-motor responses with continuous, and graded crossmodal feedback. We lack empirical knowledge on how such crossmodal stimuli influence interaction that requires consistent sensory-motor engagement.
Thirdly, real-world interaction requires not only a bottom-up perception of the crossmodal feedback, but top-down perception influenced by subjective knowledge, experiences and specific interaction situations [2, 22]. For example, in a target-searching task, the crossmodal association between colour and euclidean distance, i.e. an increase in R value (RGB scheme) associated with a decrease in the distance towards the target, was occasionally perceived as, on a semantic level, an association between a colour signifier and motion correctness, i.e. red indicated wrong moves, therefore associated with moving away from the target [16]. These two crossmodal associations raised a certain level of confusion for some participants, due to the mutually exclusive congruency.
To our knowledge, there has been little research into presenting crossmodal information that can be unequivocally perceived by participants without requiring extra cognitive effort. More specifically, we lack an understanding of whether the perception of specific crossmodal information can be enhanced or inhibited subliminally for an interaction task. Furthermore, when information contains mutually exclusive CCs (i.e. the perception of the congruency of one CC excludes the perception of the congruency of the other), we lack understanding of how people integrate such information.
Interestingly, the role of subliminal information has drawn increased attention in the HCI community, especially in the field of pervasive technology and mindless computing [1, 4, 37]. Cognitive priming, used for conveying subliminal information, refers to a technique which enhances cognitive performance without overloading cognitive channels. [4, 12, 26]. It showed an enhancement effect on various cognitive functions including perception [4, 37], judgement [21], and affectiveness [26] over the course of interaction. Given the effectiveness of this technique, in this paper, we aim to investigate two research questions using cognitive priming:
RQ1: Whether people’s crossmodal perception can be enhanced by cognitive priming, and if it can be, to what extent it modulates small and fast sensory-motor responses in an interactive task?
RQ2: How do people integrate crossmodal information in which two CCs are mutually exclusive? Furthermore, how, if at all, does the integration of such information change in the presence or absence of cognitive priming?
We addressed these two questions by using the approach of cognitive priming as a manipulation factor [21, 25], with the purpose of engaging people with two CC pairs respectively, and investigating the influence of this upon subsequent interaction behaviour. The detailed research questions and the investigation scope is presented in Sect. 3. Section 4 presents our investigation approach and experimental design. This is followed by the results of the experiments and a discussion in Sects. 5 and 6 respectively.
The contributions of the present study include: firstly, a systematic investigation of two CCs, pitch–brightness and pitch–elevation, with graded stimuli in interactive tasks rather than two polarised stimuli for binary choices; secondly, an evaluation of the role of cognitive priming in crossmodal perceptual enhancement; thirdly, the paper employs two types of priming materials and investigates their effect on people’s sensory-motor performance. Lastly, we explored how people integrate mutually exclusive CCs under cognitive priming.

2.1 A view from cognition research and behavioural studies

Our perception of certain physical features or attributes are naturally associated with each other, such as the auditory perception of an increase and decrease in pitch is associated with the visual or haptic perception of rising and falling in vertical position [14, 35, 42]. These kinds of associations can be found in other cross-sensory features as well, including the sound-brightness association, sound-quantity association etc. [14, 18, 39, 46]. As such, the term crossmodal correspondence (CC) in this paper refers to ‘a compatibility effect between attributes or dimensions of a stimulus (i.e., an object or event) in different sensory modalities (be they redundant or not)’ [41]. Accumulating behavioural studies show that when values of these associated attributes are combined congruently, for example, increased pitch combined with increased vertical elevation, the accuracy and efficiency of people’s sensory-motor response will be improved (for a detailed review, see [41]). However, if the attributes are combined incongruently (i.e. the direction of change of one of the involved attributes is reversed, for instance, when an increased pitch is associated with downward visual stimuli), sensory-motor responses will be disturbed.
Some research studies have pointed out that many CCs may occur due to the repeated exposure to the natural environment, from which people learned perceptual regularities of associated crossmodal physical features [41, 43]. Praise et al. empirically investigated the statistical mapping between sound frequency and the perceived elevation of the sound source [35]. With a wearable two-directional headphone, a large sample of natural sound recordings was collected by participants who move freely indoors and outdoors. The analysis of recordings reveals a consistent mapping between the sound frequency and the physical elevation of the source of the sound. Furthermore, when participants were asked to do a sound localisation task, researchers obtained a consistent observation concerning the frequency-elevation correspondence. Even though the correlation between the statistical environmental mapping and the perceptual judgement performance cannot be fully explained for now, the above investigation provides experimental ground for an ecological linkage between CCs and daily experience [2, 43].
Recent research also suggests that perception of crossmodal information can be influenced by task-irrelevant contextual features [9, 47]. Walker and Walker conducted an experiment to address the relative perception of crossmodal feature values. In their experiment, participants were presented with six circles with different luminance. Three of the circles were brighter than the background colour, and the other three appeared darker. Participants were asked to classify whether each of the circles was brighter or darker than the background, and confirm their answer by pressing one of two differently-sized keys, though participants neither visually nor haptically aware of the difference in size. Results showed that participants classify brighter circles more quickly with the smaller key in hand, and classified darker circles more rapidly with the larger key in hand. This finding suggests that CCs are not always absolute, and that sensory-motor responses based on them could be influenced subliminally by contextual stimuli, which in this case, the haptic perception of the size of the keys [47].
Moreover, the relativity of crossmodal perception points to further investigation, with a focus on intermediate crossmodal values that sit between pairs of polarised stimuli [9, 42]. Without fully understanding the sequential perception of intermediate values, i.e. graded crossmodal stimuli, the implementation of CCs in real-world scenarios remains limited. After all, a variety of crossmodal interaction cases involve consistent perception-action loops, either for spatial localization [19], graph sonification [3], or motor skill learning [17].

2.2 A view from HCI design implementation

From an HCI perspective, we are aware of two studies that have directly investigated the design possibilities with graded crossmodal stimuli. Metatla et al. [29] evaluated interaction performance with different crossmodal congruency levels between shape, size and elevation. They adopted a game mechanism, using a tablet to display a sequence of visual, auditory, or visual–auditory stimuli. Participants were asked to tap in the perceived sequence as quickly and accurately as possible. Results showed that the visual condition produced better performance than both the auditory and visual–auditory conditions. However, in the bimodal condition where two CCs co-existed, participants tended to rely on pitch–elevation rather than visual stimuli as the primary CC. This result implies that different CCs may have a different level of influence on interactive performance.
In another experiment, Feng and Stockman tried to replicate the crossmodal effect through augmented physical features of a tangible object on an interactive tabletop [16]. Participants were required to discover a hidden target by moving the object around the table, and by observing the concurrent crossmodal feedback to determine the distance to the target. Visual, auditory and haptic modalities were combined into unimodal, bimodal and trimodal feedback. Results revealed that more accurate movement and efficient corrections were achieved with bimodal and trimodal feedback. However, in the crossmodal condition which implemented the mapping between colour and euclidean distance, i.e. an increase in R value (RGB scheme) associated with a decrease in the distance towards the target, one-sixth participants appeared to have opposite mapping polarity. This was due to their familiarity with a different crossmodal association, the colour signifier and motion correctness, i.e. red indicated wrong moves, therefore associated with moving away from the target. These two associations happened to have a mutually exclusive mapping polarity. This result suggested that people’s sensory-motor reaction may not solely be determined by crossmodal stimuli in a bottom-up manner, but sometimes rather influenced by previous experience and current interaction goals [2, 22].
From a human-centred design perspective, researchers have shown that CCs can be implicitly activated in a natural interactive scenario. Bakker et al. applied a human-centred iterative design approach during several music education workshops, where groups of children were encouraged to use body movements to express sound features. Several auditory-haptic crossmodal associations have been identified from self-generated movements, such as increased volume associated with increased speed of movements, as well as rising postures [5]. When implemented these identified CCs on hand-held musical interfaces and applied to a subsequent music learning session, children showed improved music reproduction performance. However, we do not have enough empirical evidence to determine whether the improved performance was due to self-activated crossmodal mappings or due to an improved familiarity produced by a series of iterative design activities.

2.3 Subliminal cueing and cognitive priming in HCI

Priming, in the fields of cognitive and social psychology, generally refer to an experimental technique whereby exposure to one stimulus influences a response to a subsequent stimulus, without conscious guidance or intention [6]. It includes different priming mechanisms: the conceptual priming activate an internal mental representation in one context in such a way that the participant does not realise the relation between that activation and the later influence in an unrelated task [6, 15]; the perceptual priming shares the same activation mechanism, but the contextual features of priming have direct indications for later tasks [6, 37].
In the field of HCI, researchers applied different priming techniques to facilitate interaction [1, 25] and enhance cognitive performance [12, 21, 26]. Two major categories are implemented with empirical investigation. The technique of using subliminal stimuli to trigger fast and automatic responses is referred to as subliminal cueing, which adopts strict experimental control over the factors of cueing time and content. The cueing time is less than 50 ms, within which the perception is assumed to be subliminal, and the cueing is commonly an external stimulus presented as a visual image or geometric shape, which exposes information that would appear during subsequent tasks. This technique was inherited from the masking paradigm from experimental psychology, for the purpose of investigating the effect of priming on selection behaviour [4, 37]. Thus subliminal cueing is also known as subliminal visual masking. For example, Aranyi et al. conducted an experiment to investigate the effect of masked indices on subsequent item selection behaviour in a virtual environment. Results showed that a short-lived impact (within one second) of masked cues was indeed influence participants selections subliminally [4]. Recently, a similar masking paradigm with three types of visual cues has been applied in a mobile application. Evidence showed that the priming stimuli presented with a time window of 17 ms were not fully subliminal [37]. In summary, investigations of cueing effects using the masking paradigm that employing different time windows lead to inconsistent observations. The reliability of the results needs to be further tested. Therefore, this cueing technique is not suitable for tackling the present research question: understanding whether crossmodal perception can be enhanced subliminally for a specific interaction task.
The technique of cognitive priming has been applied with priming material either in the form of visual images, video, or textual stories [21, 25, 26]. The material does not always have a direct relationship with the interaction tasks, and it can be presented either before or during interaction without restriction on time windows. The purpose of this technique is to enhance cognitive function or render affective states for specific interaction tasks. Harrison et al. employed text-based stories as the priming material to investigate the influence on subsequent visual judgment performance on different types of charts. Results showed that positive priming improved participants’ visual judgment accuracy [21]. In another case, Lewis et al. applied affective computational priming in the form of background pictures on a creativity-support design tool [26]. With the background priming, participants showed an improvement in the quality but not the quantity of their design. The effect of cognitive priming was observed even after the priming has been removed in the latter design tasks.
In conclusion, in the existing HCI literature, the reliability of applying the subliminal cueing needs to be further tested. In comparison, cognitive priming has been used to invoke previous experience and mental states for goal-oriented tasks. Following this line of investigation, in the present study, we used a cognitive priming technique to invoke crossmodal experience and measured its effect on subsequent sensory-motor responses.
Table 1
Experimental design for experiment 1
https://static-content.springer.com/image/art%3A10.1007%2Fs12193-020-00326-y/MediaObjects/12193_2020_326_Tab1_HTML.png
Colour code applied in this paper: green indicate perceptual priming groups and red indicate conceptual priming groups

3 Research aim and scope

Previous studies have investigated crossmodal effects by manipulating physical values that were varied either congruently or incongruently. While in most cases, CCs have been tested with polarised values such as high and low in pitch or big and small in size [9, 41]. However, crossmodal stimuli switching between two polarised values are rarely encountered in everyday interactive environments. More often, the question is first, whether and how people attend to available crossmodal information which has graded feature values; and second, how people integrate several crossmodal information streams that are mutually exclusive? Specifically, the present paper seeks to address the following research questions:
RQ1: whether people’s crossmodal perception can be enhanced by cognitive priming, and if it can be, to what extent it modulates small and fast sensory-motor responses in an interactive task? Experiment 1 was designed to tackle this question by using different types of priming material as a manipulation factor. We postulate that if crossmodal perception cannot be enhanced by subliminal priming, we would observe a similar level of performance across conditions. However, if the crossmodal perception indeed can be enhanced, there would be a perceptual reinforcement of a particular CC. Based on previous empirical findings [21, 25], we hypothesise that an improved performance should be observed in conditions that include cognitive priming.
RQ2: How do people integrate crossmodal information in which two CCs are mutually exclusive? Furthermore, how, if at all, does the integration of such information change in the presence or absence of cognitive priming? Experiment 2 was designed to tackle this question by employing cognitive priming. We could deduce the integration process by observing whether the perception and corresponding task performance with primed crossmodal information is enhanced or reduced. If the perception of a primed stimulus is distracted by another unprimed crossmodal stimulus, it is likely that the crossmodal cues have all been taken into account additively [11]; thus the subsequent performance should be lowered by the distractor cue. If the perception of the primed stimulus is enhanced by the priming material, attention is likely drawn to the correlated crossmodal information selectively [11], i.e. people are less susceptible to the distractor cue, the performance in the subsequent task should be better.
There are two CCs and two types of priming materials involved. Based on previous empirical findings concerning CCs, we chose two pairs of crossmodal stimuli that have been frequently referenced in the literature for our experiments: the pitch–brightness and pitch–elevation crossmodal mappings [14, 35, 41]. We also introduced two types of priming as the manipulation factor. One type used physical features of brightness, elevation and pitch, to imply the CCs, namely perceptual priming; and another type used videos in which the brightness, elevation and pitch were implied with meaningful contents, namely the conceptual priming. Materials used in the conceptual priming do not have any direct connection with the crossmodal stimuli employed in subsequent tasks, while perceptual priming emphasises the physical values of the stimuli which contain similarities to stimuli presented later. Specifically, the conceptual priming context uses naturalistic sound and corresponding video clips to represent the correspondence between auditory pitch and visual brightness, as well as auditory pitch and visual elevation [2, 35]; while perceptual priming uses musical notes with a set of congruent visual icons to implement the same CCs. Further details about the priming materials will be explained in the apparatus section of the paper.

4 Methods

The same design and procedure were used for two experiments. In this section, we report the general experimental protocol, followed with experiment 1. Then in Sect. 6, we explain the change in the manipulation factor and report experiment 2.

4.1 Experimental design

The experiments have to be a between-subjects design to eliminate the learning effect of priming. There were two independent variables. The first was the type of priming: the perceptual priming (P-prime) and the conceptual priming (C-prime). The second variable was the two crossmodal mappings, the pitch–brightness correspondence and the pitch–elevation correspondence. The two experiments have six conditions with four manipulation groups and two control groups. The manipulation groups were: P-prime and C-prime on pitch–brightness mapping respectively (condition 1, 2), the P-prime and C-prime on another crossmodal mapping, pitch–elevation respectively (condition 3, 4). The control groups were pitch–brightness mapping without priming (condition 5) and the pitch–elevation mapping without priming (condition 6). The manipulation factors and experimental conditions are listed in Table 1.

4.2 Task and general procedure

The task in both experiments is learning and reproducing crossmodal sequences. The sample sequence was made up of two parts, the auditory melody with five-pitch values equal to musical notes C, D, E, G, A; and the visual counterpart with either five levels of brightness or vertical position displayed in five circles on the screen (Fig. 1 Experiment 1a). The detailed information about crossmodal mappings is presented in the experimental platform section. During each trial, a randomly generated sample sequence was displayed once, which has melody matched with the concurrent visual stimuli. Participants are required to reproduce the melody by clicking the circles on the screen to reproduce the melodic sequence.
The general procedure is as follows. First, participants were given a consent form with details of the experiment to read and sign. Then they were introduced to the priming session based on which group they were assigned. Participants who were in manipulation groups were told to watch a short video or an animation, while control groups had no priming session and simply moved to the next session directly (Fig. 2). After priming, participants move to the warm-up session to get familiar with the task procedure and the hardware setup. To ensure everyone received practice without overtraining, they were instructed to practice no less than twice and no more than six times. The crossmodal stimuli used in the warm-up session were different from those in the task session, for the purpose of minimising learning effects. Specifically, the sequences in the warm-up session used three-note sequences, while the task session used five-notes sequences. After practice, participants moved to the task session, which contained 16 trials (Fig. 2). All the sessions were programmed in a single piece of software, participants moved to the following section by pressing a “next” button on the screen.
After participants finished all the trials, they were asked to complete a post-experiment questionnaire. The first section of this questionnaire collected participants’ demographic information, music training history, and whether they had any visual or auditory disorder recently. The second section collected subjective evaluation data, including their interpretation of the priming material, and the interaction strategy they used during the game. The entire experiment lasted for 7 to 13 minutes.

4.3 Apparatus

4.3.1 Conceptual priming material

Early studies claim that our experience of nature can form the basis of a CC. For example, ‘thin, small, light and airy’ things tend to be found at relatively high altitude, compared with ‘dark, heavy, gloomy’ things that tend to be observed near or on the ground [38]. More recently, the results of empirical investigations [35, 40, 41] suggest that some CCs do indeed arise from the experience of correlated physical properties in the natural world. Hence, the conceptual priming in the present study was designed based on naturalistic sound and visual representations. Instead of using an image or picture as the priming material, which contains only visual information [4, 36], we used video or animation to display crossmodal information. For the pitch–brightness crossmodal mapping, we composed sound clips of birds singing and land-based carnivores roaring to represent high pitch sound and low pitch sound respectively; and in the visual mode, a mute time-lapse video of night-to-day was used to represent the change of brightness (Fig. 3a). This approach was developed based upon the sensory experience that birds are usually active in the daylight and land-based carnivores are usually active at night, thus high pitch sounds tend to be associated with brightness and low pitch sounds with darkness. The composed video are available to review from (Hyperlink:​ Pitch-brightness) and from (Hyperlink:​ Pitch-elevation).
For the pitch–elevation crossmodal correspondence, we used the same sound clips to represent the high and low pitch, and composed a mute video of birds flying in the sky and carnivores running on the ground to represent the vertically high or low elevation (Fig. 3b). The sensory experience that we build upon is that high pitch sound corresponds to high in the sky and low pitch sound corresponds to being on the ground [38, 41].

4.3.2 Perceptual priming material

Based on cognitive studies of crossmodal correspondence [35, 41], we used piano sample sounds to represent pitch values. For the priming of pitch–brightness, the notes were presented in the order C–D–E–G–A–A–G–E–D–C, and were paired with a circle which has synchronized changing in brightness from dark to bright and back to dark. The same sound samples with the same playing sequence were used in the priming for the pitch–elevation matching, with a synchronized visual display of a circle moving from the bottom to the top on the same background.

4.3.3 Experimental platform

The platform for experiment 1 has the pitch–brightness and the pitch–elevation mapping arranged separately in different conditions. Both of the stimuli are congruent without interfering with each other, e.g. increasing pitch corresponded with upward position or brighter visual appearance (Fig. 1, Experiment 1a and 1b). The audio was presented at a constant range between 50-55dB (displayed through Mac Pro device, which was tested for comfortable hearing) with a WH-CH500 headphone.
As a screen-based study [28, 45], the HSL (hue–saturation–lightness) colour scheme was used in the visual representation. We keep the H = 250, S = 0, and the lightness value ranging from 12 to 100 with the interval of 22. The corresponding auditory stimuli were notes C, D, E, G, A. Each of the five notes lasted for 300ms with 100ms intervals in-between. All the elements were displayed on a black background (H = 0, S = 100, L = 0) with a screen resolution of 2560*1600 dpi. In order to avoid the potential confound of stimulus-response compatibility effect [39] (e.g. high pitch naturally correlated to up/right and low pitch to down/left), instead of a linear arrangement, the circles were organized in a pentagon formation (Fig. 1a), during each trial a randomly generated sequence is displayed. The position of each circle placed in the pentagon was also randomised.

4.4 Measurements

Two depended variables were time intervals between each input and task error rate. Time intervals were counted as the length of time between adjacent clicks. The difference in time intervals between the manipulation groups and the baseline groups was calculated as an indicator of the priming effect on sensory-motor reaction efficiency. The interaction error rate was calculated by dividing the number of wrongly produced notes by the total number of sample notes.
In addition, the qualitative data that was included in the analysis was as follows: (1) the subjective interpretations of priming material collected from the post-experiment questionnaire, and (2) Plots of the overall task accuracy which was calculated not by summing the number of wrongly produced notes, but by the number of incorrectly aligned CCs during reproduction. For example, if one step in a sample sequence change from note C to note E with the brightness going up 2 levels, both the reproduced move from C to E with brightness going up 2 levels and D to E with brightness going up 1 level would be counted as correct, since the increased pitch value corresponded with brightened visual display. However, the step from A to E with brightness level going down would be counted as an incorrect step, as the decreased pitch sound should not be aligned with the increased brightness level.

4.5 Hypotheses

Since the pitch–brightness and pitch–elevation correspondences have been repeatedly evaluated in the literature as working for most people [14, 18, 39], we assumed that most participants would be able to do the task without perceptual discrepancy on those crossmodal stimuli. Thus we hypothesized that:
H1: Following previous studies [25, 26, 32], we predict that people’s crossmodal perception can be enhanced by cognitive priming. Specifically, that both the P-prime and the C-prime for the pitch–brightness mapping and the pitch–elevation mapping will support faster sensory-motor responses and produce better task accuracy than would be seen in the two baseline groups.
H2: The type of priming will have different sensory-motor modulation effects on task performance. However, we chose not to predict the direction of the difference.
Table 2
Statistical analysis based on time intervals in experiment 1
 
F(2, 3837)
p
r
Note
Pitch–brightness mapping
76.649
.000
.17
P-prime (517.57 ms) < control group (555.04 ms), C-prime (500.70 ms) < control group
Pitch–elevation mapping
22.246
.000
.10
P-prime (578.43 ms) < control group (558.48 ms), C-prime (545.98 ms) < control group

5 Experiment 1: study on congruent crossmodal mappings

One hundred and twenty participants (67 male, 53 female, aged 18–55 years, mean = 26.17, SD = 5.56) were involved in the first experiment. All participants confirmed that they have no visual and auditory disorders before trials (after correction). Participants were balanced across groups according to their age and gender. In order not to introduce a potential confounding factor of cultural background on CC perception or interpretation, a mixture of volunteers of different nationalities, professions and music training experience were recruited through the universities’ e-mail list and social network sites.

5.1 Results

A Kolmogorov-Smirnov test showed that the time intervals data can be assumed to be normally distributed. We ran a one-way ANOVA to compare the results of P-prime, C-prime and baseline conditions on pitch–brightness mapping, as well as the P-prime, C-prime and the baseline condition on pitch–elevation mapping. Fisher’s LSD test was used for posthoc tests of main effects. We used a confidence level of \(\alpha \) = 0.05 for the tests.

5.1.1 Results on time intervals between inputs of a sequence

During experimental trials, two of the priming groups with the pitch–brightness mapping had statistically significantly smaller time intervals between clicks than the control group, and the C-prime group also produced smaller time intervals than the P-prime group (Fig. 4). However, for the pitch–elevation mapping, only the C-prime groups had statistically significant faster inputs than the control group, while the P-prime group produced statistically significant slower inputs than the control group (Fig. 4). The detailed statistical results are listed in Table 2

5.1.2 Results on task error rate

For the pitch–brightness mapping, C-prime group (11.13%) produced more accurate sequences than the P-prime group (14.25%) and baseline group (14.44%) (Fig. 5a (pitch–brightness)). For the pitch–elevation mapping, the P-prime group (21.44%) produced a higher error rate than the control group (14.50%) and C-prime group (13.88%) as shown in Fig. 5a (pitch–elevation).

5.2 Discussion

The hypothesis H1 that crossmodal perceptual preference can be induced by cognitive priming has been confirmed for the pitch–brightness correspondence. With the pitch–elevation correspondence however, only the C-prime group showed a positive effect on task performance, while the P-prime group had slower motor responses (Fig. 4 (up-right chart)) and poorer accuracy (Fig. 5a (pitch–elevation)).
These inconsistent outcomes between the two crossmodal correspondences may result from different interaction strategies afforded by the task paradigm. Previous research mainly used a speeded classification paradigm [41], which required participants to react to stimuli with a one-shot key press. This process requires little working memory or cognitive resource. In the current experiment, however, participants had a clear interaction goal and engaged with a sequence of input actions. In this regard, the P-prime, which primed a direct association between the auditory and visual stimuli, possibly functioned as explicit instruction, and encouraged participants to recall and check during the interaction constantly. This extra cognitive load could explain the increase in response time. Subjective evaluation support this explanation, which will be discussed in the latter part of the section.
Surprisingly, such degraded performance was less evident in the P-prime group with the pitch–brightness correspondence. According to self-report, the interaction strategy used by participants might compensate the inferior effect. In responses to a question about interaction strategies for reproducing the sequences, 67\(\%\) of participants in the pitch–brightness condition reflected that they used visual tracking strategy to follow the flash pattern of the sequence. In contrast, none of the participants in the pitch–elevation mapping group recalled they had used any strategy. Indeed, the visual pattern for the pitch–elevation correspondence has a vertical arrangement, and the moving path of the sequence is overlapped. While the visual pattern for the pitch–brightness correspondence has a planar arrangement, and the moving path was more likely to be perceived as a trackable trajectory [20]. Since visual perception is more sensitive to spatial arrangement, while auditory perception is more sensitive to temporal information, the combination of a 2-D visual stimulus with a 1-D auditory stimulus can produce a multisensory enhancement effect [24]. As a result, the time delay and the error were less salient in the conditions with a pitch–brightness mapping.
The hypothesis H2 that priming types will have different effects on sensory-motor modulation has been confirmed. Results showed that the C-prime condition did improve participant’s crossmodal perception, in both the pitch–brightness and the pitch–elevation correspondences, which in turn supported faster motor response as well as improved accuracy of crossmodal sequence reproduction. In comparison, the P-prime condition had a positive effect only on the pitch–brightness crossmodal mapping. This effect may be due to the compensatory interaction strategy afforded by the task paradigm. Previous studies have shown that the priming process without conscious awareness can have a positive effect on people’s affective reaction and decision making [25, 32], thus we deduce that the perceptual enhancement with C-priming in our case could also have functioned in a subliminal way. It enabled participants to have a stronger perceptual alignment with a specific CC, which facilitated fast and instantaneous motor response. While the P-prime material may rise to the level of conscious awareness during the priming process, thus involving working memory allocation while doing the task, which lowered the overall performance to some extent.
A supportive finding of the above assumption can be gleaned from the post-experimental questionnaire. Most of the participants in the C-priming groups were not aware of the CC presented during priming, and could not consciously recall the correlation between the priming and the interaction task. In all of the 40 subjective interpretations from the C-prime groups (2 C-prime groups with 20 participants in each), only 2 participants reported that ‘it was about how the sound changes with visual information’, and that ‘they are there to focus your concentration on the screen and the sound’. The other 28 participants’ answers were exclusively focused on the detailed contents of the prime context. Such as ‘It’s showing the sunrise’ for the pitch–brightness priming and ‘flying birds, running animal’ for the pitch–elevation priming. Meanwhile, only 4 of the 40 participants rated that the C-prime was helpful for the interaction task. In contrast, 36 of 40 participants in the P-priming groups were fully aware of the purpose of the priming. One of the typical answers, for instance, was that ‘information that helps you prepare for the game’ for the pitch–brightness priming, and that ‘different notes of music vertically spaced’ for the pitch–elevation priming. 36 of the 40 participants rated that the priming was helpful. From the usability perspective, this evaluation can explain the assumption of extra cognitive load that was discussed for the hypothesis H1.
Combining participants’ behavioural data with their subjective rating, it can be confirmed that the C-prime subliminally enhanced crossmodal perception and led to better task performance. This fact also explains why the C-priming was not considered helpful by most of the participants. In comparison, the purpose of the P-priming was recognised in most cases and thus tended to be acknowledged to be helpful for task completion, however, the actual behavioural data of the P-primed participants pointed to the opposite.
To better understand the quantitative effects of the priming techniques on subsequent goal-oriented tasks, participants’ performance accuracy based on crossmodal alinement, as explained in Sect. 4.4, was plotted on two temporal scales: the within-trial performance and between-trial performance (Fig. 6 Experiment 1). Along the horizontal axis, which represents the steps within trials, the P-prime groups produced more accurate moves in the first two steps and were more prone to error in the last two steps. In comparison, the C-prime group and the control group do not show this pattern. These observations reflect that participants in the P-prime group relied more on working memory capacity than on crossmodal congruency. This observation is consistent with objective behavioural data as discussed previously. Along the vertical axis, which represents the between-trial performance, no improvement can be observed in the later trials in all groups. In other words, the data did not reflect learning effect between trials.
Table 3
Experimental design for experiment 2
https://static-content.springer.com/image/art%3A10.1007%2Fs12193-020-00326-y/MediaObjects/12193_2020_326_Tab3_HTML.png

6 Experiment 2: study on mutually exclusive crossmodal mappings

One hundred and twenty participants (62 male, 58 female, age 18–35 years, mean = 26.48, SD = 4.58) were recruited for the second experiment, with 20 participants in each condition. All the participants were newly recruited for experiment 2, and randomised in the same way as in experiment 1.
This experiment was designed to investigate the second question: RQ2: How do people integrate crossmodal information in which two CCs are combined where the perception of the congruency of one CC excludes the perception of the congruency of the other? Furthermore, how, if at all, does the integration of such information change in the presence or absence of cognitive priming?
Participants in all conditions were introduced to the interface which contained 2 crossmodal mappings arranged in an incongruent manner (Fig. 1c). Participants in conditions 1 and 2 were primed in relation to the pitch–brightness correspondence, which was applied in the subsequent task but had pitch–elevation correspondence as a distractor. Participants in conditions 3 and 4 were primed in relation to the pitch–elevation correspondence, with the pitch–brightness correspondence as the distractor (see the experimental design in Table 3). In this way, participants in each of the manipulation groups can only be perceptually consistent with one of the CCs, but can not be consistent with both.
H3: For the 2 CCs arranged in a contradicted manner, the primed groups will integrate crossmodal information in a selective way, while the control groups will do in an additive way [11]. Specifically, the priming groups will produce better performance, e.g. faster motor response and lower error rates than the control groups.

6.1 Results

A Kolmogorov-Smirnov test showed that the data set can be assumed to be normally distributed. The one-way ANOVA was used as the statistical method with a confidence level of \(\alpha \) = 0.05.

6.1.1 Results on time intervals between inputs of a sequence

When primed on the pitch–brightness correspondence with pitch–elevation as an interaction distractor, there was no significant difference between primed groups and the control group, and there was no significant difference between the two types of priming (Table 2, and Fig. 4 (bottom-left panel)). When primed on the pitch–elevation correspondence with pitch–brightness as an interaction distractor, both the primed groups performed significantly better than the control group. There was no significant difference between the two types of priming (Table 4, and Fig. 4 (bottom-right panel)).

6.1.2 Results on task error rate

With the CCs arranged incongruently, when primed on the pitch–brightness correspondence but distracted with pitch–elevation during the interaction, the C-prime group (17.88%) produced a slightly better performance in the error rate than the P-prime group (22.06%), and the control group (20.13%) (Fig. 5b pitch–brightness). When primed on the pitch–elevation correspondence with the pitch–brightness as the distractor, the P-prime group (13.69%) showed a lower error rate than the C-prime group (17.19%) and the control group (23.94%) (Fig. 5b pitch–elevation).

6.2 Discussion

Hypothesis H3 has been confirmed for priming on the pitch–elevation mapping but rejected for the priming on pitch–brightness mapping. The two priming groups and the control group for the pitch–brightness correspondence have no significant difference in terms of motor response speed (Fig. 4 (lower-left)), and there was no obvious difference between the primed groups and the control group in terms of the accuracy (Fig. 5b). The other two priming groups for the pitch–elevation correspondence had significantly better performance than the control group in terms of both the motor response speed and sequence re-producing accuracy.
One explanation for this fact may be that the pitch–elevation correspondence may have stronger perceptual weight [11] than the pitch–brightness correspondence. The perceptual experience of the pitch–elevation correspondence may be encountered more frequently in daily interactions than the pitch–brightness correspondence, thus the neural response to pitch–elevation stimuli may be stronger than that for the pitch–brightness correspondence [7, 44]. This crossmodal perceptual weighting may not be easy to observe in situations where crossmodal stimuli are isolated from one another, such as in the case of experiment 1 and in many classification-based tasks employed in previous studies [10, 14, 41]. While in the situation of experiment 2, where crossmodal stimuli were overlapped and incongruent, priming on the relatively stronger CC, pitch–elevation correspondence in this case, enabled participants weighting the dominant stimulus selectively in the subsequent activity. However, priming on the less dominant CC, the pitch–brightness correspondence in this case, seems to have little or no effect on modulating subsequent activity. As a result, only those groups primed on the pitch–elevation correspondence produced improved performance. Thus we postulate that crossmodal stimuli integration may occur in a selective manner rather than an additive manner in the presence of priming [23].
Following the discussion in experiment 1, to observe how the priming technique influenced performance in an incongruent crossmodal situation, the crossmodal alignment accuracy of participants has been plotted in Fig. 6 (right). The within-trial comparison shows that the P-priming groups produced more accurate moves at the first few steps in the sequences. Compared with the P-priming groups in experiment 1 (Fig. 6 left), a similar pattern appeared regardless of how crossmodal stimuli were combined. Based on this observation, we can deduce that participants presented with explicit P-priming material tend to do the task in a way that involved more working memory capacity. Different from the situation in experiment 1, the extra attention allocation made participants less susceptible to perceptual distractions in experiment 2. The between-trial comparison shows that participants in the C-priming groups produced better performance in the latter half of total trials than the earlier half. Comparing with the C-priming groups in experiment 1, which shows better accuracy in the earlier trials, it is plausible to attribute the improvement produced in experiment 2 to the practice effect more than to the priming effect. However, the plot for the control groups for experiment 2 does not have a salient observable difference between earlier trials and later trials due to practice. This observation could indicate two things. Firstly, the C-priming do play a part in modulating crossmodal perception, but the effect is not immediate. Secondly, without priming, participants are susceptible to the crossmodal distractor that happened to share the same crossmodal perceptual channel. Therefore, in the absence of priming, the integration of mutually exclusive crossmodal stimuli more likely happened in an additive manner than a selective manner [11].
Table 4
Statistical analysis based on time intervals in experiment 2
 
F(2, 3837)
p
r
Note
Prime on pitch–brightness with pitch–elevation as a distractor
.886
.412
.000
P-prime (537.17 ms) < control group (492.21 ms), C-prime (527.74 ms) < control group
Prime on pitch–elevation with pitch–brightness as a distractor
23.145
.000
.11
P-prime (494.23 ms) < control group (525.79 ms), C-prime (506.54 ms) < control group

7 General discussion

The present study examined participants’ crossmodal perception enhancement, as well as the crossmodal integration process in the conditions with and without cognitive priming. The first contribution, to the best of our knowledge, is a systematic investigation of the effect of two CCs, pitch–brightness and pitch–elevation, on an interactive task that requires consistent input behaviour. Secondly, this study introduced and evaluated the priming technique as a way to enhance crossmodal perception for interactive tasks. Thirdly, rooted in cognitive and social psychology, this study elucidates two types of priming materials, i.e. conceptual priming and perceptual priming, and investigated their effect on people’s sensory-motor performance; and last, through cognitive priming technique, this study explored how people integrate two CCs that happened to be mutually exclusive during the interaction.
In general, the results of experiment 1 revealed that when two visual–auditory CCs were isolated and did not interfere with one another, conceptual priming was successful in enhancing crossmodal perception, which led to faster motor responses and improved task accuracy. In contrast, perceptual priming operated in a more explicit manner. With goal-oriented tasks, perceptual priming appeared to function as an instruction, which caused extra cognitive resource to be allocated to the recall and comparison of the primed cue and the task stimuli. This process diminished the task performance, regardless of the fact that most participants regarded the perceptual priming as being helpful. The results of experiment 2 revealed that in the situation where two CCs have mutually exclusive congruency, the priming technique has little or no effect on the less dominant CC, i.e. pitch–brightness in this case. With the more dominant CC, i.e. pitch–elevation in this case, explicit priming enhanced perception and led to faster motor responses and improved accuracy.
Extending previous studies on CCs with graded stimuli [29, 47], our study further revealed perceptual phenomena that have not been previously observed with polarised crossmodal values. Individuals seem to possess varying perceptual weights regarding different CCs. These perceptual weights can be dynamically enhanced or inhibited by different types of priming and by the way crossmodal information is organised. Moreover, we observed that priming on the dominant CC will strengthen the perceptual response on that crossmodal stimulus, while priming on the less dominant CC will have little or no facilitatory effect on fast and consecutive motor responses. Future studies are needed, however, to verify and further explore the impact of multiple CCs in interactive environments, with either continuous feedback [16] or graded feedback. We envision the outcomes of this line of research being applied to the scenario where information needs to be overlaid in augmented reality or virtual reality, or in situations demanding high cognitive load and sophisticated interactive actions.
One of the limitations of the present study is the small effect of the statistical analysis. The small sample size involved in each experimental condition may be one of the reasons. Further verification with larger sample sizes should be conducted in the future. Another limitation of the present investigation is that it does not indicate whether the cognitive priming effect and the perceptual weighting phenomenon would generalise to other CCs. There are various crossmodal feature values with different levels of intensity that need to be further tested. Although it seems unlikely that the priming contexts in the present study, i.e. video clips of natural environments, are the only two cases that exhibit priming effects. The next step in this line of research would involve systematic investigation of the priming approaches in terms of material, priming duration, persistence, and interaction scenario. Last but not least, sensory channels for priming should not be limited to the auditory and visual modalities. Future work could expand the scope of the priming modalities and their combinations to other sensory channels involving but not limited to haptics, olfaction, and gustation [33].

7.1 Implications for interactive system design

In terms of potential applications, the present findings could contribute to the field of mindless technology [1, 36] and multisensory interaction [31]. Mindless computing emphasises the subconscious mental process, and is characterized as a fast, automatic interaction that requires little or no effort. Mindless technology makes use of subliminal mental state, aiming to guide people towards intended interaction behaviour without annoying or distracting them. Previous research showed that both environmental features and the interactive system itself could impose a subtle influence on people’s choice and judgement in interactive tasks [1, 4, 36]. The present study, built on theoretical accounts of CCs, contributes to the field by introducing cognitive priming as an approach to enhance crossmodal perception, and consequently improves interactive sensory-motor performance. The implications for designing multisensory, mindless interaction are first, implicit conceptual priming can be used to improve interaction efficiency when information shares only one CC feature. In addition, when there are two or more streams of crossmodal information which have mutually exclusive congruency, explicit perceptual priming could be used for improving interactive performance.
The present study also provides insights for the design of multisensory interactive systems [31]. As is so often discussed, one of the advantages of multisensory interaction is that it can expand peoples’ cognitive capacity by presenting several streams of information through different sensory channels [24, 34, 48]. In order to make good use of this advantage, we need to consider the situation in which two or more streams of information may occupy the same pair of crossmodal channels. To make things worse, the streams of information may contain mutually exclusive CCs, as demonstrated in experiment 2. Both cases could inhibit information processing capacity, and as a result, negate the advantages of multisensory interaction. The design implication for handling this situation could be isolating information streams either spatially or temporally to avoid confusion. Another solution could be shunting information flow through different pairs of crossmodal channels, e.g. the visual–auditory channels and the visual–haptic channels, to reduce cross-sensory distractions.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literatur
1.
Zurück zum Zitat Adams AT, Costa J, Jung MF, Choudhury T (2015) Mindless computing: designing technologies to subtly influence behavior. In: Proceedings of the 2015 ACM international joint conference on pervasive and ubiquitous computing. ACM, pp 719–730 Adams AT, Costa J, Jung MF, Choudhury T (2015) Mindless computing: designing technologies to subtly influence behavior. In: Proceedings of the 2015 ACM international joint conference on pervasive and ubiquitous computing. ACM, pp 719–730
2.
Zurück zum Zitat Adams WJ, Graf EW, Ernst MO (2004) Experience can change the ‘light-from-above’ prior. Nat Neurosci 7(10):1057CrossRef Adams WJ, Graf EW, Ernst MO (2004) Experience can change the ‘light-from-above’ prior. Nat Neurosci 7(10):1057CrossRef
3.
Zurück zum Zitat Agrawal M, Jorgensen J (2019) Sonify: making visual graphs accessible. In: Human interaction and emerging technologies: proceedings of the 1st international conference on human interaction and emerging technologies (IHIET 2019), August 22–24, 2019, Nice, France, volume 1018. Springer, p 454 Agrawal M, Jorgensen J (2019) Sonify: making visual graphs accessible. In: Human interaction and emerging technologies: proceedings of the 1st international conference on human interaction and emerging technologies (IHIET 2019), August 22–24, 2019, Nice, France, volume 1018. Springer, p 454
4.
Zurück zum Zitat Aranyi G, Kouider S, Lindsay A, Prins H, Ahmed I, Jacucci G, Negri P, Gamberini L, Pizzi D, Cavazza M (2014) Subliminal cueing of selection behavior in a virtual environment. Presence 23(1):33–50CrossRef Aranyi G, Kouider S, Lindsay A, Prins H, Ahmed I, Jacucci G, Negri P, Gamberini L, Pizzi D, Cavazza M (2014) Subliminal cueing of selection behavior in a virtual environment. Presence 23(1):33–50CrossRef
5.
Zurück zum Zitat Bakker S, Antle AN, Van Den Hoven E (2012) Embodied metaphors in tangible interaction design. Pers Ubiquitous Comput 16(4):433–449CrossRef Bakker S, Antle AN, Van Den Hoven E (2012) Embodied metaphors in tangible interaction design. Pers Ubiquitous Comput 16(4):433–449CrossRef
6.
Zurück zum Zitat Bargh JA, Chartrand TL (2000) The mind in the middle. Handb Res Methods Soc Pers Psychol 2:253–285 Bargh JA, Chartrand TL (2000) The mind in the middle. Handb Res Methods Soc Pers Psychol 2:253–285
7.
Zurück zum Zitat Barsalou LW (1999) Perceptions of perceptual symbols. Behav Brain Sci 22(4):637–660CrossRef Barsalou LW (1999) Perceptions of perceptual symbols. Behav Brain Sci 22(4):637–660CrossRef
8.
Zurück zum Zitat Brunel L, Carvalho PF, Goldstone RL (2015) It does belong together: cross-modal correspondences influence cross-modal integration during perceptual learning. Front Psychol 6:358CrossRef Brunel L, Carvalho PF, Goldstone RL (2015) It does belong together: cross-modal correspondences influence cross-modal integration during perceptual learning. Front Psychol 6:358CrossRef
9.
Zurück zum Zitat Brunetti R, Indraccolo A, Del Gatto C, Spence C, Santangelo V (2018) Are crossmodal correspondences relative or absolute? Sequential effects on speeded classification. Atten Percept Psychophys 80(2):527–534CrossRef Brunetti R, Indraccolo A, Del Gatto C, Spence C, Santangelo V (2018) Are crossmodal correspondences relative or absolute? Sequential effects on speeded classification. Atten Percept Psychophys 80(2):527–534CrossRef
10.
Zurück zum Zitat Brunetti R, Indraccolo A, Mastroberardino S, Spence C, Santangelo V (2017) The impact of cross-modal correspondences on working memory performance. J Exp Psychol Hum Percept Perform 43(4):819CrossRef Brunetti R, Indraccolo A, Mastroberardino S, Spence C, Santangelo V (2017) The impact of cross-modal correspondences on working memory performance. J Exp Psychol Hum Percept Perform 43(4):819CrossRef
11.
Zurück zum Zitat Bruno N, Cutting JE (1988) Minimodularity and the perception of layout. J Exp Psychol Gen 117(2):161–70CrossRef Bruno N, Cutting JE (1988) Minimodularity and the perception of layout. J Exp Psychol Gen 117(2):161–70CrossRef
12.
Zurück zum Zitat Chalfoun P, Frasson C (2012) Cognitive priming: assessing the use of non-conscious perception to enhance learner’s reasoning ability. In: International conference on intelligent tutoring systems. Springer, pp 84–89 Chalfoun P, Frasson C (2012) Cognitive priming: assessing the use of non-conscious perception to enhance learner’s reasoning ability. In: International conference on intelligent tutoring systems. Springer, pp 84–89
13.
Zurück zum Zitat Chen Y-C, Spence C (2010) When hearing the bark helps to identify the dog: semantically-congruent sounds modulate the identification of masked pictures. Cognition 114(3):389–404CrossRef Chen Y-C, Spence C (2010) When hearing the bark helps to identify the dog: semantically-congruent sounds modulate the identification of masked pictures. Cognition 114(3):389–404CrossRef
14.
Zurück zum Zitat Evans KK, Treisman A (2009) Natural cross-modal mappings between visual and auditory features. J Vis 10(1):6CrossRef Evans KK, Treisman A (2009) Natural cross-modal mappings between visual and auditory features. J Vis 10(1):6CrossRef
15.
Zurück zum Zitat Eysenck MW, Keane MT (2013) Cognitive psychology: a student’s handbook. Psychology Press, East SussexCrossRef Eysenck MW, Keane MT (2013) Cognitive psychology: a student’s handbook. Psychology Press, East SussexCrossRef
16.
Zurück zum Zitat Feng F, Stockman T (2017) An investigation of dynamic crossmodal instantiation in TUIs. In: Proceedings of the 19th ACM international conference on multimodal interaction. ACM, pp 82–90 Feng F, Stockman T (2017) An investigation of dynamic crossmodal instantiation in TUIs. In: Proceedings of the 19th ACM international conference on multimodal interaction. ACM, pp 82–90
17.
Zurück zum Zitat Frid E, Moll J, Bresin R, Pysander ELS (2019) Haptic feedback combined with movement sonification using a friction sound improves task performance in a virtual throwing task. J Multimodal User Interfaces 13(4):279–290CrossRef Frid E, Moll J, Bresin R, Pysander ELS (2019) Haptic feedback combined with movement sonification using a friction sound improves task performance in a virtual throwing task. J Multimodal User Interfaces 13(4):279–290CrossRef
18.
Zurück zum Zitat Gallace A, Spence C (2006) Multisensory synesthetic interactions in the speeded classification of visual size. Percept Psychophys 68(7):1191–1203CrossRef Gallace A, Spence C (2006) Multisensory synesthetic interactions in the speeded classification of visual size. Percept Psychophys 68(7):1191–1203CrossRef
19.
Zurück zum Zitat Geronazzo M, Bedin A, Brayda L, Campus C, Avanzini F (2016) Interactive spatial sonification for non-visual exploration of virtual maps. Int J Hum Comput Stud 85:4–15CrossRef Geronazzo M, Bedin A, Brayda L, Campus C, Avanzini F (2016) Interactive spatial sonification for non-visual exploration of virtual maps. Int J Hum Comput Stud 85:4–15CrossRef
20.
Zurück zum Zitat Han S, Humphreys GW, Chen L (1999) Uniform connectedness and classical gestalt principles of perceptual grouping. Percept Psychophys 61(4):661–674CrossRef Han S, Humphreys GW, Chen L (1999) Uniform connectedness and classical gestalt principles of perceptual grouping. Percept Psychophys 61(4):661–674CrossRef
21.
Zurück zum Zitat Harrison L, Skau D, Franconeri S, Lu A, Chang R (2013) Influencing visual judgment through affective priming. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 2949–2958 Harrison L, Skau D, Franconeri S, Lu A, Chang R (2013) Influencing visual judgment through affective priming. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 2949–2958
22.
Zurück zum Zitat Hurtienne J (2011) Image schemas and design for intuitive use: exploring new guidance for user interface design. Ph.D. thesis Hurtienne J (2011) Image schemas and design for intuitive use: exploring new guidance for user interface design. Ph.D. thesis
23.
Zurück zum Zitat Jacobs RA (2002) What determines visual cue reliability? Trends Cogn Sci 6(8):345–350CrossRef Jacobs RA (2002) What determines visual cue reliability? Trends Cogn Sci 6(8):345–350CrossRef
24.
Zurück zum Zitat James KH, Vinci-Booher S, Munoz-Rubke F (2017) The impact of multimodal-multisensory learning on human performance and brain activation patterns. In: The handbook of multimodal-multisensor interfaces. Association for Computing Machinery and Morgan & Claypool, pp 51–94 James KH, Vinci-Booher S, Munoz-Rubke F (2017) The impact of multimodal-multisensory learning on human performance and brain activation patterns. In: The handbook of multimodal-multisensor interfaces. Association for Computing Machinery and Morgan & Claypool, pp 51–94
25.
Zurück zum Zitat Kosmyna N, Tarpin-Bernard F, Rivet B (2015) Conceptual priming for in-game BCI training. ACM Trans Comput–Hum Interact (TOCHI) 22(5):26CrossRef Kosmyna N, Tarpin-Bernard F, Rivet B (2015) Conceptual priming for in-game BCI training. ACM Trans Comput–Hum Interact (TOCHI) 22(5):26CrossRef
26.
Zurück zum Zitat Lewis S, Dontcheva M, Gerber E (2011) Affective computational priming and creativity. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 735–744 Lewis S, Dontcheva M, Gerber E (2011) Affective computational priming and creativity. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 735–744
27.
Zurück zum Zitat McCormick K, Lacey S, Stilla R, Nygaard LC, Sathian K (2018) Neural basis of the crossmodal correspondence between auditory pitch and visuospatial elevation. Neuropsychologia 112:19–30CrossRef McCormick K, Lacey S, Stilla R, Nygaard LC, Sathian K (2018) Neural basis of the crossmodal correspondence between auditory pitch and visuospatial elevation. Neuropsychologia 112:19–30CrossRef
28.
Zurück zum Zitat Mehta R, Zhu RJ (2009) Blue or red? Exploring the effect of color on cognitive task performances. Science 323(5918):1226–1229CrossRef Mehta R, Zhu RJ (2009) Blue or red? Exploring the effect of color on cognitive task performances. Science 323(5918):1226–1229CrossRef
29.
Zurück zum Zitat Metatla O, Correia NN, Martin F, Bryan-Kinns N, Stockman T (2016) Tap the shapetones: exploring the effects of crossmodal congruence in an audio-visual interface. In: Proceedings of the 2016 CHI conference on human factors in computing systems. ACM, pp 1055–1066 Metatla O, Correia NN, Martin F, Bryan-Kinns N, Stockman T (2016) Tap the shapetones: exploring the effects of crossmodal congruence in an audio-visual interface. In: Proceedings of the 2016 CHI conference on human factors in computing systems. ACM, pp 1055–1066
30.
Zurück zum Zitat Molholm S, Ritter W, Javitt DC, Foxe JJ (2004) Multisensory visual-auditory object recognition in humans: a high-density electrical mapping study. Cereb Cortex 14(4):452–465CrossRef Molholm S, Ritter W, Javitt DC, Foxe JJ (2004) Multisensory visual-auditory object recognition in humans: a high-density electrical mapping study. Cereb Cortex 14(4):452–465CrossRef
31.
Zurück zum Zitat Munteanu C, Irani P, Oviatt S, Aylett M, Penn G, Pan S, Sharma N, Rudzicz F, Gomez R, Cowan B et al (2017) Designing speech, acoustic and multimodal interactions. In: Proceedings of the 2017 CHI conference extended abstracts on human factors in computing systems. ACM, pp 601–608 Munteanu C, Irani P, Oviatt S, Aylett M, Penn G, Pan S, Sharma N, Rudzicz F, Gomez R, Cowan B et al (2017) Designing speech, acoustic and multimodal interactions. In: Proceedings of the 2017 CHI conference extended abstracts on human factors in computing systems. ACM, pp 601–608
32.
Zurück zum Zitat Murphy ST, Zajonc RB (1993) Affect, cognition, and awareness: affective priming with optimal and suboptimal stimulus exposures. J Pers Soc Psychol 64(5):723CrossRef Murphy ST, Zajonc RB (1993) Affect, cognition, and awareness: affective priming with optimal and suboptimal stimulus exposures. J Pers Soc Psychol 64(5):723CrossRef
33.
Zurück zum Zitat Obrist M, Gatti E, Maggioni E, Vi CT, Velasco C (2017) Multisensory experiences in HCI. IEEE Multimed 24(2):9–13CrossRef Obrist M, Gatti E, Maggioni E, Vi CT, Velasco C (2017) Multisensory experiences in HCI. IEEE Multimed 24(2):9–13CrossRef
34.
Zurück zum Zitat Oviatt S (2002) Breaking the robustness barrier: Recent progress on the design of robust multimodal systems. In: Advances in computers, volume 56. Elsevier, pp 305–341 Oviatt S (2002) Breaking the robustness barrier: Recent progress on the design of robust multimodal systems. In: Advances in computers, volume 56. Elsevier, pp 305–341
35.
Zurück zum Zitat Parise CV, Knorre K, Ernst MO (2014) Natural auditory scene statistics shapes human spatial hearing. In: Proceedings of the national academy of sciences, p 201322705 Parise CV, Knorre K, Ernst MO (2014) Natural auditory scene statistics shapes human spatial hearing. In: Proceedings of the national academy of sciences, p 201322705
36.
Zurück zum Zitat Pinder C (2017) Nonconscious behaviour change technology: targeting the automatic. In: Proceedings of the 2017 CHI conference extended abstracts on human factors in computing systems. ACM, pp 160–165 Pinder C (2017) Nonconscious behaviour change technology: targeting the automatic. In: Proceedings of the 2017 CHI conference extended abstracts on human factors in computing systems. ACM, pp 160–165
37.
Zurück zum Zitat Pinder C, Vermeulen J, Cowan BR, Beale R, Hendley RJ (2017) Exploring the feasibility of subliminal priming on smartphones. In: Proceedings of the 19th international conference on human–computer interaction with mobile devices and services. ACM, p 21 Pinder C, Vermeulen J, Cowan BR, Beale R, Hendley RJ (2017) Exploring the feasibility of subliminal priming on smartphones. In: Proceedings of the 19th international conference on human–computer interaction with mobile devices and services. ACM, p 21
38.
Zurück zum Zitat Pratt CC (1930) The spatial character of high and low tones. J Exp Psychol 13(3):278CrossRef Pratt CC (1930) The spatial character of high and low tones. J Exp Psychol 13(3):278CrossRef
39.
Zurück zum Zitat Rusconi E, Kwan B, Giordano BL, Umilta C, Butterworth B (2006) Spatial representation of pitch height: the SMARC effect. Cognition 99(2):113–129CrossRef Rusconi E, Kwan B, Giordano BL, Umilta C, Butterworth B (2006) Spatial representation of pitch height: the SMARC effect. Cognition 99(2):113–129CrossRef
40.
Zurück zum Zitat Slobodenyuk N, Jraissati Y, Kanso A, Ghanem L, Elhajj I (2015) Cross-modal associations between color and haptics. Atten Percept Psychophys 77(4):1379–1395CrossRef Slobodenyuk N, Jraissati Y, Kanso A, Ghanem L, Elhajj I (2015) Cross-modal associations between color and haptics. Atten Percept Psychophys 77(4):1379–1395CrossRef
41.
Zurück zum Zitat Spence C (2011) Crossmodal correspondences: a tutorial review. Atten Percept Psychophys 73(4):971–995CrossRef Spence C (2011) Crossmodal correspondences: a tutorial review. Atten Percept Psychophys 73(4):971–995CrossRef
42.
Zurück zum Zitat Spence C (2019) On the relative nature of (pitch-based) crossmodal correspondences. Multisens Res 32(3):235–265CrossRef Spence C (2019) On the relative nature of (pitch-based) crossmodal correspondences. Multisens Res 32(3):235–265CrossRef
43.
Zurück zum Zitat Spence C, Deroy O (2012) Crossmodal correspondences: Innate or learned? i-Perception 3(5):316–318CrossRef Spence C, Deroy O (2012) Crossmodal correspondences: Innate or learned? i-Perception 3(5):316–318CrossRef
44.
Zurück zum Zitat Stein BE, Stanford TR, Rowland BA (2014) Development of multisensory integration from the perspective of the individual neuron. Nat Rev Neurosci 15(8):520CrossRef Stein BE, Stanford TR, Rowland BA (2014) Development of multisensory integration from the perspective of the individual neuron. Nat Rev Neurosci 15(8):520CrossRef
45.
Zurück zum Zitat Thompson E, Palacios A, Varela FJ (1992) On the ways to color. Behav Brain Sci 15(1):56–74CrossRef Thompson E, Palacios A, Varela FJ (1992) On the ways to color. Behav Brain Sci 15(1):56–74CrossRef
46.
Zurück zum Zitat Walker BN, Kramer G (1996) Mappings and metaphors in auditory displays: an experimental assessment. Georgia Institute of Technology, Atlanta Walker BN, Kramer G (1996) Mappings and metaphors in auditory displays: an experimental assessment. Georgia Institute of Technology, Atlanta
47.
Zurück zum Zitat Walker L, Walker P (2016) Cross-sensory mapping of feature values in the size-brightness correspondence can be more relative than absolute. J Exp Psychol Hum Percept Perform 42(1):138CrossRef Walker L, Walker P (2016) Cross-sensory mapping of feature values in the size-brightness correspondence can be more relative than absolute. J Exp Psychol Hum Percept Perform 42(1):138CrossRef
48.
Zurück zum Zitat Wickens CD (2008) Multiple resources and mental workload. Hum Factors 50(3):449–455CrossRef Wickens CD (2008) Multiple resources and mental workload. Hum Factors 50(3):449–455CrossRef
Metadaten
Titel
Exploring crossmodal perceptual enhancement and integration in a sequence-reproducing task with cognitive priming
verfasst von
Feng Feng
Puhong Li
Tony Stockman
Publikationsdatum
13.07.2020
Verlag
Springer International Publishing
Erschienen in
Journal on Multimodal User Interfaces / Ausgabe 1/2021
Print ISSN: 1783-7677
Elektronische ISSN: 1783-8738
DOI
https://doi.org/10.1007/s12193-020-00326-y

Weitere Artikel der Ausgabe 1/2021

Journal on Multimodal User Interfaces 1/2021 Zur Ausgabe