Eye movements are unique behavioral responses that have not only a reaction time and an accuracy, but also a location (where one moves), an amplitude (how far one moves), and a duration (how long one fixates a position). Researchers looking to examine the effects of stimulus or condition manipulations typically use a combination of these measures (for review, see Henderson, 2003). Eye movements unfold over time; hence, one can also examine the interrelationship between sequences of eye movements. In his seminal work, Yarbus (1967) noticed that observers displayed similar scan patterns in successive viewings of Repin’s painting The Unexpected Visitor and concluded that “observers differ in the way they think and, therefore, differ also to some extent in the way they look at things” (p. 192). A brief inspection of these scan patterns reveals that they are complex and nonrandom, and that they contain sequences of repeated fixations. Noton and Stark (1971) noticed that observers tend to show similar scan patterns during encoding and later recognition of images. According to their “scanpath theory,” the sequence of fixations during the first viewing of a stimulus is stored in memory as a spatial model, and stimulus recognition is facilitated through observers following the same scan path during repeated exposures to the same image. These early observations were made informally by visual inspection, but later research has aimed at quantifying the similarity of scan paths for the same observer at different time points or when solving different tasks, or between different participants.

One successful method for comparing scan paths is based on the string-edit distance (Bunke, 1992; Levenshtein, 1966; Wagner & Fischer, 1974). In this method, fixation positions on a grid are encoded with letters, and scan paths, as sequences of fixated grid positions, can then be encoded as strings. Strings are compared by applying a number of transformations (e.g., insertions, deletions, and substitutions) to transform one string into the other. The number of transformation steps can thus be used as a measure of the distance between the scan paths. Scan path comparisons based on string-edit distance have been used successfully by many researchers—for example, Foulsham and Underwood (2008), Underwood, Foulsham, and Humphrey (2009), Foulsham and Kingstone (in press), and Harding and Bloj (2010). A second common scan path comparison is a linear distance algorithm that compares fixation sequences on the basis of the average distances between fixations (Henderson, Brockmole, Castelhano, & Mack, 2007; Mannan, Ruddock, & Wooding, 1995).

While these scan path comparison methods have been successfully used by many researchers, they are necessarily constrained to the comparison of scan paths for a limited number of situations, such as scan path comparisons for the same stimulus (e.g., Brandt & Stark, 1997; Noton & Stark, 1971; Shepherd, Steckenfinger, Hasson, & Ghasanfar, 2010), for closely related stimuli or stimulus sequences (e.g., Foulsham & Kingstone, in press; Underwood et al., 2009), or for stimuli with identical layouts (e.g., Cristino, Mathôt, Theeuwes, & Gilchrist, 2010). In other words, these methods are constrained to comparisons in which stimulus similarity is high (if the stimuli are not identical). In addition, it is often difficult to dissociate whether scan path similarity is due to the properties of the image (Harding & Bloj, 2010; Itti & Koch, 2001), the task or background knowledge (Underwood et al., 2009), or idiosyncratic eye movement behavior (Tatler & Vincent, 2008) because comparisons of this nature usually do not give values that would be expected to be above chance, and the string-edit similarity score typically has very low variability (e.g., Harding & Bloj, 2010; Underwood et al., 2009). To fully explore scan patterns within and across individuals, we need a more general technique for quantifying the temporal structure of eye movements. Recurrence quantification analysis (RQA), introduced here, permits not only the characterization of the scan path of a single observer to a single stimulus, but associated measures allow generalizations over observers and stimuli. RQA provides a general set of measures for characterizing different temporal structures of fixation sequences.

Recurrence quantification analysis

Recurrence analysis has been used successfully as a tool for describing complex dynamic systems (e.g., climatological data, Marwan & Kurths, 2002; electrocardiograms, Webber & Zbilut, 2005; or postural fluctuations, Pellecchia & Shockley, 2005; Riley & Clark, 2003) that are inadequately characterized by standard methods in time series analysis (e.g., Box, Jenkins, & Reinsel, 2008). It has also been used for describing the interplay between dynamic systems in cross-recurrence analysis (e.g., the postural synchronization of two persons: Shockley, 2005; Shockley, Santana & Fowler, 2003). Recurrence analysis can be generalized to categorical data, and recently, Richardson, Dale, and colleagues have used categorical cross-recurrence analysis for analyzing the coordination of gaze patterns between individuals (e.g., Cherubini, Nüssli, & Dillenbourg, 2010; Dale, Kirkham, & Richardson, 2011a; Dale, Warlaumont, & Richardson, 2011b; Richardson & Dale, 2005; Richardson, Dale, & Tomlinson, 2009; Shockley, Richardson, & Dale, 2009). For example, Richardson and Dale quantified the coordination between a speaker and a listener’s eye movements as they viewed actors on a screen. Those researchers demonstrated that the locations of a listener’s eye movements tend to follow the speaker’s by approximately 2 s. In addition, they found that the more closely a listener’s eye movements matched the speaker’s, the better the listener’s overall comprehension of the speaker’s comments. This form of cross-recurrence analysis can provide an overall measure of similarity across two eye movement sequences (like the string-edit method), as well as the quantification of the time lag that most closely matches the two sequences. While this method is useful for quantifying similarities between two time series, the general recurrence method includes many more temporal measures that may be useful in quantifying the temporal structure of a single time series—namely, the eye movements of a single observer.

In the present article, we introduce a form of categorical recurrence analysis for characterizing the gaze patterns of a single observer. We show that it is a powerful tool for analyzing fixation sequences, for discovering repeated scan paths, and for determining image positions that are fixated repeatedly or are part of recurring scan paths. In the following sections, we first introduce the fundamentals of RQA, with specific consideration of fixation sequences. Second, we describe and interpret the main measures associated with RQA. Third, we present an eyetracking experiment designed to reveal meaningful and interpretable differences in these measures. Finally, we discuss potential applications and extensions of RQA as a general tool for the temporal analysis of eye movement behavior.

Recurrence

Consider a fixation sequence f i , i = 1, . . . , N, with f i = <x i , y i >. Two fixations are considered to be recurrent if they are close together. “Closeness” can be defined in several ways, as discussed below, but in general, one can define recurrence r ij as

$$ {r_{ij }}=\left\{ {\begin{array}{*{20}c} {1,} & {d\left( {{f_i},{f_j}} \right)\leqslant \rho } \\ {0,} & {\mathrm{otherwise}} \\ \end{array}} \right., $$
(1)

where d is some distance metric (e.g., Euclidean distance) and ρ is a given radius (i.e., two fixations are considered recurrent if they are within a certain distance of each other).

Recurrence plot

Recurrence can be represented in a recurrence plot, which plots recurrences of a fixation sequence with itself over all possible time lags. If fixations i and j are recurrent (i.e., if r ij = 1), then a dot is plotted at position i, j (see Fig. 1b). All fixations are recurrent with themselves [since d(f i , f i ) = 0]; hence, all elements on the major diagonal—the line of incidence—are recurring. Furthermore, since distance metrics are symmetric [i.e., d(f i , f j ) = d(f j , f i )], recurrence plots are also symmetric. A recurrence plot is generated for each sequence of fixations (e.g., each trial or image viewed). Note that time is not represented directly on the recurrence plot; rather, the fixation sequence is preserved. Later, we will describe an extension of RQA in which fixation duration is taken into account.

Fig. 1
figure 1

(A) Example fixation map overlaid by a grid with element size 64 pixels. In the fixed-grid method, fixations are considered recurrent if they fall within the same grid element. For example, Fixations 11 and 13 are recurrent, but Fixations 10 and 13 are not (see highlighted region). (B) Recurrence plot corresponding to the fixation sequence in Fig. 1a, generated using the fixed-grid method. Notice that points are drawn at the intersections of Fixations 11 and 13

Distance metrics

One can define several distance metrics for the analysis of fixations, and we will discuss each one in turn. In the fixed-grid method, which is similar to the string-edit method employed for scan path analysis (Cristino et al., 2010; Foulsham & Kingstone, in press; Underwood et al., 2009), a grid of locations is defined over the image, and two fixations f i and f j are considered recurrent if they land in the same grid element. This is illustrated in Fig. 1a, which shows a fixation sequence plotted on a 1,024 × 768 image with a grid element size of 64 × 64 pixels. For example, Fixations 11 and 13 are recurrent, and consequently points are drawn on the recurrence plot at positions <11, 13> and <13, 11>. One disadvantage of the fixed-grid method is that the grid is defined independently of the image content and may be too coarse in regions of interest, while being too fine in other areas. As a remedy, one can define regions of interest (e.g., eyes, nose, lips, and cheek) and define which regions are considered adjacent to each other. Both methods, however, suffer from the problem that fixations may not be defined as recurring even if they are close to each other—namely, if they happen to land in adjacent grid elements (see, e.g., Fixations 10 and 13 in Fig. 1a).

A solution to this problem is the fixation-distance method (see Fig. 2). Instead of relying on a fixed grid superimposed on the stimuli, this method defines two fixations f i and f j as recurring if they are close to each other [i.e., if the Euclidean distance d(f i , f j ) ≤ ρ for a fixed radius ρ]. This is illustrated in Fig. 3, where the distances between Fixations 4 and 34–36 are all less than ρ = 64 pixels. The fixation-distance method will be used in the remainder of the article.

Fig. 2
figure 2

Example fixation plot. Fixations are numbered sequentially, and a circle with radius 64 pixels is drawn around fixations that recur

Fig. 3
figure 3

Example recurrence plot (panel B) corresponding to the fixation sequence from Fig. 2, along with the corresponding fixation detail of Fixations 4, 34, 35, and 36 (panel A). Fixations 34, 35, and 36 fall within 64 pixels of Fixation 4. These fixations are said to recur, and points are drawn on the plot at the intersection of Fixations 4, 34, 35, and 36

Recurrence quantification measures

While the recurrence diagram provides a useful visual representation of the recurrence patterns for a fixation sequence, it must be complemented by an RQA for comparison across different fixation sequences (i.e., for different trials, participants, and experimental conditions). Here, we introduce a subset of RQA measures, those that are particularly useful for the analysis of fixations (see Webber & Zbilut, 2005, and Marwan, Wessel, Meyerfeldt, Schirdewan, & Kurths, 2002, for complete lists of RQA measures). With each mathematical description, we provide an interpretation of the measure in terms of eye movement behavior. Given the symmetry of the recurrence diagram, the quantitative measures are usually extracted from the upper triangle of the recurrence diagram, excluding the line of incidence, which does not add any additional information (recall that the line of incidence indicates that each fixation is recurrent with itself). First, we give some useful definitions: Let R be the sum of recurrences in the upper triangle of the recurrence diagram—that is, \( R=\sum\nolimits_{i=1}^{N-1 } {\sum\nolimits_{j=i+1}^N {{r_{ij }}} } \). Let D L be the set of diagonal lines, H L the set of horizontal lines, and V L the set of vertical lines—all in the upper triangle, and all with a length of at least L—and let |·| denote cardinality.

The recurrence measure is defined as

$$ \mathrm{REC}=100\frac{2R }{{N\left( {N-1} \right)}}. $$
(2)

It represents, for a sequence of N fixations, the percentage of recurrent fixations (i.e., how often observers refixate previously fixated image positions). As fixations are plotted sequentially, the larger the distance between a recurrent point and the main diagonal, the larger is the time interval (in number of fixations) between the original fixation and the refixation.

The determinism measure is defined as

$$ \mathrm{DET}=100\frac{{\left| {{D_L}} \right|}}{R}. $$
(3)

It measures the proportion of recurrent points forming diagonal lines and represents repeating gaze patterns in the recurrence diagram. In the example in Fig. 4, the scan path of Fixations 19–20 is repeated later in Fixations 43–44, producing a diagonal line in the recurrence diagram. This may represent two areas of the scene where one fixation is more likely to follow another. For example, when a person looks at one eye, he or she may be more likely to look at the other in a repeated pattern between the two eyes. This repeated pattern would create instances of determinism. In the present work, the minimum line length of diagonal line elements was set to L = 2. The length of the diagonal line element reflects the number of fixations making up the repeated scan path, and the distance from the diagonal reflects the time (in numbers of fixations) since the scan path was first followed.

Fig. 4
figure 4

Illustration of determinism in a recurrence plot (panel A), along with the detail of a deterministic fixation sequence (panel B). Fixations 19 and 20 are fixated, and later in the trial this sequence is refixated in the same order, as Fixations 43 and 44 (see the blue highlighted region on panel A)

The laminarity measure is defined as

$$ \mathrm{LAM}=100\frac{{\left| {{H_L}} \right|+\left| {{V_L}} \right|}}{2R }. $$
(4)

Referring to the top half of the recurrence plot, vertical lines represent areas that were fixated first in a single fixation and then rescanned in detail over consecutive fixations at a later time (e.g., several fixations later), and horizontal lines represent areas that were first scanned in detail and then refixated briefly later in time (see Fig. 5). (This definition of laminarity differs slightly from that of Webber & Zbilut, 2005.) In the example in Fig. 5, vertical laminarity is shown for Fixations 34, 35, and 36, and horizontal laminarity is shown for Fixations 12, 13, and 14. Again, we set the minimum line lengths of vertical and horizontal lines to L = 2. Finally, we have found that recurrence diagrams sometimes contain recurrence clusters (with horizontal and vertical lines), indicating detailed scanning of an area and of nearby locations. Laminarity in general indicates that specific areas of a scene are repeatedly fixated. For example, an observer may return to an interesting area of the scene to scan it in more detail. This would create a vertical line on the recurrence plot.

Fig. 5
figure 5

Recurrence plot illustrating laminarity and associated details of the fixations involved (panels B and C). The refixation at the location of Fixation 4 in Fixations 34, 35, and 36 (panel B) creates a vertical line on the recurrence plot (panel A, highlighted in purple). This indicates that the general location of Fixation 4 was examined in detail later on in the trial. In contrast, a region is fixated in detail at Fixations 12, 13, and 14. Later in the trial (at Fixation 24), the same location is revisited briefly (panel C). This creates a horizontal line on the RQA plot (panel A, highlighted in blue)

The center of recurrence mass (corm) is defined as the distance of the center of gravity of recurrent points from the line of incidence, normalized such that the maximum possible value is 100:

$$ \mathrm{CORM}=100\frac{{\sum\nolimits_{i=1}^{N-1 } {\sum\nolimits_{j=i+1}^N {\left( {j-i} \right){r_{ij }}} } }}{{\left( {N-1} \right)R}}. $$
(5)

This measure indicates approximately where in time most of the recurrent points are situated. Small corm values indicate that refixations tend to occur close in time, whereas large corm values indicate that refixations tend to occur widely separated in time (i.e., in terms of number of fixations; see Fig. 6). For example, if an observer sequentially scans three particular areas of a scene in detail and never returns to those areas later in the trial, most of the recurrent points would fall close to the line of incidence. This would be represented by a small corm value.

Fig. 6
figure 6

Illustration of the center of recurrence mass (corm). Corm is low when recurrence occurs close together in the trial sequence, near to the line of incidence (a), and corm is high when recurrence occurs farther apart in the trial sequence (b)

The corm measure is related to the trend measure introduced by Marwan et al. (2002), which is computed as the slope of the least-squares regression of the number of recurrences on each diagonal as a function of the distance from the central diagonal. We have found that, in particular for small values of N, the sample variance of the trend measure tends to be large, because the addition or removal of a single recurrence away from the main diagonal can strongly affect the value of the trend measure. The corm measure was found to be more resilient to such variations.

In summary, the recurrence and corm measures capture the global temporal structure of fixation sequences. They measure how many times given scene areas are refixated (recurrence) and whether these refixations occur close or far apart in the trial sequence (corm). In contrast, determinism and laminarity are measures of the finer temporal structure. Specifically, they indicate sequences of fixations that are repeated (determinism) and points at which detailed inspections of an image area are occurring (laminarity). These measures can then be compared across different types of images, experimental contexts, and participants.

Radius selection

As indicated earlier, two fixations f i and f j are considered recurrent if d(f i , f j ) ≤ ρ, with the radius ρ being a free parameter. The number of recurrences is related directly to the radius. As the radius ρ approaches zero, (off-diagonal) recurrences approach zero, and as ρ approaches the image size, recurrences approach 100 % (see Fig. 7). The dependence of recurrence on radius leads to the obvious question of how an appropriate radius for recurrence analysis should be selected.

Fig. 7
figure 7

The median and 95 % confidence interval of percent recurrence, as a function of radius

Webber and Zbilut (2005) suggested several guidelines for selecting the proper radius, including the selection of a radius that falls within the linear scaling region of a log–log plot of Fig. 7, or ensuring that the percentage of recurrences remains low (e.g., in the range 0.1 %–2.0 %; Webber & Zbilut, 2005, p. 56). In the experiment reported below, applying this method would require that the radius size change across conditions, eliminating the use of recurrence as a measure for comparing the effects of different experimental conditions. In the case of eye movements, one can apply more content-oriented criteria. For example, fixations can be considered as recurring if their foveal (or parafoveal) areas overlap, using a radius size of 1–2 deg of visual angle. In the study reported below, the radius was selected to match the size of the gaze-contingent window, which subtended approximately 5 × 5 deg of visual angle. Such content-oriented criteria make it easier to interpret the meanings of recurrences. Alternatively, it may be possible to derive an optimal radius from the spatial frequency content of the stimulus images, but we have not done so in the present work.

Significance testing (bootstrapping)

The RQA measures do not necessarily have the same probability distributions (see Fig. 8), and the distributions may even differ between experimental conditions. In addition, one limitation of the RQA measures is that their values are largely dependent on the radius chosen, and for this reason, the measured values are somewhat arbitrary. Thus, it is critical to compare the RQA measures against a random fixation model. For this reason, we relied on bootstrapping methods (Efron & Tibshirani, 1993; Foster & Bischof, 1991) for comparing the RQA measures against chance.

Fig. 8
figure 8

Histograms of the recurrence (panel A), determinism (panel B), laminarity (panel C), and corm (panel D) measures

In the experiment reported below, we computed, separately for each of the experimental conditions, spatial fixation distributions. These were smoothed using a Gaussian filter with σ = 20 pixels; that is, we computed standard fixation heat maps. We then created random fixation sequences by taking sequences of random samples from these smoothed fixation distributions. This was repeated 1,000 times for each experimental trial, and the distribution of the RQA measures for the recurrence, determinism, laminarity, and corm of the empirical values was compared against those obtained for these random fixation sequences.

Experiment

For the present study, we utilized a data set, part of which we have published independently (Risko, Anderson, Lanthier, & Kingstone, 2012). In that study, participants performed an unrestricted scene-viewing task. Another group of participants, included in the present experiment only, viewed the same scenes through a gaze-contingent window, a technique whereby the fixated visual field is constrained and updated rapidly in response to the eye movements of participants. The general purpose behind the use of gaze-contingent displays is to limit the availability of the surrounding scene context in order to examine its role in normal eye movement planning (Bertera & Rayner, 2000). The gaze-contingent window has a profound impact on eye movement behavior (Foulsham & Kingstone, in press; Foulsham, Teszka, & Kingstone, 2011; Loschky & McConkie, 2002; van Diepen & d’Ydewalle, 2003): Participants typically make smaller and more systematic eye movements when viewing a scene through a gaze-contingent display (Foulsham & Kingstone, in press; Loschky & McConkie, 2002), and the size and shape of the window has an impact on both the amplitude and direction of saccades (Foulsham et al., 2011). Given the impact of the gaze-contingent display on eye movement behavior, we employed it here to assess the utility of the RQA measures in quantifying differences in the temporal dynamics of normal and gaze-contingent viewing. In addition, we compared eye movements across 18 different scenes of buildings, interiors, and landscapes (six of each type; examples of each scene type are shown in Figs. 1, 2, and 10) in order to assess whether the RQA measures can dissociate differences in eye movement behavior resulting from differences in image content.

Method

Participants

A group of 108 undergraduates from the University of British Columbia were paid $5 each or received course credit to participate.

Stimuli and apparatus

We used 18 different scenes of exteriors, interiors, and landscapes, six of each type. The scenes spanned 38 × 29.5 cm, corresponding to a visual angle of approximately 42 × 33 deg at the viewing distance of 50 cm. The image resolution was 1,028 × 768 pixels. An SR Research EyeLink II head-mounted eyetracking system recording at 500 Hz was used to display the stimuli and to record eye movements. Calibration was performed at the start of the experiment using a nine-point calibration pattern, and drift correction was performed before each trial. The online saccade detector of the eyetracker was set to detect saccades with an amplitude of at least 0.5º, using an acceleration threshold of 9500º/s2 and a velocity threshold of 30º/s. Both the scenes and the eye positions were also presented to the experimenter on a second monitor, so that real-time feedback could be given about system accuracy.

Procedure

The participants were seated approximately 50 cm away from the computer monitor. The scenes were then presented in random order, and each remained on the screen for 15 s. The participants were in two experimental conditions, “natural viewing” and “gaze-contingent viewing.” Those in the “natural viewing” condition were told that they would be presented with a picture and that they were to look at it “naturally.” Those in the “gaze-contingent” condition viewed the scenes through a square gaze-contingent window of size 128 × 128 pixels, corresponding to a visual angle of approximately 5 × 5 deg.

Results

Data handling

Three of the participants (two from natural and one from gaze-contingent viewing) were removed from the analysis on the basis of an outlier removal procedure in which participants with an average recurrence value 2.5 standard deviations above or below their group condition mean were removed. One participant was randomly removed from the gaze-contingent condition in order to preserve a balanced design. Thus, there were 52 participants in the natural viewing and 52 in the gaze-contingent viewing condition. The fixation sequences of two trials contained no recurrences and were excluded from the RQA analysis. This was done for simplicity, as the absence of recurrence renders determinism, laminarity, and corm values undefined. In general, however, trials with no recurrence can indeed be informative in cases in which recurrence is expected, and it may be interesting to compare trials in which fixation sequences contain no recurrences to those that do. In our case, however, there were too few trials without recurrence to allow for such an analysis, and thus we chose to remove those trials.

The gaze patterns were analyzed using the RQA measures of recurrence, determinism, laminarity, and corm, separately in four 2 × 3 × 6 mixed analyses of variance, with Viewing Condition (natural or gaze-contingent) as a between-subjects factor and Image Type (exteriors, interiors, or landscapes) and Repetitions (image identity) as within-subjects factors. Huynh–Feldt corrections were applied whenever appropriate, and partial η 2s are reported for effect size. Bootstrap results for comparison against random fixation sequences were based on 1,000 bootstrap replications. (The RQA analysis was implemented in MATLAB, following Webber & Zbilut, 2005, and some of the software packages mentioned therein. The code is available on request from the authors.)

Recurrence

These results are shown in Fig. 9a. We found a significant effect of viewing condition, F(1, 102) = 95.2, p < .001, η 2 p = .346, with recurrence for natural viewing (M = 6.84, SD = 0.14) being higher than recurrence for gaze-contingent viewing (M = 3.18, SD = 0.06). A significant effect of image type also emerged, F(2, 204) = 3.98, p = .02, η 2 p = .008, with recurrence for exteriors (M = 4.70, SD = 0.16) being lower than recurrence for interiors (M = 5.18, SD = 0.13), t(1248) = 2.37, p = .02, and for landscapes (M = 5.15, SD = 0.17), t(1246) = 1.96, p = .05. Finally, there was as an interaction between viewing condition and image type, F(2, 204) = 5.74, p < .001, η 2 p = .011, with recurrence being larger for interiors than for the other image types during gaze-contingent viewing, Scheffé F(2, 933) = 18.95, p < .001, but not during normal viewing, Scheffé F(2, 931) = 2.44, p = .09. Bootstrapping showed that recurrence was significantly higher for both natural and gaze-contingent viewing than for the corresponding random fixation sequences [for the respective random sequences, M = 2.35, paired t(934) = 31.4, p < .001, and M = 2.36, paired t(936) = 13.1, p < .001].

Fig. 9
figure 9

Averages of the RQA measures across normal and gaze-contingent viewing for each of the image types. L, E, and I represent landscapes, exteriors, and interiors, respectively. Error bars represent standard errors of the means. (A) Percent global recurrence. (B) Determinism, with the y-axis representing the percentage of recurrent points that form diagonal lines. (C) Laminarity, with the y-axis representing the percentage of recurrent points that form horizontal or vertical lines. (D) Corm, with the y-axis representing the percentage of the maximum possible corm value

Determinism

The determinism results are shown in Fig. 9b. We found a significant effect of viewing condition, F(1, 102) = 20.06, p < .001, η 2 p = .114, with determinism for natural viewing (M = 35.06, SD = 0.59) being lower than determinism for gaze-contingent viewing (M = 44.83, SD = 0.65). We also found a significant effect of image type, F(2, 204) = 50.66, p < .001, η 2 p = .076, with determinism for interiors (M = 45.45, SD = 0.76) being higher than determinism for exteriors (M = 37.36, SD = 0.76), t(1248) = 7.52, p < .001, and for landscapes (M = 37.03, SD = 0.76), t(1246) = 7.77, p < .001. Bootstrapping showed that determinism was significantly higher for both natural and gaze-contingent viewing than for the corresponding random fixation sequences [respectively, M = 4.25, paired t(934) = 52.56, p < .001, and M = 4.32, paired t(936) = 62.72, p < .001].

Laminarity

The laminarity results are shown in Fig. 9c. A significant effect of viewing condition emerged, F(1, 102) = 34.3, p < .001, η 2 p = .193, with laminarity for natural viewing (M = 32.45, SD = 0.56) being higher than laminarity for gaze-contingent viewing (M = 21.69, SD = 0.43). A significant effect of image type also appeared, F(2, 204) = 53.45, p < .001, η 2 p = .076, with laminarity for interiors (M = 31.44, SD = 0.63) being higher than laminarity for exteriors (M = 24.34, SD = 0.62), t(1248) = 8.05, p < .001, and for landscapes (M = 25.42, SD = 0.67), t(1246) = 6.58, p < .001. Bootstrapping showed that laminarity was significantly higher for both natural and gaze-contingent viewing than for the corresponding random fixation sequences [respectively, M = 2.47, paired t(934) = 53.52, p < .001, and M = 2.60, paired t(936) = 44.58, p < .001].

Center of recurrence mass

The corm results are shown in Fig. 9d. We found a significant effect of viewing condition, F(1, 52) = 39.13, p < .001, η 2 p = .090, with corm for natural viewing (M = 26.54, SD = 0.29) being higher than corm for gaze-contingent viewing (M = 21.64, SD = 0.33). We also found a significant effect of image type, F(2, 204) = 15.24, p < .001, η 2 p = .024, with corm for interiors (M = 22.38, SD = 0.36) being lower than corm for exteriors (M = 25.00, SD = 0.39), t(1248) = 4.93, p < .001, and for landscapes (M = 24.89, SD = 0.40), t(1246) = 4.62, p < .001. Finally, a significant interaction between viewing condition and image type was apparent, F(2, 204) = 4.66, p = .011, η 2 p = .007, due to the fact that, as compared to the other conditions, corm was particularly low for gaze-contingent viewing of interior scenes, Scheffé F(1, 1864) = 109.42, p < .001. Bootstrapping showed that corm was significantly higher for both natural and gaze-contingent viewing than for the corresponding random fixation sequences [respectively, M = 14.01, paired t(934) = 39.56, p < .001, and M = 15.79, paired t(936) = 17.34, p < .001].

Discussion

Recurrence quantification analysis revealed significant differences in global temporal fixation patterns, in particular between natural viewing and gaze-contingent viewing. Recurrence was higher in the natural viewing condition than in the gaze-contingent viewing condition: Participants were more likely to refixate previously inspected scene areas when the surrounding scene context was visible. This highlights the importance of extrafoveal information for guiding gaze, in particular for reinspecting previously fixated scene positions. Recurrence in both the natural viewing and gaze-contingent conditions was significantly higher than would be expected on the basis of random fixation sequences. The results also revealed a significant but small (η 2 p = .008) effect of image type, with recurrence being somewhat smaller for exterior scenes than for the other scene types.

Regarding the global temporal pattern of refixations, corm was higher for natural than for gaze-contingent viewing, indicating that refixations occurred farther apart in the fixation sequence for natural viewing than for gaze-contingent viewing. This could reflect a general tendency in gaze-contingent viewing to saccade within the bounds of the window. Such fixations would constitute a recurrence in the present work, given that the size of the radius chosen was approximately equal to the size of the gaze-contingent window. Many of the refixations in the gaze-contingent condition occurred closer to the line of incidence (or closer together in the fixation sequence), resulting in a smaller corm value. In addition, the corm measures for natural and gaze-contingent viewing were significantly different from those for random fixation sequences, indicating that, in general, there was a modestly large gap between fixations and refixations.

RQA also revealed significant differences in the local temporal gaze patterns. Determinism was significantly higher for gaze-contingent than for natural viewing, and both were significantly higher than for random fixation sequences. In other words, participants were likely to follow certain scan paths repeatedly in viewing scenes, and they were more likely to do so in gaze-contingent viewing than in natural viewing. This suggests that, in the absence of peripheral scene information in the gaze-contingent condition, observers scanned scenes in a more stereotypical pattern. The same pattern of results was found for the laminarity measure, which indexes the co-occurrence of scan paths and single fixations of the same scene area at different points in time. Thus, fixation patterns in gaze-contingent viewing contain more clusters of recurrent fixations (higher laminarity and determinism), indicating that participants were inspecting particular scene regions in greater detail under these conditions. For gaze-contingent viewing, higher determinism and laminarity may indicate that participants were repeatedly fixating within the bounds of the gaze-contingent window and are less likely to make a large-amplitude saccade to a new area of the scene. This finding is consistent with previous accounts of gaze-contingent viewing patterns (Foulsham et al., 2011; Loschky & McConkie, 2002; van Diepen & d’Ydewalle, 2003).

Determinism and laminarity were also higher for interior scenes than for exteriors or landscapes. In addition, corm was lower for interior scenes than for exteriors or landscapes (particularly for gaze-contingent viewing), indicating that observers were fixating repeatedly in particular areas of the scene. These results could reflect properties of the image, such as the layout of interior scenes. For instance, the interior scenes in this work contained more objects, often clustered in particular areas (e.g., a desk). Clustered objects may encourage the clustering of eye movements, both spatially and temporally. This result is in line with work showing that patterns of eye movement behavior can change depending on the image content of a scene (e.g., Foulsham & Kingstone, 2010; Foulsham, Kingstone, & Underwood, 2008). Although Foulsham and colleagues focused mainly on the direction of saccades, it is possible that temporal aspects of eye movement behavior may also change depending on image content. Further work will need to be conducted to explore this possibility. For example, it may be possible to compare RQA measures against measures of the amount of clutter in a scene (e.g., Rosenholtz, Li, & Nakano, 2007). From the results reported here, we predict that the amount of clutter and the amount of determinism and laminarity would be positively correlated.

These observations also open an interesting prospect in assessing models of gaze generation (e.g., Boccignone & Ferraro, in press; Itti & Koch, 2001) and gaze imitation (e.g., Hoffman, Grimes, Shon, & Rao, 2006). These models have typically been tested by comparing predicted with empirical fixation distributions or by comparing the congruence of predicted and empirical gaze sequences. In contrast, RQA provides several new measures that capture general characteristics of gaze sequences, and thus may provide new constraints on these models. For example, a model may be consistent with empirical observations on the recurrence measure, but not the determinism or corm measures.

Taken together, these RQA results reveal consistent temporal gaze patterns that were not only very different for natural and gaze-contingent viewing, but also for different types of scenes. The results also show that RQA is a powerful tool for capturing important temporal characteristics of gaze patterns.

Spatial mapping of RQA

RQA is suitable for the characterization of temporal gaze patterns, independent of the spatial properties of particular scenes. While this is undoubtedly one of the advantages of RQA over other analysis methods, it is nonetheless interesting to map RQA measures back on to spatial coordinates. Given that RQA is performed separately for each participant and stimulus, this can be achieved easily. Figure 10a shows a standard fixation heat map for one of the exterior images summed over all participants, excluding first fixations, which tend to be located near the center of the screen. Figure 10b shows the recurrence heat map for the same image. It was obtained by projecting all recurrences on each recurrence diagram back into the image space. It thus represents a map of all locations that participants fixated more than once (refixated). Notice that the “heat” is much more tightly arranged across the scene and that some areas are highlighted on the recurrence heat map that are not prominent on the normal fixation heat map. The recurrence heat map may be useful for those interested only in the places where participants refixated. Image statistics (or other measures) can then be compared for these particular areas of the scene. It would be interesting to know whether areas that are refixated differ in image or semantic properties from areas that are fixated more in general. Another possibility is that areas that are consistently refixated are in some way more task-relevant than others, as has been found in work examining eye movements during everyday tasks (e.g., Land, Mennie & Rusted, 1999). Finally, Fig. 10c shows the determinism heat map for the same image. It was obtained by projecting all recurrence points that were part of diagonal lines (see Fig. 4) back into image space. It thus represents a map of all locations that were part of repeated scan paths. These back-projections could also be done for the other RQA measures.

Fig. 10
figure 10

(A) Traditional fixation heat map. (B) Recurrence heat map, with only those fixations that were recurrent represented in the image. (C) Determinism heat map, with only deterministic fixations represented in the image

Casual examination of Figs. 10a and b shows that refixated areas do not appear to be particularly special in terms of image content or semantic information, relative to the rest of the scene. It is possible that the refixated areas in Fig. 10b are convenient locations from which to examine a broader area of a scene. In addition, inspection of the recurrence heat maps of our interior scenes showed that refixations generally appear to occur in areas containing multiple objects or particularly cluttered areas of the scene. Moreover, systematic differences in heat maps appear across normal and gaze-contingent viewing conditions. In the normal condition, refixations are clustered in a few (two to four) locations across the scene, while in the gaze-contingent maps, refixations appear to be more widely distributed across the scene. Further investigation into the properties of these modified fixation maps will be the subject of upcoming work.

Fixation duration

Fixation duration can be an important indicator of processing during fixation (Henderson & Pierce, 2008; Holmqvist et al., 2011, pp. 377ff). On these grounds, the comparison of scan paths using string-edit distance has been criticized because it takes fixation locations, but not fixation durations, into account. To overcome this potential deficiency, Cristino et al. (2010) proposed the “ScanMatch” method for comparing fixation sequences, which takes location, order, and time into account. On the same grounds, one might criticize RQA, but this concern would be unwarranted, as RQA can be generalized to take fixation durations into account. Given a fixation sequence f i , i = 1, . . . , N, and the associated vector of fixation durations t i , i = 1, . . . , N, one can redefine recurrence r ij t as

$$ r_{ij}^t=\left\{ {\begin{array}{*{20}c} {{t_i}+{t_j},} & {d\left( {{f_i},{f_j}} \right)\leqslant \rho } \\ {0,} & {\mathrm{otherwise}} \\ \end{array}} \right., $$
(6)

with the distance metric d and the radius ρ. With the modified recurrence \( r_{ij}^t \), the RQA measures have to be renormalized, as is described in the Appendix.

Cross-recurrence analysis

Although RQA can be useful for comparing the temporal structures of eye movements across experimental contexts, of particular interest is the direct comparison of fixation sequences. This comparison is typically done using the string-edit method (e.g., Underwood et al., 2009). RQA can be extended in order to directly compare fixation sequences in a method known as cross-recurrence quantification analysis (CRQA; e.g., Richardson & Dale, 2005). Richardson and Dale used a method very similar to the fixed-grid method described above; however, the adaptive method can be used for CRQA by simply comparing one sequence of fixations to another (instead of a sequence to itself, as in RQA). Given two fixation sequences f i and g i , i = 1, . . . , N, we define the cross-recurrence

$$ {r_{ij }}=\left\{ {\begin{array}{*{20}c} {1,} & {d\left( {{f_i},{g_j}} \right)\leqslant \rho } \\ {0,} & {\mathrm{otherwise}} \\ \end{array}} \right.. $$
(7)

In this case, recurrence occurs when two fixations from different sequences land within a given radius of each other. These fixation sequences could be from the same participant viewing two different images, or from different participants viewing the same image (or from any other combination of fixation sequences). The CRQA method is discussed in greater detail in Richardson and Dale (2005). The only addition to that method here is to use the radius to indicate when fixations are recurrent, rather than a fixed grid.

Independence of RQA measures

We have selected only a small number of RQA measures from the large number of measures found in the literature (e.g., Marwan & Kurths, 2002; Webber & Zbilut, 2005)—namely, those that have a straightforward and simple interpretation in terms of fixation sequences. Other measures may, however, capture further characteristics of fixation sequences. It should also be pointed out that the selected measures (recurrence, determinism, laminarity, and corm) are not necessarily independent. For example, we found a significant negative correlation between determinism and corm. This is due to the fact that one cannot have a high corm value (recurrences that tend to occur widely separated in time, as illustrated in the upper left corner of the recurrence matrix in Fig. 6) and, at the same time, a large overall number of recurrences. One potentially fruitful line of investigation will be examine the viewing factors, such as stimulus type, task, and range of view, that do and do not modulate the correlations between measures.

Potential applications and future directions

RQA provides a rich source of information about eye movement behavior. The potential applications of this method are many and varied. RQA can be used as a method for examining the characteristics of inhibition of return (Klein & MacInnes, 1999; Smith & Henderson, 2011). For example, some current theories of inhibition of return in visual search predict that few refixations would occur near the line of incidence, as one proposed mechanism of inhibition of return is to prevent the visual system from repeatedly sampling nearby locations (e.g., Klein & MacInnes, 1999). This has been corroborated by work demonstrating that when given a choice between inspecting a new or old location of a search display, participants prefer to inspect the new location, particularly when the old location was recently visited (McCarley, Wang, Kramer, Irwin, & Peterson, 2003). The center of recurrence mass may be a particularly useful measure for this work. In addition, one can measure the decay of inhibition of return by counting how many fixations, on average, occur before a refixation.

RQA may also be useful for theories of visual attention concerned with the idiosyncratic characteristics of eye movement behavior—for example, those that posit the interplay between periods of local scanning, followed by large-scale relocations to new areas of a scene (Pannasch, Helmert, Roth, Herbold, & Walter, 2008; Tatler & Vincent, 2008). Laminarity and determinism are good candidates for revealing these patterns; however, laminar and deterministic recurrences can occur on their own (a few fixations repeated in isolation) or in large clusters (many fixations in and around the same area of a scene). To dissociate these, RQA plots can be quantified using cluster analysis (e.g., Schaeffer, 2007; von Luxburg, 2007). A cluster in a recurrence plot indicates that fixations are occurring close together in time and space. Clusters away from the line of incidence indicate areas of the scene that have been reinspected in detail. Cluster analysis can be used to quantify the number of clusters in a given sequence, the average size of the clusters (i.e., how many fixations are involved in a given instance), and the distance of the cluster from the line of incidence.

The RQA measures allow for generalizations of the temporal structure of eye movements across observers and stimuli. As such, they can be used to index many aspects of the relationship between visual attention, scene understanding, and cognition. For example, multiple deterministic points across people in a social scene may reflect the observers’ understanding of the particular social situation portrayed (cf. Birmingham, Bischof, & Kingstone, 2008a, b). Percentages of recurrence in a scene may reflect the exploratory behavior of a particular observer (cf. Risko et al., 2012) or impoverished memory of particular scene features (Hollingworth & Henderson, 2002). RQA is a new approach to understanding the temporal aspects of eye movement behavior. Used in conjunction with traditional measures, it can provide a more thorough understanding of how people explore their environment.

Conclusions

We have demonstrated that recurrence quantification analysis can be used as a robust and general analytic tool to quantify the temporal dynamics of fixation sequences. We showed that several recurrence quantification measures can readily be applied to fixation behavior to quantify both fine and global aspects of the temporal structure of eye movements. Furthermore, we demonstrated the use of this analysis in the comparison of eye movement behavior between natural and gaze-contingent viewing. We believe that RQA is a promising tool for future research into the temporal characteristics of eye movement behavior and that it can reveal interesting dependencies between temporal and spatial influences on visual attention.