Construction of multivariate surrogate sets from nonlinear data using the wavelet transform
Introduction
The nonlinear properties of biophysical systems such as the brain have motivated the nonlinear characterization of time series data in many studies. However, before characterizing the nonlinear properties of time series data, it is important to first determine that the data do, in fact, contain nonlinear structure. It is otherwise possible that the values of ‘nonlinear invariants’ will merely reflect practical limitations of the data such as a finite sample length [1], [2], experimental filtering [3] or other linear properties such as the power spectra [4]. A data set can be said to be nonlinear if it meets two criteria. First, the data must permit rejection of the null hypothesis that they are purely linear, or have a linear origin but have been distorted by a static nonlinear measurement effect. Second, alternative factors that may allow rejection of this null hypothesis, such as nonstationary stochastic processes, must be excluded [5], [6]. Only if these conditions have been met is it then reasonable to use nonlinear methods to further characterize the properties of the time series.
A simple, yet powerful method of testing such a null hypothesis is to resample the original data in such a way that the linear properties of the data are preserved, but any nonlinear structure is removed. The values of a nonlinear measure calculated from an ensemble of such “surrogate data” represent their expected distribution under the null hypothesis. A straightforward statistical comparison between this null distribution and the value derived from the experimental data then permits formal testing of the null hypothesis. This “bootstrap” technique of generating a null distribution has a well established role in statistics. It was introduced into the dynamical systems context by Pijn et al. [7] and Theiler et al. [8] where it has had a significant impact (for an overview see [9]). For example, prior to surrogate data testing, several studies had concluded that scalp EEG data from healthy human subjects was chaotic. Reappraisal of this data using surrogate techniques has typically shown that nonlinearity does occur, but only weakly and/or infrequently [10], [11], [12], [13].
The simplest method of data resampling is to randomly permute the temporal order of the data (without replacement). However, unless strongly constrained [14] such a process typically destroys linear as well as nonlinear correlations. A solution to this problem, and the most widely implemented technique for generating surrogate data, is to resample the data in the Fourier domain [8]. That is, the data are Fourier transformed and the phase of each frequency is rotated by an independent random number p∈(0,2π). The inverse Fourier transformed data has, on average, the same spectral properties as the original data, but with any nonlinear structure removed. Such a procedure is easy to implement and associated with minimal computation demands, particularly if the sample length is a power of 2. There are, however, important caveats of this approach, including the effect on the amplitude distribution of the data and the extension to spatially extended data sets, which to our knowledge has not yet been illustrated.
Wavelets are a relatively novel signal analysis tool that have already had important applications in engineering, biology and physics. They are ideally suited to signals with transient temporal properties, multiscale structure and spatial extension. Wavelets hence show great potential in the study of nonlinear and biological systems. For example, Guan et al. [15] found that just three wavelet modes were able to capture the rich dynamics of a complex nonlinear spatio-temporal system, whereas greater than 20 Fourier modes were required. Wavelets represent a simple method of capturing the correlations between scales exhibited by coupled chaotic oscillators [16]. Wavelets are able to efficiently capture neural signals isolated in time and space, such as epileptiform discharges in scalp EEG [17], [18] and isolated areas of transient activation in intracranial functional magnetic resonance imaging (fMRI) [19], [20], [21].
The parsinomious representation of complex spatio-temporal and nonlinear data by wavelet decomposition suggests that a wavelet-based surrogate algorithm may have many important applications in the study of complex biophysical systems. In this paper, a wavelet-based method of constructing surrogate data for nonlinear hypothesis testing is presented and applied to both numeric and experimental test data. Different methods of resampling the data in the ‘wavelet domain’ are compared. It is shown how the technique can be easily extended to multivariate data sets. A consideration of the effect of wavelet resampling on the amplitude distribution of the original data is presented. This motivates a practical approach to construct surrogate data in the context of the constraints imposed in the real experimental setting.
Section snippets
Wavelet resampling
In this section, a brief technical review of a wavelet decomposition of a time series is given. The ‘decorrelating’ properties of the wavelet transform are discussed, and the technique of resampling in the wavelet domain is described.
Application to numerical and experimental data
In the following sections the above concepts are applied to time series data obtained from (1) a coupled Rossler dynamical system, (2) a static nonlinear transformation of a colored noise source, and (3) human scalp EEG data. The effects of different types of wavelet resampling are illustrated. The results and computational demands are compared to the phase randomization approach.
Extension to multivariate data
Many complex biophysical systems—such as the nervous system—are characterized by sparsely interconnected local nonlinear subsystems. Since the initial observations of synchronization in chaotic systems [32] there has been much interest in the role of complex, nonlinear interdependence between different subsystems within such a network (e.g. [13], [33]). This can be studied by examining multivariate time series data sets recorded from different regions of the system. Research in this area has
Effect of wavelet resampling on amplitude distribution of time series
A significant problem that is known to be associated with the phase randomization technique is that the amplitude distribution of the surrogate data is, on average, Gaussian. Although stochastic signals, such as produced by Eq. (16), have Gaussian amplitude distributions, it is possible that the process of recording the data introduces a static nonlinear distortion of the signal, as modelled by Eq. (17). A nonlinear measurement effect is thought to be present in the case of fMRI of blood flow
Effect of wavelet resampling on nonlinear structure of time series
Just as the preservation of linear structure and the potential for multiple distinct realizations are essential characteristics of any surrogate technique, so too is the removal of any nonlinear structure that may be present. Whereas the first two properties are essential to minimize the rate of false positive rejections of the null hypothesis, removal of nonlinear structure is essential to limit the rate of false negatives. That is, failure to properly destroy any nonlinear structure in the
Computation demands of wavelet resampling
As discussed above, in the real experimental setting, it is often desirable to test multiple data sets for nonlinearity. To ensure that the rate of false positive rejections of the null hypothesis is minimized, it may be necessary to construct thousands of surrogate data sets, particularly if the data sets are drawn from different subjects. Clearly the computational demands of surrogate data construction are important in this setting even if the computation is performed off-line. The
Discussion
The use of surrogate data is a crucial step in the testing of time series data sets for evidence of nonlinear structure or independence. In this paper, we present a method of generating surrogate data based on the wavelet transform. This method was recently developed for application in other biophysical contexts [26] and is presented here in relation to nonlinear hypothesis testing for the first time. Resampling of the data in the wavelet domain is shown to have all the desired properties of a
Acknowledgements
The authors wish to thank S. Knock and P. Drysdale for helpful discussions and technical assistance. MB acknowledges the support of a NSW Institute of Psychiatry Research Fellowship and University of Sydney SESQUI post-doctoral fellowship.
References (41)
- et al.
Finite correlation dimension for stochastic systems with power-law spectra
Physica D
(1989) - et al.
Detection and description of nonlinear interdependence in normal multichannel human EEG
Clin. Neurophysiol.
(2002) - et al.
Chaos or noise in EEG signals; dependence on state and brain site
Electroenceph. Clin. Neurophysiol.
(1991) - et al.
Testing for nonlinearity: the method of surrogate data
Physica D
(1992) - et al.
Surrogate time series
Physica D
(2000) - et al.
Investigation of nonlinear structure in multichannel EEG
Phys. Lett. A
(1995) - et al.
Dynamics of the alpha rhythm: evidence for non-linearity?
Clin. Neurophysiol.
(1999) - et al.
A wavelet method for the characterization of spatiotemporal patterns
Physica D
(2002) - et al.
Wavelet-crosscorrelation analysis can help predict whether bursts of pule stimulation will terminate afterdischarges
Clin. Neurophysiol.
(2002) - et al.
Topographic organisation of nonlinear interdependence in multichannel human EEG
Neuroimage
(2002)