Elsevier

Physics Letters A

Volume 346, Issues 1–3, 10 October 2005, Pages 47-53
Physics Letters A

Filling gaps in chaotic time series

https://doi.org/10.1016/j.physleta.2005.07.076Get rights and content

Abstract

We propose a method for filling arbitrarily wide gaps in deterministic time series. Crucial to the method is the ability to apply Takens' theorem in order to reconstruct the dynamics underlying the time series. We introduce a functional to evaluate the degree of compatibility of a filling sequence of data with the reconstructed dynamics. An algorithm for finding highly compatible filling sequences with a reasonable computational effort is then discussed.

Introduction

One problem faced by many practitioners in the applied sciences is the presence of gaps (i.e. sequences of missing data) in observed time series, which makes hard or impossible any analysis. The problem is routinely solved by interpolation if the gap width is very short, but it becomes a formidable one if the gap width is larger than some time scale characterizing the predictability of the time series.

If the physical system under study is described by a small set of coupled ordinary differential equations, then a theorem by Takens [1], [2] suggests that from a single time series it is possible to build-up a mathematical model whose dynamics is diffeomorph to that of the system under examination. In this Letter we leverage the dynamic reconstruction theorem of Takens for filling an arbitrarily wide gap in a time series.

It is important to stress that the goal of the method is not that of recovering a good approximation to the lost data. Sensitive dependence on initial conditions, and imperfections of the reconstructed dynamics, make this goal a practical impossibility, except for some special cases, such as small gap width, or periodic dynamics. We rather aim at giving one or more surrogate data which can be considered compatible with the observed dynamics, in a sense which will be made rigorous in the following.

We shall assume that an observable quantity s is a function of the state of a continuous-time, low-dimensional dynamical system, whose time evolution is confined on a strange attractor (that is, we explicitly discard transient behavior). Both the explicit form of the equations governing the dynamical system and the function which links its state to the signal s(t) may be unknown. We also assume that an instrument samples s(t) at regular intervals of length Δt, yielding an ordered set of N¯ data si=s((i1)Δt),i=1,,N¯. If, for any cause, the instrument is unable to record the value of s for a number of times, there will be some invalid entries in the time series {si}, for some values of the index i.

From the time series {si} we reconstruct the underlying dynamics with the technique of delay coordinates. That is, we shall invoke Takens' theorem [1], [2] and claim that the m-dimensional vectors xi=(si,si+τ,,si+(m1)τ) lie on a curve in Rm which is diffeomorph to the curve followed in its (unknown) phase space by the state of the dynamical system which originated the signal s(t). Here τ is a positive integer, and i now runs only up to N=N¯(m1)τ. Severals pitfalls have to be taken into account in order to choose the most appropriate values for m and τ. Strong constraints also come from the length of the time series, compared to the characteristic time scales of the dynamical system, and from the amount of instrumental noise which affects the data. We shall not review these issues here, but address the reader to Refs. [3], [4], [5].

We note that gaps (that is, invalid entries) in the time series {si} do not prevent a successful reconstruction of a set R={xi} of state vectors, unless the total width of the gaps is comparable with N¯. We simply mark as “missing” any reconstructed vector xi whose components are not all valid entries. If the gap in the signal s spans more than (m1)τ data points, then it will be mapped into a contiguous gap in the sequence of reconstructed vectors.

If the valid vectors of R sample well enough the underlying strange attractor embedded in Rm, one may hope to find, by means of a suitable interpolation technique, a vector field F:URm, such that within an open set U of Rm containing all the vectors xi, the observed dynamics can be approximated by x˙=F(x). This very idea is at the base of several forecasting schemes, where one takes the last observed vector xN as the initial condition for Eq. (2), and integrates it forward in time (see, e.g., [7], [8]).

The gap-filling problem was framed in terms of forecasts by Serre et al. [9]. Their method, which amounts to a special form of the shooting algorithm for boundary value problems, is limited by the predictability properties of the dynamics, and cannot fill gaps of arbitrary width.

The rest of this Letter is organized as follows: in Section 2 we cast the problem as a variational one, where a functional measures how well a candidate filling trajectory agrees with the vector field defining the observed dynamics. Then an algorithm is proposed for finding a filling trajectory. In Section 3 we give an example of what can be obtained with this method. Finally, we discuss the algorithm and offer some speculations on future works in Section 4.

Section snippets

A variational approach

The source of all difficulties of gap-filling comes from the following constraint: the interpolating curve, which shall be as close as possible to a solution of (2), must start at the last valid vector before the gap and reach the first valid vector after the gap in a time T which is prescribed.

To properly satisfy this constraint, we propose to frame the problem of filling gaps as a variational one. We are looking for a differentiable vector function ξ:[0,T]U which minimizes the functional J(ξ)

An example

In this section we show how the algorithm described above performs on a time series generated by a chaotic attractor. We integrate numerically the Lorenz equations [10] with the usual parameters (σ=10, r=28, b=8/3). We sample the x-variable of the equations with an interval Δt=0.02, collecting 5000 consecutive data points which are our time series. One thousand consecutive data points are then marked as “not-valid”, thus inserting in the time series a gap with a width of 1/5th of the series

Discussion and conclusions

In this Letter we have described an algorithm which fills an arbitrarily wide gap in a time series, provided that the dynamic reconstruction method of Takens is applicable. The goal is to provide a filling signal which is consistent with the observed dynamics, in the sense that, in the reconstructed phase space, the vector tangent to the filling curve should be close to the vector field modeling the observed dynamics. This request is cast as a variational problem, defined by the functional (3).

Acknowledgements

This work has been supported by fondo convezione strana of the Department of Mathematics of the University of Lecce. We are grateful to Prof. Carlo Sempi and to Dr. Fabio Paronetto for valuable comments.

References (14)

  • M. Casdagli et al.

    Physica D

    (1991)
  • A. Provenzale et al.

    Physica D

    (1992)
  • M. Casdagli

    Physica D

    (1989)
  • F. Paparella et al.

    Phys. Lett. A

    (1997)
  • F. Takens
  • T. Sauer et al.

    J. Stat. Phys.

    (1991)
  • J. Theiler

    Phys. Rev. A

    (1986)
There are more references available in the full text version of this article.

Cited by (9)

  • Statistical properties and time-frequency analysis of temperature, salinity and turbidity measured by the MAREL Carnot station in the coastal waters of Boulogne-sur-Mer (France)

    2016, Journal of Marine Systems
    Citation Excerpt :

    The Blackman–Tukey method, however, requires evenly-spaced data. Therefore, we have interpolated the turbidity time series with 13% of missing data in order to generate the powespectra in Fig. 10, as done in some studies (Ibanez and Conversi, 2002; Paparella, 2005). Nevertheless, interpolation introduces numerous artifacts to the data, both in the time and the frequency domain.

  • Feature-preserving interpolation and filtering of environmental time series

    2015, Environmental Modelling and Software
    Citation Excerpt :

    Recently, copula-based methods have been shown to outperform kriging for gap-filling problems (Bárdossy and Pegram, 2014). In another vein, Paparella (2005) formulates the gap-filling problem as an optimization that starts with a stitching of pieces from the observed signal. In recent years, a family of non-parametric methods has emerged in geostatistics that are based on the recognition that parametric models may be poorly adapted to represent complex phenomena.

  • Estimation of connectivity measures in gappy time series

    2015, Physica A: Statistical Mechanics and its Applications
View all citing articles on Scopus
View full text