Forecasting stochastic processes using singular spectrum analysis: Aspects of the theory and application
Introduction
Singular spectrum analysis (SSA) is a nonparametric technique that is designed for use in signal extraction and the prediction of irregular time series that may exhibit non-stationary and nonlinear properties, as well as intermittent or transient behaviour. The development of SSA is often attributed to researchers working in the physical sciences, namely Broomhead and King (1986), Vautard and Ghil (1989) and Vautard, Yiou, and Ghil (1992), although many of the basic building blocks were outlined by Basilevsky and Hum (1979) in a socioeconomic setting, and an early formulation of some of the key ideas can be found in the work of Prony (1795). An introduction to SSA is presented by Elsner and Tsonis (1996), and a more detailed examination of the methodology, with an emphasis on the algebraic structure and algorithms, is provided by Golyandina, Nekrutkin, and Zhigljavski (2001).
The application of SSA to forecasting has gained popularity over recent years (see for example Hassani, Heravi, & Zhigljavsky, 2009, Hassani, Soofi, & Zhigljavsky, 2010, Hassani & Zhigljavsky, 2009 and Thomakos, Wang, & Wille, 2002, for applications in business and economics), and the general finding appears to be that SSA performs well. These studies have examined SSA forecasts by investigating real world applications and comparing the performance of SSA to those of other benchmarks like ARIMA models and Holt–Winters procedures. However, with real world data the true data generating mechanism is not known, and making a comparison with such benchmarks does not convey the full picture: knowing that SSA outperforms a benchmark serves only to show that the benchmark is suboptimal, and therefore does not provide an appropriate baseline.
In this paper, our purpose is to provide what we believe to be the first theoretical analysis of the forecasting performance of SSA under appropriate regularity conditions concerning the true data generating mechanism. We present a formulation of the SSA mean squared forecast error (MSFE) for a general class of processes. The usefulness of such formulae lies not only in the fact that they provide a neat mathematical characterization of the SSA forecast error, but also in the fact that they allow a comparison to be made between SSA and the optimal mean squared error solution for a known random processes. The minimal mean squared error (MMSE) predictor obviously provides a (gold) standard against which all other procedures can be measured.
Irrespective of the actual structure of the observed process, SSA forecasts are obtained by calculating a linear recurrence formula (LRF) that is used to construct a prediction of the future value(s) of the realized time series. Given a univariate time series of length , the coefficients of the LRF are computed from a spectral decomposition of an dimensional Hankel matrix known as the trajectory matrix. The dimension is called the window length, and is referred to as the window width. The Gramian of the trajectory matrix is constructed for a known window length, and the eigenvalue decomposition of the Gramian evaluated. This is then used to decompose the observed series into a signal component, constructed from eigentriples of the Hankel matrix (the first left and right hand eigenvalues and their associated singular values), and a residual. The resulting signal plus noise decomposition is then employed to produce a forecast via the LRF coefficients. Details are presented in the following section, where we outline the basic structure of the calculations underlying the construction of a model and the associated forecasts.
Section 3 presents the theoretical MSFE of a model under very broad assumptions. The formulae that we derive indicate how the use of different values of , a tuning parameter, and , a modeling parameter, will interact to influence the MSFE obtained from a given model. In Section 4, it is shown that, when appropriate regularity conditions are satisfied, the SSA forecasts constructed in practice, and their associated MSFE estimates, will converge to their theoretical population ensemble counterparts.
Section 5 illustrates the theoretical results obtained in Sections 3 Theoretical properties of SSA forecasts, 4 Consistent parameter estimation. In forecasting applications, it is common practice to assume implicitly that the fitted model is correct, and therefore that the forecasting formulae derived from the model are appropriate; however, such an assumption rarely holds true. In general, the true data generating process (DGP) is unknown, and the fitted model will only provide, at best, a close approximation to the true DGP. Hence, the expectation is that the forecasting performance of a fitted model will be sub-optimal, and therefore it is natural to ask in what ways and to what extent the forecasting performance of the fitted model will fall short. In an attempt to address this question, Section 5 examines the MSFE performances of different models and compares them with those of the optimal MSE predictors for known DGPs.
Section 6 demonstrates the application of SSA forecasting to different real world time series. It shows that SSA forecasts can provide considerable improvements in empirical MSFE performances over the conventional benchmark models that have been used previously to characterize these series. Section 7 presents a brief conclusion.
Section snippets
The mechanics of SSA forecasting
Singular spectrum analysis (SSA) is based on the basic idea that there is an isomorphism between an observed time series and the vector space of Hankel matrices, defined by the mapping where is a preassigned window length, , , and the so called trajectory matrix for and . Let denote the eigenvalues of arranged in
Theoretical properties of SSA forecasts
Let us assume that the data-generating mechanism of the underlying stochastic process is such that there exists a for which the -lagged vectors of the trajectory matrix can be modeled as where is an coefficient matrix, and is a zero mean stochastic process with contemporaneous covariance matrix , denoted henceforth, that is orthogonal to , where is the th order identity. The specification
Consistent parameter estimation
Before proceeding, let us note that the population ensemble forecasting parameters and MSFE values presented in Lemma 3, Theorem 1 and Proposition 1 will not be available to the practitioner. However, they can be estimated from the data using obvious “plug in” estimates. Thus, , , can be estimated by substituting for , can be estimated using , where and , and can be estimated by replacing with, in an
Forecasting an process
Consider a zero mean process , where and . The autocovariance generating function of this process is , where , and Assumption 1 is satisfied with . For an process, the optimal MSE forecast of given , , is , , with a MSFE of for the th forecast horizon.
Empirical applications
When examining real world data sets, the predictive performance of a model can be evaluated only by comparing it to other competing models, as the true DGP is unknown in practice, and therefore the optimal MSE predictor used for the theoretical processes examined in the previous section is no longer available for analysis. We have therefore selected three different time series that have been examined elsewhere in the literature, and used models the have previously been fitted to these data sets
Concluding remarks
The theoretical examination of SSA forecasting presented above indicates that the loss in MSFE performance of SSA relative to the optimal MSE predictor is both process-specific and variable in nature, and need not be severe. However, when applied to different real world time series, SSA can exhibit considerable improvements in empirical MSFE performances over the conventional benchmark models that have been used to characterize the series previously.
The contrast between the inferior performance
M. Atikur Rahman Khan is an OCE Postdoctoral Fellow at the Commonwealth Scientific and Industrial Research Organisation (CSIRO). He completed his M.Sc. in Statistics at the National University of Singapore and his Ph.D. in Econometrics at the Department of Econometrics and Business Statistics, Monash University. He is a member of the International Statistical Institute and the Statistical Society of Australia. He was awarded a Postgraduate Publication Award at Monash University for carrying out
References (26)
- et al.
Extracting qualitative dynamics from experimental data
Physica D: Nonlinear Phenomena
(1986) - et al.
Forecasting European industrial production with singular spectrum analysis
International Journal of Forecasting
(2009) - et al.
Predicting daily exchange rate with singular spectrum analysis
Nonlinear Analysis. Real World Applications
(2010) - et al.
Modeling daily realized futures volatility with singular spectrum analysis
Physica A. Statistical Mechanics and its Applications
(2002) - et al.
Singular spectrum analysis in nonlinear dynamics, with applications to paleoclimatic time series
Physica D: Nonlinear Phenomena
(1989) - et al.
Singular-spectrum analysis: A toolkit for short, noisy chaotic signals
Physica D: Nonlinear Phenomena
(1992) - et al.
Karhunen–Loève analysis of historical time series with an application to plantation births in Jamaica
Journal of the American Statistical Association
(1979) - et al.
Time series analysis: Forecasting and control
(1976) - et al.
Introduction to time series and forecasting
(2002) The analysis of time series: An introduction
(2004)
Stochastic limit theory: An introduction for econometricians
Singular spectrum analysis: A new tool in time series analysis
Analysis of time series structure: SSA and related techniques
Cited by (26)
Carbon prices forecasting based on the singular spectrum analysis, feature selection, and deep learning: Toward a unified view
2023, Process Safety and Environmental ProtectionCirculant singular spectrum analysis: A new automated procedure for signal extraction
2021, Signal ProcessingCitation Excerpt :The main task in SSA is to extract the underlying signals of a time series like the trend, cycle, seasonal and irregular components. It has been applied to a wide range of time series problems, besides signal processing [2], like forecasting [3], missing value imputation [4] or functional time series [5] among others. SSA builds a trajectory matrix by putting together lagged pieces of the original time series and works with the Singular Value Decomposition of this matrix.
Forecasting with auxiliary information in forecasts using multivariate singular spectrum analysis
2019, Information SciencesCitation Excerpt :In brief, the MSSA process filters the data and extracts signals which can be used to generate a new time series that is less noisy, and then uses this less noisy, reconstructed series for generating a forecast [40]. In comparison to its univariate counterpart, Singular Spectrum Analysis (SSA) [4,5,14], which has a variety of applications [1,13,21,22,26,30–32,34,37,41,43,51], MSSA is yet to be exploited on a similar scale. As a result there have been comparatively few applications of MSSA, see for example [6,20,25,35,39,42,44].
Forecasting the demand of the aviation industry using hybrid time series SARIMA-SVR approach
2019, Transportation Research Part E: Logistics and Transportation ReviewStructured low-rank matrix completion for forecasting in time series analysis
2018, International Journal of ForecastingCitation Excerpt :SSA uses the fact that many time series can be approximated well by a class of so-called time series of finite rank. However, despite many successful examples (Hassani, Heravi, & Zhigljavsky, 2009; Khan & Poskitt, 2017; Papailias & Thomakos, 2017), SSA forecasting has a number of disadvantages. This paper develops a method based on Hankel matrix completion.
A bi-level ensemble learning approach to complex time series forecasting: Taking exchange rates as an example
2023, Journal of Forecasting
M. Atikur Rahman Khan is an OCE Postdoctoral Fellow at the Commonwealth Scientific and Industrial Research Organisation (CSIRO). He completed his M.Sc. in Statistics at the National University of Singapore and his Ph.D. in Econometrics at the Department of Econometrics and Business Statistics, Monash University. He is a member of the International Statistical Institute and the Statistical Society of Australia. He was awarded a Postgraduate Publication Award at Monash University for carrying out this work. His research interests include time series analysis and forecasting, predictive modelling, and privacy-preserving data analytics.
D.S. Poskitt holds a chair in the Department of Econometrics and Business Statistics, Monash University, having previously been a member of the Department of Statistics and Econometrics, Australian National University. He is a member of the Econometrics Society, the Institute of Mathematical Statistics and the Australian and New-Zealand Statistical Society, and a Fellow of the Royal Statistical Society. He has published extensively in the area of statistical time series analysis and is an Associate Editor of the Journal of Time Series Analysis. He is a recipient of an American Statistical Association Award for Outstanding Statistical Application, and an Econometric Theory Multa Scripsit Award.