nach oben

2018 | Buch

Kapitel lesen Erstes Kapitel lesen

Time Series Analysis and Forecasting

Selected Contributions from ITISE 2017

herausgegeben von: Prof. Ignacio Rojas, Prof. Héctor Pomares, Prof. Olga Valenzuela

Verlag: Springer International Publishing

Buchreihe : Contributions to Statistics

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This book presents selected peer-reviewed contributions from the International Work-Conference on Time Series, ITISE 2017, held in Granada, Spain, September 18-20, 2017. It discusses topics in time series analysis and forecasting, including advanced mathematical methodology, computational intelligence methods for time series, dimensionality reduction and similarity measures, econometric models, energy time series forecasting, forecasting in real problems, online learning in time series as well as high-dimensional and complex/big data time series.

The series of ITISE conferences provides a forum for scientists, engineers, educators and students to discuss the latest ideas and implementations in the foundations, theory, models and applications in the field of time series analysis and forecasting. It focuses on interdisciplinary and multidisciplinary research encompassing computer science, mathematics, statistics and econometrics.

Inhaltsverzeichnis

Frontmatter

Advanced Mathematical Methodologies in Time Series

Frontmatter

Forecasting via Fokker–Planck Using Conditional Probabilities

Abstract

Using a closed solution to a Fokker–Planck equation model of a time series, a probability distribution for the next observation is developed. This pdf has one free parameter, b. Various approaches to selecting this parameter have been explored: most recent value, weighted moving average, etc. Here, we explore using a conditional probability distribution for this parameter b, based upon the most recent observation. These methods are tested against some real-world product sales for both a one-step ahead and a two-step ahead forecast. Significant reduction in safety stock levels is found versus an ARMA approach, without a significant increase in out-of-stocks.

Chris Montagnon

Cryptanalysis of a Random Number Generator Based on a Chaotic Ring Oscillator

Abstract

This paper introduces cryptanalysis of a random number generator (RNG) based on a chaotic ring oscillator. An attack system is proposed to discover the security weaknesses of the chaos-based RNG. Convergence of the attack system is proved using master–slave synchronization scheme. Future evaluation of the RNG is obtained from a scalar time series where the only information available are the structure of the RNG and a scalar time series observed from the chaotic ring oscillator. Simulation and numerical results verifying the feasibility of the attack system are given. It is verified that deterministic chaos itself cannot be pointed out as the source of randomness.

Salih Ergün

Further Results on a Robust Multivariate Time Series Analysis in Nonlinear Models with Autoregressive and t-Distributed Errors

Abstract

We investigate a time series model which can generally be explained as the additive combination of a multivariate, nonlinear regression model with multiple univariate, covariance stationary autoregressive (AR) processes whose white noise components obey independent scaled t-distributions. These distributions enable the stochastic modeling of heavy tails or outlier-afflicted observations and present the framework for a partially adaptive, robust maximum likelihood (ML) estimation of the deterministic model parameters, of the AR coefficients, of the scale parameters, and of the degrees of freedom of the underlying t-distributions. To carry out the ML estimation, we derive a generalized expectation maximization (GEM) algorithm, which takes the form of linearized, iteratively reweighted least squares. In order to derive a quality assessment of the resulting estimates, we extend this GEM algorithm by a Monte Carlo based bootstrap algorithm that enables the computation of the covariance matrix with respect to all estimated parameters. We apply the extended GEM algorithm to a multivariate global navigation satellite system (GNSS) time series, which is approximated by a three-dimensional circle while taking into account the colored measurement noise and partially heavy-tailed white noise components. The precision of the circle model fitted by the GEM algorithm is superior to that of the previous standard estimation approach.

Hamza Alkhatib, Boris Kargoll, Jens-André Paffenholz

A New Estimation Technique for AR(1) Model with Long-Tailed Symmetric Innovations

Abstract

In recent years, it is seen in many time series applications that innovations are non-normal. In this situation, it is known that the least squares (LS) estimators are neither efficient nor robust and maximum likelihood (ML) estimators can only be obtained numerically which might be problematic. The estimation problem is considered newly through different distributions by the use of modified maximum likelihood (MML) estimation technique which assumes the shape parameter to be known. This becomes a drawback in machine data processing where the underlying distribution cannot be determined but assumed to be a member of a broad class of distributions. Therefore, in this study, the shape parameter is assumed to be unknown and the MML technique is combined with Huber’s estimation procedure to estimate the model parameters of autoregressive (AR) models of order 1, named as adaptive modified maximum likelihood (AMML) estimation. After the derivation of the AMML estimators, their efficiency and robustness properties are discussed through simulation study and compared with both MML and LS estimators. Besides, two test statistics for significance of the model are suggested. Both criterion and efficiency robustness properties of the test statistics are discussed, and comparisons with the corresponding MML and LS test statistics are given. Finally, the estimation procedure is generalized to AR(q) models.

Ayşen Dener Akkaya, Özlem Türker Bayrak

Prediction of High-Dimensional Time Series with Exogenous Variables Using Generalized Koopman Operator Framework in Reproducing Kernel Hilbert Space

Abstract

We propose a novel methodology to predict high-dimensional time series with exogenous variables using Koopman operator framework, by assuming that the time series are generated by some underlying unknown dynamical system with input as exogenous variables. In order to do that, we first generalize the definition of the original Koopman operator to allow for input to the underlying dynamical system. We then obtain a formulation of the generalized Koopman operator in reproducing kernel Hilbert space (RKHS) and a new derivation of its numerical approximation methods, namely, Extended Dynamic Mode Decomposition (EDMD) and its kernel-based version. We also obtain a statistical interpretation of kernel-based EDMD developed for deterministic Koopman operator by utilizing the connection between RKHS and Gaussian processes regression, and relate it to the stochastic Koopman and Perron–Frobenius operator. In applications, we found that the prediction performance of this methodology is promising in forecasting real-world high-dimensional time series with exogenous variables, including financial markets data. We believe that this methodology will be of interest to the community of scientists and engineers working on quantitative finance, econometrics, system biology, neurosciences, meteorology, oceanography, system identification and control, data mining, machine learning, computational intelligence, and many other fields involving high-dimensional time series and spatiotemporal data.

Jia-Chen Hua, Farzad Noorian, Philip H. W. Leong, Gemunu Gunaratne, Jorge Gonçalves

Eigenvalues Distribution Limit of Covariance Matrices with AR Processes Entries

Abstract

We consider a class of random matrices \(B_{N}=X_{N}T_{N}X_{N}^{t},\) where \( X_{N}\) is a matrix \((N\times n(N))\) whose rows are independent, the entries \(X_{ij}\) in each row satisfy an autoregressive relation AR(1), and \( T_{N}\) is a diagonal matrix independent of \(X_{N}\). Under some conditions, we show that if the empirical distribution function of eigenvalues of \(T_{N}\) converges almost surely to a proper probability distribution as \(N\longrightarrow \infty \) and \(\frac{n(N)}{N}\longrightarrow c>0\), then the empirical distribution function of eigenvalues of \(B_{N}\) converges almost surely to a non-random limit function given by Marcenko and Pastur. Numerical simulations illustrate the behavior of kernel density estimators and density estimators of Stieltjes transform around the true density and we give a numerical comparison on the base of \(L_{1}\) error varying different parameters.

Zahira Khettab, Tahar Mourid

Computational Intelligence Methods for Time Series

Frontmatter

Deep Learning for Detection of BGP Anomalies

Abstract

The Internet uses Border Gateway Protocol (BGP) for exchange of routes and reachability information between Autonomous Systems (AS). Hence, BGP is subject to anomalous traffic that can cause problems with connectivity and traffic loss. Routing Table Leak (RTL), worm and power outage events are considered anomalous in the sense that they can disrupt the Internet routing and cause slowdowns of varying severity, which leads to packet delivery reliability issues. Deep learning, a subfield of machine learning, could be applied in detection of BGP anomalies. Studying RTL, worm, and power outage events are of interest to network operators and researchers alike. In this paper, we consider datasets of several events, all of which caused large-scale Internet outages. We use artificial neural network (ANN) models based on a backpropagation algorithm for anomalous event classification.

Marijana Cosovic, Slobodan Obradovic, Emina Junuz

Using Scaling Methods to Improve Support Vector Regression’s Performance for Travel Time and Traffic Volume Predictions

Abstract

Long queues often happen on toll roads, especially at the tollgates. These create many problems including having an impact on the regular roads nearby. If travel time and traffic volume at the tollgates can be predicted accurately in advance, this would allow traffic authorities to take appropriate measures to improve traffic flow and the safety of road users. This paper describes a novel combination of scaling methods with Support Vector Machines for Regression (SVR) for travel time and tollgate volume prediction tasks, as part of the Knowledge Discovery and Data Mining (KDD) Cup 2017. A new method is introduced to handle missing data by utilising the structure of the road network. Moreover, experiments with reduced data were conducted to evaluate whether the conclusions from combining scaling methods with SVR could be generalised.

Amanda Yan Lin, Mengcheng Zhang, Selpi

Dimensionality Reduction and Similarity Measures in Time Series

Frontmatter

Linear Trend Filtering via Adaptive LASSO

Abstract

Linear trend filtering methods are popular due to their overall simplicity—the model is linear in each segment and there are typically only few segments considered. These segments are defined by unique points where the trend changes its direction—so-called changepoints. In this paper, we consider an innovative estimation approach for such models. Our proposal is based on recent developments in the atomic pursuit techniques: we present an estimation algorithm based on the adaptive LASSO penalty and we introduce a fully data-driven method which can be effectively used to fit the continuous linear trend models. Some statistical properties are discussed and the empirical performance is compared with respect to other competitive LASSO-based techniques.

Matúš Maciak

An Efficient Anomaly Detection in Quasi-Periodic Time Series Data—A Case Study with ECG

Abstract

Anomaly detection from a time series is an important problem with applications to find or predict the development of a fault in a system. Depending on the source of the data, it could be nonperiodic, quasi-periodic, and periodic. Modeling an aperiodic data to detect anomaly is difficult. A pure periodic data seldom happens in nature. Finding anomaly in quasi-periodic time series signals, for example, bio-signals like ECG, heart rate (pulse) data, are important. But, the analysis is computationally complex because of the need for proper window size selection and comparison of every pair of subsequences of window-size duration. In this paper, we proposed an efficient algorithm for anomaly detection of quasi-periodic time series data. We introduced a new concept “mother signal”, which is the average of normal subsequences. Creation of the mother signal is the first step in the process. Finding deviations of subsequences of varied duration (due to quasi-periodicity) from mother signal, is the second step. When this distance crosses a threshold, it is declared as a discord. The algorithm is light enough to work in real-time on computationally weak platforms like a mobile phone. Experiments were done with ECG signals to evaluate the performance. It is shown to be computationally more efficient compared to existing works, and could identify discords with higher rate.

Goutam Chakraborty, Takuya Kamiyama, Hideyuki Takahashi, Tetsuo Kinoshita

Similarity Analysis of Time Interval Data Sets—A Graph Theory Approach

Abstract

Comparison of entities, i.e., the measurement of their similarity, is a frequent, but challenging task in computer science. It requires a precise and quantifiable definition of similarity itself. Are two texts equal, if they overlap in a majority of their composing words? Does a pair of pictures resemble the same content? What defines the sameness of two songs? While certain distance-based approaches, e.g., Minkowski, make for a good starting point in defining similarity, there is no one-size-fits-all approach. In this work, we tackle a particularly interesting problem, namely, the definition of a similarity measure for comparing time interval data sets. Our approach regards the data sets as disjoint parts of a bigraph, thereby allowing for an application of methods from graph theory. We present both a formal definition of the similarity of two time intervals and our methods as well as concrete use-case from the medical domain, thus demonstrating the applicability for real-world scenarios.

Marc Haßler, Christian Kohlschein, Tobias Meisen

Logical Comparison Measures in Classification of Data—Nonmetric Measures

Abstract

In this chapter, we will create and use generalized combined comparison measures from t-norms (T) and t-conorms (S) for comparison of data. Norms are combined by the use of generalized mean, where t-norms give minimum and t-conorms give maximum compensation. From this intuitively thinking follows that when these norms are aggregated together, these new comparison measures should be able to find the best possible classification result in between minimum and maximum. We will use classification as our test bench for the suitability of these new comparison measures created. In these classification tasks, we have tested five different types of combined comparison measures (CCM), with t-norms and t-conorms. That were Dombi family, Frank family, Schweizer-Sklar family, Yager family, and Yu family. In classification, we used the following datasets: ionosphere, iris, and wine. We will compare the results achieved with CCM to the ones achieved with pseudo equivalences and show that these new measures tend to give better results.

Kalle Saastamoinen

Econometric Models

Frontmatter

Asymptotic and Bootstrap Tests for a Change in Autoregression Omitting Variability Estimation

Abstract

A sequence of time-ordered observations follows an autoregressive model of order one and its parameter is possibly subject to change at most once at some unknown time point. The aim is to test whether such an unknown change has occurred or not. A change-point method presented here rely on a ratio type test statistic based on the maxima of cumulative sums. The main advantage of the developed approach is that the variance of the observations neither has to be known nor estimated. Asymptotic distribution of the test statistic under the no-change null hypothesis is derived. Moreover, we prove the consistency of the test under the alternative. A bootstrap procedure is proposed in the way of a completely data-driven technique without any tuning parameters. The results are illustrated through a simulation study, which demonstrates the computational efficiency of the procedure. A practical application to real data is presented as well.

Barbora Peštová, Michal Pešta

Distance Between VARMA Models and Its Application to Spatial Differences Analysis in the Relationship GDP—Unemployment Growth Rate in Europe

Abstract

In this paper, a novel distance measure for evaluating the closeness of two vector autoregressive moving average models is presented and its main properties are discussed. The proposed distance is used to investigate the presence of spatial differences in the dynamic link between unemployment rate variation and GDP growth in some European Union countries.

Francesca Di Iorio, Umberto Triacca

Copulas for Modeling the Relationship Between Inflation and the Exchange Rate

Abstract

Copulas are useful tools for formalizing the dependence structure between variables. They have proven to be very valuable in economics, where the dependence plays a key role. In this chapter, we use copulas to analyze the dependence between inflation and US/Euro exchange rates in the Euro area, during different periods. We first explore the dependence between the variables using a nonparametric approach. Then, we select an appropriate parametric copula for each period. Results confirm the sensibility of copulas to macroeconomic fluctuations that occur during the analyzed periods.

Laila Ait Hassou, Fadoua Badaoui, Okou Guei Cyrille, Amine Amar, Abdelhak Zoglat, Elhadj Ezzahid

Energy Time Series Forecasting

Frontmatter

Fuel Consumption Estimation for Climbing Phase

Abstract

Aiming at the problem of the civil aviation carbon emission, the purpose of this chapter is to present a simplified method to estimate aircraft fuel consumption using an adaptive Genetic Algorithm-Back Propagation (GA-BP) Strong prediction network. This chapter gives a brief overview of the modeling approach and describes efforts to validate and analyze the initial results of this project. The parameters of fuel consumption are analyzed by using QAR flight data, two kinds of fuel consumption prediction model are proposed, it is the BP prediction model and the adaptive (it is abbreviated to A) GA-BP (Genetic Algorithm-Back Propagation) Strong prediction model. The crossover and mutation probability of GA-BP Strong prediction model can be adaptive adjustment, and the BP neural network as a weak predictor, after the limited number of iterations, it can realize error optimization adjustment and solve the complicated nonlinear problem. Results of the simulation indicated the two models have obvious advantages in nonlinear prediction, and the prediction accuracy and the degree of fitting are good. The results of this study illustrate that the two neural network with nonlinear transfer functions can accurately represent complex aircraft fuel consumption functions for climb phases of flight, so the two models are feasible in the field of fuel consumption prediction. The methodology can be extended to cruise and descent phases of flight.

JingJie Chen, YongPing Zhang

Time Series Optimization for Energy Prediction in Wi-Fi Infrastructures

Abstract

Access points play an important role in Wi-Fi networks and can provide us with useful information about the energy consumption according to the users’ behavior. If we predict the energy consumption in a determined access point, we can make easier the maintenance plans for the network infrastructure making the most adequate decisions about the placement of new devices or reinforcement of existing ones, for example. In this work, we propose an energy prediction methodology based on system identification, where the energy measured in the access points is represented as time series. The prediction results were reasonably good for an experimental environment consisting of ten access points in an academic building, modeling the energy patterns along some weeks. Moreover, we found an optimization problem where the main parameters of the identification model can be adjusted in order to provide results more accurate. Given the computational effort required for searching in depth the optimal values, we applied a genetic algorithm, which provided better results in less time with regard to a direct search method.

David Rodriguez-Lozano, Juan A. Gomez-Pulido, Arturo Duran-Dominguez

An Econometric Analysis of the Merit-Order Effect in Electricity Spot Price: The Germany Case

Abstract

In this paper, we carry out an econometric analysis for Germany, as a country with high penetration of renewable energy sources (RES), in order to investigate impact of wind energy and photovoltaic feed-in on electricity spot price level, the so-called merit-order effect. We have used an ARMA-X-GARCH-X modeling where wind generation and photovoltaic are considered as exogenous variables included in the mean and the variance equation, in order to assess the joint impact of RES on the electricity spot price level as well as on spot price volatility in Germany. Our main empirical findings suggest that wind power and photovoltaic feed-in decreases electricity spot price. However, their impact on electricity spot prices volatility is quite different. Indeed, the solar photovoltaic power has a lowering on impact electricity price volatility, whereas the wind feed-in exacerbates it.

François Benhmad, Jacques Percebois

Forecasting in Real Problems

Frontmatter

The Analysis of Variability of Short Data Sets Based on Mahalanobis Distance Calculation and Surrogate Time Series Testing

Abstract

In this work, we present convenient for short time series approach which is based on the multivariate Mahalanobis distance calculation, combined with the surrogate time series testing. In order to test the ability of this approach to differentiate changes which could occur in complex processes, we analyzed data sets of different origins. We used seismological, meteorological, physiological, and economic data sets. Exactly, we analyzed data sets of inter earthquake times (IET), inter earthquake distances (IED), and differences in consecutive magnitudes (DM) compiled from southern Californian earthquake catalogue, data sets of yearly number of warmer and colder days derived from maximal air temperature data bases in Tbilisi, Georgia, arterial systolic, and diastolic blood pressure time series of healthy persons, as well as components of Index of Economic Freedom (IEF) and exchange rate time series of three southern Caucasian countries. It was shown that used approach, even in the case of relatively short time series, may effectively be used to quantify dynamical changes occurred in different natural complex processes.

Teimuraz Matcharashvili, Natalia Zhukova, Tamaz Chelidze, Evgeni Baratashvili, Tamar Matcharashvili, Manana Janiashvili

On Generalized Additive Models with Dependent Time Series Covariates

Abstract

The generalized additive model (GAM) is a standard statistical methodology and is frequently used in various fields of applied data analysis where the response variable is non-normal, e.g., integer-valued, and the explanatory variables are continuous, typically normally distributed. Standard assumptions of this model, among others, are that the explanatory variables are independent and identically distributed vectors which are not multicollinear. To handle the multicollinearity and serial dependence together a new hybrid model, called GAM-PCA-VAR model, was proposed in [17] (de Souza et al., J Roy Stat Soc C-Appl 2018) which is the combination of GAM with the principal component analysis (PCA) and the vector autoregressive (VAR) model. In this paper, some properties of the GAM-PCA-VAR model are discussed theoretically and verified by simulation. A real data set is also analyzed with the aim to describe the association between respiratory disease and air pollution concentrations.

Márton Ispány, Valdério A. Reisen, Glaura C. Franco, Pascal Bondon, Higor H. A. Cotta, Paulo R. P. Filho, Faradiba S. Serpa

A Bayesian Approach to Astronomical Time Delay Estimations

Abstract

Time delay estimations between two time series data in astronomy have some difficulties due to their sparseness. We propose a fully Bayesian method based on a state–space model for this kind of analyses, and raise one example of the application to astronomical data. Our estimation can deal with heteroskedastic observational errors of astronomical time series and has much smaller errors than the result with a conventional method. This method may be applicable for many kinds of black hole systems and has a potential to derive the information of geometrical structure of astronomical objects after some improvements.

Mariko Kimura, Hyungsuk Tak, Taichi Kato

Further Results on a Modified EM Algorithm for Parameter Estimation in Linear Models with Time-Dependent Autoregressive and t-Distributed Errors

Abstract

In this contribution, we consider an expectation conditional maximization either (ECME) algorithm for the purpose of estimating the parameters of a linear observation model with time-dependent autoregressive (AR) errors. The degree of freedom (d.o.f.) of the underlying family of scaled t-distributions, which is used to account for outliers and heavy-tailedness of the white noise components, is adapted to the data, resulting in a self-tuning robust estimator. The time variability of the AR coefficients is described by a second linear model. We improve the estimation of the d.o.f. in a previous version of the ECME algorithm, which involves a zero search, by using an interval Newton method. We model the transient oscillations of a shaker table measured by a high-accuracy accelerometer, and we analyze various criteria for selecting a simultaneously parsimonious and realistic time-variability model.

Boris Kargoll, Mohammad Omidalizarandi, Hamza Alkhatib, Wolf-Dieter Schuh

Backmatter

Titel: Time Series Analysis and Forecasting
herausgegeben von: Prof. Ignacio Rojas
Prof. Héctor Pomares
Prof. Olga Valenzuela
Verlag: Springer International Publishing
Electronic ISBN: 978-3-319-96944-2
Print ISBN: 978-3-319-96943-5
DOI: https://doi.org/10.1007/978-3-319-96944-2