Predicting the Future of Discrete Sequences from Fractal Representations of the Past

Tino, Peter; Dorffner, Georg

doi:10.1023/A:1010972803901

Predicting the Future of Discrete Sequences from Fractal Representations of the Past

Published: November 2001

Volume 45, pages 187–217, (2001)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Predicting the Future of Discrete Sequences from Fractal Representations of the Past

Download PDF

Peter Tino^1,2 &
Georg Dorffner^1,3

1014 Accesses
42 Citations
Explore all metrics

Abstract

We propose a novel approach for building finite memory predictive models similar in spirit to variable memory length Markov models (VLMMs). The models are constructed by first transforming the n-block structure of the training sequence into a geometric structure of points in a unit hypercube, such that the longer is the common suffix shared by any two n-blocks, the closer lie their point representations. Such a transformation embodies a Markov assumption—n-blocks with long common suffixes are likely to produce similar continuations. Prediction contexts are found by detecting clusters in the geometric n-block representation of the training sequence via vector quantization. We compare our model with both the classical (fixed order) and variable memory length Markov models on five data sets with different memory and stochastic components. Fixed order Markov models (MMs) fail on three large data sets on which the advantage of allowing variable memory length can be exploited. On these data sets, our predictive models have a superior, or comparable performance to that of VLMMs, yet, their construction is fully automatic, which, is shown to be problematic in the case of VLMMs. On one data set, VLMMs are outperformed by the classical MMs. On this set, our models perform significantly better than MMs. On the remaining data set, classical MMs outperform the variable context length strategies.

References

Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.
Google Scholar
Aurenhammer, F. (1991). Voronoi diagrams—Survey of a fundamental geometric data structure. ACM Computing Surveys, 3, 345–405.
Google Scholar
Barnsley, M. F. (1988). Fractals everywhere. New York: Academic Press.
Google Scholar
Bauer, H. U., Der, R.,& Herrmann, M. (1996). Controlling the magnification factor of self-organizing feature maps. Neural Computation, 8, 757–771.
Google Scholar
Bollerslev, T. (1986). A generalized autoregressive conditional heteroscedasticity. Journal of Econometrics, 31, 307–327.
Google Scholar
Brillinger, D. R. (1994). Examples of scientific problems and data analysis in demography, neurophysiology and seismology. J. Comp. and Graph. Statistics, 3, 1–22.
Google Scholar
Bruske, J.& Sommer, G. (1995). Dynamic cell structure learns perfectly topology preserving map. Neural Computation, 4, 845–865.
Google Scholar
Burset, M.& Guigó, R. (1996). Evaluation of gene structure prediction programs. Genomics, 34, 353–357.
Google Scholar
Bühlmann, P. (1998). Extreme events from return-volume process: A discretization approach for complexity reduction. Applied Financial Economics, 8, 267–278.
Google Scholar
Bühlmann, P. (2000). Model selection for variable length Markov chains and tuning the context algorithm. Annals of the Institute of Statistical Mathematics, 52, 287–315.
Google Scholar
Bühlmann, P. (1999b). Dynamic adaptive partitioning for nonlinear time series. Biometrika, 86, 555–571.
Google Scholar
Bühlmann, P.& Wyner, A. J. (1999). Variable length Markov chains. Annals of Statistics, 27, 480–513.
Google Scholar
Buhmann, J. M. (1995). Learning and data clustering. In M. Arbib (Ed.), Handbook of brain theory and neural networks. Cambridge, MA: Bradford Books, MIT Press.
Google Scholar
Crutchfield, J. P.& Young, K. (1990). Computation at the onset of chaos. In W. H. Zurek (Ed.), Complexity, entropy, and the physics of information, SFI Studies in the Sciences of Complexity (Vol. 8). Reading, MA: Addison-Wesley.
Google Scholar
Everitt, B. S. (1977). The analysis of contingency tables. London: Chapman and Hall.
Google Scholar
Freund, J., Ebeling, W.,& Rateitschak, K. (1996). Self-similar sequences and universal scaling of dynamical entropies. Physical Review E, 5, 5561–5566.
Google Scholar
Giles, C. L., Lawrence, S.,& Tsoi, A. C. (1997). Rule inference for financial prediction using recurrent neural networks. In Proceedings of the Conference on Computational Intelligence for Financial Engineering (pp. 253–259). Piscataway, NJ: IEEE.
Google Scholar
Guyon, I.& Pereira, F. (1995). Design of a linguistic postprocessor using variable memory length Markov models. In Proceedings of International Conference on Document Analysis and Recognition (pp. 454–457). Montreal, Canada: IEEE Computer Society Press.
Google Scholar
Jaditz, T.& Sayers, C. L. (1993). Is chaos generic in economic data? Int. Journal of Bifurcation and Chaos, 3, 745–755.
Google Scholar
Jeffrey, J. (1990). Chaos game representation of gene structure. Nucleic Acids Research, 8, 2163–2170.
Google Scholar
Katok, A.& Hausselblatt, B. (1995). Introduction to the modern theory of dynamical systems. Cambridge, UK: Cambridge University Press.
Google Scholar
Khinchin, A. I. (1957). Mathematical foundations of information theory. New York: Dover Publications.
Google Scholar
Kohonen, T. (1990). The self- organizing map. Proceedings of the IEEE, 9, 1464–1479.
Google Scholar
Krogh, A. (1997). Two methods for improving performance of a HMM and their application for gene finding. In Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology, Halkidiki, Greece (pp. 179–186). Menlo Park, CA: AAAI Press.
Google Scholar
Laird, P.& Saul, R. (1994). Discrete sequence prediction and its applications. Machine Learning, 15, 43–68.
Google Scholar
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability (pp. 281–297). Berkeley, CA: University of California Press.
Google Scholar
Nadas, A. (1984). Estimation of probabilities in the language model of the IBM speech recognition system. IEEE Trans. on ASSP, 4, 859–861.
Google Scholar
Noh, J., Engle, R. F.,& Kane, A. (1994). Forecasting volatility and option prices of the s&p 500 index. Journal of Derivatives, 2(1), 17–30.
Google Scholar
Papageorgiou, C. P. (1998). Mixed memory Markov models for time series analysis. In Proceedings of the conference on Computational Intelligence for Financial Engineering (pp. 165–183). Piscataway, NJ: IEEE.
Google Scholar
Prum, B., Rodolphe, F.,& deTurckheim, E. (1995). Findingwords with unexpected frequencies in deoxyribonucleic acid sequences. Journal of Royal Statistical Society, B 57, 205–220.
Google Scholar
Rissanen, J. (1983). A universal data compression system. IEEE Trans. Inform. Theory, 5, 656–664.
Google Scholar
Ritter, H.& Schulten, K. (1986). On the stationary state of the kohonen's self-organizing sensory mapping. Biol. Cybern., 54, 99–106.
Google Scholar
Ron, D., Singer, Y.,& Tishby, N. (1994). The power of amnesia. In Advances in Neural Information Processing Systems 6. San Mateo, CA: Morgan Kaufmann.
Google Scholar
Ron, D., Singer, Y.,& Tishby, N. (1996). The power of amnesia. Machine Learning, 25(1), 117–150.
Google Scholar
Rose, K., Gurewitz, E.,& Fox, G. C. (1990). Statistical mechanics and phase transitions in clustering. Physical Review Letters, 8, 945–948.
Google Scholar
Tiňo, P.& Šajda, J. (1995). Learning and extracting initial mealy machines with a modular neural network model. Neural Computation, 4, 822–844.
Google Scholar
Tiňo, P.& Dorffner, G. (1998). Recurrent neural networks with iterated function systems dynamics. In International ICSC/IFAC Symposium on Neural Computation (pp. 526–532). Canada/Switzerland: ICSC Academic Press.
Google Scholar
Tiňo, P.& Köteles, M. (1999). Extracting finite state representations from recurrent neural networks trained on chaotic symbolic sequences. IEEE Transactions on Neural Networks, 10(2), 284–302.
Google Scholar
Tiňo, P. (1999). Spatial representation of symbolic sequences through iterative function systems. IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems and Humans, 29(4), 386–392.
Google Scholar
Tiňo, P., Dorffner, G.,& Schittenkopf, Ch. (2000). Understanding state space organization in recurrent neural networks with iterative function systems dynamics. In S. Wermter,& R. Sun (Eds.), Hybrid neural symbolic integration (pp. 256–270). Berlin: Springer Verlag.
Google Scholar
Tiňo, P., Schittenkopf, Ch., Dorffner, G.,& Dockner, E. J. (2000a). A symbolic dynamics approach to volatility prediction. In Y. S. Abu-Mostafa, B. LeBaron, A. W. Lo,& A. S. Weigend (Eds.), Computational finance (pp. 137–151). Cambridge, MA: MIT Press.
Google Scholar
Weinberger, M. J., Rissanen, J. J.,& Feder, M. (1995). A universal finite memory source. IEEE Transactions on Information Theory, 3, 643–652.
Google Scholar
Willems, F. M. J., Shtarkov, Y. M.,& Tjalkens, T. J. (1995). The context tree weighting method: basic properties. IEEE Trans. Info. Theory, 3, 653–664.
Google Scholar

Download references

Author information

Authors and Affiliations

Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010, Vienna, Austria
Peter Tino & Georg Dorffner
Department of Computer Science and Engineering, Slovak University of Technology, Ilkovicova 3, 812 19, Bratislava, Slovakia
Peter Tino
Department of Medical Cybernetics and Artificial Intelligence, University of Vienna, Freyung 6/2, A-1010, Vienna, Austria
Georg Dorffner

Authors

Peter Tino
View author publications
You can also search for this author in PubMed Google Scholar
Georg Dorffner
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tino, P., Dorffner, G. Predicting the Future of Discrete Sequences from Fractal Representations of the Past. Machine Learning 45, 187–217 (2001). https://doi.org/10.1023/A:1010972803901

Download citation

Issue Date: November 2001
DOI: https://doi.org/10.1023/A:1010972803901

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Predicting the Future of Discrete Sequences from Fractal Representations of the Past

Abstract

Article PDF

Similar content being viewed by others

Structure of PC Sequences and the 3rd Prediction Problem

A Fractal Dimension for Measures via Persistent Homology

Generalization of Higuchi’s fractal dimension for multifractal analysis of time series with limited length

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Predicting the Future of Discrete Sequences from Fractal Representations of the Past

Abstract

Article PDF

Similar content being viewed by others

Structure of PC Sequences and the 3rd Prediction Problem

A Fractal Dimension for Measures via Persistent Homology

Generalization of Higuchi’s fractal dimension for multifractal analysis of time series with limited length

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation