Abstract
Clustering of sequential or temporal data is more challenging than traditional clustering as dynamic observations should be processed rather than static measures. This paper proposes a Hidden Markov Model (HMM)-based technique suitable for clustering of data sequences. The main aspect of the work is the use of a probabilistic model-based approach using HMM to derive new proximity distances, in the likelihood sense, between sequences. Moreover, a novel partitional clustering algorithm is designed which alleviates computational burden characterizing traditional hierarchical agglomerative approaches. Experimental results show that this approach provides an accurate clustering partition and the devised distance measures achieve good performance rates. The method is demonstrated on real world data sequences, i.e. the EEG signals due to their temporal complexity and the growing interest in the emerging field of Brain Computer Interfaces.
Chapter PDF
Similar content being viewed by others
Keywords
- Hide Markov Model
- Natural Cluster
- Hide Markov Model State
- Partitional Cluster Algorithm
- Discrete Hide Markov Model
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Rabiner, L. R.: A tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. of IEEE 77(2) (1989) 257–286.
Hu, J., Brown, M. K., Turin, W.: HMM based on-line handwriting recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 18(10) (1996) 1039–1045.
Hughey, R., Krogh, A.: Hidden Markov Model for sequence analysis: extension and analysis of the basic method. Comp. Appl. in the Biosciences 12 (1996) 95–107.
Eickeler, S., Kosmala, A., Rigoll, G.: Hidden Markov Model based online gesture recognition. Proc. Int. Conf. on Pattern Recognition (ICPR) (1998) 1755–1757.
Jebara, T., Pentland, A.: Action Reaction Learning: Automatic Visual Analysis and Synthesis of interactive behavior. In 1st Intl. Conf. on Computer Vision Systems (ICVS’99) (1999).
Rabiner, L. R., Lee, C. H., Juang, B. H., Wilpon, J. G.: HMM Clustering for Connected Word Recognition. Proceedings of IEEE ICASSP (1989) 405–408.
Lee, K. F.: Context-Dependent Phonetic Hidden Markov Models for Speaker-Independent Continuous Speech Recognition. IEEE Transactions on Acoustics, Speech and Signal Processing 38(4) (1990) 599–609.
Smyth, P.: Clustering sequences with HMM, in Advances in Neural Information Processing (M. Mozer, M. Jordan, and T. Petsche, eds.) MIT Press 9 (1997).
Li, C., Biswas, G.: Clustering Sequence Data using Hidden Markov Model Representation, SPIE’99 Conference on Data Mining and Knowledge Discovery: Theory, Tools, and Technology, (1999) 14–21.
Li, C., Biswas, G.: A Bayesian Approach to Temporal Data Clustering using Hidden Markov Models. Intl. Conference on Machine Learning (2000) 543–550.
Schwarz, G.: Estimating the dimension of a model. The Annals of Statistics, 6(2) (1978) 461–464.
Stolcke, A., Omohundro, S.: Hidden Markov Model Induction by Bayesian Model Merging. Hanson, S. J., Cowan, J. D., Giles, C. L. eds. Advances in Neural Information Processing Systems 5 (1993) 11–18.
Cheeseman, P., Stutz, J.: Bayesian Classification (autoclass): Theory and Results. Advances in Knowledge discovery and data mining, (1996) 153–180.
Law, M. H., Kwok, J. T.: Rival penalized competitive learning for model-based sequence Proceedings Intl Conf. on Pattern Recognition (ICPR) 2 (2000) 195–198.
Penny, W. D., Roberts, S. J., Curran, E., Stokes, M.: EEG-based communication: a PR approach. IEEE Trans. Rehabilitation Engineering 8(2) (2000) 214–215.
Juang, B. H., Levinson, S. E., Sondhi, M. M.: Maximum likelihood estimation for multivariate mixture observations of Markov Chain. IEEE Trans. Informat. Theory 32(2) (1986) 307–309.
Juang, B. H., Rabiner, L. R.: Mixture autoregressive hidden Markov models for speech signals. IEEE Trans. Acoust. Speech Signal Proc. 33(6) (1985) 1404–1413.
Penny, W. D., Roberts, S. J.: Dynamic models for nonstationary signal segmentation. Computers and Biomedical Research 32(6) (1998) 483–502.
Kalman, R. E.: A New Approach to Linear Filtering and Prediction Problems. Transaction of the ASME-Journal of Basic Engineering (1960) 35–45.
Jazwinski, A.: Adaptive Filtering. Automatica 5 (1969) 475–485.
Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Academic Press (1999).
Anderson, C. W., Stolz, E. A., Shamsunder, S.: Multivariate autoregressive models for classification of spontaneous electroencephalogram during mental tasks. IEEE Transactions on Biomedical Engineering, 45(3) (1998) 277–286.
Nunez, P. L.: Neocortical Dynamics and Human EEG Rhythms. Oxford University Press, (1995).
Kaufman, L., Rousseuw, P.: Findings groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons-New York (1990).
Keirn, Z.: Alternative modes of communication between man and machine. Master’s thesis. Purdue University (1988).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Panuccio, A., Bicego, M., Murino, V. (2002). A Hidden Markov Model-Based Approach to Sequential Data Clustering. In: Caelli, T., Amin, A., Duin, R.P.W., de Ridder, D., Kamel, M. (eds) Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2002. Lecture Notes in Computer Science, vol 2396. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-70659-3_77
Download citation
DOI: https://doi.org/10.1007/3-540-70659-3_77
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44011-6
Online ISBN: 978-3-540-70659-5
eBook Packages: Springer Book Archive