Abstract
The present study devises mapping methodologies and projection techniques that visualize and demonstrate biological sequence data clustering results. The Sequence Data Density Display (SDDD) and Sequence Likelihood Projection (SLP) visualizations represent the input symbolical sequences in a lower-dimensional space in such a way that the clusters and relations of data elements are depicted graphically. Both operate in combination/synergy with the Self-Organizing Hidden Markov Model Map (SOHMMM). The resulting unified framework is in position to analyze automatically and directly raw sequence data. This analysis is carried out with little, or even complete absence of, prior information/domain knowledge.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16:645–678
Du K-L (2010) Clustering: a neural network approach. Neural Netw 23:89–107
Kohonen T (2001) Self-organizing maps, 3rd edn. Springer, Berlin
Tasdemir K (2010) Graph based representations of density distribution and distances for self-organizing maps. IEEE Trans Neural Netw 21:520–526
Tasdemir K, Merenyi E (2009) Exploiting data topology in visualization and clustering of self-organizing maps. IEEE Trans Neural Netw 20:549–562
Brugger D, Bogdan M, Rosenstiel W (2008) Automatic cluster detection in Kohonen’s SOM. IEEE Trans Neural Netw 19:442–459
Ultsch A (2003) Maps for the visualization of high-dimensional data spaces. In: Proc. workshop self-organizing maps, pp 225–230
Yin H (2002) ViSOM—a novel method for multivariate data projection and structure visualization. IEEE Trans Neural Netw 13:237–243
Kraaijveld MA, Mao J, Jain AK (1995) A nonlinear projection method based on Kohonen’s topology preserving maps. IEEE Trans Neural Netw 6:548–559
Ferles C, Stafylopatis A (2013) Self-Organizing Hidden Markov Model Map (SOHMMM). Neural Netw 48:133–147
Ferles C, Siolas G, Stafylopatis A (2013) Scaled self-organizing map—hidden Markov model architecture for biological sequence clustering. Appl Artif Intell 27:461–495
Ferles C, Siolas G, Stafylopatis A (2011) Scaled on-line unsupervised learning algorithm for a SOM-HMM hybrid. In: 26th Int. symposium computer information sciences, pp 533–537
Ferles C, Stafylopatis A (2008) A hybrid self-organizing model for sequence analysis. In: 20th IEEE int. conf. tools artificial intell., pp 105–112
Ferles C, Stafylopatis A (2008) Sequence clustering with the self-organizing hidden Markov model map. In: 8th IEEE int. conf. bioinformatics bioengineering, pp 1–7
Barreto G de A, Araujo A, Kremer S (2003) A taxonomy of spatiotemporal connectionist networks revisited: the unsupervised case. Neural Comput 15:1255–1320
Hammer B, Micheli A, Strickert M et al (2004) A general framework for unsupervised processing of structured data. Neurocomputing 57:3–35
Hammer B, Hasenfuss A (2010) Topographic mapping of large dissimilarity data sets. Neural Comput 22:2229–2284
Lebbah M, Rogovschi N, Bennani Y (2007) BeSOM: Bernoulli on self-organizing map. In: Int. joint conf. neural netw., pp 631–636
Somervuo P (2004) Online algorithm for the self-organizing map of symbol strings. Neural Netw 17:1231–1239
Strickert M, Hammer B (2004) Self-organizing context learning. In: Proc. European symposium artificial neural netw., pp 39–44
Koski T (2001) Hidden Markov models for bioinformatics. Kluwer Academics Publishers, Dordrecht, The Netherlands
Rabiner L (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77:257–286
The iProClass Protein Knowledgebase (release 4.32) [online]. Available: http://pir.georgetown.edu/
Sharma K (2008) Bioinformatics: sequence alignment and Markov models. McGraw-Hill, New York
Durbin R, Eddy SR, Krogh A et al (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge
Krogh A, Brown M, Mian IS et al (1994) Hidden Markov models in computational biology: applications to protein modeling. J Mol Biol 235:1501–1531
Mount DW (2004) Bioinformatics: sequence and genome analysis, 2nd edn. Cold Spring Harbor Laboratory Press, New York
Baldi P, Brunak S (2001) Bioinformatics: the machine learning approach, 2nd edn. The MIT Press, Cambridge, Massachusetts
UCI Machine Learning Repository [online]. Available: http://archive.ics.uci.edu/ml/
Acknowledgment
The authors would like to thank Anastasis Tzimas for his insightful remarks and comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media LLC
About this protocol
Cite this protocol
Ferles, C., Beaufort, WS., Ferle, V. (2017). Self-Organizing Hidden Markov Model Map (SOHMMM): Biological Sequence Clustering and Cluster Visualization. In: Westhead, D., Vijayabaskar, M. (eds) Hidden Markov Models. Methods in Molecular Biology, vol 1552. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6753-7_6
Download citation
DOI: https://doi.org/10.1007/978-1-4939-6753-7_6
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-6751-3
Online ISBN: 978-1-4939-6753-7
eBook Packages: Springer Protocols