A General Formulation of Cluster Analysis with Dimension Reduction and Subspace Separation

Yamamoto, Michio; Hwang, Heungsun

doi:10.2333/bhmk.41.115

A General Formulation of Cluster Analysis with Dimension Reduction and Subspace Separation

Published: 01 January 2014

Volume 41, pages 115–129, (2014)
Cite this article

Behaviormetrika Aims and scope Submit manuscript

Michio Yamamoto¹ &
Heungsun Hwang²

70 Accesses
19 Citations
Explore all metrics

Abstract

We propose a novel approach to finding an optimal subspace of multi-dimensional variables for identifying a cluster structure of objects. When some variables are irrelevant to the cluster structure and are correlated between themselves, they are likely to have an adverse effect on clustering of objects. In such situations, the proposed method aims to obtain an optimal subspace for partitioning objects by eliminating the effects of these irrelevant variables. The proposed method can be considered an extension of reduced k-means analysis and factorial k-means analysis for the settings where irrelevant variables are correlated. The proposed method is applied for the analyses of artificial and real data to investigate how it performs as compared to the existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Arabie, P. & Hubert, L. (1994). Cluster analysis in marketting research (pp.160–189). In Bagozzi, R.P., editor, Advanced methods of marketing research. Blackwell, Oxford.
Google Scholar
Ben-Hur, A. & Guyon, I. (2003). Detecting stable clusters using principal component analysis. In Brownstein, M.J. and Khodursky, A.B. (Eds.) Functional Genomics (pp.159–182). Human Press.
Chapter Google Scholar
De Soete, G. & Carroll, J.D. (1994). K-means clustering in a low-dimensional Euclidean space. In Diday, E. and Lechevallier, Y. and Schader, M. and Bertrand, P. and Burtschy, B. (Eds.) New Approaches in Classification and Data Analysis (pp.212–219). Springer, Heidelberg
Chapter Google Scholar
DeSarbo, W.S., Jedidi, K., Cool, K., & Schendel, D. (1990). Simultaneous multidimensional unfolding and cluster analysis: An investigation of strategic groups. Marketing Letters, 2, 129–146.
Google Scholar
Fisher, R.A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, 179–188.
Article Google Scholar
Gattone, S.A. & Rocci, R. (2012). Clustering curves on a reduced subspace. Journal of Computational and Graphical Statistics, 21, 361–379.
Article MathSciNet Google Scholar
Hartigan, J.A. & Wong, M.A. (1979). Algorithm AS 136: A K-means clustering algorithm. Journal of the Royal Statistical Society, Series C, 28, 100–108.
MATH Google Scholar
Holzinger, K. J. & Swineford, F. (1939). A study in factor analysis: The stability of a bi-factor solution. Supplementary Educational Monographs, No.48. The University of Chicago.
Google Scholar
Hubert, L. & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218.
Article Google Scholar
James, G.M. & Sugar, C.A. (2003). Clustering of sparsely sampled functional data. Journal of the American Statistical Association, 98, 397–408.
Article MathSciNet Google Scholar
Jennrich, R.I. (2001). A simple general procedure for orthogonal rotation. Psychometrika, 66, 289–306.
Article MathSciNet Google Scholar
Jennrich, R.I. (2002). A simple general procedure for oblique rotation. Psychometrika, 67, 7–20.
Article MathSciNet Google Scholar
Kaiser, H.F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23, 187–200.
Article Google Scholar
Lloyd, S. (1982). Least squares quantization in pem. IEEE Transactions on Information Theory, 28, 128–137.
Article Google Scholar
MacQueen, J. (1967). Some methods of classification and analysis of multivariate observations. In L.M. Le Cam & J. Neyman (Eds.), Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, 1, 281–297. Berkeley, CA: University of California Press.
MathSciNet MATH Google Scholar
Milligan, G.W. & Cooper, M.C. (1988). A study of standardization of variables in cluster analysis. Journal of Classification, 5, 181–204.
Article MathSciNet Google Scholar
Niu, D., Dy, J.G., & Jordan, M.I. (2011). Dimensionality reduction for spectral clustering. Proceedings of the 14th International Conference on Artificial Intelligence and Statistics 2011, 552–560.
Google Scholar
Rocci, R., Gattone, S.A., & Vichi, M. (2011). A new dimension reduction method: Factor discriminant k-means. Journal of Classification, 28, 210–226.
Article MathSciNet Google Scholar
Sun, W., Wang, J., & Fang, Y. (2012). Regularized k-means clustering of high-dimensional data and its asymptotic consistency. Electronic Journal of Statistics, 6, 148–167.
Article MathSciNet Google Scholar
Terada, Y. (2013a). Strong consistency of reduced k-means clustering. arXiv:1212.4942.
Google Scholar
Terada, Y. (2013b). Strong consistency of factorial k-means clustering. arXiv:1301.0676.
Google Scholar
Timmerman, M.E., Ceulemans, E., Kiers, H.A.L., & Vichi, M. (2010). Factorial and reduced k-means reconsidered. Computational Statistics & Data Analysis, 54, 1858–1871.
Article MathSciNet Google Scholar
Tukey, J.W. (1977). Exploratory data analysis. Addison-Wesley.
MATH Google Scholar
Verbeek, J.J. (2004). Mixture models for clustering and dimension reduction. Thesis. University of Amsterdam.
Google Scholar
Vidal, R. (2011). Subspace clustering. Signal Processing Magazine, IEEE, 28, 52–68.
Article Google Scholar
Vichi, M. & Kiers H.A.L. (2001). Factorial k-means analysis for two-way data. Computational Statistics & Data Analysis, 37, 49–64.
Article MathSciNet Google Scholar
Wang, J. (2010). Consistent selection of the number of clusters via crossvalidation. Biometrika, 97, 893–904.
Article MathSciNet Google Scholar
Yamamoto, M. (2012). Clustering of functional data in a low-dimensional subspace. Advances in Data Analysis and Classification, 6, 219–247.
Article MathSciNet Google Scholar
Yamamoto, M. & Terada, Y. (2013). Functional factorial k-means analysis. arXiv:1311.0463.
Google Scholar

Download references

Author information

Authors and Affiliations

Kyoto University, Japan
Michio Yamamoto
McGill University, Japan
Heungsun Hwang

Authors

Michio Yamamoto
View author publications
You can also search for this author in PubMed Google Scholar
Heungsun Hwang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michio Yamamoto.

About this article

Cite this article

Yamamoto, M., Hwang, H. A General Formulation of Cluster Analysis with Dimension Reduction and Subspace Separation. Behaviormetrika 41, 115–129 (2014). https://doi.org/10.2333/bhmk.41.115

Download citation

Received: 22 July 2013
Revised: 25 November 2013
Published: 01 January 2014
Issue Date: January 2014
DOI: https://doi.org/10.2333/bhmk.41.115

Key Words and Phrases

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A General Formulation of Cluster Analysis with Dimension Reduction and Subspace Separation

Abstract

Access this article

Similar content being viewed by others

An Integrated Approach to High-Dimensional Data Clustering

Subspace multi-clustering: a review

Cluster Analysis of Data with Reduced Dimensionality: An Empirical Study

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Key Words and Phrases

Navigation

A General Formulation of Cluster Analysis with Dimension Reduction and Subspace Separation

Abstract

Access this article

Similar content being viewed by others

An Integrated Approach to High-Dimensional Data Clustering

Subspace multi-clustering: a review

Cluster Analysis of Data with Reduced Dimensionality: An Empirical Study

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Share this article

Key Words and Phrases

Search

Navigation