Abstract
The primary method for validating cluster analysis techniques is throughMonte Carlo simulations that rely on generating data with known cluster structure (e.g., Milligan 1996). This paper defines two kinds of data generation mechanisms with cluster overlap, marginal and joint; current cluster generation methods are framed within these definitions. An algorithm generating overlapping clusters based on shared densities from several different multivariate distributions is proposed and shown to lead to an easily understandable notion of cluster overlap. Besides outlining the advantages of generating clusters within this framework, a discussion is given of how the proposed data generation technique can be used to augment research into current classification techniques such as finite mixture modeling, classification algorithm robustness, and latent profile analysis.
Similar content being viewed by others
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Steinley, D., Henson, R. OCLUS: An Analytic Method for Generating Clusters with Known Overlap. Journal of Classification 22, 221–250 (2005). https://doi.org/10.1007/s00357-005-0015-6
Issue Date:
DOI: https://doi.org/10.1007/s00357-005-0015-6