nach oben

Journal of Intelligent Information Systems

Erschienen in:

01.06.2013

Clustering based on a near neighbor graph and a grid cell graph

verfasst von: Xinquan Chen

Erschienen in: Journal of Intelligent Information Systems | Ausgabe 3/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This paper presents two novel graph-clustering algorithms, Clustering based on a Near Neighbor Graph (CNNG) and Clustering based on a Grid Cell Graph (CGCG). CNNG algorithm inspired by the idea of near neighbors is an improved graph-clustering method based on Minimum Spanning Tree (MST). In order to analyze massive data sets more efficiently, CGCG algorithm, which is a kind of graph-clustering method based on MST on the level of grid cells, is presented. To clearly describe the two algorithms, we give some important concepts, such as near neighbor point set, near neighbor undirected graph, grid cell, and so on. To effectively implement the two algorithms, we use some efficient partitioning and index methods, such as multidimensional grid partition method, multidimensional index tree, and so on. From simulation experiments of some artificial data sets and seven real data sets, we observe that the time cost of CNNG algorithm can be decreased by using some improving techniques and approximate methods while attaining an acceptable clustering quality, and CGCG algorithm can approximately analyze some dense data sets with linear time cost. Moreover, comparing some classical clustering algorithms, CNNG algorithm can often get better clustering quality or quicker clustering speed.

Vorheriger Artikel Unsupervised feature construction for improving data representation and semantics

Nächster Artikel QUBiC: An adaptive approach to query-based recommendation

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nur mit Berechtigung zugänglich

Agrawal, R., Gehrke, J., Gunopolos, D., et al. (1998). Automatic subspace clustering of high dimensional data for data mining application. In Proceeding of the ACM SIGMOD international conference on management of data (pp. 94–105).

Anders, K.H. (2003). A hierarchical graph-clustering approach to find groups of objects. In The 5th workshop on progress in automated map generalization (pp. 1–8).

Cormen, T.H., Leiserson, C.E., Rivest, R.L., et al. (2009). Introduction to algorithms (3rd ed.). Cambridge: The MIT Press.MATH

Costa, A.F.B.F., Pimentel, B.A., de Souza, R.M.C.R. (2013). Clustering interval data through kernel-induced feature space. Journal of Intelligent Information Systems, 40(1), 109–140.CrossRef

Ester, M., Kriegel, H.P., Sander, J., Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial data sets with noise. In The 2th international conference on knowledge discovery and data mining (pp. 226–231). Portland.

Frank, A., & Asuncion, A. (2010). UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml .

Frey, B.J., & Dueck, D. (2007). Clustering by passing messages between data points. Science, 315(16), 972–976.MathSciNetCrossRefMATH

Gabriel, K., & Sokal, R. (1969). A new statistical approach to geographic variation analysis. Systematic Zoology, 18, 259–278.CrossRef

Gower, J.C., & Ross, G.J.S. (1969). Minimum spanning trees and single linkage cluster analysis. Applied Statistics, 18(1), 54–64.MathSciNetCrossRef

Guha, S., Rastogi, R., Shim, K. (1998). Cure: an efficient clustering algorithm for large databases. In Proceeding of the ACM SIGMOD international conference on management of data (pp. 73–84). Seattle: ACM Press.

Jain, A.K. (2010). Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8), 651–666.CrossRef

Jain, A.K., Murty, M.N., Flynn, P.J. (1999). Data clustering: a review. ACM Computing Surveys, 31(3), 264–323.CrossRef

Jaromczyk, J.W., Godfried, T. (1992). Relative neighborhood graphs and their relatives. Proceedings of the IEEE, 80(9), 1502–1517.CrossRef

Karypis, G., Han, E.H., Kumar, V. (1999). Chameleon: a hierarchical clustering algorithm using dynamic modeling. IEEE Computer, 32(8), 68–75.CrossRef

Lee, D.T. (1980). Two dimensional voronoi diagram in the l_p metric. Journal of ACM, 27(4), 604–618.CrossRefMATH

Li, C.B., Yin, W.M., Li, R.R., et al. (2009). Tutorial to data structures (3rd ed.). Beijing: The Tsinghua University Press.

Schaeffer, S.E. (2007). Graph clustering. Computer Science Review, 1(1), 27–64.MathSciNetCrossRef

Schölkopf, B., Smola, A., Müller, K.R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5), 1299–1319.CrossRef

Tan, P.N., Steinbach, M., Kumar, V. (2005). Introduction to data mining. Addison Wesley.

Theodoridis, S., & Koutroumbas, K. (2006). Pattern recognition (3rd ed.). Academic Press.

Toussaint, G. (1980). The relative neighborhood graph of a finite planar set. Pattern Recognition, 12(4), 261–268.MathSciNetCrossRefMATH

Wang, X.C., Wang, X.L., Wilkes, D.M. (2009). A divide-and-conquer approach for minimum spanning tree-based clustering. IEEE Transactions on Knowledge and Data Engineering, 21(7), 945–958.CrossRef

Wang, W., Yang, J., Muntz, R.R. (1997). STING: a statistical information grid approach to spatial data mining. In Proceedings of the 23rd VLDB conference (pp. 186–195). Athens, Greece.

Yao, A.C. (1975). An O(∣E∣ ·loglog∣V∣) algorithm for finding minimum spanning trees. Information Processing Letters, 4(1), 21–23.CrossRefMATH

Yao, A.C. (1982). On constructing minimum spanning trees in k-dimensional spaces and related problems. SIAM Journal on Computing, 11(5), 721–736.MathSciNetCrossRefMATH

Zahn, C.T. (1971). Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transactions on Computers, C-20(1), 68–86.

Zhang, N.X. (2006). Algorithms and data structures: Described in C language (2nd ed.). Beijing: The Higher Education Press.

Zhang, T., Ramakrishnan, R., Linvy, M. (1997). BIRCH: an efficient data clustering method for very large data sets. Data Mining and Knowledge Discovery, 1(2), 141–182.CrossRef

Zhou, C.M., Miao, D.Q., Wang, R.Z. (2010). A graph-theoretical clustering method based on two rounds of minimum spanning trees. Pattern Recognition, 43(3), 752–766.CrossRef

Titel: Clustering based on a near neighbor graph and a grid cell graph
verfasst von: Xinquan Chen
Publikationsdatum: 01.06.2013
Verlag: Springer US
Erschienen in: Journal of Intelligent Information Systems / Ausgabe 3/2013
Print ISSN: 0925-9902
Elektronische ISSN: 1573-7675
DOI: https://doi.org/10.1007/s10844-013-0236-9

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2013

Provisional reporting for rank joins

BruteSuppression: a size reduction method for Apriori rule sets

An argumentation framework for description logic ontology reasoning and management

Learning from data streams with only positive and unlabeled data

Coupling semantic and statistical techniques for dynamically enriching web ontologies

QUBiC: An adaptive approach to query-based recommendation

Premium Partner