Skip to main content

2018 | OriginalPaper | Buchkapitel

A Large-Scale Data Clustering Algorithm Based on BIRCH and Artificial Immune Network

verfasst von : Yangyang Li, Guangyuan Liu, Peidao Li, Licheng Jiao

Erschienen in: Advances in Swarm Intelligence

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper describes a large-scale data clustering algorithm which is a combination of Balanced Iterative Reducing and Clustering using Hierarchies Algorithm (BIRCH) and Artificial Immune Network Clustering Algorithm (aiNet). Compared with traditional clustering algorithms, aiNet can better adapt to non-convex datasets and does not require a given number of clusters. But it is not suitable for handling large-scale datasets for it needs a long time to evolve. Besides, the aiNet model is very sensitive to noise, which greatly restricts its application. Contrary to aiNet, BIRCH can better process large-scale datasets but cannot deal with non-convex datasets like traditional clustering algorithms, and requires the cluster number. By combining these two methods, a new large-scale data clustering algorithm is obtained which inherits the advantages and overcomes the disadvantages of BIRCH and aiNet simultaneously.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Hartigan, J.A., Wong, M.A.: A K-Means clustering algorithm. Appl. Statis. 28(1), 100–108 (1979)CrossRef Hartigan, J.A., Wong, M.A.: A K-Means clustering algorithm. Appl. Statis. 28(1), 100–108 (1979)CrossRef
2.
Zurück zum Zitat Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3(3), 32–57 (1974)MathSciNetCrossRef Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3(3), 32–57 (1974)MathSciNetCrossRef
3.
Zurück zum Zitat Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)CrossRef Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)CrossRef
4.
Zurück zum Zitat Maulik, U., Bandyopadhyay, S.: Genetic algorithm-based clustering technique. Pattern Recogn. 33(9), 1455–1465 (2004)CrossRef Maulik, U., Bandyopadhyay, S.: Genetic algorithm-based clustering technique. Pattern Recogn. 33(9), 1455–1465 (2004)CrossRef
5.
Zurück zum Zitat Das, S., Abraham, A., Konar, A.: Automatic kernel clustering with a Multi-Elitist particle swarm optimization algorithm. Pattern Recogn. Lett.-PRL 29(5), 688–699 (2008)CrossRef Das, S., Abraham, A., Konar, A.: Automatic kernel clustering with a Multi-Elitist particle swarm optimization algorithm. Pattern Recogn. Lett.-PRL 29(5), 688–699 (2008)CrossRef
7.
Zurück zum Zitat Fred, A.L.N., Leitao, Y.M.N.: Partitional vs hierarchical clustering using a minimum grammar complexity approach. In: Ferri, F.J., Iñesta, J.M., Amin, A., Pudil, P. (eds.) Advances in Pattern Recognition. SSPR/SPR 2000, vol. 1876. Springer, Heidelberg, pp. 193–202 (2000). https://doi.org/10.1007/3-540-44522-6_20CrossRef Fred, A.L.N., Leitao, Y.M.N.: Partitional vs hierarchical clustering using a minimum grammar complexity approach. In: Ferri, F.J., Iñesta, J.M., Amin, A., Pudil, P. (eds.) Advances in Pattern Recognition. SSPR/SPR 2000, vol. 1876. Springer, Heidelberg, pp. 193–202 (2000). https://​doi.​org/​10.​1007/​3-540-44522-6_​20CrossRef
8.
Zurück zum Zitat Nanni, M., Pedreschi, D.: Time-Focused clustering of trajectories of moving objects. J. Intell. Inf. Syst. 27(3), 267–289 (2006)CrossRef Nanni, M., Pedreschi, D.: Time-Focused clustering of trajectories of moving objects. J. Intell. Inf. Syst. 27(3), 267–289 (2006)CrossRef
9.
Zurück zum Zitat Girolami, M.: Mercer kernel-based clustering in feature space. IEEE Trans. Neural Netw. 13(3), 780–784 (2002)CrossRef Girolami, M.: Mercer kernel-based clustering in feature space. IEEE Trans. Neural Netw. 13(3), 780–784 (2002)CrossRef
10.
Zurück zum Zitat Ng, A.Y,, Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Neural Information Processing Systems, pp. 849–856 (2001) Ng, A.Y,, Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Neural Information Processing Systems, pp. 849–856 (2001)
11.
Zurück zum Zitat Martínez, A.M, Kak, A.C.: PCA versus LDA. IEEE Trans. Pattern Anal. Mach. Intell.–PAMI 23(2), 228–233 (2009)CrossRef Martínez, A.M, Kak, A.C.: PCA versus LDA. IEEE Trans. Pattern Anal. Mach. Intell.–PAMI 23(2), 228–233 (2009)CrossRef
12.
Zurück zum Zitat de Castro, L.N., Von, Z.F.J.: aiNet: an artificial immune network for data analysis. In: Data Mining: A Heuristic Approach, pp. 231–259 (2001) de Castro, L.N., Von, Z.F.J.: aiNet: an artificial immune network for data analysis. In: Data Mining: A Heuristic Approach, pp. 231–259 (2001)
13.
Zurück zum Zitat Timmis, J., Neal, M.: A Resource Limited Artificial Immune System for Data Analysis. Research and Development in Intelligent Systems XVII, pp. 19–32, December 2000CrossRef Timmis, J., Neal, M.: A Resource Limited Artificial Immune System for Data Analysis. Research and Development in Intelligent Systems XVII, pp. 19–32, December 2000CrossRef
14.
Zurück zum Zitat Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Proceedings of ACM SIGMOD Conference, Montreal, Canada, pp. 103–114 (1996)CrossRef Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Proceedings of ACM SIGMOD Conference, Montreal, Canada, pp. 103–114 (1996)CrossRef
15.
Zurück zum Zitat Greensmith, J., Aickelin, U., Cayzer, S.: Introducing dendritic cells as a novel immune-inspired algorithm for anomaly detection. In: The 4th International Conference on Artificial Immune Systems (ICARIS 2005), Banff, Alberta, Canada (2005) Greensmith, J., Aickelin, U., Cayzer, S.: Introducing dendritic cells as a novel immune-inspired algorithm for anomaly detection. In: The 4th International Conference on Artificial Immune Systems (ICARIS 2005), Banff, Alberta, Canada (2005)
17.
Zurück zum Zitat Richard, O.D., Peter, E.H., David, G.S.: Pattern Classification, 2nd edn. China Machine Press, Beijing (2004)MATH Richard, O.D., Peter, E.H., David, G.S.: Pattern Classification, 2nd edn. China Machine Press, Beijing (2004)MATH
18.
Zurück zum Zitat Barbakh, W., Fyfe, C.: Online clustering algorithms. Int. J. Neural Syst. 18(3), 185–194 (2008)CrossRef Barbakh, W., Fyfe, C.: Online clustering algorithms. Int. J. Neural Syst. 18(3), 185–194 (2008)CrossRef
19.
Zurück zum Zitat Havens, T.C., Bezdek, J.C., Leckie, C., et al.: Fuzzy c-means algorithms for very large data. IEEE Trans. Fuzzy Syst. 20(6), 1130–1146 (2012)CrossRef Havens, T.C., Bezdek, J.C., Leckie, C., et al.: Fuzzy c-means algorithms for very large data. IEEE Trans. Fuzzy Syst. 20(6), 1130–1146 (2012)CrossRef
20.
Zurück zum Zitat Handl, J., Knowles, J.: An evolutionary approach to multiobjective clustering. IEEE Trans. Evol. Comput. 11(1), 56–76 (2007)CrossRef Handl, J., Knowles, J.: An evolutionary approach to multiobjective clustering. IEEE Trans. Evol. Comput. 11(1), 56–76 (2007)CrossRef
Metadaten
Titel
A Large-Scale Data Clustering Algorithm Based on BIRCH and Artificial Immune Network
verfasst von
Yangyang Li
Guangyuan Liu
Peidao Li
Licheng Jiao
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-93815-8_32

Premium Partner