Skip to main content

2017 | OriginalPaper | Buchkapitel

Evaluation of Frequent Pattern Growth Based Fuzzy Particle Swarm Optimization Approach for Web Document Clustering

verfasst von : Raja Varma Pamba, Elizabeth Sherly, Kiran Mohan

Erschienen in: Computational Science and Its Applications – ICCSA 2017

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Soft and hard clustering efficiency evaluation of novel approach of frequent pattern growth based fuzzy particle swarm optimization for clustering web documents is studied and analyzed in this paper. The conventional approaches K-Means and Fuzzy c-means (FCM) fails with regard to random initialization and local minima hookups. To overcome this drawbacks, bio inspired mechanisms like genetic algorithm, ant colony optimization and particle swarm optimization (PSO) are used to optimize the K-means and FCM clustering. The major contribution of the novel method are three fold. Primarily in its ways to automatically find effective cluster numbers, cluster centroids and swarms for the bio inspired fuzzy particle swarm optimization. Second in yielding fuzzy overlapping clusters using the FCM objective function overcoming the drawbacks of the existing methods. Third, the methodology discusses in this paper prunes out the irrelevant elements from the search space and thereby retains all relationships with search query as semantic conditionally relatable sets. The evaluation results show that our proposed approach performs better for Adjusted Rand Index (ARI), Normalized Mutual Information (NMI) and Adjusted Concordance Index (ACI) against various distance based similarity measures and FCMPSO.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3, 32–57 (1973)MathSciNetCrossRefMATH Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3, 32–57 (1973)MathSciNetCrossRefMATH
2.
Zurück zum Zitat Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)CrossRefMATH Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)CrossRefMATH
3.
Zurück zum Zitat Liu, H., Pei, T., Zhou, T., Zhu, A.X.: Multi-temporal MODIS-data-based PSO-FCM clustering applied to wetland extraction in the Sanjiang Plain. In: International Conference on Earth Observation Data Processing and Analysis, Wuhan, China, vol. 7285 (2008) Liu, H., Pei, T., Zhou, T., Zhu, A.X.: Multi-temporal MODIS-data-based PSO-FCM clustering applied to wetland extraction in the Sanjiang Plain. In: International Conference on Earth Observation Data Processing and Analysis, Wuhan, China, vol. 7285 (2008)
4.
Zurück zum Zitat Silva Filho, T.M., Pimentel, B.A., Souza, R.M.C.R., Oliveira, A.L.I.: Hybrid methods for fuzzy clustering based on fuzzy c-means and improved particle swarm optimization. Expert Syst. Appl. 42(17–18), 6315–6328 (2015)CrossRef Silva Filho, T.M., Pimentel, B.A., Souza, R.M.C.R., Oliveira, A.L.I.: Hybrid methods for fuzzy clustering based on fuzzy c-means and improved particle swarm optimization. Expert Syst. Appl. 42(17–18), 6315–6328 (2015)CrossRef
5.
Zurück zum Zitat Lam, Y.-K., Tsang, P.W.M., Leung, C.-S.: PSO-based K-Means clustering with enhanced cluster matching for gene expression data. Neural Comput. Appl. 22(7–8), 1349–1355 (2013)CrossRef Lam, Y.-K., Tsang, P.W.M., Leung, C.-S.: PSO-based K-Means clustering with enhanced cluster matching for gene expression data. Neural Comput. Appl. 22(7–8), 1349–1355 (2013)CrossRef
6.
Zurück zum Zitat Feng, Y., Teng, G.F., Wang, A.X., Yao, Y.M.: Chaotic inertia weight in particle swarm optimization. In: Second International Conference on Innovative Computing, Information and Control, pp. 475–501. IEEE (2008) Feng, Y., Teng, G.F., Wang, A.X., Yao, Y.M.: Chaotic inertia weight in particle swarm optimization. In: Second International Conference on Innovative Computing, Information and Control, pp. 475–501. IEEE (2008)
7.
Zurück zum Zitat Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Discov. 8, 53–87 (2004)MathSciNetCrossRef Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Discov. 8, 53–87 (2004)MathSciNetCrossRef
8.
Zurück zum Zitat Izakian, H., Abraham, A.: Fuzzy C-means and fuzzy swarm for fuzzy clustering problem. Expert Syst. Appl. 38(3), 1835–1838 (2011)CrossRef Izakian, H., Abraham, A.: Fuzzy C-means and fuzzy swarm for fuzzy clustering problem. Expert Syst. Appl. 38(3), 1835–1838 (2011)CrossRef
9.
Zurück zum Zitat Kennedy, J.F., Eberhart, R.C., Shi, Y., NetLibrary, Inc.: Swarm Intelligence. Morgan Kaufmann Publishers, San Francisco (2001) Kennedy, J.F., Eberhart, R.C., Shi, Y., NetLibrary, Inc.: Swarm Intelligence. Morgan Kaufmann Publishers, San Francisco (2001)
10.
Zurück zum Zitat Pamba, R.V., Sherly, E., Mohan, K.: Automated information retrieval model using FP growth based fuzzy particle swarm optimization. Int. J. Comput. Sci. Inf. Technol. 9(1) (2017) Pamba, R.V., Sherly, E., Mohan, K.: Automated information retrieval model using FP growth based fuzzy particle swarm optimization. Int. J. Comput. Sci. Inf. Technol. 9(1) (2017)
11.
Zurück zum Zitat Priyadharshini, S.P., Pujeri, R.V.: Performance analysis of fuzzy clustering. Int. J. Adv. Eng. Technol. (2014) Priyadharshini, S.P., Pujeri, R.V.: Performance analysis of fuzzy clustering. Int. J. Adv. Eng. Technol. (2014)
12.
Zurück zum Zitat Zheng, Y., Qu, J., Zhou, Y.: An improved PSO clustering algorithm based on affinity propagation. WSEAS Trans. Syst. 12(9), 447–456 (2013) Zheng, Y., Qu, J., Zhou, Y.: An improved PSO clustering algorithm based on affinity propagation. WSEAS Trans. Syst. 12(9), 447–456 (2013)
13.
Zurück zum Zitat Huang, H.-C., Chuang, Y.-Y., Chen, C.-S.: Multiple kernel fuzzy clustering. IEEE Trans. Fuzzy Syst. 20(1), 120–134 (2012)CrossRef Huang, H.-C., Chuang, Y.-Y., Chen, C.-S.: Multiple kernel fuzzy clustering. IEEE Trans. Fuzzy Syst. 20(1), 120–134 (2012)CrossRef
14.
Zurück zum Zitat Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31(8), 651–666 (2010). ElsevierCrossRef Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31(8), 651–666 (2010). ElsevierCrossRef
15.
Zurück zum Zitat Cui, X., Potok, T.E.: Document clustering analysis based on hybrid PSO+Kmeans algorithm. J. Comput. Sci. 27–33 (2005). Special Issue Cui, X., Potok, T.E.: Document clustering analysis based on hybrid PSO+Kmeans algorithm. J. Comput. Sci. 27–33 (2005). Special Issue
16.
Zurück zum Zitat Wu, J., Xiong, H., Chen, J.: Adapting the right measures for k-means clustering. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, ser. KDD 2009, pp. 877–886 (2009) Wu, J., Xiong, H., Chen, J.: Adapting the right measures for k-means clustering. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, ser. KDD 2009, pp. 877–886 (2009)
17.
Zurück zum Zitat Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2003)MathSciNetMATH Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2003)MathSciNetMATH
18.
Zurück zum Zitat Amodio, S., d’Ambrosio, A., Iorio, C., Siciliano, R.: Adjusted concordance index, an extension of the adjusted rand index to fuzzy partitions. STAD Research report 03 2015 (2016) Amodio, S., d’Ambrosio, A., Iorio, C., Siciliano, R.: Adjusted concordance index, an extension of the adjusted rand index to fuzzy partitions. STAD Research report 03 2015 (2016)
19.
Zurück zum Zitat Campello, R.J.G.B.: A fuzzy extension of the rand index and other related indexes for clustering and classification assessment. Pattern Recogn. Lett. 28(7), 833–841 (2007)CrossRef Campello, R.J.G.B.: A fuzzy extension of the rand index and other related indexes for clustering and classification assessment. Pattern Recogn. Lett. 28(7), 833–841 (2007)CrossRef
20.
Zurück zum Zitat Hullermeier, E., Rifqi, M., Henzgen, S., Senge, R.: Comparing fuzzy partitions: a generalization of the rand index and related measures. IEEE Trans. Fuzzy Syst. 20(3), 546–556 (2012)CrossRef Hullermeier, E., Rifqi, M., Henzgen, S., Senge, R.: Comparing fuzzy partitions: a generalization of the rand index and related measures. IEEE Trans. Fuzzy Syst. 20(3), 546–556 (2012)CrossRef
21.
Zurück zum Zitat Yates, R.B., Neto, B.R.: Modern Information Retrieval. Addison-Wesley, New York (1999) Yates, R.B., Neto, B.R.: Modern Information Retrieval. Addison-Wesley, New York (1999)
27.
Zurück zum Zitat Larsen, B., Aone, C.,: Fast and effective text mining using linear-time document clustering. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (1999) Larsen, B., Aone, C.,: Fast and effective text mining using linear-time document clustering. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (1999)
28.
Zurück zum Zitat Alok, A.K., Saha, S., Ekbal, A.: Development of an external cluster validity index using probabilistic approach and min-max distance. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 6, 494–504 (2014) Alok, A.K., Saha, S., Ekbal, A.: Development of an external cluster validity index using probabilistic approach and min-max distance. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 6, 494–504 (2014)
Metadaten
Titel
Evaluation of Frequent Pattern Growth Based Fuzzy Particle Swarm Optimization Approach for Web Document Clustering
verfasst von
Raja Varma Pamba
Elizabeth Sherly
Kiran Mohan
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-62392-4_27