Skip to main content

2016 | OriginalPaper | Buchkapitel

A Multi Criteria Document Clustering Approach Using Genetic Algorithm

verfasst von : D. Mustafi, G. Sahoo, A. Mustafi

Erschienen in: Computational Intelligence in Data Mining—Volume 1

Verlag: Springer India

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this work we present a multi criteria based clustering algorithm and demonstrate its usefulness in clustering documents. The algorithm proposes various metrices to judge the veracity of the clusters formed and then finds a near optimal solution that ensures good fitness scores for the all metrices. In view of the complexity of optimizing multiple clustering goals using classical optimization techniques, the paper proposes the use of an evolutionary strategy in the form of Genetic algorithm to quickly find a near optimal cluster set that satisfies all the cluster goodness criteria. The use of Genetic algorithm also inherently allows us to overcome the problem of converging to locally optimal solutions and find a global optima. The results obtained using the proposed algorithm have been compared with the outputs from standard classical algorithms and the performances have been compared.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Akter, R., Chung, Y.: An evolutionary approach for document clustering. IERI Procedia 4, 370–375 (2013)CrossRef Akter, R., Chung, Y.: An evolutionary approach for document clustering. IERI Procedia 4, 370–375 (2013)CrossRef
2.
Zurück zum Zitat Kalogeratos, A., Likas, A.: Document clustering using synthetic cluster prototypes. Data Knowl. Eng. 70(3), 284–306 (2011)CrossRef Kalogeratos, A., Likas, A.: Document clustering using synthetic cluster prototypes. Data Knowl. Eng. 70(3), 284–306 (2011)CrossRef
3.
Zurück zum Zitat Matthews, S.G., Gongora, M.A., Hopgood, A.A., Ahmadi, S.: Web usage mining with evolutionaryextraction of temporal fuzzy association rules. Knowl.-Based Syst. 54, 66–72 (2013)CrossRef Matthews, S.G., Gongora, M.A., Hopgood, A.A., Ahmadi, S.: Web usage mining with evolutionaryextraction of temporal fuzzy association rules. Knowl.-Based Syst. 54, 66–72 (2013)CrossRef
4.
Zurück zum Zitat Mukhopadhyay, A., Maulik, U., Bandyopadhyay, S., Coello Coello, C.: A survey of multiobjective evolutionary algorithms for data mining: part i. IEEE Trans. Evol. Comput. 18(1), 4–19 (2014) Mukhopadhyay, A., Maulik, U., Bandyopadhyay, S., Coello Coello, C.: A survey of multiobjective evolutionary algorithms for data mining: part i. IEEE Trans. Evol. Comput. 18(1), 4–19 (2014)
5.
Zurück zum Zitat Nasir, J.A., Varlamis, I., Karim, A., Tsatsaronis, G.: Semantic smoothing for text clustering. Knowl.-Based Syst. 54, 216–229 (2013)CrossRef Nasir, J.A., Varlamis, I., Karim, A., Tsatsaronis, G.: Semantic smoothing for text clustering. Knowl.-Based Syst. 54, 216–229 (2013)CrossRef
6.
Zurück zum Zitat Premalatha, K., Natarajan, A.M.: Genetic algorithm for document clustering with simultaneous and ranked mutation. Modern Appl. Sci. 3(2), (2009) Premalatha, K., Natarajan, A.M.: Genetic algorithm for document clustering with simultaneous and ranked mutation. Modern Appl. Sci. 3(2), (2009)
7.
Zurück zum Zitat Rana, C., Jain, S.K.: An evolutionary clustering algorithm based on temporal features for dynamic recommender systems. Swarm Evol. Comput. 14, 21–30 (2014)CrossRef Rana, C., Jain, S.K.: An evolutionary clustering algorithm based on temporal features for dynamic recommender systems. Swarm Evol. Comput. 14, 21–30 (2014)CrossRef
8.
Zurück zum Zitat Singh, V.K., Tiwari, N., Garg, S.: Document clustering using k-means, heuristic k-means and fuzzy c-means. In: Computational Intelligence and Communication Networks (CICN), 2011 International Conference on, pp. 297–301. IEEE (2011) Singh, V.K., Tiwari, N., Garg, S.: Document clustering using k-means, heuristic k-means and fuzzy c-means. In: Computational Intelligence and Communication Networks (CICN), 2011 International Conference on, pp. 297–301. IEEE (2011)
9.
Zurück zum Zitat Song, W., Qiao, Y., Park, S.C., Qian, X.: A hybrid evolutionary computation approach with its application for optimizing text document clustering. Expert Syst. Appl. 42(5), 2517–2524 (2015)CrossRef Song, W., Qiao, Y., Park, S.C., Qian, X.: A hybrid evolutionary computation approach with its application for optimizing text document clustering. Expert Syst. Appl. 42(5), 2517–2524 (2015)CrossRef
Metadaten
Titel
A Multi Criteria Document Clustering Approach Using Genetic Algorithm
verfasst von
D. Mustafi
G. Sahoo
A. Mustafi
Copyright-Jahr
2016
Verlag
Springer India
DOI
https://doi.org/10.1007/978-81-322-2734-2_25