Skip to main content
Erschienen in: Natural Computing 3/2020

25.04.2018

Swarm optimization clustering methods for opinion mining

verfasst von: Ellen Souza, Diego Santos, Gustavo Oliveira, Alisson Silva, Adriano L. I. Oliveira

Erschienen in: Natural Computing | Ausgabe 3/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Supervised machine learning and opinion lexicon are the most frequent approaches for opinion mining, but they require considerable effort to prepare the training data and to build the opinion lexicon, respectively. In this paper, a novel unsupervised clustering approach is proposed for opinion mining. Three swarm algorithms based on Particle Swarm Optimization are evaluated using three corpora with different levels of complexity with respect to size, number of opinions, domains, languages, and class balancing. K-means and Agglomerative clustering algorithms, as well as, the Artificial Bee Colony and Cuckoo Search swarm-based algorithms were selected for comparison. The proposed swarm-based algorithms achieved better accuracy using the word bigram feature model as the pre-processing technique, the Global Silhouette as optimization function, and on datasets with two classes: positive and negative. Although the swarm-based algorithms obtained lower result for datasets with three classes, they are still competitive considering that neither labeled data, nor opinion lexicons are required for the opinion clustering approach.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abbasi A, Hassan A, Dhar M (2014) Benchmarking Twitter sentiment analysis tools. In: Proceedings of LREC-2014, the ninth international conference on language resources and evaluation, March, pp 823–829 Abbasi A, Hassan A, Dhar M (2014) Benchmarking Twitter sentiment analysis tools. In: Proceedings of LREC-2014, the ninth international conference on language resources and evaluation, March, pp 823–829
Zurück zum Zitat Cagnina LC, Errecalde ML, Ingaramo DA (2008) A discrete particle swarm optimizer for clustering short-text corpora. In: Proceedings of international conference on bioinspired optimization methods and their applications, BIOMA 2008, pp 1–10 Cagnina LC, Errecalde ML, Ingaramo DA (2008) A discrete particle swarm optimizer for clustering short-text corpora. In: Proceedings of international conference on bioinspired optimization methods and their applications, BIOMA 2008, pp 1–10
Zurück zum Zitat Cui X, Potok TE (2005) Document clustering analysis based on hybrid pso+k-means algorithm. Special issue, pp 27–33 Cui X, Potok TE (2005) Document clustering analysis based on hybrid pso+k-means algorithm. Special issue, pp 27–33
Zurück zum Zitat Evangelista TR, Padilha TPP (2013) Monitoramento de Posts Sobre Empresas de E-Commerce em Redes Sociais Utilizando Análise de Sentimentos. Brazilian Workshop on Social Network Analysis and Mining (BraSNAM) Evangelista TR, Padilha TPP (2013) Monitoramento de Posts Sobre Empresas de E-Commerce em Redes Sociais Utilizando Análise de Sentimentos. Brazilian Workshop on Social Network Analysis and Mining (BraSNAM)
Zurück zum Zitat Fouladgar N, Lotfi S (2016) A novel approach for optimization in dynamic environments based on modified cuckoo search algorithm. Soft Comput 20(7):2889–2903CrossRef Fouladgar N, Lotfi S (2016) A novel approach for optimization in dynamic environments based on modified cuckoo search algorithm. Soft Comput 20(7):2889–2903CrossRef
Zurück zum Zitat Go A, Bhayani R, Huang L (2010) Twitter Sentiment classification using distant supervision. Tech rep Go A, Bhayani R, Huang L (2010) Twitter Sentiment classification using distant supervision. Tech rep
Zurück zum Zitat Huang Y (2016) Conceptually categorizing geographic features from text based on latent semantic analysis and ontologies. Ann GIS 22(2):113–127CrossRef Huang Y (2016) Conceptually categorizing geographic features from text based on latent semantic analysis and ontologies. Ann GIS 22(2):113–127CrossRef
Zurück zum Zitat Ingaramo D, Errecalde M, Cagnina L, Rosso P (2011) A particle swarm optimizer to cluster parallel Spanish–English short-text corpora. CEUR Workshop Proc 824(Icl):43–48 Ingaramo D, Errecalde M, Cagnina L, Rosso P (2011) A particle swarm optimizer to cluster parallel Spanish–English short-text corpora. CEUR Workshop Proc 824(Icl):43–48
Zurück zum Zitat Karaboga D, Basturk B (2007) A powerful and efficient algorithm for numerical function optimization: artificial bee colony (abc) algorithm. J Global Optim 39(3):459–471MathSciNetMATHCrossRef Karaboga D, Basturk B (2007) A powerful and efficient algorithm for numerical function optimization: artificial bee colony (abc) algorithm. J Global Optim 39(3):459–471MathSciNetMATHCrossRef
Zurück zum Zitat Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. In: IEEE international conference on systems, man, and cybernetics, pp 4–8 Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. In: IEEE international conference on systems, man, and cybernetics, pp 4–8
Zurück zum Zitat Kushal D, Lawrence S, Pennock DM (2003) Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: WWW, pp 519–528 Kushal D, Lawrence S, Pennock DM (2003) Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: WWW, pp 519–528
Zurück zum Zitat Liu B, Zhang L (2012) A survey of opinion mining and sentiment analysis. Min Text Data Chapter 1:415–463CrossRef Liu B, Zhang L (2012) A survey of opinion mining and sentiment analysis. Min Text Data Chapter 1:415–463CrossRef
Zurück zum Zitat MacQueen J et al (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Oakland, CA, USA 1:281–297 MacQueen J et al (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Oakland, CA, USA 1:281–297
Zurück zum Zitat Owoputi O, Connor BO, Dyer C, Gimpel K, Schneider N (2012) Part-of-speech tagging for Twitter: word clusters and other advances. Carnegie Mellon University, Tech rep Owoputi O, Connor BO, Dyer C, Gimpel K, Schneider N (2012) Part-of-speech tagging for Twitter: word clusters and other advances. Carnegie Mellon University, Tech rep
Zurück zum Zitat Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd annual meeting on association for computational linguistics, ACL ’05, pp 115–124. https://doi.org/10.3115/1219840.1219855 Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd annual meeting on association for computational linguistics, ACL ’05, pp 115–124. https://​doi.​org/​10.​3115/​1219840.​1219855
Zurück zum Zitat Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP) (July), pp 79–86 Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP) (July), pp 79–86
Zurück zum Zitat Premalatha K, Natarajan A (2009) Discrete PSO with GA operators for document clustering. Int J Recent Trends Eng 1(1):20–24 Premalatha K, Natarajan A (2009) Discrete PSO with GA operators for document clustering. Int J Recent Trends Eng 1(1):20–24
Zurück zum Zitat Premalatha K, Natarajan AM (2010) Hybrid PSO and GA models for document clustering. Int J Adv Soft Comput Its Appl 2(3):302–320 Premalatha K, Natarajan AM (2010) Hybrid PSO and GA models for document clustering. Int J Adv Soft Comput Its Appl 2(3):302–320
Zurück zum Zitat Sarkar S, Roy A, Purkayastha B (2013) Application of particle swarm optimization in data clustering: a survey. Int J Comput Appl 65(25):38–46 Sarkar S, Roy A, Purkayastha B (2013) Application of particle swarm optimization in data clustering: a survey. Int J Comput Appl 65(25):38–46
Zurück zum Zitat Sarkar S, Roy A, Purkayastha B (2014a) Clustering of documents using particle swarm optimization and semantics information. Int J Comput Sci Inf Technol 5(3):4175–4180 Sarkar S, Roy A, Purkayastha B (2014a) Clustering of documents using particle swarm optimization and semantics information. Int J Comput Sci Inf Technol 5(3):4175–4180
Zurück zum Zitat Sarkar S, Roy A, Purkayastha BS (2014b) A comparative analysis of particle swarm optimization and K-means algorithm for text clustering using Nepali Wordnet. Int J Nat Lang Comput (IJNLC) 3(3):83–92CrossRef Sarkar S, Roy A, Purkayastha BS (2014b) A comparative analysis of particle swarm optimization and K-means algorithm for text clustering using Nepali Wordnet. Int J Nat Lang Comput (IJNLC) 3(3):83–92CrossRef
Zurück zum Zitat Souza E, Alves T, Teles I, Oliveira ALI, Gusmão C (2016a) TOPIE: an open-source opinion mining pipeline to analyze consumers sentiment in Brazilian Portuguese. In: Computational processing of the Portuguese language: 12th international conference, PROPOR 2016, Tomar, Portugal, July 13–15, 2016, Proceedings. Springer International Publishing, pp 95–105 Souza E, Alves T, Teles I, Oliveira ALI, Gusmão C (2016a) TOPIE: an open-source opinion mining pipeline to analyze consumers sentiment in Brazilian Portuguese. In: Computational processing of the Portuguese language: 12th international conference, PROPOR 2016, Tomar, Portugal, July 13–15, 2016, Proceedings. Springer International Publishing, pp 95–105
Zurück zum Zitat Teles V, Santos D, Souza E (2016) Uma Análise Comparativa de Técnicas Supervisionadas para Mineração de Opinião de Consumidores Brasileiros no Twitter. In: XIII Encontro Nacional de Inteligência Artificial e Computacional (ENIAC-2016), pp 217–228 Teles V, Santos D, Souza E (2016) Uma Análise Comparativa de Técnicas Supervisionadas para Mineração de Opinião de Consumidores Brasileiros no Twitter. In: XIII Encontro Nacional de Inteligência Artificial e Computacional (ENIAC-2016), pp 217–228
Zurück zum Zitat Tripathy A, Agrawal A, Rath SK (2016) Classification of sentiment reviews using n-gram machine learning approach. Expert Syst Appl 57:117–126CrossRef Tripathy A, Agrawal A, Rath SK (2016) Classification of sentiment reviews using n-gram machine learning approach. Expert Syst Appl 57:117–126CrossRef
Zurück zum Zitat Wu ST, Li Y, Xu Y, Pham B, Chen P (2004) Automatic pattern-taxonomy extraction for web mining. In: IEEE/WIC/ACM international conference on web intelligence, 2004. WI 2004. Proceedings. IEEE, pp 242–248 Wu ST, Li Y, Xu Y, Pham B, Chen P (2004) Automatic pattern-taxonomy extraction for web mining. In: IEEE/WIC/ACM international conference on web intelligence, 2004. WI 2004. Proceedings. IEEE, pp 242–248
Zurück zum Zitat Yang XS, Deb S (2009) Cuckoo search via lévy flights. In: World congress on nature and biologically inspired computing, 2009. NaBIC 2009. IEEE, pp 210–214 Yang XS, Deb S (2009) Cuckoo search via lévy flights. In: World congress on nature and biologically inspired computing, 2009. NaBIC 2009. IEEE, pp 210–214
Metadaten
Titel
Swarm optimization clustering methods for opinion mining
verfasst von
Ellen Souza
Diego Santos
Gustavo Oliveira
Alisson Silva
Adriano L. I. Oliveira
Publikationsdatum
25.04.2018
Verlag
Springer Netherlands
Erschienen in
Natural Computing / Ausgabe 3/2020
Print ISSN: 1567-7818
Elektronische ISSN: 1572-9796
DOI
https://doi.org/10.1007/s11047-018-9681-2

Weitere Artikel der Ausgabe 3/2020

Natural Computing 3/2020 Zur Ausgabe

EditorialNotes

Preface