Skip to main content
Erschienen in: Soft Computing 10/2016

10.03.2016 | Focus

Particle swarm optimization-based feature selection in sentiment classification

verfasst von: Lin Shang, Zhe Zhou, Xing Liu

Erschienen in: Soft Computing | Ausgabe 10/2016

Einloggen

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Sentiment classification is one of the important tasks in text mining, which is to classify documents according to their opinion or sentiment. Documents in sentiment classification can be represented in the form of feature vectors, which are employed by machine learning algorithms to perform classification. For the feature vectors, the feature selection process is necessary. In this paper, we will propose a feature selection method called fitness proportionate selection binary particle swarm optimization (F-BPSO). Binary particle swarm optimization (BPSO) is the binary version of particle swam optimization and can be applied to feature selection domain. F-BPSO is a modification of BPSO and can overcome the problems of traditional BPSO including unreasonable update formula of velocity and lack of evaluation on every single feature. Then, some detailed changes are made on the original F-BPSO including using fitness sum instead of average fitness in the fitness proportionate selection step. The modified method is, thus, called fitness sum proportionate selection binary particle swarm optimization (FS-BPSO). Moreover, further modifications are made on the FS-BPSO method to make it more suitable for sentiment classification-oriented feature selection domain. The modified method is named as SCO-FS-BPSO where SCO stands for “sentiment classification-oriented”. Experimental results show that in benchmark datasets original F-BPSO is superior to traditional BPSO in feature selection performance and FS-BPSO outperforms original F-BPSO. Besides, in sentiment classification domain, SCO-FS-BPSO which is modified specially for sentiment classification is superior to traditional feature selection methods on subjective consumer review datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst 26(3):12CrossRef Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst 26(3):12CrossRef
Zurück zum Zitat Basu T, Murthy C (2012) Effective text classification by a supervised feature selection approach. In: 2012 IEEE 12th international conference on data mining workshops (ICDMW), pp 918–925. IEEE Basu T, Murthy C (2012) Effective text classification by a supervised feature selection approach. In: 2012 IEEE 12th international conference on data mining workshops (ICDMW), pp 918–925. IEEE
Zurück zum Zitat Cervante L, Xue B, Shang L, Zhang M (2012) A dimension reduction approach to classification based on particle swarm optimisation and rough set theory. In: Australasian conference on artificial intelligence, pp 313–325. Springer, New York Cervante L, Xue B, Shang L, Zhang M (2012) A dimension reduction approach to classification based on particle swarm optimisation and rough set theory. In: Australasian conference on artificial intelligence, pp 313–325. Springer, New York
Zurück zum Zitat Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(1):131–156CrossRef Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(1):131–156CrossRef
Zurück zum Zitat Eberhart R, Simpson P, Dobbins R (1996) Computational intelligence PC tools. Academic Press Professional Inc, San Diego Eberhart R, Simpson P, Dobbins R (1996) Computational intelligence PC tools. Academic Press Professional Inc, San Diego
Zurück zum Zitat Engelbrecht AP (2005) Fundamentals of computational swarm intelligence. Wiley, New York Engelbrecht AP (2005) Fundamentals of computational swarm intelligence. Wiley, New York
Zurück zum Zitat Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305MATH Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305MATH
Zurück zum Zitat Jin Y, Xiong W, Wang C (2010) Feature selection for chinese text categorization based on improved particle swarm optimization. In: 2010 International conference on natural language processing and knowledge engineering (NLP-KE), pp 1–6. IEEE Jin Y, Xiong W, Wang C (2010) Feature selection for chinese text categorization based on improved particle swarm optimization. In: 2010 International conference on natural language processing and knowledge engineering (NLP-KE), pp 1–6. IEEE
Zurück zum Zitat Kennedy J (2003) Bare bones particle swarms. In: Proceedings of IEEE swarm intelligence symposium, pp 80–87 Kennedy J (2003) Bare bones particle swarms. In: Proceedings of IEEE swarm intelligence symposium, pp 80–87
Zurück zum Zitat Kennedy J, Eberhart R (1995) Particle swarm optimization. Proc IEEE Int Conf Neural Netw 4:1942–1948CrossRef Kennedy J, Eberhart R (1995) Particle swarm optimization. Proc IEEE Int Conf Neural Netw 4:1942–1948CrossRef
Zurück zum Zitat Kennedy J, Eberhart R (1997) A discrete binary version of the particle swarm optimization. In: Proceedings of IEEE international conference on systems, man, and cybernetics, computational cybernetics and simulation, vol 5, pp 4104–4108 Kennedy J, Eberhart R (1997) A discrete binary version of the particle swarm optimization. In: Proceedings of IEEE international conference on systems, man, and cybernetics, computational cybernetics and simulation, vol 5, pp 4104–4108
Zurück zum Zitat Khanesar MA, Teshnehlab M, Shoorehdeli MA (2007) A novel binary particle swarm optimization. In: IEEE mediterranean conference on control and automation, pp 1–6 Khanesar MA, Teshnehlab M, Shoorehdeli MA (2007) A novel binary particle swarm optimization. In: IEEE mediterranean conference on control and automation, pp 1–6
Zurück zum Zitat Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324CrossRefMATH Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324CrossRefMATH
Zurück zum Zitat Lee S, Soak S, Oh S, Pedrycz W, Jeon M (2008) Modified binary particle swarm optimization. Progr Nat Sci 18(9):1161–1166MathSciNetCrossRef Lee S, Soak S, Oh S, Pedrycz W, Jeon M (2008) Modified binary particle swarm optimization. Progr Nat Sci 18(9):1161–1166MathSciNetCrossRef
Zurück zum Zitat Liao C-J, Tseng C-T, Luarn P (2007) A discrete version of particle swarm optimization for flowshop scheduling problems. Computers Oper Res 34(10):3099–3111CrossRefMATH Liao C-J, Tseng C-T, Luarn P (2007) A discrete version of particle swarm optimization for flowshop scheduling problems. Computers Oper Res 34(10):3099–3111CrossRefMATH
Zurück zum Zitat Liu X, Shang L (2013) A fast wrapper feature subset selection method based on binary particle swarm optimization. In: Proceedings of IEEE congress on evolutionary computation, pp 3347–3353 Liu X, Shang L (2013) A fast wrapper feature subset selection method based on binary particle swarm optimization. In: Proceedings of IEEE congress on evolutionary computation, pp 3347–3353
Zurück zum Zitat Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135CrossRef Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135CrossRef
Zurück zum Zitat Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, vol 10, pp 79–86. Association for Computational Linguistics Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, vol 10, pp 79–86. Association for Computational Linguistics
Zurück zum Zitat Qiu B, Zhao K, Mitra P, Wu D, Caragea C, Yen J, Greer GE, Portier K (2011) Get online support, feel better–sentiment analysis and dynamics in an online cancer survivor community. In: Privacy, security, risk and trust (PASSAT) and 2011 IEEE third inernational conference on social computing (SocialCom), pp 274–281. IEEE Qiu B, Zhao K, Mitra P, Wu D, Caragea C, Yen J, Greer GE, Portier K (2011) Get online support, feel better–sentiment analysis and dynamics in an online cancer survivor community. In: Privacy, security, risk and trust (PASSAT) and 2011 IEEE third inernational conference on social computing (SocialCom), pp 274–281. IEEE
Zurück zum Zitat Sadri J, Sadri CY (2006) A genetic binary particle swarm optimization model. In: IEEE congress on evolutionary computation, pp 656–663 Sadri J, Sadri CY (2006) A genetic binary particle swarm optimization model. In: IEEE congress on evolutionary computation, pp 656–663
Zurück zum Zitat Shi X, Liang Y, Lee H, Lu C, Wang Q (2007) Particle swarm optimization-based algorithms for tsp and generalized tsp. Inf Process Lett 103(5):169–176MathSciNetCrossRefMATH Shi X, Liang Y, Lee H, Lu C, Wang Q (2007) Particle swarm optimization-based algorithms for tsp and generalized tsp. Inf Process Lett 103(5):169–176MathSciNetCrossRefMATH
Zurück zum Zitat Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: Proceedings of IEEE world congress on computational intelligence, pp 69–73 Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: Proceedings of IEEE world congress on computational intelligence, pp 69–73
Zurück zum Zitat Song Q, Ni J, Wang G (2013) A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Trans Knowl Data Eng 25(1):1–14CrossRef Song Q, Ni J, Wang G (2013) A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Trans Knowl Data Eng 25(1):1–14CrossRef
Zurück zum Zitat Tasgetiren MF, Liang Y-C (2004) A binary particle swarm optimization algorithm for lot sizing problem. J Econ Soc Res 5(2):1–20 Tasgetiren MF, Liang Y-C (2004) A binary particle swarm optimization algorithm for lot sizing problem. J Econ Soc Res 5(2):1–20
Zurück zum Zitat Trelea IC (2003) The particle swarm optimization algorithm: convergence analysis and parameter selection. Inf Process Lett 85(6):317–325MathSciNetCrossRefMATH Trelea IC (2003) The particle swarm optimization algorithm: convergence analysis and parameter selection. Inf Process Lett 85(6):317–325MathSciNetCrossRefMATH
Zurück zum Zitat Wang M, Cao D, Li L, Li S, Ji R (2014) Microblog sentiment analysis based on cross-media bag-of-words model. In: Proceedings of international conference on internet multimedia computing and service, p 76. ACM Wang M, Cao D, Li L, Li S, Ji R (2014) Microblog sentiment analysis based on cross-media bag-of-words model. In: Proceedings of international conference on internet multimedia computing and service, p 76. ACM
Zurück zum Zitat Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recognit Lett 28(4):459–471CrossRef Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recognit Lett 28(4):459–471CrossRef
Zurück zum Zitat Xue B, Cervante L, Shang L, Browne WN, Zhang M (20104) Binary PSO and rough set theory for feature selection: a multi-objective filter based approach. Int J Comput Intell Appl 13(2) Xue B, Cervante L, Shang L, Browne WN, Zhang M (20104) Binary PSO and rough set theory for feature selection: a multi-objective filter based approach. Int J Comput Intell Appl 13(2)
Zurück zum Zitat Xue B, Zhang M, Browne WN (2013) Novel initialisation and updating mechanisms in pso for feature selection in classification. In: EvoApplications, pp 428–438 Xue B, Zhang M, Browne WN (2013) Novel initialisation and updating mechanisms in pso for feature selection in classification. In: EvoApplications, pp 428–438
Zurück zum Zitat Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671CrossRef Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671CrossRef
Zurück zum Zitat Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. ICML 97:412–420 Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. ICML 97:412–420
Zurück zum Zitat Yang S, Wang M, Jiao L (2004) A quantum particle swarm optimization. In: IEEE congress on evolutionary computation, vol 1, pp 320–324 Yang S, Wang M, Jiao L (2004) A quantum particle swarm optimization. In: IEEE congress on evolutionary computation, vol 1, pp 320–324
Zurück zum Zitat Zhou Z, Liu X, Li P, Shang L (2014) Feature selection method with proportionate fitness based binary particle swarm optimization. In: Simulated evolution and learning, pp 582–592. Springer, New York Zhou Z, Liu X, Li P, Shang L (2014) Feature selection method with proportionate fitness based binary particle swarm optimization. In: Simulated evolution and learning, pp 582–592. Springer, New York
Metadaten
Titel
Particle swarm optimization-based feature selection in sentiment classification
verfasst von
Lin Shang
Zhe Zhou
Xing Liu
Publikationsdatum
10.03.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 10/2016
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-016-2093-2

Weitere Artikel der Ausgabe 10/2016

Soft Computing 10/2016 Zur Ausgabe