Skip to main content
Top
Published in: Knowledge and Information Systems 3/2017

10-05-2017 | Regular Paper

Document-level sentiment classification using hybrid machine learning approach

Authors: Abinash Tripathy, Abhishek Anand, Santanu Kumar Rath

Published in: Knowledge and Information Systems | Issue 3/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

It is a practice that users or customers intend to share their comments or reviews about any product in different social networking sites. An analyst usually processes to reviews properly to obtain any meaningful information from it. Classification of sentiments associated with reviews is one of these processing steps. The reviews framed are often made in text format. While processing the text reviews, each word of the review is considered as a feature. Thus, selection of right kind of features needs to be carried out to select the best feature from the set of all features. In this paper, the machine learning algorithm, i.e., support vector machine, is used to select the best features from the training data. These features are then given input to artificial neural network method, to process further. Different performance evaluation parameters such as precision, recall, f-measure, accuracy have been considered to evaluate the performance of the proposed approach on two different datasets, i.e., IMDb dataset and polarity dataset.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on Empirical methods in natural language processing, vol 10, Association for Computational Linguistics, 2002, pp 79–86 Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on Empirical methods in natural language processing, vol 10, Association for Computational Linguistics, 2002, pp 79–86
2.
go back to reference Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on Association for Computational Linguistics, Association for Computational Linguistics, 2004, p 271 Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on Association for Computational Linguistics, Association for Computational Linguistics, 2004, p 271
3.
go back to reference Turney PD (2002) Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting on Association for Computational Linguistics, 2002, pp 417–424 Turney PD (2002) Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting on Association for Computational Linguistics, 2002, pp 417–424
5.
go back to reference Feldman R (2013) Techniques and applications for sentiment analysis. Commun ACM 56(4):82–89CrossRef Feldman R (2013) Techniques and applications for sentiment analysis. Commun ACM 56(4):82–89CrossRef
6.
go back to reference Gautam G, Yadav D (2014) Sentiment analysis of twitter data using machine learning approaches and semantic analysis. In: 2014 seventh international conference on contemporary computing (IC3), IEEE, 2014, pp 437–442 Gautam G, Yadav D (2014) Sentiment analysis of twitter data using machine learning approaches and semantic analysis. In: 2014 seventh international conference on contemporary computing (IC3), IEEE, 2014, pp 437–442
8.
go back to reference Hady MFA, Schwenker F (2013) Semi-supervised learning. In: Bianchini M, Maggini M, Jain LC (eds) Handbook on neural information processing. Springer, Berlin, pp 215–239 Hady MFA, Schwenker F (2013) Semi-supervised learning. In: Bianchini M, Maggini M, Jain LC (eds) Handbook on neural information processing. Springer, Berlin, pp 215–239
10.
go back to reference Garreta R, Moncecchi G (2013) Learning scikit-learn: machine Learning in Python. Packt Publishing Ltd, Birmingham Garreta R, Moncecchi G (2013) Learning scikit-learn: machine Learning in Python. Packt Publishing Ltd, Birmingham
11.
go back to reference Matsumoto S, Takamura H, Okumura M (2005) Sentiment classification using word sub-sequences and dependency sub-trees. In: Ho TB, Chung D, Liu H (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 301–311CrossRef Matsumoto S, Takamura H, Okumura M (2005) Sentiment classification using word sub-sequences and dependency sub-trees. In: Ho TB, Chung D, Liu H (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 301–311CrossRef
12.
go back to reference Moraes R, Valiati JF, Neto WPG (2013) Document-level sentiment classification: an empirical comparison between SVM and ANN. Expert Syst Appl 40(2):621–633CrossRef Moraes R, Valiati JF, Neto WPG (2013) Document-level sentiment classification: an empirical comparison between SVM and ANN. Expert Syst Appl 40(2):621–633CrossRef
13.
go back to reference Tang D (2015) Sentiment-specific representation learning for document-level sentiment analysis. In: Proceedings of the eighth ACM international conference on web search and data mining, ACM, 2015, pp 447–452 Tang D (2015) Sentiment-specific representation learning for document-level sentiment analysis. In: Proceedings of the eighth ACM international conference on web search and data mining, ACM, 2015, pp 447–452
14.
go back to reference Tu Z, He Y, Foster J, van Genabith J, Liu Q, Lin S (2012) Identifying high-impact sub-structures for convolution kernels in document-level sentiment classification. In: Proceedings of the 50th annual meeting of the Association for Computational Linguistics: short papers, vol 2, Association for Computational Linguistics, 2012, pp 338–343 Tu Z, He Y, Foster J, van Genabith J, Liu Q, Lin S (2012) Identifying high-impact sub-structures for convolution kernels in document-level sentiment classification. In: Proceedings of the 50th annual meeting of the Association for Computational Linguistics: short papers, vol 2, Association for Computational Linguistics, 2012, pp 338–343
15.
go back to reference Liu SM, Chen J-H (2015) A multi-label classification based approach for sentiment classification. Expert Syst Appl 42(3):1083–1093CrossRef Liu SM, Chen J-H (2015) A multi-label classification based approach for sentiment classification. Expert Syst Appl 42(3):1083–1093CrossRef
16.
go back to reference Zhang D, Xu H, Su Z, Xu Y (2015) Chinese comments sentiment classification based on word2vec and SVM perf. Expert Syst Appl 42(4):1857–1863CrossRef Zhang D, Xu H, Su Z, Xu Y (2015) Chinese comments sentiment classification based on word2vec and SVM perf. Expert Syst Appl 42(4):1857–1863CrossRef
17.
go back to reference Luo B, Zeng J, Duan J (2016) Emotion space model for classifying opinions in stock message board. Expert Syst Appl 44:138–146CrossRef Luo B, Zeng J, Duan J (2016) Emotion space model for classifying opinions in stock message board. Expert Syst Appl 44:138–146CrossRef
18.
go back to reference Niu T, Zhu S, Pang L, El Saddik A (2016) Sentiment analysis on multi-view social data. In: Tian Q, Sebe N, Qi G, Huet B, Hong R, Liu X (eds) Multimedia modeling. Springer, Berlin, pp 15–27CrossRef Niu T, Zhu S, Pang L, El Saddik A (2016) Sentiment analysis on multi-view social data. In: Tian Q, Sebe N, Qi G, Huet B, Hong R, Liu X (eds) Multimedia modeling. Springer, Berlin, pp 15–27CrossRef
19.
go back to reference Tripathy A, Agrawal A, Rath SK (2016) Classification of sentiment reviews using n-gram machine learning approach. Expert Syst Appl 57:117–126CrossRef Tripathy A, Agrawal A, Rath SK (2016) Classification of sentiment reviews using n-gram machine learning approach. Expert Syst Appl 57:117–126CrossRef
20.
go back to reference Govindarajan M (2013) Sentiment analysis of movie reviews using hybrid method of naive bayes and genetic algorithm. Int J Adv Comput Res 3(4):139 Govindarajan M (2013) Sentiment analysis of movie reviews using hybrid method of naive bayes and genetic algorithm. Int J Adv Comput Res 3(4):139
21.
go back to reference Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst (TOIS) 26(3):12CrossRef Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst (TOIS) 26(3):12CrossRef
22.
go back to reference Balage Filho PP, Avanço L, Pardo TA, Nunes MG (2014) NILC USP: an improved hybrid system for sentiment analysis in Twitter messages. SemEval 2014:428 Balage Filho PP, Avanço L, Pardo TA, Nunes MG (2014) NILC USP: an improved hybrid system for sentiment analysis in Twitter messages. SemEval 2014:428
23.
go back to reference Jagtap B, Dhotre V (2014) SVM and HMM based hybrid approach of sentiment analysis for teacher feedback assessment. Int J Emerg Trends Technol Comput Sci (IJETCS) 3(3):229–232 Jagtap B, Dhotre V (2014) SVM and HMM based hybrid approach of sentiment analysis for teacher feedback assessment. Int J Emerg Trends Technol Comput Sci (IJETCS) 3(3):229–232
24.
go back to reference Wang S, Wei Y, Li D, Zhang W, Li W (2007) A hybrid method of feature selection for Chinese text sentiment classification, In: Fourth international conference on fuzzy systems and knowledge discovery, 2007 (FSKD 2007), vol 3, IEEE, 2007, pp 435–439 Wang S, Wei Y, Li D, Zhang W, Li W (2007) A hybrid method of feature selection for Chinese text sentiment classification, In: Fourth international conference on fuzzy systems and knowledge discovery, 2007 (FSKD 2007), vol 3, IEEE, 2007, pp 435–439
25.
go back to reference Babatunde O, Armstrong L, Leng J, Diepeveen D (2014) A genetic algorithm-based feature selection. Br J Math Comput Sci 4(21):889–905 Babatunde O, Armstrong L, Leng J, Diepeveen D (2014) A genetic algorithm-based feature selection. Br J Math Comput Sci 4(21):889–905
26.
go back to reference Neumann J, Schnörr C, Steidl G (2005) Combined SVM-based feature selection and classification. Mach Learn 61(1–3):129–150CrossRefMATH Neumann J, Schnörr C, Steidl G (2005) Combined SVM-based feature selection and classification. Mach Learn 61(1–3):129–150CrossRefMATH
27.
go back to reference Fernandez-Lozano C, Seoane JA, Gestal M, Gaunt TR, Dorado J, Campbell C (2015) Texture classification using feature selection and kernel-based techniques. Soft Comput 19(9):2469–2480CrossRef Fernandez-Lozano C, Seoane JA, Gestal M, Gaunt TR, Dorado J, Campbell C (2015) Texture classification using feature selection and kernel-based techniques. Soft Comput 19(9):2469–2480CrossRef
28.
go back to reference Maldonado S, Weber R, Basak J (2011) Simultaneous feature selection and classification using kernel-penalized support vector machines. Inf Sci 181(1):115–128CrossRef Maldonado S, Weber R, Basak J (2011) Simultaneous feature selection and classification using kernel-penalized support vector machines. Inf Sci 181(1):115–128CrossRef
29.
go back to reference Zheng L, Wang H, Gao S (2015) Sentimental feature selection for sentiment analysis of Chinese online reviews. Int J Mach Learn Cybern 6:1–10 Zheng L, Wang H, Gao S (2015) Sentimental feature selection for sentiment analysis of Chinese online reviews. Int J Mach Learn Cybern 6:1–10
30.
go back to reference Sharma A, Dey S (2012) A comparative study of feature selection and machine learning techniques for sentiment analysis. In: Proceedings of the 2012 ACM Research in Applied Computation Symposium, ACM, 2012, pp 1–7 Sharma A, Dey S (2012) A comparative study of feature selection and machine learning techniques for sentiment analysis. In: Proceedings of the 2012 ACM Research in Applied Computation Symposium, ACM, 2012, pp 1–7
31.
go back to reference Hardin D, Tsamardinos I, Aliferis CF (2004) A theoretical characterization of linear svm-based feature selection. In: Proceedings of the twenty-first international conference on machine learning, ACM, 2004, p 48 Hardin D, Tsamardinos I, Aliferis CF (2004) A theoretical characterization of linear svm-based feature selection. In: Proceedings of the twenty-first international conference on machine learning, ACM, 2004, p 48
32.
go back to reference Tang H, Tan S, Cheng X (2009) A survey on sentiment detection of reviews. Expert Syst Appl 36(7):10760–10773CrossRef Tang H, Tan S, Cheng X (2009) A survey on sentiment detection of reviews. Expert Syst Appl 36(7):10760–10773CrossRef
34.
go back to reference Hsu CW, Chang CC, Lin CJ (2003) A practical guide to support vector classification. Technical Report, Department of Computer Science, National Taiwan University Hsu CW, Chang CC, Lin CJ (2003) A practical guide to support vector classification. Technical Report, Department of Computer Science, National Taiwan University
35.
go back to reference Zhang GP (2000) Neural networks for classification: a survey. IEEE Trans Syst Man Cybern C Appl Rev 30(4):451–462CrossRef Zhang GP (2000) Neural networks for classification: a survey. IEEE Trans Syst Man Cybern C Appl Rev 30(4):451–462CrossRef
36.
go back to reference Reby D, Lek S, Dimopoulos I, Joachim J, Lauga J, Aulagnier S (1997) Artificial neural networks as a classification method in the behavioural sciences. Behav Process 40(1):35–43CrossRef Reby D, Lek S, Dimopoulos I, Joachim J, Lauga J, Aulagnier S (1997) Artificial neural networks as a classification method in the behavioural sciences. Behav Process 40(1):35–43CrossRef
37.
go back to reference Mouthami K, Devi KN, Bhaskaran VM (2013) Sentiment analysis and classification based on textual reviews. In: 2013 international conference on information communication and embedded systems (ICICES), IEEE, 2013, pp 271–276 Mouthami K, Devi KN, Bhaskaran VM (2013) Sentiment analysis and classification based on textual reviews. In: 2013 international conference on information communication and embedded systems (ICICES), IEEE, 2013, pp 271–276
38.
go back to reference Salvetti F, Lewis S, Reichenbach C (2004) Automatic opinion polarity classification of movie. Colo Res Linguist 17:2 Salvetti F, Lewis S, Reichenbach C (2004) Automatic opinion polarity classification of movie. Colo Res Linguist 17:2
39.
go back to reference Mullen T, Collier N (2004) Sentiment analysis using support vector machines with diverse information sources. In: Lin D, Wu D (eds) EMNLP, vol 4, pp 412–418 Mullen T, Collier N (2004) Sentiment analysis using support vector machines with diverse information sources. In: Lin D, Wu D (eds) EMNLP, vol 4, pp 412–418
40.
go back to reference Beineke P, Hastie T, Vaithyanathan S (2004) The sentimental factor: improving review classification via human-provided information. In: Proceedings of the 42nd annual meeting on Association for Computational Linguistics, 2004, p 263 Beineke P, Hastie T, Vaithyanathan S (2004) The sentimental factor: improving review classification via human-provided information. In: Proceedings of the 42nd annual meeting on Association for Computational Linguistics, 2004, p 263
41.
go back to reference Whitelaw C, Garg N, Argamon S (2005) Using appraisal groups for sentiment analysis. In: Proceedings of the 14th ACM international conference on information and knowledge management, ACM, 2005, pp 625–631 Whitelaw C, Garg N, Argamon S (2005) Using appraisal groups for sentiment analysis. In: Proceedings of the 14th ACM international conference on information and knowledge management, ACM, 2005, pp 625–631
42.
go back to reference Aue A, Gamon M (2005) Customizing sentiment classifiers to new domains: a case study. In: Proceedings of recent advances in natural language processing (RANLP), vol. 1, 2005, pp 1–7 Aue A, Gamon M (2005) Customizing sentiment classifiers to new domains: a case study. In: Proceedings of recent advances in natural language processing (RANLP), vol. 1, 2005, pp 1–7
43.
go back to reference Read J (2005) Using emoticons to reduce dependency in machine learning techniques for sentiment classification. In: Proceedings of the ACL student research workshop, Association for Computational Linguistics, 2005, pp 43–48 Read J (2005) Using emoticons to reduce dependency in machine learning techniques for sentiment classification. In: Proceedings of the ACL student research workshop, Association for Computational Linguistics, 2005, pp 43–48
44.
go back to reference Kennedy A, Inkpen D (2006) Sentiment classification of movie reviews using contextual valence shifters. Comput Intell 22(2):110–125MathSciNetCrossRef Kennedy A, Inkpen D (2006) Sentiment classification of movie reviews using contextual valence shifters. Comput Intell 22(2):110–125MathSciNetCrossRef
45.
go back to reference Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: European conference on machine learning, pp 137–142 Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: European conference on machine learning, pp 137–142
46.
go back to reference Socher R, Perelygin A, Wu JY, Chuang J, Manning C, Ng A, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), pp 1642–1654 Socher R, Perelygin A, Wu JY, Chuang J, Manning C, Ng A, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), pp 1642–1654
47.
go back to reference Cao Y, Xu R, Chen T (2015) Combining convolutional neural network and support vector machine for sentiment classification. In: Chinese national conference on social media processing, pp 144–155 Cao Y, Xu R, Chen T (2015) Combining convolutional neural network and support vector machine for sentiment classification. In: Chinese national conference on social media processing, pp 144–155
48.
go back to reference Liu B (2015) Sentiment analysis: mining opinions, sentiments, and emotions. Cambridge University Press, CambridgeCrossRef Liu B (2015) Sentiment analysis: mining opinions, sentiments, and emotions. Cambridge University Press, CambridgeCrossRef
49.
go back to reference Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:​1301.​3781
50.
go back to reference van Rijsbergen CJ, Robertson SE, Porter MF, Martin F (1980) New models in probabilistic information retrieval. British Library Research and Development Department, London van Rijsbergen CJ, Robertson SE, Porter MF, Martin F (1980) New models in probabilistic information retrieval. British Library Research and Development Department, London
51.
go back to reference Goldberg Y, Levy O (2014) word2vec Explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722 Goldberg Y, Levy O (2014) word2vec Explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv preprint arXiv:​1402.​3722
53.
go back to reference Weston J, Elisseeff A, Schölkopf B, Tipping M (2003) Use of the zero-norm with linear models and kernel methods. J Mach Learn Res 3:1439–1461MathSciNetMATH Weston J, Elisseeff A, Schölkopf B, Tipping M (2003) Use of the zero-norm with linear models and kernel methods. J Mach Learn Res 3:1439–1461MathSciNetMATH
Metadata
Title
Document-level sentiment classification using hybrid machine learning approach
Authors
Abinash Tripathy
Abhishek Anand
Santanu Kumar Rath
Publication date
10-05-2017
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 3/2017
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-017-1055-z

Other articles of this Issue 3/2017

Knowledge and Information Systems 3/2017 Go to the issue

Premium Partner