Skip to main content
Log in

Applying data mining and machine learning techniques for sentiment shifter identification

  • Original Paper
  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

Sentiment shifters, as a set of words and expressions that can affect text polarity, play a fundamental role in opinion mining. However, the limited ability of current automated opinion mining systems in handling shifters is a major challenge. This paper presents three novel and efficient methods for identifying sentiment shifters in reviews in order to improve the overall accuracy of opinion mining systems: two data mining based algorithms and a machine learning based algorithm. The data mining algorithms do not need shifter tagged datasets. They use weighted association rule mining (WARM) for finding frequent patterns representing sentiment shifters from a domain-specific and a general corpus. These patterns include different kinds of shifter words such as shifter verbs and quantifiers and are able to handle both local and long-distance shifters. The items in WARM for the two designed methods are in the form of dependency relations and SRL arguments of sentences, respectively. Secondly, we implemented a supervised machine learning system based on semantic features of sentences for shifter identification and polarity classification. This method obviously needs shifter tagged dataset for shifter identification. We tested our proposed algorithms on polarity classification task for 2 domains: a specific domain (drug reviews) and a general domain. Experiments demonstrate that (1) the extracted shifters improve the performance of the polarity classification, (2) the proposed data mining methods outperform other implemented methods in shifter identification, and (3) the proposed semantic based machine learning method has the best efficiency among all implemented methods in polarity classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html.

  2. Semantic Role labels.

  3. https://nlp.stanford.edu/software/stanford-dependencies.html.

  4. Term frequency-inverse document frequency.

  5. http://ronan.collobert.com/senna/.

  6. https://wordnet.princeton.edu/.

  7. http://verbs.colorado.edu/~mpalmer/projects/verbnet.html.

  8. The Proposed-DM-Dependency method originally was presented in our conference paper (Noferesti and Shamsfard 2016) and we used it here as baseline to compare with our new methods.

References

  • Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In Proceedings of 28th international conference on very large data bases (VLDB) (Vol. 8(14), pp. 92–108).

  • Asmi, A., & Ishaya, T. (2012). Negation identification and calculation in sentiment analysis. In IMMM 2012, the second international conference, (c) (pp. 1–7).

  • Boubel, N., François, T., & Naets, H. (2013). Automatic extraction of contextual valence shifters. In Proceedings of the international conference recent advances in natural language processing RANLP 2013 (98–104).

  • Cambria, E., Poria, S., & Bajpai, R. (2016). SenticNet 4: A semantic resource for sentiment analysis based on conceptual primitives. In Proceedings of COLING 2016, the 26th international conference on computational linguistics: Technical papers (pp. 2666–2677).

  • Chapman, W. W., Bridewell, W., Hanbury, P., Cooper, G. F., & Buchanan, B. G. (2001). A simple algorithm for identifying negated findings and diseases in discharge summaries. Journal of Biomedical Informatics, 34(5), 301–310.

    Article  Google Scholar 

  • De Marneffe, M.-C., MacCartney, B., & Manning, C. D. (2006). Generating typed dependency parses from phrase structure parses. In Proceedings of the 5th international conference on language resources and evaluation (LREC 2006) (pp. 449–454).

  • Esuli, A., & Sebastiani, F. (2006). SENTIWORDNET: A publicly available lexical resource for opinion mining. In Proceedings of the 5th conference on language resources and evaluation (pp. 417–422).

  • Heerschop, B., Van Iterson, P., Hogenboom, A., Frasincar, F., & Kaymak, U. (2011). Analyzing sentiment in a large set of Web data while accounting for negation. Advances in Intelligent and Soft Computing, 86, 195–205.

    Article  Google Scholar 

  • Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 04, 4 (p 168).

  • Huang, S., Niu, Z., & Shi, C. (2014). Automatic construction of domain-specific sentiment lexicon based on constrained label propagation. Knowledge-Based Systems, 56, 191–200.

    Article  Google Scholar 

  • Ikeda, D., Takamura, H., & Okumura, M. (2010). Learning to shift the polarity of words for sentiment classification. Transactions of the Japanese Society for Artificial Intelligence, 25, 50–57.

    Article  Google Scholar 

  • Jia, L., Yu, C., & Meng, W. (2009, November). The effect of negation on sentiment analysis and retrieval effectiveness. In Proceedings of the 18th ACM conference on Information and knowledge management (pp. 1827–1830).

  • Kennedy, J. (2011). Particle swarm optimization. In Encyclopedia of machine learning (pp. 760–766). Boston, MA: Springer.

    Google Scholar 

  • Kennedy, A., & Inkpen, D. (2006). Sentiment classification of movie reviews using contextual valence shifters. In Computational Intelligence (Vol. 22, pp. 110–125).

  • Kipper, K., Dang, H. T., & Palmer, M. (2000). Class-based construction of a verb lexicon. In AAAI/IAAI, 691, 696.

  • Kuo, R. J., Chao, C. M., & Chiu, Y. T. (2011). Application of particle swarm optimization to association rule mining. Applied Soft Computing, 11(1), 326–336.

    Article  Google Scholar 

  • Li, S., Lee, S. Y. M., Chen, Y., Huang, C. R., & Zhou, G. (2010). Sentiment classification and polarity shifting. In Aclweb.Org, (August), (pp. 635–643).

  • Marrese-Taylor, E., Velásquez, J. D., & Bravo-Marquez, F. (2014). A novel deterministic approach for aspect-based opinion mining in tourism products reviews. Expert Systems with Applications, 41(17), 7764–7775.

    Article  Google Scholar 

  • Miller, George A. (1995). WordNet: A lexical database for English. Communications of the ACM, 38(11), 39–41.

    Article  Google Scholar 

  • Morante, R., & Blanco, E. (2012, June). *SEM 2012 shared task: Resolving the scope and focus of negation. In Proceedings of the first joint conference on lexical and computational semantics-volume 1: proceedings of the main conference and the shared task, and volume 2: proceedings of the sixth international workshop on semantic evaluation (pp. 265–274).

  • Morsy, S. A., & Rafea, A. (2012, June). Improving document-level sentiment classification using contextual valence shifters. In International conference on application of natural language to information systems (pp. 253–258).

  • Noferesti, S., & Shamsfard, M. (2016). Using data mining techniques for sentiment shifter identification. In Tenth International Conference on Language Resources and Evaluation, LREC 2016 (pp. 2716–2720).

  • Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up: sentiment classification using machine learning techniques. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, (pp. 79–86).

  • Shaikh, M. A. M., Prendinger, H., & Mitsuru, I. (2007, September). Assessing sentiment of text by semantic dependency and contextual valence analysis. In International conference on affective computing and intelligent interaction (pp. 191–202).

  • Simančík, F., & Lee, M. (2009). A CCG-based system for valence shifting for sentiment analysis. Advances in Computational Linguistics, 41, 99–108.

    Google Scholar 

  • Stone, P. J., Dunphy, D. C., Smith, M. S., & Ogilvie, D. M. (1966). The general inquirer: A computer approach to content analysis. Cambridge: The MIT Press.

    Google Scholar 

  • Wiegand, M., Balahur, A., Roth, B., Klakow, D., & Montoyo, A. (2010). A survey on the role of negation in sentiment analysis. In Proceedings of the workshop on negation and speculation in natural language processing, (July), (pp. 60–68).

  • Wilson, T. A., Wiebe, J., & Hoffmann, P. (2009). Recognizing contextual polarity: An exploration of features for phrase-level sentiment analysis. Computational Linguistics, 35(3), 399–433.

    Article  Google Scholar 

  • Xia, R., Wang, T., Hu, X., Li, S., & Zong, C. (2013, August). Dual training and dual prediction for polarity classification. In ACL2 (pp. 521–525).

  • Xia, R., Xu, F., Yu, J., Qi, Y., & Cambria, E. (2016). Polarity shift detection, elimination and ensemble: A three-stage model for document-level sentiment analysis. Information Processing and Management, 52(1), 36–45.

    Article  Google Scholar 

  • Xia, R., Xu, F., Zong, C., Li, Q., Qi, Y., & Li, T. (2015). Dual sentiment analysis: Considering two sides of one review. IEEE Transactions on Knowledge and Data Engineering, 27(8), 2120–2133.

    Article  Google Scholar 

  • Yu, H., Shang, J., Hsu, M., Castellanos, M., & Han, J. (2016). Data-driven contextual valence shifter quantification for multi-theme sentiment analysis. In Proceedings of the 25th ACM international on conference on information and knowledge management (pp. 939–948).

  • Zhang, C., & Zhang, S. (2002). Association rules mining: Models and algorithms. In Lecture notes in computer science (Vol. 2307, p. 243).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mehrnoush Shamsfard.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rahimi, Z., Noferesti, S. & Shamsfard, M. Applying data mining and machine learning techniques for sentiment shifter identification. Lang Resources & Evaluation 53, 279–302 (2019). https://doi.org/10.1007/s10579-018-9432-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-018-9432-0

Keywords

Navigation