Abstract
Sentiment shifters, as a set of words and expressions that can affect text polarity, play a fundamental role in opinion mining. However, the limited ability of current automated opinion mining systems in handling shifters is a major challenge. This paper presents three novel and efficient methods for identifying sentiment shifters in reviews in order to improve the overall accuracy of opinion mining systems: two data mining based algorithms and a machine learning based algorithm. The data mining algorithms do not need shifter tagged datasets. They use weighted association rule mining (WARM) for finding frequent patterns representing sentiment shifters from a domain-specific and a general corpus. These patterns include different kinds of shifter words such as shifter verbs and quantifiers and are able to handle both local and long-distance shifters. The items in WARM for the two designed methods are in the form of dependency relations and SRL arguments of sentences, respectively. Secondly, we implemented a supervised machine learning system based on semantic features of sentences for shifter identification and polarity classification. This method obviously needs shifter tagged dataset for shifter identification. We tested our proposed algorithms on polarity classification task for 2 domains: a specific domain (drug reviews) and a general domain. Experiments demonstrate that (1) the extracted shifters improve the performance of the polarity classification, (2) the proposed data mining methods outperform other implemented methods in shifter identification, and (3) the proposed semantic based machine learning method has the best efficiency among all implemented methods in polarity classification.
Similar content being viewed by others
Notes
Semantic Role labels.
Term frequency-inverse document frequency.
The Proposed-DM-Dependency method originally was presented in our conference paper (Noferesti and Shamsfard 2016) and we used it here as baseline to compare with our new methods.
References
Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In Proceedings of 28th international conference on very large data bases (VLDB) (Vol. 8(14), pp. 92–108).
Asmi, A., & Ishaya, T. (2012). Negation identification and calculation in sentiment analysis. In IMMM 2012, the second international conference, (c) (pp. 1–7).
Boubel, N., François, T., & Naets, H. (2013). Automatic extraction of contextual valence shifters. In Proceedings of the international conference recent advances in natural language processing RANLP 2013 (98–104).
Cambria, E., Poria, S., & Bajpai, R. (2016). SenticNet 4: A semantic resource for sentiment analysis based on conceptual primitives. In Proceedings of COLING 2016, the 26th international conference on computational linguistics: Technical papers (pp. 2666–2677).
Chapman, W. W., Bridewell, W., Hanbury, P., Cooper, G. F., & Buchanan, B. G. (2001). A simple algorithm for identifying negated findings and diseases in discharge summaries. Journal of Biomedical Informatics, 34(5), 301–310.
De Marneffe, M.-C., MacCartney, B., & Manning, C. D. (2006). Generating typed dependency parses from phrase structure parses. In Proceedings of the 5th international conference on language resources and evaluation (LREC 2006) (pp. 449–454).
Esuli, A., & Sebastiani, F. (2006). SENTIWORDNET: A publicly available lexical resource for opinion mining. In Proceedings of the 5th conference on language resources and evaluation (pp. 417–422).
Heerschop, B., Van Iterson, P., Hogenboom, A., Frasincar, F., & Kaymak, U. (2011). Analyzing sentiment in a large set of Web data while accounting for negation. Advances in Intelligent and Soft Computing, 86, 195–205.
Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 04, 4 (p 168).
Huang, S., Niu, Z., & Shi, C. (2014). Automatic construction of domain-specific sentiment lexicon based on constrained label propagation. Knowledge-Based Systems, 56, 191–200.
Ikeda, D., Takamura, H., & Okumura, M. (2010). Learning to shift the polarity of words for sentiment classification. Transactions of the Japanese Society for Artificial Intelligence, 25, 50–57.
Jia, L., Yu, C., & Meng, W. (2009, November). The effect of negation on sentiment analysis and retrieval effectiveness. In Proceedings of the 18th ACM conference on Information and knowledge management (pp. 1827–1830).
Kennedy, J. (2011). Particle swarm optimization. In Encyclopedia of machine learning (pp. 760–766). Boston, MA: Springer.
Kennedy, A., & Inkpen, D. (2006). Sentiment classification of movie reviews using contextual valence shifters. In Computational Intelligence (Vol. 22, pp. 110–125).
Kipper, K., Dang, H. T., & Palmer, M. (2000). Class-based construction of a verb lexicon. In AAAI/IAAI, 691, 696.
Kuo, R. J., Chao, C. M., & Chiu, Y. T. (2011). Application of particle swarm optimization to association rule mining. Applied Soft Computing, 11(1), 326–336.
Li, S., Lee, S. Y. M., Chen, Y., Huang, C. R., & Zhou, G. (2010). Sentiment classification and polarity shifting. In Aclweb.Org, (August), (pp. 635–643).
Marrese-Taylor, E., Velásquez, J. D., & Bravo-Marquez, F. (2014). A novel deterministic approach for aspect-based opinion mining in tourism products reviews. Expert Systems with Applications, 41(17), 7764–7775.
Miller, George A. (1995). WordNet: A lexical database for English. Communications of the ACM, 38(11), 39–41.
Morante, R., & Blanco, E. (2012, June). *SEM 2012 shared task: Resolving the scope and focus of negation. In Proceedings of the first joint conference on lexical and computational semantics-volume 1: proceedings of the main conference and the shared task, and volume 2: proceedings of the sixth international workshop on semantic evaluation (pp. 265–274).
Morsy, S. A., & Rafea, A. (2012, June). Improving document-level sentiment classification using contextual valence shifters. In International conference on application of natural language to information systems (pp. 253–258).
Noferesti, S., & Shamsfard, M. (2016). Using data mining techniques for sentiment shifter identification. In Tenth International Conference on Language Resources and Evaluation, LREC 2016 (pp. 2716–2720).
Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up: sentiment classification using machine learning techniques. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, (pp. 79–86).
Shaikh, M. A. M., Prendinger, H., & Mitsuru, I. (2007, September). Assessing sentiment of text by semantic dependency and contextual valence analysis. In International conference on affective computing and intelligent interaction (pp. 191–202).
Simančík, F., & Lee, M. (2009). A CCG-based system for valence shifting for sentiment analysis. Advances in Computational Linguistics, 41, 99–108.
Stone, P. J., Dunphy, D. C., Smith, M. S., & Ogilvie, D. M. (1966). The general inquirer: A computer approach to content analysis. Cambridge: The MIT Press.
Wiegand, M., Balahur, A., Roth, B., Klakow, D., & Montoyo, A. (2010). A survey on the role of negation in sentiment analysis. In Proceedings of the workshop on negation and speculation in natural language processing, (July), (pp. 60–68).
Wilson, T. A., Wiebe, J., & Hoffmann, P. (2009). Recognizing contextual polarity: An exploration of features for phrase-level sentiment analysis. Computational Linguistics, 35(3), 399–433.
Xia, R., Wang, T., Hu, X., Li, S., & Zong, C. (2013, August). Dual training and dual prediction for polarity classification. In ACL2 (pp. 521–525).
Xia, R., Xu, F., Yu, J., Qi, Y., & Cambria, E. (2016). Polarity shift detection, elimination and ensemble: A three-stage model for document-level sentiment analysis. Information Processing and Management, 52(1), 36–45.
Xia, R., Xu, F., Zong, C., Li, Q., Qi, Y., & Li, T. (2015). Dual sentiment analysis: Considering two sides of one review. IEEE Transactions on Knowledge and Data Engineering, 27(8), 2120–2133.
Yu, H., Shang, J., Hsu, M., Castellanos, M., & Han, J. (2016). Data-driven contextual valence shifter quantification for multi-theme sentiment analysis. In Proceedings of the 25th ACM international on conference on information and knowledge management (pp. 939–948).
Zhang, C., & Zhang, S. (2002). Association rules mining: Models and algorithms. In Lecture notes in computer science (Vol. 2307, p. 243).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rahimi, Z., Noferesti, S. & Shamsfard, M. Applying data mining and machine learning techniques for sentiment shifter identification. Lang Resources & Evaluation 53, 279–302 (2019). https://doi.org/10.1007/s10579-018-9432-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10579-018-9432-0