Joint multi-grain topic sentiment: modeling semantic aspects for online reviews
Introduction
With the availability of ubiquitous internet access, increasing numbers of people are conducting online research prior to buying a product. People are eager to know what consumers feel and their perspectives about a product. This is known as word-of-mouth. The availability of electronic word-of-mouth, online consumer reviews, is growing rapidly and has a significant influence on the purchasing behavior of consumers. This is because consumer reviews contain user perspectives with different usage scenarios and are frequently considered more credible and trustworthy than vendor product descriptions [23]. Although consumer reviews are helpful for product purchasing and online opinion tracking, manually analyzing reviews to gain user opinion insight such as consumer sentiment about important aspects of a product is tedious. Current user interface (UI) tools (e.g., tagging keywords or numerical ratings) are inadequate to digest the details of user opinions. Therefore, there has recently been considerable interest in developing automated tools for opinion mining and sentiment analysis.
A major challenge in opinion mining is aspect-based sentiment analysis [6]. Some online reviews provide overall ratings for an object. However, users are typically interested in the detailed aspects in addition to the overall ratings. The detailed aspects along with the sentiments are embedded in textual content, which has a significant economic influence [3]. Individual preference levels differ considerably by aspect and thus, an object can be described and rated differently for different aspects. For example, one reviewer may rate a restaurant highly based on the taste of the food whereas another reviewer may rate the same restaurant poorly because of the service or ambience. Aspect-based sentiment analysis is valuable for making an informed decision.
Domain independent aspect-based sentiment analysis is challenging. This is because, in many cases, the sentiment polarity of a word is domain-dependent [6]. For instance, unpredictable plot expresses a positive sentiment in the movie domain whereas unpredictable touch screen expresses a negative sentiment in the electronics domain. For the extensive variety of products and services in the countless diverse domains, it is costly to construct labeled data for each product or service. Therefore, domain-independent models with minimal or no supervision are required for aspect-based sentiment analysis systems.
A typical aspect-based sentiment analysis system functions in two phases. To begin, it extracts aspects. Then, it determines the sentiment of the aspects. In many systems, one of the two phases uses some type of supervised settings. For example, predefined aspects are required in [17] for aspect-based sentiment classification. Conversely, aspects are extracted automatically in [30]; however, aspect-based user numerical ratings are required for aspect-based sentiment summarization.
Recently, domain-independent topic-sentiment models (i.e., ASUM [10], JST [14], [15], and HASM [11]) have been proposed for addressing these two problems simultaneously with joint models of topic and sentiment. These models can be applied to any domain because they do not require predefined aspects or a domain-dependent sentiment lexicon. However, the topic-sentiment models fail to automatically identify ratable aspects from many redundant or uncorrelated topics. Furthermore, the optimal number of topics required to model online reviews is either prohibitively large or small (e.g., 100 or 2). In our analysis, the number of ratable aspects of a product is approximately 10. Consequently, it is difficult to conceptualize or browse the sentiment-oriented ratable aspects. Manual effort is required to identify aspects from topics. We have experimentally observed that many topics do not correspond to ratable aspects and contain redundant or uncorrelated top words, even when these models use approximately 10 topics. Although MG-LDA [31] detects ratable aspects, it cannot identify their sentiment orientation.
These limitations of the previous works motivate our research. Although there have been numerous attempts to model both topics and sentiments, there has been no research that examines the effectiveness of multi-grain topic sentiment for aspect-based sentiment classification. Integrating sentiment with multi-grain topics is not trivial because the topics are derived from regions, defined as windows, of a document. We have experimented with many design choices and developed the Joint Multi-grain Topic Sentiment (JMTS) model. JMTS extends MG-LDA by constructing an additional sentiment layer on the presumption that sentiment-oriented ratable aspects are generated from regional distributions of topics and sentiment. One of our key technical contributions is that JMTS relates sentiment to windows and words whereas ASUM and JST relate sentiment to sentences and words. Modeling the relation between sentiment and window has proven to be effective, as will be verified in the experiments in Section 4.
We extend our preliminary work [2] in three areas. First, we use asymmetric priors while incorporating prior sentiment information into JMTS. Second, we compute the aspect sentiment distribution of the sentences as well as the reviews. Third, we demonstrate the efficacy of JMTS compared to existing models in aspect classification and pointwise mutual information (PMI). The contributions of this paper are as follows:
- •
We propose a novel JMTS model for online reviews. JMTS effectively extracts quality sentiment-oriented ratable aspects automatically, eliminating the requirement for a manual probe.
- •
We verify the efficacy of JMTS qualitatively by demonstrating that JMTS generates correlated top words with low contamination for sentiment-oriented ratable aspects.
- •
We confirm that JMTS outperforms existing models (HASM, ASUM, JST, and MG-LDA) with quantitative comparisons.
The remainder of this paper is organized as follows. Section 2 shows related work. Sections 3 and 4 describe the novel JMTS model and experimental results, respectively. Section 5 concludes this paper.
Section snippets
Related work
Sentiment analysis is a well-studied problem [23]. Some of the work includes the economic influence and helpfulness of reviews [38], emotion mining [27], stock movements [12], cross lingual sentiment analysis [16] and review snippets aggregation [28]. The most common sentiment analysis problem is classifying a text into either positive or negative polarity [23]. Some work [7], [34] classifies sentiment into multiple rather than two classes. The majority of the work emphasizes sentiment
Generative model
Our goal is to extract sentiment-oriented ratable aspects of online reviews by extending topic models (e.g., LDA [4]).
LDA generates a document in three steps. To begin, it draws a topic distribution for each document. Then, it selects a topic from the topic distribution. Finally, a word is drawn from the topic. Topics are sampled once for the entire document collection. The graphical model of LDA is presented in Fig. 1a, where the shaded node represents the observed variable and non-shaded
Experimental results
We perform both qualitative and quantitative experiments to evaluate JMTS and the quality of extracted aspects. We compare JMTS aspects with the aspects of prior models qualitatively using two criteria: the top words of an aspect should be correlated and minimally contaminated with the top words of other aspects. We also make quantitative comparisons in aspect sentiment classification, aspect classification, and pointwise mutual information (PMI).
Conclusion
In this paper, we have addressed the problem of extracting sentiment-oriented ratable aspects from online reviews. We have proposed the Joint Multi-grain Topic Sentiment (JMTS) model. We have confirmed that JMTS outperforms the state-of-the-art models qualitatively and quantitatively. We have also demonstrated the quality of extracted aspects compared to human predefined aspects. We are working on unsupervised aspect summarization and aspect rating prediction.
Acknowledgment
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (numbers 2015R1A2A1A10052665 and 2015R1A2A1A15052701).
References (41)
- et al.
The effect of news and public mood on stock movements
Inform. Sci.
(2014) Comparison of the predicted and observed secondary structure of t4 phage lysozyme
Biochim. Biophys. Acta
(1975)- et al.
Unsupervised product feature extraction for feature-oriented opinion determination
Inform. Sci.
(2014) - et al.
Sentiment topic models for social emotion mining
Inform. Sci.
(2014) - et al.
Multi-aspect sentiment analysis for chinese online social reviews based on topic modeling and hownet lexicon
Knowl.-Based Syst.
(2013) - et al.
Ensemble of feature sets and classification algorithms for sentiment classification
Inform. Sci.
(2011) - Alias-i (version 4.0.1)....
- et al.
Semantic aspect discovery for online reviews
Proceedings of the 12th IEEE International Conference on Data Mining
(2012) - et al.
Show me the money!: deriving the pricing power of product features by mining consumer reviews
Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
(2007) - et al.
Latent dirichlet allocation
J. Mach. Learn. Res.
(2003)
An unsupervised aspect-sentiment model for online reviews
Proceedings of Human Language Technologies: The 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics
Techniques and applications for sentiment analysis
Commun. ACM
Beyond the stars: improving rating predictions using review text content
Proceedings of 12th International Workshop on the Web and Databases
Finding scientific topics
Proc. Natl. Acad. Sci.
Unsupervised learning by probabilistic latent semantic analysis
Mach. Learn.
Aspect and sentiment unification model for online review analysis
Proceedings of 4th ACM International Conference on Web Search and Data Mining
A hierarchical aspect-sentiment model for online reviews
Proceedings of the 27th AAAI Conference on Artificial Intelligence
A non-negative matrix tri factorization approach to sentiment classification with lexical prior knowledge
Proceedings of Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
Weakly supervised joint sentiment-topic detection from text
IEEE Trans. Knowl. Data Eng.
Joint sentiment/topic model for sentiment analysis
Proceedings of 18th ACM International Conference on Information and Knowledge Management
Cited by (85)
Evolutionary learning of selection hyper-heuristics for text classification[Formula presented]
2023, Applied Soft ComputingA hierarchical neural network model with user and product attention for deceptive reviews detection
2022, Information SciencesA robust optimization method for label noisy datasets based on adaptive threshold: Adaptive-k
2024, Frontiers of Computer ScienceSentiment analysis using a deep ensemble learning model
2024, Multimedia Tools and Applications