Elsevier

Information Sciences

Volume 339, 20 April 2016, Pages 206-223
Information Sciences

Joint multi-grain topic sentiment: modeling semantic aspects for online reviews

https://doi.org/10.1016/j.ins.2016.01.013Get rights and content

Abstract

The availability of electronic word-of-mouth, online consumer reviews, is increasing rapidly. Users frequently look for important aspects of a product or service in the reviews. They are typically interested in sentiment-oriented ratable aspects (i.e., semantic aspects). However, extracting semantic aspects across domains is challenging. We propose a domain-independent topic sentiment model called Joint Multi-grain Topic Sentiment (JMTS) to extract semantic aspects. JMTS effectively extracts quality semantic aspects automatically, thereby eliminating the requirement for manual probing. We conduct both qualitative and quantitative comparisons to evaluate JMTS. The experimental results confirm that JMTS generates semantic aspects with correlated top words and outperforms state-of-the-art models in several performance metrics.

Introduction

With the availability of ubiquitous internet access, increasing numbers of people are conducting online research prior to buying a product. People are eager to know what consumers feel and their perspectives about a product. This is known as word-of-mouth. The availability of electronic word-of-mouth, online consumer reviews, is growing rapidly and has a significant influence on the purchasing behavior of consumers. This is because consumer reviews contain user perspectives with different usage scenarios and are frequently considered more credible and trustworthy than vendor product descriptions [23]. Although consumer reviews are helpful for product purchasing and online opinion tracking, manually analyzing reviews to gain user opinion insight such as consumer sentiment about important aspects of a product is tedious. Current user interface (UI) tools (e.g., tagging keywords or numerical ratings) are inadequate to digest the details of user opinions. Therefore, there has recently been considerable interest in developing automated tools for opinion mining and sentiment analysis.

A major challenge in opinion mining is aspect-based sentiment analysis [6]. Some online reviews provide overall ratings for an object. However, users are typically interested in the detailed aspects in addition to the overall ratings. The detailed aspects along with the sentiments are embedded in textual content, which has a significant economic influence [3]. Individual preference levels differ considerably by aspect and thus, an object can be described and rated differently for different aspects. For example, one reviewer may rate a restaurant highly based on the taste of the food whereas another reviewer may rate the same restaurant poorly because of the service or ambience. Aspect-based sentiment analysis is valuable for making an informed decision.

Domain independent aspect-based sentiment analysis is challenging. This is because, in many cases, the sentiment polarity of a word is domain-dependent [6]. For instance, unpredictable plot expresses a positive sentiment in the movie domain whereas unpredictable touch screen expresses a negative sentiment in the electronics domain. For the extensive variety of products and services in the countless diverse domains, it is costly to construct labeled data for each product or service. Therefore, domain-independent models with minimal or no supervision are required for aspect-based sentiment analysis systems.

A typical aspect-based sentiment analysis system functions in two phases. To begin, it extracts aspects. Then, it determines the sentiment of the aspects. In many systems, one of the two phases uses some type of supervised settings. For example, predefined aspects are required in [17] for aspect-based sentiment classification. Conversely, aspects are extracted automatically in [30]; however, aspect-based user numerical ratings are required for aspect-based sentiment summarization.

Recently, domain-independent topic-sentiment models (i.e., ASUM [10], JST [14], [15], and HASM [11]) have been proposed for addressing these two problems simultaneously with joint models of topic and sentiment. These models can be applied to any domain because they do not require predefined aspects or a domain-dependent sentiment lexicon. However, the topic-sentiment models fail to automatically identify ratable aspects from many redundant or uncorrelated topics. Furthermore, the optimal number of topics required to model online reviews is either prohibitively large or small (e.g., 100 or 2). In our analysis, the number of ratable aspects of a product is approximately 10. Consequently, it is difficult to conceptualize or browse the sentiment-oriented ratable aspects. Manual effort is required to identify aspects from topics. We have experimentally observed that many topics do not correspond to ratable aspects and contain redundant or uncorrelated top words, even when these models use approximately 10 topics. Although MG-LDA [31] detects ratable aspects, it cannot identify their sentiment orientation.

These limitations of the previous works motivate our research. Although there have been numerous attempts to model both topics and sentiments, there has been no research that examines the effectiveness of multi-grain topic sentiment for aspect-based sentiment classification. Integrating sentiment with multi-grain topics is not trivial because the topics are derived from regions, defined as windows, of a document. We have experimented with many design choices and developed the Joint Multi-grain Topic Sentiment (JMTS) model. JMTS extends MG-LDA by constructing an additional sentiment layer on the presumption that sentiment-oriented ratable aspects are generated from regional distributions of topics and sentiment. One of our key technical contributions is that JMTS relates sentiment to windows and words whereas ASUM and JST relate sentiment to sentences and words. Modeling the relation between sentiment and window has proven to be effective, as will be verified in the experiments in Section 4.

We extend our preliminary work [2] in three areas. First, we use asymmetric priors while incorporating prior sentiment information into JMTS. Second, we compute the aspect sentiment distribution of the sentences as well as the reviews. Third, we demonstrate the efficacy of JMTS compared to existing models in aspect classification and pointwise mutual information (PMI). The contributions of this paper are as follows:

  • We propose a novel JMTS model for online reviews. JMTS effectively extracts quality sentiment-oriented ratable aspects automatically, eliminating the requirement for a manual probe.

  • We verify the efficacy of JMTS qualitatively by demonstrating that JMTS generates correlated top words with low contamination for sentiment-oriented ratable aspects.

  • We confirm that JMTS outperforms existing models (HASM, ASUM, JST, and MG-LDA) with quantitative comparisons.

The remainder of this paper is organized as follows. Section 2 shows related work. Sections 3 and 4 describe the novel JMTS model and experimental results, respectively. Section 5 concludes this paper.

Section snippets

Related work

Sentiment analysis is a well-studied problem [23]. Some of the work includes the economic influence and helpfulness of reviews [38], emotion mining [27], stock movements [12], cross lingual sentiment analysis [16] and review snippets aggregation [28]. The most common sentiment analysis problem is classifying a text into either positive or negative polarity [23]. Some work [7], [34] classifies sentiment into multiple rather than two classes. The majority of the work emphasizes sentiment

Generative model

Our goal is to extract sentiment-oriented ratable aspects of online reviews by extending topic models (e.g., LDA [4]).

LDA generates a document in three steps. To begin, it draws a topic distribution for each document. Then, it selects a topic from the topic distribution. Finally, a word is drawn from the topic. Topics are sampled once for the entire document collection. The graphical model of LDA is presented in Fig. 1a, where the shaded node represents the observed variable and non-shaded

Experimental results

We perform both qualitative and quantitative experiments to evaluate JMTS and the quality of extracted aspects. We compare JMTS aspects with the aspects of prior models qualitatively using two criteria: the top words of an aspect should be correlated and minimally contaminated with the top words of other aspects. We also make quantitative comparisons in aspect sentiment classification, aspect classification, and pointwise mutual information (PMI).

Conclusion

In this paper, we have addressed the problem of extracting sentiment-oriented ratable aspects from online reviews. We have proposed the Joint Multi-grain Topic Sentiment (JMTS) model. We have confirmed that JMTS outperforms the state-of-the-art models qualitatively and quantitatively. We have also demonstrated the quality of extracted aspects compared to human predefined aspects. We are working on unsupervised aspect summarization and aspect rating prediction.

Acknowledgment

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (numbers 2015R1A2A1A10052665 and 2015R1A2A1A15052701).

References (41)

  • S. Brody et al.

    An unsupervised aspect-sentiment model for online reviews

    Proceedings of Human Language Technologies: The 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics

    (2010)
  • R. Feldman

    Techniques and applications for sentiment analysis

    Commun. ACM

    (2013)
  • G. Ganu et al.

    Beyond the stars: improving rating predictions using review text content

    Proceedings of 12th International Workshop on the Web and Databases

    (2009)
  • T.L. Griffiths et al.

    Finding scientific topics

    Proc. Natl. Acad. Sci.

    (2004)
  • T. Hofmann

    Unsupervised learning by probabilistic latent semantic analysis

    Mach. Learn.

    (2001)
  • Y. Jo et al.

    Aspect and sentiment unification model for online review analysis

    Proceedings of 4th ACM International Conference on Web Search and Data Mining

    (2011)
  • S. Kim et al.

    A hierarchical aspect-sentiment model for online reviews

    Proceedings of the 27th AAAI Conference on Artificial Intelligence

    (2013)
  • T. Li et al.

    A non-negative matrix tri factorization approach to sentiment classification with lexical prior knowledge

    Proceedings of Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

    (2009)
  • C. Lin et al.

    Weakly supervised joint sentiment-topic detection from text

    IEEE Trans. Knowl. Data Eng.

    (2012)
  • C. Lin et al.

    Joint sentiment/topic model for sentiment analysis

    Proceedings of 18th ACM International Conference on Information and Knowledge Management

    (2009)
  • Cited by (85)

    • Sentiment analysis using a deep ensemble learning model

      2024, Multimedia Tools and Applications
    View all citing articles on Scopus
    View full text