Skip to main content
Erschienen in: Discover Computing 2/2012

01.04.2012

Opinion-based entity ranking

verfasst von: Kavita Ganesan, ChengXiang Zhai

Erschienen in: Discover Computing | Ausgabe 2/2012

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The deployment of Web 2.0 technologies has led to rapid growth of various opinions and reviews on the web, such as reviews on products and opinions about people. Such content can be very useful to help people find interesting entities like products, businesses and people based on their individual preferences or tradeoffs. Most existing work on leveraging opinionated content has focused on integrating and summarizing opinions on entities to help users better digest all the opinions. In this paper, we propose a different way of leveraging opinionated content, by directly ranking entities based on a user’s preferences. Our idea is to represent each entity with the text of all the reviews of that entity. Given a user’s keyword query that expresses the desired features of an entity, we can then rank all the candidate entities based on how well opinions on these entities match the user’s preferences. We study several methods for solving this problem, including both standard text retrieval models and some extensions of these models. Experiment results on ranking entities based on opinions in two different domains (hotels and cars) show that the proposed extensions are effective and lead to improvement of ranking accuracy over the standard text retrieval models for this task.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Amati, G., & van Rijsbergen, C. J. (2002). Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Transactions on Information System, 20(4), 357–389.CrossRef Amati, G., & van Rijsbergen, C. J. (2002). Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Transactions on Information System, 20(4), 357–389.CrossRef
Zurück zum Zitat Balog, K., Azzopardi, L., & de Rijke, M. (2009). A language modeling framework for expert finding. Information Processing & Management, 45(1), 1–19.CrossRef Balog, K., Azzopardi, L., & de Rijke, M. (2009). A language modeling framework for expert finding. Information Processing & Management, 45(1), 1–19.CrossRef
Zurück zum Zitat Dave, K., Lawrence, S., & Pennock, D. M. (2003). Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In WWW ’03: proceedings of the twelfth international conference on World Wide Web (pp. 519–528). ACM Press. Dave, K., Lawrence, S., & Pennock, D. M. (2003). Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In WWW ’03: proceedings of the twelfth international conference on World Wide Web (pp. 519–528). ACM Press.
Zurück zum Zitat Fang, H., Tao, T., & Zhai. C. (2004). A formal study of information retrieval heuristics. In SIGIR ’04: proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval (ppp. 49–56). New York, NY, USA: ACM Press. Fang, H., Tao, T., & Zhai. C. (2004). A formal study of information retrieval heuristics. In SIGIR ’04: proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval (ppp. 49–56). New York, NY, USA: ACM Press.
Zurück zum Zitat Fang, H., & Zhai, C. (2007). Probabilistic models for expert finding. In ECIR (pp. 418–430). Fang, H., & Zhai, C. (2007). Probabilistic models for expert finding. In ECIR (pp. 418–430).
Zurück zum Zitat Gamon, M. (2004). Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis. In Proceedings of the 20th international conference on Computational Linguistics (p. 841). Geneva, Switzerland: Association for Computational Linguistics. Gamon, M. (2004). Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis. In Proceedings of the 20th international conference on Computational Linguistics (p. 841). Geneva, Switzerland: Association for Computational Linguistics.
Zurück zum Zitat Hannah, J. P. B. H. I. O. D., & Macdonald, C. (2007). University of Glasgow at TREC2007: Experiments in blog and enterprise tracks with Terrier. In Proceeddings of the 16th text retrieval conference (TREC 2007). Hannah, J. P. B. H. I. O. D., & Macdonald, C. (2007). University of Glasgow at TREC2007: Experiments in blog and enterprise tracks with Terrier. In Proceeddings of the 16th text retrieval conference (TREC 2007).
Zurück zum Zitat He, B., Macdonald, C., He, J., & Ounis, I. (2008). An effective statistical approach to blog post opinion retrieval. In CIKM ’08: proceeding of the 17th ACM conference on information and knowledge management (pp. 1063–1072). New York, NY, USA: ACM. He, B., Macdonald, C., He, J., & Ounis, I. (2008). An effective statistical approach to blog post opinion retrieval. In CIKM ’08: proceeding of the 17th ACM conference on information and knowledge management (pp. 1063–1072). New York, NY, USA: ACM.
Zurück zum Zitat Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information System, 20(4), 422–446.CrossRef Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information System, 20(4), 422–446.CrossRef
Zurück zum Zitat Koren, J., Zhang, Y., Liu, X. (2008). Personalized interactive faceted search. In WWW ’08: proceeding of the 17th international conference on World Wide Web (pp. 477–486). New York, NY, USA: ACM. Koren, J., Zhang, Y., Liu, X. (2008). Personalized interactive faceted search. In WWW ’08: proceeding of the 17th international conference on World Wide Web (pp. 477–486). New York, NY, USA: ACM.
Zurück zum Zitat Krulwich, B., & Burkey, C. (1996). The contactfinder agent: Answering bulletin board questions with referrals. In AAAI/IAAI (Vol. 1, pp. 10–15). Krulwich, B., & Burkey, C. (1996). The contactfinder agent: Answering bulletin board questions with referrals. In AAAI/IAAI (Vol. 1, pp. 10–15).
Zurück zum Zitat Lafferty, J., & Zhai, C. (2001). Document language models, query models, and risk minimization for information retrieval. In SIGIR ’01: proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 111–119). New York, NY, USA: ACM. Lafferty, J., & Zhai, C. (2001). Document language models, query models, and risk minimization for information retrieval. In SIGIR ’01: proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 111–119). New York, NY, USA: ACM.
Zurück zum Zitat Lu, Y., Zhai, C., & Sundaresan, N. (2009). Rated aspect summarization of short comments. In Proceedings of the 18th international conference on World wide web (pp. 131–140). Madrid, Spain: ACM. Lu, Y., Zhai, C., & Sundaresan, N. (2009). Rated aspect summarization of short comments. In Proceedings of the 18th international conference on World wide web (pp. 131–140). Madrid, Spain: ACM.
Zurück zum Zitat Nasukawa, T., & Yi, J. (2003). Sentiment analysis: Capturing favorability using natural language processing. In Proceedings of the 2nd international conference on Knowledge capture (pp. 70–77). Sanibel Island, FL, USA: ACM. Nasukawa, T., & Yi, J. (2003). Sentiment analysis: Capturing favorability using natural language processing. In Proceedings of the 2nd international conference on Knowledge capture (pp. 70–77). Sanibel Island, FL, USA: ACM.
Zurück zum Zitat Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., & Lioma, C. (2006). Terrier: A high performance and scalable information retrieval platform. In Proceedings of ACM SIGIR’06 workshop on open source information retrieval (OSIR 2006). Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., & Lioma, C. (2006). Terrier: A high performance and scalable information retrieval platform. In Proceedings of ACM SIGIR’06 workshop on open source information retrieval (OSIR 2006).
Zurück zum Zitat Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the 2002 conference on empirical methods in natural language processing (EMNLP) (pp. 79–86). Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the 2002 conference on empirical methods in natural language processing (EMNLP) (pp. 79–86).
Zurück zum Zitat Pang, B., & Lee, L. (2004). A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the ACL (pp. 271—278). Pang, B., & Lee, L. (2004). A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the ACL (pp. 271—278).
Zurück zum Zitat Pang, B., & Lee, L. (2005). Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the ACL (pp. 115–124). Pang, B., & Lee, L. (2005). Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the ACL (pp. 115–124).
Zurück zum Zitat Ponte, J. M., & Croft, W. B. (1998). A language modeling approach to information retrieval. In SIGIR ’98: proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval (pp. 275–281). New York, NY, USA: ACM. Ponte, J. M., & Croft, W. B. (1998). A language modeling approach to information retrieval. In SIGIR ’98: proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval (pp. 275–281). New York, NY, USA: ACM.
Zurück zum Zitat Prabowo, R., & Thelwall, M. (2009). Sentiment analysis: A combined approach. Journal of Informetrics, 3(2), 143–157.CrossRef Prabowo, R., & Thelwall, M. (2009). Sentiment analysis: A combined approach. Journal of Informetrics, 3(2), 143–157.CrossRef
Zurück zum Zitat Robertson, S. (2009). The probabilistic relevance framework: BM25 and beyond. Foundations and Trends® in Information Retrieval, 3(4), 333–389. Robertson, S. (2009). The probabilistic relevance framework: BM25 and beyond. Foundations and Trends® in Information Retrieval, 3(4), 333–389.
Zurück zum Zitat Robertson, S. E., Walker, S., Jones, S., Hancock-Beaulieu, M., & Gatford, M. (1994). Okapi at TREC-3. In TREC (p. 109). Robertson, S. E., Walker, S., Jones, S., Hancock-Beaulieu, M., & Gatford, M. (1994). Okapi at TREC-3. In TREC (p. 109).
Zurück zum Zitat Sadikov, E., Madhavan, J., Wang, L., & Halevy, A. (2010). Clustering query refinements by user intent. In WWW ’10: Proceedings of the 19th international conference on World wide web (pp. 841–850). New York, NY, USA: ACM. Sadikov, E., Madhavan, J., Wang, L., & Halevy, A. (2010). Clustering query refinements by user intent. In WWW ’10: Proceedings of the 19th international conference on World wide web (pp. 841–850). New York, NY, USA: ACM.
Zurück zum Zitat Salton, G., & Buckley, C. (1997). Improving retrieval performance by relevance feedback, pp. 355–364. Salton, G., & Buckley, C. (1997). Improving retrieval performance by relevance feedback, pp. 355–364.
Zurück zum Zitat Siegel, S., & Castellan, N. J. (1988). Nonparametric statistics for the social sciences. New York: McGraw-Hill. Siegel, S., & Castellan, N. J. (1988). Nonparametric statistics for the social sciences. New York: McGraw-Hill.
Zurück zum Zitat Snyder, B., & Barzilay, R. (2007). Multiple aspect ranking using the good grief algorithm. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT-NAACL) (pp. 300–307). Snyder, B., & Barzilay, R. (2007). Multiple aspect ranking using the good grief algorithm. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT-NAACL) (pp. 300–307).
Zurück zum Zitat Tan, B., & Peng, F. (2008). Unsupervised query segmentation using generative language models and wikipedia. In WWW ’08: proceeding of the 17th international conference on World Wide Web (pp. 347–356). New York, NY, USA: ACM. Tan, B., & Peng, F. (2008). Unsupervised query segmentation using generative language models and wikipedia. In WWW ’08: proceeding of the 17th international conference on World Wide Web (pp. 347–356). New York, NY, USA: ACM.
Zurück zum Zitat Tunkelang, D. (2009). Faceted search. San Rafael: Morgan and Claypool Publishers. Tunkelang, D. (2009). Faceted search. San Rafael: Morgan and Claypool Publishers.
Zurück zum Zitat Turney, P. D., & Littman, M. L. (2003). Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information System, 21(4), 315–346.CrossRef Turney, P. D., & Littman, M. L. (2003). Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information System, 21(4), 315–346.CrossRef
Zurück zum Zitat Wang, H., Lu, Y., & Zhai, C. (2010). Latent aspect rating analysis on review text data: A rating regression approach. In KDD ’10: proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 783–792). New York, NY, USA: ACM. Wang, H., Lu, Y., & Zhai, C. (2010). Latent aspect rating analysis on review text data: A rating regression approach. In KDD ’10: proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 783–792). New York, NY, USA: ACM.
Zurück zum Zitat Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics Bulletin, 1(6), 80–83.CrossRef Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics Bulletin, 1(6), 80–83.CrossRef
Zurück zum Zitat Yang, K., Yu, N., Valerio, A., Zhangm, H., & Ke, W. (2007). Fusion approach to finding opinions in blogosphere. ICWSM. Yang, K., Yu, N., Valerio, A., Zhangm, H., & Ke, W. (2007). Fusion approach to finding opinions in blogosphere. ICWSM.
Zurück zum Zitat Zhai, C. (2008). Statistical language models for information retrieval. San Rafael: Morgan & Claypool. Zhai, C. (2008). Statistical language models for information retrieval. San Rafael: Morgan & Claypool.
Zurück zum Zitat Zhai, C., Lafferty, J. (2004). A study of smoothing methods for language models applied to information retrieval. ACM Transactions on Information System, 22(2), 179–214.CrossRef Zhai, C., Lafferty, J. (2004). A study of smoothing methods for language models applied to information retrieval. ACM Transactions on Information System, 22(2), 179–214.CrossRef
Zurück zum Zitat Zhai, C., Lafferty, J. (2001). A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR ’01: proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 334–342). New York, NY, USA: ACM. Zhai, C., Lafferty, J. (2001). A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR ’01: proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 334–342). New York, NY, USA: ACM.
Metadaten
Titel
Opinion-based entity ranking
verfasst von
Kavita Ganesan
ChengXiang Zhai
Publikationsdatum
01.04.2012
Verlag
Springer Netherlands
Erschienen in
Discover Computing / Ausgabe 2/2012
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI
https://doi.org/10.1007/s10791-011-9174-8