skip to main content
research-article

Analyzing, Detecting, and Exploiting Sentiment in Web Queries

Published:01 December 2013Publication History
Skip Abstract Section

Abstract

The Web contains an increasing amount of biased and opinionated documents on politics, products, and polarizing events. In this article, we present an indepth analysis of Web search queries for controversial topics, focusing on query sentiment. To this end, we conduct extensive user assessments and discriminative term analyses, as well as a sentiment analysis using the SentiWordNet thesaurus, a lexical resource containing sentiment annotations. Furthermore, in order to detect the sentiment expressed in queries, we build different classifiers based on query texts, query result titles, and snippets. We demonstrate the virtue of query sentiment detection in two different use cases. First, we define a query recommendation scenario that employs sentiment detection of results to recommend additional queries for polarized queries issued by search engine users. The second application scenario is controversial topic discovery, where query sentiment classifiers are employed to discover previously unknown topics that trigger both highly positive and negative opinions among the users of a search engine. For both use cases, the results of our evaluations on real-world data are promising and show the viability and potential of query sentiment analysis in practical scenarios.

References

  1. Ahmad, K. 2011. Affective Computing and Sentiment Analysis: Emotion, Metaphor and Terminology (Text, Speech and Language Technology) 1st Ed. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Aktolga, E. and Allan, J. 2011. Reranking search results for sparse queries. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 173--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Allan, J. 2002. Topic Detection and Tracking: Event-Based Information Organization. Kluwer Academic Publishers. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Anagnostopoulos, A., Becchetti, L., Castillo, C., and Gionis, A. 2010. An optimization framework for query recommendation. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. 161--170. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Awadallah, R., Ramanath, M., and Weikum, G. 2012. Harmony and dissonance: Organizing the people’s voices on political controversies. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining. 523--532. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Baeza-Yates, R., Hurtado, C., and Mendoza, M. 2004. Query recommendation using query logs in search engines. In Proceedings of the International Conference on Current Trends in Database Technology. Lecture Notes in Computer Science, vol. 3268. Springer-Verlag, Berlin, Heidelberg, 588--596. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bar-Yossef, Z. and Gurevich, M. 2008. Mining search engine query logs via suggestion sampling. Proc. VLDB Endow. 1, 1, 54--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bar-Yossef, Z. and Kraus, N. 2011. Context-sensitive query auto-completion. In Proceedings of the 20th International Conference on World Wide Web. 107--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bermingham, A. and Smeaton, A. F. 2010. Classifying sentiment in microblogs: Is brevity an advantage? In Proceedings of the 19th ACM International Conference on Information and Knowledge Management. 1833--1836. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Broccolo, D., Marcon, L., Nardini, F. M., Perego, R., and Silvestri, F. 2012. Generating suggestions for queries in the long tail with an inverted index. Inf. Process. Manage. 48, 2, 326--339. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Broder, A. 2002. A taxonomy of Web search. SIGIR Forum 36, 2, 3--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Broder, A. Z., Fontoura, M., Gabrilovich, E., Joshi, A., Josifovski, V., and Zhang, T. 2007. Robust classification of rare queries using Web knowledge. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 231--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Cao, H., Hu, D. H., Shen, D., Jiang, D., Sun, J.-T., Chen, E., and Yang, Q. 2009. Context-aware query classification. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 3--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Chang, C.-C. and Lin, C.-J. 2011. Libsvm: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 3, 27:1--27:27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Chelaru, S., Altingovde, I. S., and Siersdorfer, S. 2012. Analyzing the polarity of opinionated queries. In Proceedings of the 34th European Conference on IR Research. Lecture Notes in Computer Science, vol. 7224. Springer-Verlag, Berlin, Heidelberg, 463--467. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Demartini, G. and Siersdorfer, S. 2010. Dear search engine: What’s your opinion about...?: Sentiment analysis for semantic enrichment of Web search results. In Proceedings of the 3rd International Semantic Search Workshop. 4:1--4:7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Denecke, K. 2009. Are sentiwordnet scores suited for multi-domain sentiment classification? In Proceedings of the 4th IEEE International Conference on Digital Information Management. 33--38.Google ScholarGoogle ScholarCross RefCross Ref
  18. Esuli, A. and Sebastiani, F. 2006. Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of the 5th Conference on Language Resources and Evaluation. 417--422.Google ScholarGoogle Scholar
  19. Fellbaum, C., Ed. 1998. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA.Google ScholarGoogle Scholar
  20. Fonseca, B. M., Golgher, P. B., de Moura, E. S., and Ziviani, N. 2003. Using association rules to discover search engines related queries. In Proceedings of the 1st Conference on Latin American Web Congress. IEEE Computer Society, 66--71. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Goorha, S. and Ungar, L. 2010. Discovery of significant emerging trends. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 57--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Gwet, K. 2010. Handbook of Inter-Rater Reliability 2nd Ed. Advanced Analytics, LLC.Google ScholarGoogle Scholar
  23. Gyllstrom, K. and Moens, M.-F. 2011. Clash of the typings: Finding controversies and children’s topics within queries. In Proceedings of the 33rd European Conference on IR Research. Lecture Notes in Computer Science, vol. 6611. Springer-Verlag, Berlin, Heidelberg, 80--91. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. 2009. The weka data mining software: An update. SIGKDD Explor. 11, 1, 10--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Hatzivassiloglou, V. and McKeown, K. 1995. A quantitative evaluation of linguistic tests for the automatic prediction of semantic markedness. In Proceedings of the 33rd Annual Meeting of the ACL. 197--204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Hatzivassiloglou, V. and McKeown, K. 1997. Predicting the semantic orientation of adjectives. In Proceedings of the 35th Annual Meeting of the ACL. 174--181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jain, A. and Mishne, G. 2010. Organizing query completions for Web search. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management. 1169--1178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Kang, I.-H. and Kim, G. 2003. Query type classification for Web document retrieval. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 64--71. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Kittur, A., Suh, B., Pendleton, B. A., and Chi, E. H. 2007. He says, she says: Conflict and coordination in Wikipedia. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 453--462. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Kucuktunc, O., Cambazoglu, B. B., Weber, I., and Ferhatosmanoglu, H. 2012. A large-scale sentiment analysis for Yahoo! answers. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining. 633--642. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Li, X., Wang, Y.-Y., and Acero, A. 2008. Learning query intent from regularized click graphs. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 339--346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Manning, C. D., Raghavan, P., and Schütze, H. 2008. Introduction to Information Retrieval. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. O’Connor, B., Balasubramanyan, R., Routledge, B. R., and Smith, N. A. 2010. From tweets to polls: Linking text sentiment to public opinion time series. In Proceedings of the 4th International Conference on Weblogs and Social Media.Google ScholarGoogle Scholar
  34. Orimaye, S. O., Alhashmi, S. M., and Siew, E.-G. 2011. Frequency of sentential contexts vs. frequency of query terms in opinion retrieval. In Proceedings of the 7th International Conference on Web Information Systems and Technologies, J. Cordeiro and J. Filipe Eds., SciTePress, 607--610.Google ScholarGoogle Scholar
  35. Pak, A. and Paroubek, P. 2010. Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the 7th Conference on International Language Resources and Evaluation.Google ScholarGoogle Scholar
  36. Pan, S. J., Ni, X., Sun, J.-T., Yang, Q., and Chen, Z. 2010. Cross-domain sentiment classification via spectral feature alignment. In Proceedings of the 19th International Conference on World Wide Web. 751--760. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Pang, B. and Lee, L. 2008. Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2, 1--2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Pang, B., Lee, L., and Vaithyanathan, S. 2002. Thumbs up?: Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing - Volume 10. Association for Computational Linguistics, 79--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Pass, G., Chowdhury, A., and Torgeson, C. 2006. A picture of search. In Proceedings of the 1st International Conference on Scalable Information Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Pera, M. S., Qumsiyeh, R., and Ng, Y.-K. 2011. A query-based multi-document sentiment summarizer. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 1071--1076. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Preis, T., Moat, H. S., and Stanley, H. E. 2013. Quantifying trading behavior in financial markets using Google trends. Sci. Rep. 3.Google ScholarGoogle Scholar
  42. Radlinski, F., Szummer, M., and Craswell, N. 2010. Inferring query intent from reformulations and clicks. In Proceedings of the 19th International Conference on World Wide Web. 1171--1172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Ripberger, J. T. 2011. Capturing curiosity: Using internet search trends to measure public attentiveness. Policy Stud. J. 39, 2, 239--259.Google ScholarGoogle ScholarCross RefCross Ref
  44. Shen, D., Li, Y., Li, X., and Zhou, D. 2009. Product query classification. In Proceedings of the 18th ACM Conference on Information and Knowledge Management. 741--750. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Shokouhi, M. and Radinsky, K. 2012. Time-sensitive query auto-completion. In Proceedings of the 35th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 601--610. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Siersdorfer, S., Chelaru, S., Nejdl, W., and San Pedro, J. 2010. How useful are your comments?: Analyzing and predicting Youtube comments and comment ratings. In Proceedings of the 19th International Conference on World Wide Web. 891--900. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Song, Y., Zhou, D., and He, L.-w. 2011. Post-ranking query suggestion by diversifying search results. In Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 815--824. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Szpektor, I., Gionis, A., and Maarek, Y. 2011. Improving recommendation for long-tail queries via templates. In Proceedings of the 20th International Conference on World Wide Web. 47--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., and Kappas, A. 2010. Sentiment in short strength detection informal text. J. Am. Soc. Inf. Sci. Technol. 61, 12, 2544--2558. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Thomas, M., Pang, B., and Lee, L. 2006. Get out the vote: Determining support or opposition from congressional floor-debate transcripts. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 327--335. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Turney, P. D. 2002. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 417--424. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Turney, P. D. and Littman, M. L. 2002. Unsupervised learning of semantic orientation from a hundred-billion-word corpus. Tech. rep. egb-1094. National Research Council Canada.Google ScholarGoogle Scholar
  53. Vuong, B.-Q., Lim, E.-P., Sun, A., Le, M.-T., Lauw, H. W., and Chang, K. 2008. On ranking controversies in Wikipedia: Models and evaluation. In Proceedings of the International Conference on Web Search and Data Mining. 171--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Vural, A. G., Cambazoglu, B. B., and Senkul, P. 2012. Sentiment-focused Web crawling. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management. 2020--2024. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Weber, I., Garimella, V. R. K., and Borra, E. 2012. Mining Web query logs to analyze political issues. In Proceedings of the 3rd Annual ACM Web Science Conference. 330--334. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Wilkinson, E. 2012. Climate change: Environmental issues vs leadership. http://www.wateo.org/2012/01/02/climate-change-environmental-issues-vs-leadership-by-elisa-wilkinson/.Google ScholarGoogle Scholar
  57. Zaragoza, H., Cambazoglu, B. B., and Baeza-Yates, R. A. 2010. Web search solved?: All result rankings the same? In Proceedings of the 19th ACM International Conference on Information and Knowledge Management. 529--538. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Analyzing, Detecting, and Exploiting Sentiment in Web Queries

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on the Web
      ACM Transactions on the Web  Volume 8, Issue 1
      December 2013
      204 pages
      ISSN:1559-1131
      EISSN:1559-114X
      DOI:10.1145/2560539
      Issue’s Table of Contents

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 December 2013
      • Revised: 1 September 2013
      • Accepted: 1 September 2013
      • Received: 1 September 2012
      Published in tweb Volume 8, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader