Abstract
The Web contains an increasing amount of biased and opinionated documents on politics, products, and polarizing events. In this article, we present an indepth analysis of Web search queries for controversial topics, focusing on query sentiment. To this end, we conduct extensive user assessments and discriminative term analyses, as well as a sentiment analysis using the SentiWordNet thesaurus, a lexical resource containing sentiment annotations. Furthermore, in order to detect the sentiment expressed in queries, we build different classifiers based on query texts, query result titles, and snippets. We demonstrate the virtue of query sentiment detection in two different use cases. First, we define a query recommendation scenario that employs sentiment detection of results to recommend additional queries for polarized queries issued by search engine users. The second application scenario is controversial topic discovery, where query sentiment classifiers are employed to discover previously unknown topics that trigger both highly positive and negative opinions among the users of a search engine. For both use cases, the results of our evaluations on real-world data are promising and show the viability and potential of query sentiment analysis in practical scenarios.
- Ahmad, K. 2011. Affective Computing and Sentiment Analysis: Emotion, Metaphor and Terminology (Text, Speech and Language Technology) 1st Ed. Springer. Google ScholarDigital Library
- Aktolga, E. and Allan, J. 2011. Reranking search results for sparse queries. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 173--182. Google ScholarDigital Library
- Allan, J. 2002. Topic Detection and Tracking: Event-Based Information Organization. Kluwer Academic Publishers. Google ScholarDigital Library
- Anagnostopoulos, A., Becchetti, L., Castillo, C., and Gionis, A. 2010. An optimization framework for query recommendation. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. 161--170. Google ScholarDigital Library
- Awadallah, R., Ramanath, M., and Weikum, G. 2012. Harmony and dissonance: Organizing the people’s voices on political controversies. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining. 523--532. Google ScholarDigital Library
- Baeza-Yates, R., Hurtado, C., and Mendoza, M. 2004. Query recommendation using query logs in search engines. In Proceedings of the International Conference on Current Trends in Database Technology. Lecture Notes in Computer Science, vol. 3268. Springer-Verlag, Berlin, Heidelberg, 588--596. Google ScholarDigital Library
- Bar-Yossef, Z. and Gurevich, M. 2008. Mining search engine query logs via suggestion sampling. Proc. VLDB Endow. 1, 1, 54--65. Google ScholarDigital Library
- Bar-Yossef, Z. and Kraus, N. 2011. Context-sensitive query auto-completion. In Proceedings of the 20th International Conference on World Wide Web. 107--116. Google ScholarDigital Library
- Bermingham, A. and Smeaton, A. F. 2010. Classifying sentiment in microblogs: Is brevity an advantage? In Proceedings of the 19th ACM International Conference on Information and Knowledge Management. 1833--1836. Google ScholarDigital Library
- Broccolo, D., Marcon, L., Nardini, F. M., Perego, R., and Silvestri, F. 2012. Generating suggestions for queries in the long tail with an inverted index. Inf. Process. Manage. 48, 2, 326--339. Google ScholarDigital Library
- Broder, A. 2002. A taxonomy of Web search. SIGIR Forum 36, 2, 3--10. Google ScholarDigital Library
- Broder, A. Z., Fontoura, M., Gabrilovich, E., Joshi, A., Josifovski, V., and Zhang, T. 2007. Robust classification of rare queries using Web knowledge. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 231--238. Google ScholarDigital Library
- Cao, H., Hu, D. H., Shen, D., Jiang, D., Sun, J.-T., Chen, E., and Yang, Q. 2009. Context-aware query classification. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 3--10. Google ScholarDigital Library
- Chang, C.-C. and Lin, C.-J. 2011. Libsvm: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 3, 27:1--27:27. Google ScholarDigital Library
- Chelaru, S., Altingovde, I. S., and Siersdorfer, S. 2012. Analyzing the polarity of opinionated queries. In Proceedings of the 34th European Conference on IR Research. Lecture Notes in Computer Science, vol. 7224. Springer-Verlag, Berlin, Heidelberg, 463--467. Google ScholarDigital Library
- Demartini, G. and Siersdorfer, S. 2010. Dear search engine: What’s your opinion about...?: Sentiment analysis for semantic enrichment of Web search results. In Proceedings of the 3rd International Semantic Search Workshop. 4:1--4:7. Google ScholarDigital Library
- Denecke, K. 2009. Are sentiwordnet scores suited for multi-domain sentiment classification? In Proceedings of the 4th IEEE International Conference on Digital Information Management. 33--38.Google ScholarCross Ref
- Esuli, A. and Sebastiani, F. 2006. Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of the 5th Conference on Language Resources and Evaluation. 417--422.Google Scholar
- Fellbaum, C., Ed. 1998. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA.Google Scholar
- Fonseca, B. M., Golgher, P. B., de Moura, E. S., and Ziviani, N. 2003. Using association rules to discover search engines related queries. In Proceedings of the 1st Conference on Latin American Web Congress. IEEE Computer Society, 66--71. Google ScholarDigital Library
- Goorha, S. and Ungar, L. 2010. Discovery of significant emerging trends. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 57--64. Google ScholarDigital Library
- Gwet, K. 2010. Handbook of Inter-Rater Reliability 2nd Ed. Advanced Analytics, LLC.Google Scholar
- Gyllstrom, K. and Moens, M.-F. 2011. Clash of the typings: Finding controversies and children’s topics within queries. In Proceedings of the 33rd European Conference on IR Research. Lecture Notes in Computer Science, vol. 6611. Springer-Verlag, Berlin, Heidelberg, 80--91. Google ScholarDigital Library
- Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. 2009. The weka data mining software: An update. SIGKDD Explor. 11, 1, 10--18. Google ScholarDigital Library
- Hatzivassiloglou, V. and McKeown, K. 1995. A quantitative evaluation of linguistic tests for the automatic prediction of semantic markedness. In Proceedings of the 33rd Annual Meeting of the ACL. 197--204. Google ScholarDigital Library
- Hatzivassiloglou, V. and McKeown, K. 1997. Predicting the semantic orientation of adjectives. In Proceedings of the 35th Annual Meeting of the ACL. 174--181. Google ScholarDigital Library
- Jain, A. and Mishne, G. 2010. Organizing query completions for Web search. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management. 1169--1178. Google ScholarDigital Library
- Kang, I.-H. and Kim, G. 2003. Query type classification for Web document retrieval. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 64--71. Google ScholarDigital Library
- Kittur, A., Suh, B., Pendleton, B. A., and Chi, E. H. 2007. He says, she says: Conflict and coordination in Wikipedia. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 453--462. Google ScholarDigital Library
- Kucuktunc, O., Cambazoglu, B. B., Weber, I., and Ferhatosmanoglu, H. 2012. A large-scale sentiment analysis for Yahoo! answers. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining. 633--642. Google ScholarDigital Library
- Li, X., Wang, Y.-Y., and Acero, A. 2008. Learning query intent from regularized click graphs. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 339--346. Google ScholarDigital Library
- Manning, C. D., Raghavan, P., and Schütze, H. 2008. Introduction to Information Retrieval. Cambridge University Press. Google ScholarDigital Library
- O’Connor, B., Balasubramanyan, R., Routledge, B. R., and Smith, N. A. 2010. From tweets to polls: Linking text sentiment to public opinion time series. In Proceedings of the 4th International Conference on Weblogs and Social Media.Google Scholar
- Orimaye, S. O., Alhashmi, S. M., and Siew, E.-G. 2011. Frequency of sentential contexts vs. frequency of query terms in opinion retrieval. In Proceedings of the 7th International Conference on Web Information Systems and Technologies, J. Cordeiro and J. Filipe Eds., SciTePress, 607--610.Google Scholar
- Pak, A. and Paroubek, P. 2010. Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the 7th Conference on International Language Resources and Evaluation.Google Scholar
- Pan, S. J., Ni, X., Sun, J.-T., Yang, Q., and Chen, Z. 2010. Cross-domain sentiment classification via spectral feature alignment. In Proceedings of the 19th International Conference on World Wide Web. 751--760. Google ScholarDigital Library
- Pang, B. and Lee, L. 2008. Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2, 1--2. Google ScholarDigital Library
- Pang, B., Lee, L., and Vaithyanathan, S. 2002. Thumbs up?: Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing - Volume 10. Association for Computational Linguistics, 79--86. Google ScholarDigital Library
- Pass, G., Chowdhury, A., and Torgeson, C. 2006. A picture of search. In Proceedings of the 1st International Conference on Scalable Information Systems. Google ScholarDigital Library
- Pera, M. S., Qumsiyeh, R., and Ng, Y.-K. 2011. A query-based multi-document sentiment summarizer. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 1071--1076. Google ScholarDigital Library
- Preis, T., Moat, H. S., and Stanley, H. E. 2013. Quantifying trading behavior in financial markets using Google trends. Sci. Rep. 3.Google Scholar
- Radlinski, F., Szummer, M., and Craswell, N. 2010. Inferring query intent from reformulations and clicks. In Proceedings of the 19th International Conference on World Wide Web. 1171--1172. Google ScholarDigital Library
- Ripberger, J. T. 2011. Capturing curiosity: Using internet search trends to measure public attentiveness. Policy Stud. J. 39, 2, 239--259.Google ScholarCross Ref
- Shen, D., Li, Y., Li, X., and Zhou, D. 2009. Product query classification. In Proceedings of the 18th ACM Conference on Information and Knowledge Management. 741--750. Google ScholarDigital Library
- Shokouhi, M. and Radinsky, K. 2012. Time-sensitive query auto-completion. In Proceedings of the 35th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 601--610. Google ScholarDigital Library
- Siersdorfer, S., Chelaru, S., Nejdl, W., and San Pedro, J. 2010. How useful are your comments?: Analyzing and predicting Youtube comments and comment ratings. In Proceedings of the 19th International Conference on World Wide Web. 891--900. Google ScholarDigital Library
- Song, Y., Zhou, D., and He, L.-w. 2011. Post-ranking query suggestion by diversifying search results. In Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 815--824. Google ScholarDigital Library
- Szpektor, I., Gionis, A., and Maarek, Y. 2011. Improving recommendation for long-tail queries via templates. In Proceedings of the 20th International Conference on World Wide Web. 47--56. Google ScholarDigital Library
- Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., and Kappas, A. 2010. Sentiment in short strength detection informal text. J. Am. Soc. Inf. Sci. Technol. 61, 12, 2544--2558. Google ScholarDigital Library
- Thomas, M., Pang, B., and Lee, L. 2006. Get out the vote: Determining support or opposition from congressional floor-debate transcripts. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 327--335. Google ScholarDigital Library
- Turney, P. D. 2002. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 417--424. Google ScholarDigital Library
- Turney, P. D. and Littman, M. L. 2002. Unsupervised learning of semantic orientation from a hundred-billion-word corpus. Tech. rep. egb-1094. National Research Council Canada.Google Scholar
- Vuong, B.-Q., Lim, E.-P., Sun, A., Le, M.-T., Lauw, H. W., and Chang, K. 2008. On ranking controversies in Wikipedia: Models and evaluation. In Proceedings of the International Conference on Web Search and Data Mining. 171--182. Google ScholarDigital Library
- Vural, A. G., Cambazoglu, B. B., and Senkul, P. 2012. Sentiment-focused Web crawling. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management. 2020--2024. Google ScholarDigital Library
- Weber, I., Garimella, V. R. K., and Borra, E. 2012. Mining Web query logs to analyze political issues. In Proceedings of the 3rd Annual ACM Web Science Conference. 330--334. Google ScholarDigital Library
- Wilkinson, E. 2012. Climate change: Environmental issues vs leadership. http://www.wateo.org/2012/01/02/climate-change-environmental-issues-vs-leadership-by-elisa-wilkinson/.Google Scholar
- Zaragoza, H., Cambazoglu, B. B., and Baeza-Yates, R. A. 2010. Web search solved?: All result rankings the same? In Proceedings of the 19th ACM International Conference on Information and Knowledge Management. 529--538. Google ScholarDigital Library
Index Terms
- Analyzing, Detecting, and Exploiting Sentiment in Web Queries
Recommendations
Analyzing the polarity of opinionated queries
ECIR'12: Proceedings of the 34th European conference on Advances in Information RetrievalIn this paper, we present an in-depth analysis of Web search queries for controversial topics, focusing on query sentiment. To this end, we conduct extensive user assessments as well as an automatic sentiment analysis using the SentiWordNet thesaurus.
Joint sentiment/topic model for sentiment analysis
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementSentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...
Detecting bursts in sentiment-aware topics from social media
Nowadays plenty of user-generated posts, e.g., sina weibos, are published on the social media. The posts contain the publics sentiments (i.e., positive or negative) towards various topics. Bursty sentiment-aware topics from these posts reveal sentiment-...
Comments