research-article

Integration of news content into web results

Author:
Fernando Diaz

Yahoo! Labs Montreal, Montreal, QC

Yahoo! Labs Montreal, Montreal, QC
View Profile

WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data MiningFebruary 2009Pages 182–191https://doi.org/10.1145/1498759.1498825

Published:09 February 2009Publication History

WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining

Pages 182–191

ABSTRACT

Aggregated search refers to the integration of content from specialized corpora or verticals into web search results. Aggregation improves search when the user has vertical intent but may not be aware of or desire vertical search. In this paper, we address the issue of integrating search results from a news vertical into web search results. News is particularly challenging because, given a query, the appropriate decision---to integrate news content or not---changes with time. Our system adapts to news intent in two ways. First, by inspecting the dynamics of the news collection and query volume, we can track development of and interest in topics. Second, by using click feedback, we can quickly recover from system errors. We define several click-based metrics which allow a system to be monitored and tuned without annotator effort.

References

J. Allan, editor. Topic Detection and Tracking: Event-based Information Organization, volume 12 of The Information Retrieval Series. Springer, 2002. Google ScholarDigital Library
H. Becker, C. Meek, and D. M. Chickering. Modeling contextual factors of click rates. In AAAI, pages 1310--1315, 2007. Google ScholarDigital Library
D. Beeferman and A. Berger. Agglomerative clustering of a search engine query log. In KDD 2000, pages 407--416, 2000. Google ScholarDigital Library
S. M. Beitzel, E. C. Jensen, O. Frieder, D. Grossman, D. D. Lewis, A. Chowdhury, and A. Kolcz. Automatic web query classification using labeled and unlabeled training data. In SIGIR 2005, pages 581--582, 2005. Google ScholarDigital Library
N. J. Belkin and W. B. Croft. Information filtering and information retrieval: two sides of the same coin? CACM, 35(12):29--38, 1992. Google ScholarDigital Library
A. Z. Broder, M. Fontoura, E. Gabrilovich, A. Joshi, V. Josifovski, and T. Zhang. Robust classification of rare queries using web knowledge. In SIGIR 2007, pages 231--238, 2007. Google ScholarDigital Library
J. Callan. Distributed information retrieval. In W. B. Croft, editor, Advances in Information Retrieval. Kluwer Academic Publishers, 2000.Google Scholar
B. Carterette and R. Jones. Evaluating search engines by modeling the relationship between relevance and clicks. In NIPS, 2007.Google Scholar
S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In SIGIR 2002, pages 299--306, 2002. Google ScholarDigital Library
A. S. Das, M. Datar, A. Garg, and S. Rajaram. Google news personalization: scalable online collaborative filtering. In WWW 2007, pages 271--280, 2007. Google ScholarDigital Library
F. Diaz and R. Jones. Using temporal profiles of queries for precision prediction. In SIGIR 2004, pages 18--24, 2004. Google ScholarDigital Library
G. Dupret, V. Murdock, and B. Piwowarski. Web search engine evaluation using clickthrough data and a user model. In Query Log Analysis: Social And Technological Challenges. A workshop at the 16th International World Wide Web Conference (WWW 2007), May 2007.Google Scholar
G. E. Dupret and B. Piwowarski. A user browsing model to predict search engine click data from past observations. In SIGIR 2008, pages 331--338, 2008. Google ScholarDigital Library
R. Jones and F. Diaz. Temporal profiles of queries. ACM Trans. Inf. Syst., 25(3):14, July 2007. Google ScholarDigital Library
N. K. Jong and P. Stone. Bayesian models of nonstationary markov decision processes. In The IJCAI-2005 Workshop on Planning and Learning in A Priori Unknown or Dynamic Domains, 2005.Google Scholar
I.-H. Kang and G. Kim. Query type classification for web document retrieval. In SIGIR 2003, pages 64--71, 2003. Google ScholarDigital Library
R. Kleinberg, A. Slivkins, and E. Upfal. Multi-armed bandits in metric spaces. In STOC 2008. 2008. Google ScholarDigital Library
V. Lavrenko and W. B. Croft. Relevance based language models. In SIGIR 2001, pages 120--127, 2001. Google ScholarDigital Library
X. Li, Y.-Y. Wang, and A. Acero. Learning query intent from regularized click graphs. In SIGIR 2008, pages 339--346, 2008. Google ScholarDigital Library
C.-J. Lin, R. C. Weng, and S. S. Keerthi. Trust region newton method for large-scale logistic regression. Journal of Machine Learning Research, 9:627--650, 2008. Google ScholarDigital Library
A. K. McCallum and K. Nigam. Employing EM in pool-based active learning for text classification. In ICML 1998, pages 350--358, 1998. Google ScholarDigital Library
D. Metzler, S. T. Dumais, and C. Meek. Similarity measures for short segments of text. In ECIR, pages 16--27, 2007. Google ScholarDigital Library
V. Murdock and M. Lalmas, editors. Proceedings of the SIGIR Workshop on Aggregated Search, 2008.Google Scholar
S. Pandey, D. Agarwal, D. Chakrabarti, and V. Josifovski. Bandits for taxonomies: A model-based approach. In SDM, 2007.Google ScholarCross Ref
S. Pandey, D. Chakrabarti, and D. Agarwal. Multi-armed bandit problems with dependent arms. In ICML 2007, pages 721--728, 2007. Google ScholarDigital Library
F. Radlinski, R. Kleinberg, and T. Joachims. Learning diverse rankings with multi-armed bandits. In ICML 2008, pages 784--791, 2008. Google ScholarDigital Library
M. Sahami and T. D. Heilman. A web-based kernel function for measuring the similarity of short text snippets. In WWW 2006, pages 377--386, 2006. Google ScholarDigital Library
M. J. A. Strens. A bayesian framework for reinforcement learning. In ICML 2000, pages 943--950, 2000. Google ScholarDigital Library
T. Strohman, D. Metzler, H. Turtle, and W. B. Croft. Indri: A language model-based search engine for complex queries. In Proceedings of the International Conference on Intelligence Analysis, 2004.Google Scholar
R. Sutton and A. Barto. Reinforcement Learning. MIT Press, 1998. Google ScholarDigital Library
M. Vlachos, C. Meek, Z. Vagena, and D. Gunopulos. Identifying similarities, periodicities and bursts for online search queries. In SIGMOD 2004, pages 131--142, 2004. Google ScholarDigital Library
J.-R. Wen, J.-Y. Nie, and H.-J. Zhang. Query clustering using user logs. ACM Trans. Inf. Syst., 20(1):59--81, 2002. Google ScholarDigital Library
X. Zhang. Fast Algorithms for Burst Detection. PhD thesis, New York University, 2006. Google ScholarDigital Library
Y. Zhang, W. Xu, and J. P. Callan. Exploration and exploitation in adaptive filtering based on bayesian active learning. In ICML 2003, pages 896--903, 2003.Google Scholar

Index Terms

Integration of news content into web results
1. Information systems
  1. World Wide Web
    1. Web applications
    2. Web services

Recommendations

Click-through prediction for news queries
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval

A growing trend in commercial search engines is the display of specialized content such as news, products, etc. interleaved with web search results. Ideally, this content should be displayed only when it is highly relevant to the search query, as it ...
Read More
v-TCM: vertical-aware transformer click model for web search
SAC '22: Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing

Understanding ¹ and predicting user click behavior on a web search engine results page is critical for online advertising and recommendation engines. The click prediction results can be further used to estimate the relevance of the search engine ...
Read More
AllInOneNews: development and evaluation of a large-scale news metasearch engine
SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data

AllInOneNews is the largest news metasearch engine in the world, connecting to over 1,000 news sites over 150 countries. Implementing a large-scale metasearch engine like AllInOneNews needs to overcome unique challenges not faced by building small ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining
February 2009
314 pages
ISBN:9781605583907
DOI:10.1145/1498759
Editors:
Ricardo Baeza-Yates
Yahoo! Research, Spain
,
Paolo Boldi
Universita degli Studi di Milano, Italy
,
Berthier Ribeiro-Neto
Google Engineering, Brazil & CS Dept., Univ. Fed. de Minas Gerais, Brazil
,
B. Barla Cambazoglu
Yahoo! Research
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 February 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
click prediction
distributed information retrieval
news search
query similarity
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate498of2,863submissions,17%
Upcoming Conference
WSDM '25

Sponsor:

sigir

sigir

sigir

sigir

The Eighteenth ACM International Conference on Web Search and Data Mining

April 7 - 11, 2025

Hannover , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 95
  Total Citations
  View Citations
- 862
  Total Downloads
- Downloads (Last 12 months)16
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Integration of news content into web results

WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Click-through prediction for news queries

v-TCM: vertical-aware transformer click model for web search

AllInOneNews: development and evaluation of a large-scale news metasearch engine