research-article

Personalizing web search results by reading level

Authors:
Kevyn Collins-Thompson

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Paul N. Bennett

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Ryen W. White

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Sebastian de la Chica

Microsoft, Redmond, WA, USA

Microsoft, Redmond, WA, USA
View Profile

,
David Sontag

Microsoft Research New England, Cambridge, MA, USA

Microsoft Research New England, Cambridge, MA, USA
View Profile

CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge managementOctober 2011Pages 403–412https://doi.org/10.1145/2063576.2063639

Published:24 October 2011Publication History

CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

Pages 403–412

ABSTRACT

Traditionally, search engines have ignored the reading difficulty of documents and the reading proficiency of users in computing a document ranking. This is one reason why Web search engines do a poor job of serving an important segment of the population: children. While there are many important problems in interface design, content filtering, and results presentation related to addressing children's search needs, perhaps the most fundamental challenge is simply that of providing relevant results at the right level of reading difficulty. At the opposite end of the proficiency spectrum, it may also be valuable for technical users to find more advanced material or to filter out material at lower levels of difficulty, such as tutorials and introductory texts. We show how reading level can provide a valuable new relevance signal for both general and personalized Web search. We describe models and algorithms to address the three key problems in improving relevance for search using reading difficulty: estimating user proficiency, estimating result difficulty, and re-ranking based on the difference between user and result reading level profiles. We evaluate our methods on a large volume of Web query traffic and provide a large-scale log analysis that highlights the importance of finding results at an appropriate reading level for the user.

References

P. N. Bennett, K. Svore, and S. T. Dumais. Classification-enhanced ranking. In Proc. of WWW 2010, 111--120. Google ScholarDigital Library
D. Bilal. Children's use of the yahooligans! web search engine: Cognitive, physical, and affective behaviors on fact-based search tasks. J. Am. Soc. Inf. Sci., 51(7):646--665, 2000. Google ScholarDigital Library
J. Chall, E. Dale. Readability Revisited: The New Dale Chall Readability Formula. Brookline Books, 1995.Google Scholar
C. L. Clarke, E. Agichtein, S. Dumais, and R. W. White. The influence of caption features on clickthrough patterns in web search. In Proc. of SIGIR 2007, 135--142. Google ScholarDigital Library
K. Collins-Thompson and J. P. Callan. A language modeling approach to predicting reading difficulty. In HLT-NAACL, 193--200.Google Scholar
A. Druin, E. Foss, H. Hutchinson, E. Golub, and L. Hatley. Children's roles using keyword search interfaces at home. In Proc. of CHI 2010, 413--422. Google ScholarDigital Library
C. Eickhoff, P. Serdyukov, and A. de Vries. A combined topical/non-topical approach to identifying web sites for children. In Proc. of WSDM 2011, 505--514. Google ScholarDigital Library
B. Efron, R. J. Tibshirani. An Introduction to the Bootstrap. Chapman & Hall/CRC, New York, 1994.Google Scholar
J. Gao, W. Yuan, X. Li, K. Deng, and J.-Y. Nie. Smoothing clickthrough data for web search ranking. In Proc. of SIGIR 2009. Google ScholarDigital Library
K. Gyllstrom and M.-F. Moens. Wisdom of the ages: toward delivering the children's web with the link-based agerank algorithm. In Proc. of CIKM 2008, 159--168. Google ScholarDigital Library
S. Hirsh. Children's relevance criteria and information seeking on electronic resources. JASIST, 50(14):1265--1283. Google ScholarDigital Library
P. Kidwell, G. Lebanon, and K. Collins-Thompson. Statistical estimation of word acquisition with application to readability prediction. In Proc. of EMNLP 2009. Google ScholarDigital Library
G. Kumaran, R. Jones, and O. Madani. Biasing web search results for topic familiarity. In Proc. of CIKM 2005, 271--272. Google ScholarDigital Library
X. Liu, W. B. Croft, P. Oh, and D. Hart. Automatic recognition of reading levels from user queries. In Proc. of SIGIR 2004, 548--549. Google ScholarDigital Library
PuppyIR. PuppyIR: An open source environment to construct information services for children. 2011. http://www.puppyir.eu/.Google Scholar
J. Teevan, S. T. Dumais, and E. Horvitz. Personalizing search via automated analysis of interests and activities. In Proc. of SIGIR 2005, 449--456. Google ScholarDigital Library
S. D. Torres, D. Hiemstra, and P. Serdyukov. An analysis of queries intended to search information for children. In IIiX 2010, 235--244. Google ScholarDigital Library
M. van Kalsbeek, J. de Wit, D. Trieschnigg, P. van der Vet, T. Huibers, and D. Hiemstra. Automatic reformulation of children's search queries. Technical Report TR-CTIT-10--23, June 2010.Google Scholar
K. Wang, T. Walker, and Z. Zheng. pSkip: Estimating relevance ranking quality from web search clickthrough data. In Proc. of SIGKDD 2009, 1355--1364. Google ScholarDigital Library
R. W. White, P. N. Bennett, and S. T. Dumais. Predicting short-term interests using activity-based search context. In Proc. of CIKM 2010, 1009--1018. Google ScholarDigital Library
R. W. White and S. T. Dumais. Characterizing and predicting search engine switching behavior. In Proc. of CIKM 2009, 87--96. Google ScholarDigital Library
R. W. White, S. T. Dumais, and J. Teevan. Characterizing the influence of domain expertise on web search behavior. In Proc. of WSDM 2009, 132--141. Google ScholarDigital Library
Q. Wu, C. J. C. Burges, K. M. Svore, and J. Gao. Adapting boosting for information retrieval measures. Information Retrieval, 3(13):254--270, 2010. Google ScholarDigital Library
C. X. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In Proc. of SIGIR 2003, 10--17. Google ScholarDigital Library

Index Terms

Personalizing web search results by reading level
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Improving Ranking Consistency for Web Search by Leveraging a Knowledge Base and Search Logs
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

In this paper, we propose a new idea called ranking consistency in web search. Relevance ranking is one of the biggest problems in creating an effective web search system. Given some queries with similar search intents, conventional approaches typically ...
Read More
Probabilistic models for personalizing web search
WSDM '12: Proceedings of the fifth ACM international conference on Web search and data mining

We present a new approach for personalizing Web search results to a specific user. Ranking functions for Web search engines are typically trained by machine learning algorithms using either direct human relevance judgments or indirect judgments obtained ...
Read More
Personalization of web-search using short-term browsing context
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

Search and browsing activity is known to be a valuable source of information about user's search intent. It is extensively utilized by most of modern search engines to improve ranking by constructing certain ranking features as well as by personalizing ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management
October 2011
2712 pages
ISBN:9781450307178
DOI:10.1145/2063576
Editors:
Bettina Berendt,
Arjen de Vries,
Wenfei Fan,
Craig Macdonald
University of Glasgow, UK
,
Iadh Ounis
University of Glasgow, UK
,
Ian Ruthven
University of Strathclyde, UK
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 October 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
personalization
re-ranking
reading difficulty
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 118
  Total Citations
  View Citations
- 853
  Total Downloads
- Downloads (Last 12 months)30
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Personalizing web search results by reading level

CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Improving Ranking Consistency for Web Search by Leveraging a Knowledge Base and Search Logs

Probabilistic models for personalizing web search

Personalization of web-search using short-term browsing context