poster

An unsupervised ranking method based on a technical difficulty terrain

Authors:
Shoaib Jameel

The Chinese University of Hong Kong, Hong Kong, Hong Kong

The Chinese University of Hong Kong, Hong Kong, Hong Kong
View Profile

,
Wai Lam

The Chinese University of Hong Kong, Hong Kong, Hong Kong

The Chinese University of Hong Kong, Hong Kong, Hong Kong
View Profile

,
Ching-man Au Yeung

ASTRI, Hong Kong, Hong Kong

ASTRI, Hong Kong, Hong Kong
View Profile

,
Sheaujiun Chyan

The Chinese University of Hong Kong, Hong Kong, Hong Kong

The Chinese University of Hong Kong, Hong Kong, Hong Kong
View Profile

CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge managementOctober 2011Pages 1989–1992https://doi.org/10.1145/2063576.2063872

Published:24 October 2011Publication History

CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

Pages 1989–1992

ABSTRACT

Users look for information that can suit their level of expertise, but it often takes a mammoth effort to trace such information. One has to sift through multiple pages to look for one that fits the appropriate technical background. In this paper, a query-independent ranking system is proposed for technical web pages. The pages returned by the system are sorted by their relative technical difficulty in either ascending or descending order specified by the user. The technical difficulty of a document i.e. terms in sequence, is first computed by the combination of each individual term's geometry in the low-dimensional latent semantic indexing (LSI) space, which can be visualized as a conceptual terrain. Then the pages are ranked based on the expected cost to get over the terrain. Results indicate that our terrain based method outperforms traditional readability measures.

References

J. S. Chall and E. Dale. Readability revisited: the new dale-chall readability formula. 1995.Google Scholar
K. Collins-Thompson and J. Callan. Predicting reading difficulty with statistical language models. J. Am. Soc. Inf. Sci. Technol., 56:1448--1462, November 2005. Google ScholarDigital Library
S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41:391--407, 1990.Google ScholarCross Ref
C. R. Fletcher, S. T. Chrysler, P. van den Broek, J. A. Deaton, and C. P. Bloom. The role of co-occurrence, co-reference, and causality in the coherence of conjoined sentences. In R. F. Lorch, and E. J. O'Brien (Eds.), Sources of coherence in reading, pages 203--218, 1995.Google Scholar
P. W. Foltz, W. Kintsch, and T. K. Landauer. The measurement of textual coherence with latent semantic analysis. Discourse Process, 15:285--307, 1998.Google ScholarCross Ref
K. Järvelin and J. Kekäläinen. Ir evaluation methods for retrieving highly relevant documents. In Proceedings of SIGIR, pages 41--48, 2000. Google ScholarDigital Library
J. P. Kincaid, R. P. Fishburne, R. L. Rogers, and B. S. Chissom. Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel. Technical report, Feb. 1975.Google Scholar
G. Kumaran, R. Jones, and O. Madani. Biasing web search results for topic familiarity. In Proceedings of CIKM, pages 271--272, 2005. Google ScholarDigital Library
T. Landauer and S. Dumais. A solution to plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2):211--240, 1997.Google ScholarCross Ref
T. K. Landauer, P. W. Foltz, and D. Laham. An Introduction to Latent Semantic Analysis. Discourse Processes, (25):259--284, 1998.Google Scholar
G. H. M. Laughlin. Smog grading-a new readability formula. Journal of Reading, 12(8):pp. 639--646, 1969.Google Scholar
D. S. McNamara, E. Kintsch, N. B. Songer, and W. Kintsch. Are good texts always better? interactions of text coherence, background knowledge, and levels of understanding in learning from text. Cognition and Instruction, 14(1):pp. 1--43, 1996.Google ScholarCross Ref
M. Nakatani, A. Jatowt, and K. Tanaka. Easiest-first search: towards comprehension-based web search. In Proceeding of CIKM, pages 2057--2060, 2009. Google ScholarDigital Library
A. Stenner, I. Horabin, D. Smith, and M. Smith. The lexile framework. 1988.Google Scholar
R. W. White, S. T. Dumais, and J. Teevan. Characterizing the influence of domain expertise on web search behavior. In Proceedings of the WSDM, pages 132--141, 2009. Google ScholarDigital Library
M. B. W. Wolfe, M. E. Schreiner, B. Rehder, D. Laham, P. W. Foltz, W. Kintsch, and T. K. Landauer. Learning from text: Matching readers and texts by latent semantic analysis. Discourse Processes, 25(2/3):309--336, 1998.Google ScholarCross Ref
X. Yan, D. Song, and X. Li. Concept-based document readability in domain specific information retrieval. In Proceedings of CIKM, pages 540--549, 2006. Google ScholarDigital Library

Index Terms

An unsupervised ranking method based on a technical difficulty terrain
1. Information systems
  1. Information retrieval

Recommendations

An unsupervised technical difficulty ranking model based on conceptual terrain in the latent space
JCDL '12: Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries

Search results of the existing general-purpose search engines usually do not satisfy domain-specific information retrieval tasks as there is a mis-match between the technical expertise of a user and the results returned by the search engine. In this ...
Read More
An Unsupervised Technical Readability Ranking Model by Building a Conceptual Terrain in LSI
SKG '12: Proceedings of the 2012 Eighth International Conference on Semantics, Knowledge and Grids

Searching for domain-specific related information has gained a high popularity in recent years. Naturally, everyone is not at par with each other when it comes to knowledge about the concepts of a domain. A doctor may be well versed in her field of ...
Read More
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

This work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management
October 2011
2712 pages
ISBN:9781450307178
DOI:10.1145/2063576
Editors:
Bettina Berendt,
Arjen de Vries,
Wenfei Fan,
Craig Macdonald
University of Glasgow, UK
,
Iadh Ounis
University of Glasgow, UK
,
Ian Ruthven
University of Strathclyde, UK
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 October 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
conceptual hop model
lsi
technical expertise
Qualifiers
- poster
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 244
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

An unsupervised ranking method based on a technical difficulty terrain

CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

An unsupervised technical difficulty ranking model based on conceptual terrain in the latent space

An Unsupervised Technical Readability Ranking Model by Building a Conceptual Terrain in LSI

Re-ranking search results using query logs