skip to main content
10.1145/2063576.2063872acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

An unsupervised ranking method based on a technical difficulty terrain

Published:24 October 2011Publication History

ABSTRACT

Users look for information that can suit their level of expertise, but it often takes a mammoth effort to trace such information. One has to sift through multiple pages to look for one that fits the appropriate technical background. In this paper, a query-independent ranking system is proposed for technical web pages. The pages returned by the system are sorted by their relative technical difficulty in either ascending or descending order specified by the user. The technical difficulty of a document i.e. terms in sequence, is first computed by the combination of each individual term's geometry in the low-dimensional latent semantic indexing (LSI) space, which can be visualized as a conceptual terrain. Then the pages are ranked based on the expected cost to get over the terrain. Results indicate that our terrain based method outperforms traditional readability measures.

References

  1. J. S. Chall and E. Dale. Readability revisited: the new dale-chall readability formula. 1995.Google ScholarGoogle Scholar
  2. K. Collins-Thompson and J. Callan. Predicting reading difficulty with statistical language models. J. Am. Soc. Inf. Sci. Technol., 56:1448--1462, November 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41:391--407, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  4. C. R. Fletcher, S. T. Chrysler, P. van den Broek, J. A. Deaton, and C. P. Bloom. The role of co-occurrence, co-reference, and causality in the coherence of conjoined sentences. In R. F. Lorch, and E. J. O'Brien (Eds.), Sources of coherence in reading, pages 203--218, 1995.Google ScholarGoogle Scholar
  5. P. W. Foltz, W. Kintsch, and T. K. Landauer. The measurement of textual coherence with latent semantic analysis. Discourse Process, 15:285--307, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  6. K. Järvelin and J. Kekäläinen. Ir evaluation methods for retrieving highly relevant documents. In Proceedings of SIGIR, pages 41--48, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. P. Kincaid, R. P. Fishburne, R. L. Rogers, and B. S. Chissom. Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel. Technical report, Feb. 1975.Google ScholarGoogle Scholar
  8. G. Kumaran, R. Jones, and O. Madani. Biasing web search results for topic familiarity. In Proceedings of CIKM, pages 271--272, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T. Landauer and S. Dumais. A solution to plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2):211--240, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  10. T. K. Landauer, P. W. Foltz, and D. Laham. An Introduction to Latent Semantic Analysis. Discourse Processes, (25):259--284, 1998.Google ScholarGoogle Scholar
  11. G. H. M. Laughlin. Smog grading-a new readability formula. Journal of Reading, 12(8):pp. 639--646, 1969.Google ScholarGoogle Scholar
  12. D. S. McNamara, E. Kintsch, N. B. Songer, and W. Kintsch. Are good texts always better? interactions of text coherence, background knowledge, and levels of understanding in learning from text. Cognition and Instruction, 14(1):pp. 1--43, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  13. M. Nakatani, A. Jatowt, and K. Tanaka. Easiest-first search: towards comprehension-based web search. In Proceeding of CIKM, pages 2057--2060, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Stenner, I. Horabin, D. Smith, and M. Smith. The lexile framework. 1988.Google ScholarGoogle Scholar
  15. R. W. White, S. T. Dumais, and J. Teevan. Characterizing the influence of domain expertise on web search behavior. In Proceedings of the WSDM, pages 132--141, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. B. W. Wolfe, M. E. Schreiner, B. Rehder, D. Laham, P. W. Foltz, W. Kintsch, and T. K. Landauer. Learning from text: Matching readers and texts by latent semantic analysis. Discourse Processes, 25(2/3):309--336, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  17. X. Yan, D. Song, and X. Li. Concept-based document readability in domain specific information retrieval. In Proceedings of CIKM, pages 540--549, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An unsupervised ranking method based on a technical difficulty terrain

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management
          October 2011
          2712 pages
          ISBN:9781450307178
          DOI:10.1145/2063576

          Copyright © 2011 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 October 2011

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • poster

          Acceptance Rates

          Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader