skip to main content
10.1145/1148170.1148212acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

A framework to predict the quality of answers with non-textual features

Published:06 August 2006Publication History

ABSTRACT

New types of document collections are being developed by various web services. The service providers keep track of non-textual features such as click counts. In this paper, we present a framework to use non-textual features to predict the quality of documents. We also show our quality measure can be successfully incorporated into the language modeling-based retrieval model. We test our approach on a collection of question and answer pairs gathered from a community based question answering service where people ask and answer questions. Experimental results using our quality measure show a significant improvement over our baseline.

References

  1. A. Berger, S. D. Pietra, and V. D. Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39--71, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1--7):107--117, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. D. Burke, K. J. Hammond, V. A. Kulyukin, S. L. Lytinen, N. Tomuro, and S. Schoenberg. Question answering from frequently asked question files: Experiences with the faq finder system. AI Magazine, 18(2):57--66, 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Harman. Overview of the first text retrieval conference (trec-1). In Proceedings of the First TREC Conference, pages 1--20, 1992.Google ScholarGoogle Scholar
  5. J. Hwang, S. Lay, and A. Lippman. Nonparametric multivariate density estimation: A comparative study. IEEE Transactions of Signal Processing, 42(10):2795--2810, 1994.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Jeon, W. B. Croft, and J. H. Lee. Finding similar questions in large question and answer archives. In Proceedings of the ACM Fourteenth Conference on Information and Knowledge Management, pages 76--83, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Jeon and R. Manmatha. Using maximum entropy for automatic image annotation. Image and Video Retrieval Third International Conference, CIVR 2004, Proceedings Series: Lecture Notes in Computer Science, 3115:24--32, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  8. V. Jijkoun and M. de Rijke. Retrieving answers from frequently asked questions pages on the web. In Proceedings of the ACM Fourteenth Conference on Information and Knowledge Management, pages 76--83, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H. Kim and J. Seo. High-performance faq retrieval using an automatic clustering method of query logs. Information Processing and Management, 42(3):650--661, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5):604--632, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. Kraaij, T. Westerveld, and D. Hiemstra. The importance of prior probabilities for entry page search. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 27--34, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. L. S. Larkey. Automatic essay grading using text categorization techniques. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 90--95, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Lenz, A. Hubner, and M. Kunze. Question answering with textual cbr. In Proceedings of the Third International Conference on Flexible Query Answering Systems, pages 236--247, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. X. Li and W. B. Croft. Time-based language models. In Proceedings of the Twelfth ACM International Conference on Information and knowledge management, pages 469--475, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Malouf. A comparison of algorithms for maximum entropy parameter estimation. In Proceedings of Conference on Computational Natural Language Learning, pages 49--55, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. K. Nigam, J. Lafferty, and A. McCallum. Using maximum entropy for text classification. In Proceedings of IJCAI-99 Workshop on Machine Learning for Information Filtering, pages 61--67, 1999.Google ScholarGoogle Scholar
  17. B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up? sentiment classification using machine learning techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 275--281, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. E. Sneiders. Automated faq answering: Continued experience with shallow language understanding. In Proceedings for the 1999 AAAI Fall Symposium on Question Answering Systems, 1999.Google ScholarGoogle Scholar
  20. D. M. Strong, Y. W. Lee, and R. Y. Wang. Data quality in context. Communications of the ACM, 40(5):103--110, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C.-H. Wu, J.-F. Yeh, and M.-J. Chen. Domain-specific faq retrieval using independent aspects. ACM Transactions on Asian Language Information Processing, 4(1):1--17, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 334--342, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Zhou and W. B. Croft. Document quality models for web ad hoc retrieval. In Proceedings of the ACM Fourteenth Conference on Information and Knowledge Management, pages 331--332, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. X. Zhu and S. Gauch. Incorporating quality metrics in centralized/distributed information retrieval on the world wide web. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 288--295, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A framework to predict the quality of answers with non-textual features

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
        August 2006
        768 pages
        ISBN:1595933697
        DOI:10.1145/1148170

        Copyright © 2006 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 6 August 2006

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader