skip to main content
10.1145/511446.511500acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
Article

Probabilistic question answering on the web

Published:07 May 2002Publication History

ABSTRACT

Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this paper we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR) using proximity and question type features achieves a total reciprocal document rank of .20 on the TREC 8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.

References

  1. Cody Kwok, Oren Etzioni, and Daniel S. Weld. Scaling question answering to the web. In the Proceedings of the 10th World Wide Web Conference (WWW 2001), Hong Kong, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Steven Abney, Michael Collins, and Amit Singhal. Answer extraction. In the Proceedings of ANLP 2000, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. Katz. From sentence processing to information access on the World Wide Web. In Natural Language Processing for the World Wide Web: Papers from the 1997 AAAI Spring Symposium, pages 77--94, 1997.Google ScholarGoogle Scholar
  4. Julian Kupiec. Murax: A robust linguistic approach for question answering using an on-line encyclopedia. In the Proceedings of 16th SIGIR Conference, Pittsburgh, PA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ellen Voorhees and Dawn Tice. The TREC-8 question answering track evaluation. In Text Retrieval Conference TREC-8, Gaithersburg, MD, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  6. J. Prager, D. Radev, E. Brown, and A. Coden. The use of predictive annotation for question answering in trec8. In NIST Special Publication 500-246:The Eighth Text REtrieval Conference (TREC 8), pages 399--411, 1999.Google ScholarGoogle Scholar
  7. Dragomir R. Radev, John Prager, and Valerie Samn. Ranking suspected answers to natural language questions using predictive annotation. In the Proceedings of 6th Conference on Applied Natural Language Processing (ANLP), Seattle, Washington, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. E. Hovy, L. Gerber, U. Hermjakob, M. Junk, and C-Y Lin. Question answering in webclopedia. In NIST Special Publication 500-249: The Ninth Text REtrieval Conference (TREC 9), pages 655--664, 2000.Google ScholarGoogle Scholar
  9. C. L. A. Clarke, G. V. Cormack, D. I .E. Kisman, and T. R. Lynam. Question answering by passage selection (multitext experiments for trec-9). In NIST Special Publication 500-249: The Ninth Text REtrieval Conference (TREC 9), pages 673--683, 2000.Google ScholarGoogle Scholar
  10. S. Harabagiu, D. Moldovan, R. Mihalcea M. Pasca, R. Bunescu M. Surdeanu, R. Gîrju, V. Rus, and P. Morarescu. Falcon: Boosting knowledge for answer engines. In NIST Special Publication 500-249:The Ninth Text REtrieval Conference (TREC 9), pages 479--488, 2000.Google ScholarGoogle Scholar
  11. D. R. Radev, K. Libner, and W. Fan. Getting answers to natural language queries on the web. Journal of the American Society for Information Science and Technology (JASIST), page to appear, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Eugene Agichtein, Steve Lawrence, and Luis Gravano. Learning search engine specific query transformations for question answering. In the Proceedings of the 10th World Wide Web Conference (WWW 2001), Hong Kong, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Eric J. Glover, Gary W. Flake, Steve Lawrence, William P. Birmingham, Andries Kruger, C. Lee Giles, and David M. Pennock. Improving category specific web search by learning query modifications. In The Proceedings of Symposium on Applications and the Internet, SAINT 2001, San Diego, California, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Dragomir R. Radev, Hong Qi, Zhiping Zheng, Sasha Blair-Goldensohn, Zhu Zhang, Weiguo Fan, and John Prager. Mining the web for answers to natural language questions. In the Proceedings of ACM CIKM 2001: Tenth International Conference on Information and Knowledge Management, Atlanta, GA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. William W. Cohen. Learning trees and rules with set-valued features. In Proceedings of the Thirteenth National Conference on Artificial Intelligence and the Eighth Innovative Applications of Artificial Intelligence Conference, pages 709--716, Menlo Park, August 1996. AAAI Press MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Andrei Mikheev. Document centered approach to text normalization. In Proceedings of SIGIR'2000, pages 136--143, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Eric Brill. Transformation-based error-driven learning and natural language processing: A case study in part of speech tagging. Computational Linguistics, 21(4):543--566, December 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. E. Robertson, S. Walker, S. Jones, M. M. Hancock-Beaulieu, and M. Gatford. Okapi at TREC-4. In D. K. Harman, editor, Proceedings of the Fourth Text Retrieval Conference, pages 73--97. NIST Special Publication 500-236, 1996.Google ScholarGoogle Scholar

Index Terms

  1. Probabilistic question answering on the web

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            WWW '02: Proceedings of the 11th international conference on World Wide Web
            May 2002
            754 pages
            ISBN:1581134495
            DOI:10.1145/511446

            Copyright © 2002 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 7 May 2002

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            Overall Acceptance Rate1,899of8,196submissions,23%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader