Abstract
The wealth of information on the web makes it an attractive resource for seeking quick answers to simple, factual questions such as “who was the first American in space?” or “what is the second tallest mountain in the world?” Yet today's most advanced web search services (e.g., Google and AskJeeves) make it surprisingly tedious to locate answers to such questions. In this paper, we extend question-answering techniques, first studied in the information retrieval literature, to the web and experimentally evaluate their performance.First we introduce Mulder, which we believe to be the first general-purpose, fully-automated question-answering system available on the web. Second, we describe Mulder's architecture, which relies on multiple search-engine queries, natural-language parsing, and a novel voting procedure to yield reliable answers coupled with high recall. Finally, we compare Mulder's performance to that of Google and AskJeeves on questions drawn from the TREC-8 question answering track. We find that Mulder's recall is more than a factor of three higher than that of AskJeeves. In addition, we find that Google requires 6.6 times as much user effort to achieve the same level of recall as Mulder.
- AKMAJIAN,A.AND HENY, F. 1975. An Introduction to the Principles of Transformational Syntax. MIT Press, Cambridge, Mass.Google Scholar
- ANTWORTH, E. L. 1990. PC-KIMMO: A two-level processor for morphological analysis. Summer Institute of Linguistics, Dallas, Tex.Google Scholar
- ARPA. 1998. Proceedings of the 7th Message Understanding Conference. Morgan Kaufmann, San Francisco, Calif.Google Scholar
- BIKEL, D., MILLER, S., SCHWARTZ, R., AND WEISCHEDEL, R. 1997. Nymble: A high-performance learning name finder. In Proceedings of the Fifth Conference on Applied Natural Language Processing (1997), 194-201. Google Scholar
- BRIN,S.AND PAGE, L. 1998. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the Seventh International World Wide Web Conference (www-7, Brisborne, Australia, Apr. 14-18). Google Scholar
- BUCKLEY, C., SALTON, G., ALLAN,J.,AND SINGHAL, A. 1995. Automatic query expansion using SMART: TREC 3. In NIST Special Publication 500-225: The Third Text REtrieval Conference (TREC-3) (1995), Department of Commerce, National Institute of Standards and Technology, 69-80.Google Scholar
- BURKE, R., HAMMOND, K., KULYUKIN, V., LYTINEN, S., TOMURO,N.,AND SCHOENBERG, S. 1997. Question answering from frequently-asked question files: Experiences with the FAQ finder system. Tech. Rep. TR-97-05. Depart. of Computer Science, University of Chicago. Google Scholar
- CHAKRABARTI, S., BERG,M,VAN DER., AND DOM, B. 1999. Focused crawling: a new approach to topicspecific Web resource discovery. In Proceedings of 8th International World Wide Web Conference (WWW8). Google Scholar
- CHARNIAK, E. 1997. Statistical techniques for natural language parsing. AI Magazine 18,4 (Winter).Google Scholar
- CHARNIAK, E. 1999. A Maximum-Entropy-Inspired Parser. Tech. Rep. CS-99-12 (Aug.), Brown University, Computer Science Dept. Google Scholar
- CHAUDHRI,V.AND R. 1999. Question Answering Systems: Papers from the 1999 Fall Symposium. Technical Report FS-98-04 (November), AAAI.Google Scholar
- CHOMSKY, N. 1965. Aspects of a Theory of Syntax. MIT Press, Cambridge, Mass.Google Scholar
- COLLINS, M. J. 1996. A New Statistical Parser Based on Bigram Lexical Dependencies. In Proceedings of the 34th Annual Meeting of the ACL (Santa Cruz, Calif ). Google Scholar
- ETZIONI, O. 1997. Moving up the information food chain: softbots as information carnivores. AI Maga., special issue, Summer 1997.Google Scholar
- GRINBERG, D., LAFFERTY,J.,AND SLEATOR, D. 1995. ARobust Parsing Algorithm for Link Grammars. In Proceedings of the Fourth International Workshop on Parsing Technologies (Prague, Sept.).Google Scholar
- HARABAGIU, S., MAIORANO,S.,AND PASCA, M. 2000. Experiments with Open-Domain Textual Question Answering. In Proceedings of COLING-2000 (Saarbruken Germany, Aug.). Google Scholar
- KATZ, B. 1997. From Sentence Processing to Information Access on the World Wide Web. In Natural Language Processing for the World Wide Web: Papers from the 1997 AAAI Spring Symposium, 77-94.Google Scholar
- KUPIEC, J. 1993. MURAX: A Robust Linguistic Approach for Question Answering Using an On-Line Encyclopedia. In Proceedings of the 16th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval (Pittsburgh, Pa. June 27-July 1). R. Korfhage, E. M. Rasmussen, and P. Willett, Eds., ACM, New York, 181-190. Google Scholar
- LITKOWSKI, K. 1999. Question-Answering Using Semantic Relation Triples. In Proceedings of the 8th Text Retrieval Conference (TREC-8). (National Institute of Standards and Technology, Gaithersburg MD), 349-356.Google Scholar
- MARCUS,M.P.,MARCINKIEWICZ,M.A.,AND SANTORINI, B. 1993. Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics 19, 313-330. Google Scholar
- MILLER, G. 1991. WordNet: An on-line lexical database. International Journal of Lexicography 3, 4, 235-312.Google Scholar
- RADEV, D. R., PRAGER,J.,AND SAMN, V. 1999. The Use of Predictive Annotation for Question Answering in TREC8. In Proceedings of the 8th Text Retrieval Conference (TREC-8). (National Institute of Standards and Technology, Gaithersburg MD), 399-411.Google Scholar
- SNEIDERS, E. 1999. Automated FAQ Answering: Continued Experience with Shallow Language Understanding. In Question Answering Systems. Papers from the 1999 AAAI Fall Symposium.Google Scholar
- SRIHARI,R.AND LI, W. 1999. Information Extraction Supported Question Answering. In Proceedings of the 8th Text Retrieval Conference (TREC-8). (National Institute of Standards and Technology, Gaithersburg MD), 185-196.Google Scholar
- TAYLOR, S. E., FRANCKENPOHL, H., AND PETTE, J. L. 1960. Grade level norms for the component of the fundamental reading skill. EDL Information and Research Bulletin No. 3. Huntington, N.Y.Google Scholar
- VOORHEES, E. 1994. Query expansion using lexical-semantic relations. In Proceedings of ACM SIGIR (Dublin, Ireland). Google Scholar
- VOORHEES,E.AND TICE, D. 1999. The TREC-8 Question Answering Track Evaluation, pp. 77-82. Department of Commerce, National Institute of Standards and Technology.Google Scholar
- VOORHEES,E.AND TICE, D. 2000. Building a question answering test collection. In Proceedings of the Twenty-Third Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York. Google Scholar
- WHITEHEAD, S. D. 1995. Auto-FAQ: An experiment in cyberspace leveraging. Computer Networks and ISDN Systems 28, 1-2 (Dec.), 137-146. Google Scholar
- ZAMIR,O.AND ETZIONI, O. 1999. A Dynamic Clustering Interface to Web Search Results. In Proceedings of the Eighth Int. WWW Conference. Google Scholar
Index Terms
- Scaling question answering to the web
Recommendations
Probabilistic question answering on the web
WWW '02: Proceedings of the 11th international conference on World Wide WebWeb-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language ...
Question Answering System Based on Web
ICICTA '12: Proceedings of the 2012 Fifth International Conference on Intelligent Computation Technology and AutomationThis paper summarizes the classification, implementation and evaluation of question answering system (QA). QA is divided into four categories: chat robot, QA based knowledge base, QA retrieval system and QA based on free text. Web QA system is composed ...
Research on Answer Extraction Method for Domain Question Answering System(QA)
CIS '09: Proceedings of the 2009 International Conference on Computational Intelligence and Security - Volume 01The domain knowledge has a direct impact on the result of question - answering (Q & A) in the restricted domain Question Answering System (QA). In this paper, a method of answer extraction for domain Chinese question-and-answer (Q&A) is proposed, which ...
Comments