Building a question answering test collection

Authors:
Ellen M. Voorhees

National Institute of Standards and Technology, 100 Bureau Drive, STOP 8940, Gaithersburg, MD

National Institute of Standards and Technology, 100 Bureau Drive, STOP 8940, Gaithersburg, MD
View Profile

,
Dawn M. Tice

National Institute of Standards and Technology, 100 Bureau Drive, STOP 8940, Gaithersburg, MD

National Institute of Standards and Technology, 100 Bureau Drive, STOP 8940, Gaithersburg, MD
View Profile

SIGIR '00: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrievalJuly 2000Pages 200–207https://doi.org/10.1145/345508.345577

Published:01 July 2000Publication History

SIGIR '00: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval

Pages 200–207

ABSTRACT

The TREC-8 Question Answering (QA) Track was the first large-scale evaluation of domain-independent question answering systems. In addition to fostering research on the QA task, the track was used to investigate whether the evaluation methodology used for document retrieval is appropriate for a different natural language processing task. As with document relevance judging, assessors had legitimate differences of opinions as to whether a response actually answers a question, but comparative evaluation of QA systems was stable despite these differences. Creating a reusable QA test collection is fundamentally more difficult than creating a document retrieval test collection since the QA task has no equivalent to document identifiers.

References

1.D.E. Appelt, J.R. Hobbs, J. Bear, D. Israel, M. Kameyama, A. Kehler, D. Martin, K. Myers, and M. Tyson. SRI International FASTUS system MUC-6 test results and analysis. In Proceedings of the Sixth Message Understanding ConferenCe (MUC-6), pages 237-248. Morgan Kaufmann, 1995. Google ScholarDigital Library
2.BBN Systems and Technologies. BBN: Description of the PLUM system as used for MUC-6. In Proceedings of the Sixth Message Understanding Conference (MUC-6), pages 55-69. Morgan Kaufmann, 1995. Google ScholarDigital Library
3.Eric Breck, John Burger, Lisa Ferro, David House, Marc Light, and Indeueet Mani. A sys called Qanda. In Proceedings of the Eighth Text REtrieval Conference (TREC- 8), pages 4.43-451, November 1999. Notebook draft.Google Scholar
4.Robin D. Burke, Kristian J. Hammond, Vladirnir A. Kulyukin, Steven L. Lytinen, Nonko Tomuro, and Scott Schoenberg. Questions answering from frequently-asked question files: Experiences with the FAQ Finder system. Technical Report TR-97-05, The University of Chicago, Computer Science Department, June 1997. Google ScholarDigital Library
5.Paul Cohen, Robert Schrag, Eric Jones, Adam Pease, Albert Lin, Barbara Starr, David Gunning, and Murray Burke. The DARPA high-performance knowledge bases project. AI Magazine, pages 25-49, Winter 1998.Google Scholar
6.Laura L. Downey and Dawn M. Tice. A usability case study using TREC and ZPRISE. Information Processing and Management, 35(5):589-603, 1999. Google ScholarDigital Library
7.Boris Katz. From sentence processing to information access on the world wide web. Paper presented at the AAAI Spnng Symposium on Natural Language Processing for the World Wide Web, 1997. Electronic version at http ://www.ai.ait.edu/ people/boris/webaccess.Google Scholar
8.Julian Kupiec. MURAX: A robust linguistic approach for question answering using an on-line encyclopedia. In Robert Korfage, Edie Rasmussen, and Peter Willett, editors, Proceedings of the Sixteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 181-190, 1993. Special issue of the SIG1R FORUM. Google ScholarDigital Library
9.M.E. Lesk and G. Salton. Relevance assessments and retrieval system evaluation. Information Storage and Retrieval. 4:343-359, 1969.Google ScholarCross Ref
10.Joel Martin and Chris Lankester. Ask Me Tomorrow: The University of Ottawa question answering system. In Proceedings of the Eighth Text REtrieval Conference (TREC- 8), pages 575-583, November 1999. Notebook draft.Google Scholar
11.John O'Connor. Answer-passage retrieval by text searching. Journal of the American Society for Information Science, pages 227-239, July 1980.Google ScholarCross Ref
12.Linda Schamber. Relevance and information behavior. Annual Review of in formation Science and Technology, 29:3- 48, 1994.Google Scholar
13.Alan Stuart. Kendalr s tau. In Samuel Kotz and Norman L. Johnson, editors, Encyclopedia of Statistical Sciences, volume 4, pages 367-369. John Wiley & Sons, 1983.Google Scholar
14.Ellen M. Voorhees. Variations in relevance judgments and the measurement of retrieval effectiveness. In W. Bruce Croft, Alistair Moffat, C.J. van Rijsbergen, Ross Wilkinson, and Justin Zobel, editors, Proceedings of the 21stAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 315- 323, Melbourne, Australia, August 1998. ACM Press, New York. Google ScholarDigital Library
15.Ellen M. Voorhees. Special issue: The sixth Text REtrieval Conference (TREC-6). Information Processing and Management, 36(1), January 2000. Google ScholarDigital Library
16.Ellen M. Voorhees and Dawn M. Tice. The TREC-8 question answering track evaluation. In E.M. Voorhees and D.K. Harman, editors, Proceedings of the Eighth Text RE- trieval Conference (TREC-8 ). Electronic version available at http://trec.nist.gov/pubs.htrul, 2000.Google Scholar
17.B. Webber. Question answering. In Stuart C. Shapiro, editor, Encyclopedia of Articficial Intelligence, volume 2, pages 814.-822. Wiley, 1987.Google Scholar
18.Terry Winograd. Five lectures on artificial intelligence. In A. Zampolli, editor, Lingusitic Structures Processing, volume 5 of Fundamental Studies in Computer Science, pages 399-520. North Holland, 1977.Google Scholar
19.W. A. Woods. Lunar rocks in natural english: Explorations in natural language question answering. In A. Zampolli, editor, Lingusitic Structures Processing, volume 5 of Fundamental Studies in Computer Science, pages 521-569. North Holland, 1977.Google Scholar

Index Terms

Building a question answering test collection
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis
2. Information systems
  1. Information retrieval

Recommendations

Building a reusable test collection for question answering
Research Articles

In contrast to traditional information retrieval systems, which return ranked lists of documents that users must manually browse through, a question answering system attempts to directly answer natural language questions posed by the user. Although such ...
Read More
Quality-aware collaborative question answering: methods and evaluation
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining

Community Question Answering (QA) portals contain questions and answers contributed by hundreds of millions of users. These databases of questions and answers are of great value if they can be used directly to answer questions from any user. In this ...
Read More
Building a web test collection using social media
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Community Question Answering (CQA) platforms contain a large number of questions and associated answers. Answerers sometimes include URLs as part of the answers to provide further information. This paper describes a novel way of building a test ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '00: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
July 2000
396 pages
ISBN:1581132263
DOI:10.1145/345508
Chairmen:
Emmanuel Yannakoudakis
Athens Univ. of Economics and Business, Greece
,
Nicholas J. Belkin
Rutgers Univ.
,
Mun-Kew Leong
Kent Ridge Digital Labs
,
Peter Ingwersen
Royal School of Library and Information Science
Copyright © 2000 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 July 2000
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 170
  Total Citations
  View Citations
- 506
  Total Downloads
- Downloads (Last 12 months)221
- Downloads (Last 6 weeks)39
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Building a question answering test collection

SIGIR '00: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Building a reusable test collection for question answering

Quality-aware collaborative question answering: methods and evaluation

Building a web test collection using social media