Article

Free Access

Efficient construction of large test collections

Authors:
Gordon V. Cormack

Department of Computer Science, University of Waterloo, Waterloo, Ontario, Canada

Department of Computer Science, University of Waterloo, Waterloo, Ontario, Canada
View Profile

,
Christopher R. Palmer

Department of Computer Science, University of Waterloo, Waterloo, Ontario, Canada

Department of Computer Science, University of Waterloo, Waterloo, Ontario, Canada
View Profile

,
Charles L. A. Clarke

Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario, Canada

Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario, Canada
View Profile

SIGIR '98: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrievalAugust 1998Pages 282–289https://doi.org/10.1145/290941.291009

Published:01 August 1998Publication History

SIGIR '98: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval

Pages 282–289

References

1.C. L. A. Clarke and G. V. Cormack. Interactive substring retrival. In D. K. Harman and E. M. Voorhees, editors, information Technology: The Fifth Text REtrieval Conference (TREC-5), Gaithersburg, Maryland, November 1996. National Institute of Standards and Technology (NIST), United States Department of Commerce. Available electronically at http://trec .nist .gov.Google Scholar
2.C. L. A. Clarke, G. V. Cormack, and F. J. Burkowski. Shortest substring ranking. In D. K. Harman, editor, The Fourth Text REtrieval Conference (TREC-$), pages 295-304, Gaithersburg, Maryland, November 1995. National Institute of Standards and Technology (NIST), United States Department of Commerce. NIST Special Publication 500-238. Available electronically at http:/#tree, hist. gov.Google Scholar
3.G. V. Cormack, C. L. A. Clarke, C. 1t. Palmer, and S. S.-L. To. Passage based refinement. In Sixth Text REtrieval Conference (TREC-6), Gaithersburg, Maryland, November 1997. National Institute of Standards and Technology (NIST), United States Department of Commerce. Available electronically at http://tree .nisz .gov.Google Scholar
4.H. Gilbert and K. S. Jones. Statistical bases of relevance assessment for the 'ideal' information retrieval test collection. Technical report, Computer Laboratory, University of Cambridge, 1979. BL R&D Report 5481.Google Scholar
5.D. Harman. Overview of the first TREC conference. In 16th Annual International A CM SIGIR Conference on Research and Development in Information Retrieval, pages 36-47, Pittsburgh, PA, june 1993. Google ScholarDigital Library
6.D. Harman. Overview of the fourth Text RE- trieval Conference (TREC-4). In D. K. Harman, editor, The Fourth Text REtrieval Conference (TREC- #{), pages 1-23, Gaithersburg, Maryland, November 1995. National Institute of Standards and Technology (NIST), United States Department of Commerce. NIST Special Publication 500-236. Available electronically at http://tree .aist .gov.Google ScholarCross Ref
7.D. K. Harman, editor. The First Text REtrieval Conference (TREC-1), Gaithersburg, Maryland, November 1992. National Institute of Standards and Technology (NIST), United States Department of Commerce. NIST Special Publication 500-207.Google Scholar
8.D. K. Harman, editor. The Second Text RE- trieval Conference (TREC-#), Gaithersburg, Maryland, November 1993. National Institute of Standards and Technology (NIST), United States Department of Commerce. NIST Special Publication 500-215. Google ScholarDigital Library
9.D. K. Harmon, editor. Overview of the Third Text REtrieval Conference (TREC-3), Gaithersburg, Maryland, November 1994. National Institute of Standards and Technology (NIST), United States Department of Commerce. NIST Special Publication 500-225. Available electronically at http://trec, hist. gov.Google Scholar
10.D. K. Harmon. Overview of the third Text REtrieval Conference (TREC-3). In D. K. Harmon, editor, Overview of the Third Te#t REtrieval Conference (TREC-3), pages 1-19, Gaithersburg, Maryland, November 1994. National Institute of Standards and Technology (NIST), United States Department of Commerce. NIST Special Publication 500-225. Available electronically at http://trec .hist. gov.Google Scholar
11.D. K. Harmon, editor. The Fourth Text REtrieval Conference (TRBC-j), Gaithersburg, Maryland, November 1995. National Institute of Standards and Technology (NIST), United States Department of Commerce. NIST Special Publication 500-236. Available electronically at http://trec .nist. gov.Google Scholar
12.D.K. Harman and E. M. Voorhees, editors. Information Technology: The Fifth Text REtrieval Conference (TREC-5), Gaithersburg, Maryland, November 1996. National Institute of Standards and Technology (NIST), United States Department of Commerce. NIST Special Publication 500-238. Available electronically at http://trec.nist.gov.Google Scholar
13.D. K. Harman and E. M. Voorhees, editors. The Sixth Text REtrieval Conference (TREC-5), Gaithersburg, Maryland, November 1997. National Institute of Standards and Technology (NIST), United States Department of Commerce. Available electronically at http://tree .n~st. gov.Google Scholar
14.M. E. Lesk and G. Salton. Relevance assessments and retrieval system evaluation. Information Storage and Management, 4:343-359, 1966.Google ScholarCross Ref
15.E. V. Paul B. Kantor. Report on the TREC-5 confusion track. In D. K. Harmon, editor, Information Technology: The Fifth Text REtrieval Conference (TREC-5), pages 65-74, Gaithersburg, Maryland, November 1996. National Institute of Standards and Technology (NIST), United States Department of Commerce. NIST Special Publication 500-236. Available electronically at http://trec .nist. gov.Google Scholar
16.S. E. Robertson. The probability ranking principle in it. Journal of Documentation, 33:294-304, 1977.Google ScholarCross Ref
17.P. Sheridan, J. P. Ballerini, and P. Sch#iuble. Building a large multilingual test collection from comparable news documents. In G. Grefenstette, A. Smeaton, and P. Sheridan, editors, Workshop on Cross-Linguistic Information Retrieval, pages 56- 65. ACM SIGIR, Aug. 1996. Google ScholarDigital Library
18.K. Sparck Jones and C. J. Van Rijsbergen. Report on the need for and provision of an 'ideal' test collection. Technical report, University Computer Laboratory, Cambridge, 1975.Google Scholar
19.K. Sparck Jones and C. J. Van Rijsbergen. Information retrieval test collections. Journal of Documentation, 32(1):59-72, March 1976.Google ScholarCross Ref
20.J. Tague-Sutcliffe and J. Blustein. A statistical analysis of the TREC-3 data. In D. K. Harman, editor, Overview of the Third Text REtrieval Con}erence (TREC-3), pages 385-398, Gaithersburg, Maryland, November 1994. National Institute of Standards and Technology (NIST), United States Department of Commerce. NIST Special Publication 500-225. Available electronically at http://trec.nist, gov.Google Scholar
21.E. M. Voorhees. Variations in relevance judgements and the measurement of retrieval effectiveness. In #lst Annual International A CM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, August 1998. Google ScholarDigital Library
22.N. West. Applied Statistics for Marine Affairs Professionals. Praeger, Westport, CT, 1996.Google Scholar
23.J. Zobel. How reliable are the results of large-scale information retrieval experiments? In Zlst Annual International A CM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, August 1998. Google ScholarDigital Library

Index Terms

Efficient construction of large test collections
1. Information systems
  1. Information retrieval
  2. Information storage systems

Recommendations

On the Reusability of Personalized Test Collections
UMAP '17: Adjunct Publication of the 25th Conference on User Modeling, Adaptation and Personalization

Test collections for offline evaluation remain crucial for information retrieval research and industrial practice, yet reusability of test collections is under threat by different factors such as dynamic nature of data collections and new trends in ...
Read More
Test theory for assessing IR test collections
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

How good is an IR test collection? A series of papers in recent years has addressed the question by empirically enumerating the consistency of performance comparisons using alternate subsets of the collection. In this paper we propose using Test Theory, ...
Read More
Building Test Collections: An Interactive Guide for Students and Others Without Their Own Evaluation Conference Series
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

This is a full-day tutorial on building and validating test collections. The intended audience is advanced students who nd themselves in need of a test collection, or actually in the process of building a test collection, to support their own research. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '98: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
August 1998
394 pages
ISBN:1581130155
DOI:10.1145/290941
Chairmen:
W. Bruce Croft
Univ. of Massachusetts
,
Alistair Moffat
Univ. of Melbourne, Victoria, Australia
,
C. J. van Rijsbergen
Univ. of Glasgow, Scotland, UK
,
Ross Wilkinson
RMIT Univ., Australia and CSIRO
,
Justin Zobel
RMIT Univ., Australia
Copyright © 1998 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 August 1998
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 163
  Total Citations
  View Citations
- 1,051
  Total Downloads
- Downloads (Last 12 months)91
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Efficient construction of large test collections

SIGIR '98: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval

References

Cited By

Index Terms

Recommendations

On the Reusability of Personalized Test Collections

Test theory for assessing IR test collections

Building Test Collections: An Interactive Guide for Students and Others Without Their Own Evaluation Conference Series

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Efficient construction of large test collections

SIGIR '98: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval

References

Cited By

Index Terms

Recommendations

On the Reusability of Personalized Test Collections

Test theory for assessing IR test collections

Building Test Collections: An Interactive Guide for Students and Others Without Their Own Evaluation Conference Series

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media