ABSTRACT
Community Question Answering (CQA) platforms contain a large number of questions and associated answers. Answerers sometimes include URLs as part of the answers to provide further information. This paper describes a novel way of building a test collection for web search by exploiting the link information from this type of social media data. We propose to build the test collection by regarding CQA questions as queries and the associated linked web pages as relevant documents. To evaluate this approach, we collect approximately ten thousand CQA queries, whose answers contained links to ClueWeb09 documents after spam filtering. Experimental results using this collection show that the relative effectiveness between different retrieval models on the ClueWeb-CQA query set is consistent with that on the TREC Web Track query sets, confirming the reliability of our test collection. Further analysis shows that the large number of queries generated through this approach compensates for the sparse relevance judgments in determining significant differences.
- C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In Proc. of SIGIR, SIGIR '04, pages 25--32, 2004. Google ScholarDigital Library
- S. Buttcher, C. L. A. Clarke, P. C. K. Yeung, and I. Soboroff. Reliable information retrieval evaluation with incomplete and biased judgements. In Proc. of SIGIR, SIGIR '07, pages 63--70, 2007. Google ScholarDigital Library
- B. Carterette, J. Allan, and R. Sitaraman. Minimal test collections for retrieval evaluation. In Proc. of SIGIR, SIGIR '06, pages 268--275, 2006. Google ScholarDigital Library
- B. Carterette, V. Pavlu, E. Kanoulas, J. A. Aslam, and J. Allan. Evaluation over thousands of queries. In Proc. of SIGIR, SIGIR '08, pages 651--658, 2008. Google ScholarDigital Library
- G. V. Cormack, M. D. Smucker, and C. L. A. Clarke. Efficient and effective spam filtering and re-ranking for large web datasets. CoRR, abs/1004.5168, 2010.Google Scholar
- S. Huston and W. B. Croft. Evaluating verbose query processing techniques. In Proc. of SIGIR, SIGIR '10, pages 291--298, 2010. Google ScholarDigital Library
- K. Jones, C. Van Rijsbergen, B. L. Research, and D. Dept. Report on the Need for and Provision of an Ideal Information Retrieval Test Collection. British Library Research and Development reports. 1975.Google Scholar
- V. Lavrenko and W. B. Croft. Relevance based language models. In Proc. of SIGIR, SIGIR '01, pages 120--127, 2001. Google ScholarDigital Library
- D. Metzler and W. B. Croft. A markov random field model for term dependencies. In Proc. of SIGIR, SIGIR '05, pages 472--479, 2005. Google ScholarDigital Library
- D. Metzler and W. B. Croft. Latent concept expansion using markov random fields. In Proc. of SIGIR, SIGIR '07, pages 311--318, 2007. Google ScholarDigital Library
Index Terms
- Building a web test collection using social media
Recommendations
Predicting web searcher satisfaction with existing community-based answers
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information RetrievalCommunity-based Question Answering (CQA) sites, such as Yahoo! Answers, Baidu Knows, Naver, and Quora, have been rapidly growing in popularity. The resulting archives of posted answers to questions, in Yahoo! Answers alone, already exceed in size 1 ...
Using graded-relevance metrics for evaluating community QA answer selection
WSDM '11: Proceedings of the fourth ACM international conference on Web search and data miningCommunity Question Answering (CQA) sites such as Yahoo! Answers have emerged as rich knowledge resources for information seekers. However, answers posted to CQA sites can be irrelevant, incomplete, redundant, incorrect, biased, ill-formed or even ...
A Test Collection for Ad-hoc Dataset Retrieval
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information RetrievalThis paper introduces a new test collection for ad-hoc dataset retrieval, which have been developed through a shared task called Data Search in the fifteenth NTCIR. This test collection consists of dataset collections derived from the US and Japanese ...
Comments