Skip to main content

Open Access 2021 | Open Access | Buch

Buchtitelbild

Evaluating Information Retrieval and Access Tasks

NTCIR's Legacy of Research Impact

herausgegeben von: Prof. Tetsuya Sakai, Douglas W. Oard, Prof. Noriko Kando

Verlag: Springer Singapore

Buchreihe : The Information Retrieval Series

insite
SUCHEN

Über dieses Buch

This open access book summarizes the first two decades of the NII Testbeds and Community for Information access Research (NTCIR). NTCIR is a series of evaluation forums run by a global team of researchers and hosted by the National Institute of Informatics (NII), Japan. The book is unique in that it discusses not just what was done at NTCIR, but also how it was done and the impact it has achieved. For example, in some chapters the reader sees the early seeds of what eventually grew to be the search engines that provide access to content on the World Wide Web, today’s smartphones that can tailor what they show to the needs of their owners, and the smart speakers that enrich our lives at home and on the move. We also get glimpses into how new search engines can be built for mathematical formulae, or for the digital record of a lived human life.

Key to the success of the NTCIR endeavor was early recognition that information access research is an empirical discipline and that evaluation therefore lay at the core of the enterprise. Evaluation is thus at the heart of each chapter in this book. They show, for example, how the recognition that some documents are more important than others has shaped thinking about evaluation design. The thirty-three contributors to this volume speak for the many hundreds of researchers from dozens of countries around the world who together shaped NTCIR as organizers and participants.

This book is suitable for researchers, practitioners, and students—anyone who wants to learn about past and present evaluation efforts in information retrieval, information access, and natural language processing, as well as those who want to participate in an evaluation task or even to design and organize one.

Inhaltsverzeichnis

Frontmatter

Open Access

Chapter 1. Graded Relevance
Abstract
NTCIR was the first large-scale IR evaluation conference series to construct test collections with graded relevance assessments: the NTCIR-1 test collections from 1998 already featured relevant and partially relevant documents. In this chapter, I provide a survey on the use of graded relevance assessments and of graded relevance measures in the past NTCIR tasks which primarily tackled ranked retrieval. My survey shows that the majority of the past tasks fully utilised graded relevance by means of graded evaluation measures, but not all of them; interestingly, even a few relatively recent tasks chose to adhere to binary relevance measures. I conclude the chapter by a summary of my survey in table form.
Tetsuya Sakai

Open Access

Chapter 2. Experiments on Cross-Language Information Retrieval Using Comparable Corpora of Chinese, Japanese, and Korean Languages
Abstract
This paper describes research activities for exploring techniques of cross-language information retrieval (CLIR) during the NACSIS Test Collection for Information Retrieval/NII Testbeds and Community for Information access Research (NTCIR)-1 to NTCIR-6 evaluation cycles, which mainly focused on Chinese, Japanese, and Korean (CJK) languages. First, general procedures and techniques of CLIR are briefly reviewed. Second, document collections that were used for the research tasks and test collection construction for retrieval experiments are explained. Specifically, CLIR tasks from NTCIR-3 to NTCIR-6 utilized multilingual corpora consisting of newspaper articles that were published in Taiwan, Japan, and Korea during the same time periods. A set of articles can be considered a “pseudo” comparable corpus because many events or affairs are commonly covered across languages in the articles. Such comparable corpora are helpful for comparing the performance of CLIR between pairs of CJK and English. This comparison leads to deeper insights into CLIR techniques. NTCIR CLIR tasks have been built on the basis of test collections that incorporate such comparable corpora. We summarize the technical advances observed in these CLIR tasks at the end of the paper.
Kazuaki Kishida, Kuang-hua Chen

Open Access

Chapter 3. Text Summarization Challenge: An Evaluation Program for Text Summarization
Abstract
In Japan, the Text Summarization Challenge (TSC), the first text summarization evaluation of its kind, was conducted in 2000–2001 as a part of the NTCIR (NII-NACSIS Test Collection for IR Systems) Workshop. The purpose of the workshop was to facilitate collecting and sharing text data for summarization by researchers in the field and to clarify the issues of evaluation measures for summarization of Japanese texts. After that, TSC has been held every 18 months as a part of the NTCIR project. In this chapter, we describe our TSC series, the data used, and the evaluation methods for each task, and the features of TSC evaluation.
Hidetsugu Nanba, Tsutomu Hirao, Takahiro Fukushima, Manabu Okumura

Open Access

Chapter 4. Challenges in Patent Information Retrieval
Abstract
We organized tasks on patent information retrieval during the decade from NTCIR-3 to NTCIR-8. All of the tasks were ones that reflected real needs of professional patent searchers and used large numbers of patent documents. This chapter describes the designs of the tasks, the details of the test collections, and the challenges addressed in the research field of patent information retrieval.
Makoto Iwayama, Atsushi Fujii, Hidetsugu Nanba

Open Access

Chapter 5. Multi-modal Summarization
Abstract
Multi-modal summarization is a technology that provides users with abridgments of topics of interest. Such abridgments consist of organized text and informative graphics. These summarizations have two roles. One is to assist the users to review and understand their topics of interest. The other is to guide users both visually and verbally in their exploratory search. To establish this technology, it was necessary to integrate several research streams. These included information access, information extraction, and information visualization; all of these technologies had been developing rapidly since the beginning of the twenty-first century. MuST was a workshop, the main theme of which was research on multi-modal summarization of trend information. It was not an evaluation workshop and did not present the participants with a specific task, because at the time when the workshop was conducted, multi-modal summarization was merely an agglomeration of yet-to-be-developed technologies that had not yet been fully synthesized. Rather than sharing a task, the MuST workshop shared a data set. Making an annotated corpus shared as its unifying force, the workshop encouraged cooperative and competitive researches on trend information. Several innovations emerged from the workshop. These covered trend information extraction, visualization as information access interface and as data analysis method, linguistic summary generation from charts, and trend mining.
Tsuneaki Kato

Open Access

Chapter 6. Opinion Analysis Corpora Across Languages
Abstract
At NTCIR-6, 7, and 8, we included a new multilingual opinion analysis task (MOAT) that involved Japanese, English, and Chinese newspapers. This was the first task that compared the performance of sentiment retrieval strategies with common subtasks across languages. In this paper, we introduce the research question posed by NTCIR MOAT and present what has been achieved to date. We then describe the types of tasks and research that have involved our test collection both previously and in current research. Finally, we summarize our contributions and discuss future research directions.
Yohei Seki

Open Access

Chapter 7. Patent Translation
Abstract
The NTCIR patent translation task was the first task for the machine translation of patents that used large-scale patent parallel sentence pairs. In this chapter, we first present the history of machine translation; the contribution of evaluation workshops to machine translation research, and previous evaluation workshops; and the challenge of patent translation at the time of the first patent translation task at NTCIR. We then describe the innovations at NTCIR, including the sharing of research infrastructure, the progress of corpus-based machine translation technologies, and evaluation methods for patent translation. Finally, we outline the developments in machine translation technologies, including patent translation and remark on the future of patent translation.
Isao Goto

Open Access

Chapter 8. Component-Based Evaluation for Question Answering
Abstract
This chapter describes the component-based evaluation of automatic question answering (QA) systems, which was pioneered in the NTCIR-7 ACLIA challenge and has became a fundamental part of QA system development, especially for difficult real-world datasets which require a multi-strategy, multi-component approach. We summarize the history of component evaluation for QA and describe more recent work at Carnegie Mellon (on TREC Genomics, BioASQ, and LiveQA datasets) which has descended directly from our experiences in NTCIR.
Teruko Mitamura, Eric Nyberg

Open Access

Chapter 9. Temporal Information Access
Abstract
This chapter introduces the research background and details of temporal information access tasks in the NTCIR. The GeoTime task was the first attempt to evaluate temporal information retrieval as an extension of an information-retrieval-for-question-answering task. Temporalia was a task to investigate the role of temporal factors in a search.
Masaharu Yoshioka, Hideo Joho

Open Access

Chapter 10. SogouQ: The First Large-Scale Test Collection with Click Streams Used in a Shared-Task Evaluation
Abstract
Search logs are very precious for information retrieval studies. In this chapter, we will introduce a real Chinese query log dataset, SogouQ, which was released by SogouQ corporation in 2010 for the NTCIR-9 Intent task. SogouQ contains more than 30 million clicks collected in 2008. It is the first large-scale query logs used in a shared-task evaluation (i.e., the NTCIR tasks). SogouQ has been adopted in a number of follow-up evaluation tasks, NTCIR-10 Intent-2, NTCIR-11 IMine, NTCIR-12 IMine-2, as well as in several Chinese domestic tasks. Moreover, SogouQ has a broader impact on other research areas, such as natural language processing and social science. It has been acquired by more than 200 institutions.
Ruihua Song, Min Zhang, Cheng Luo, Tetsuya Sakai, Yiqun Liu, Zhicheng Dou

Open Access

Chapter 11. Evaluation of Information Access with Smartphones
Abstract
NTCIR 1CLICK and MobileClick are the earliest attempts toward test-collection-based evaluation for information access with smartphones. Those campaigns aimed to develop an IR system that outputs a short text summary for a given query, which is expected to fit a small screen and to satisfy users’ information needs without requiring much interaction. The textual output was evaluated on the basis of iUnits, pieces of relevant text for a given query, with consideration of users’ reading behaviors. This chapter begins with an introduction to NTCIR 1CLICK and MobileClick, explains the evaluation methodology and metrics such as S-measure and M-measure, and finally discusses the potential impacts of those evaluation campaigns.
Makoto P. Kato

Open Access

Chapter 12. Mathematical Information Retrieval
Abstract
We present an overview of the NTCIR Math Tasks organized during NTCIR-10, 11, and 12. These tasks are primarily dedicated to techniques for searching mathematical content with formula expressions. In this chapter, we first summarize the task design and introduce test collections generated in the tasks. We also describe the features and main challenges of mathematical information retrieval systems and discuss future perspectives in the field.
Akiko Aizawa, Michael Kohlhase

Open Access

Chapter 13. Experiments in Lifelog Organisation and Retrieval at NTCIR
Abstract
Lifelogging can be described as the process by which individuals use various software and hardware devices to gather large archives of multimodal personal data from multiple sources and store them in a personal data archive, called a lifelog. The Lifelog task at NTCIR was a comparative benchmarking exercise with the aim of encouraging research into the organisation and retrieval of data from multimodal lifelogs. The Lifelog task ran for over 4 years from NTCIR-12 until NTCIR-14 (2015.02–2019.06); it supported participants to submit to five subtasks, each tackling a different challenge related to lifelog retrieval. In this chapter, a motivation is given for the Lifelog task and a review of progress since NTCIR-12 is presented. Finally, the lessons learned and challenges within the domain of lifelog retrieval are presented.
Cathal Gurrin, Hideo Joho, Frank Hopfgartner, Liting Zhou, Rami Albatal, Graham Healy, Duc-Tien Dang Nguyen

Open Access

Chapter 14. The Future of Information Retrieval Evaluation
Abstract
Looking back over the storied history of NTCIR that is recounted in this volume, we can see many impactful contributions. As we look at the future, we might then ask what points of continuity and change we might reasonably anticipate. Beginning that discussion is the focus of this chapter.
Douglas W. Oard
Backmatter
Metadaten
Titel
Evaluating Information Retrieval and Access Tasks
herausgegeben von
Prof. Tetsuya Sakai
Douglas W. Oard
Prof. Noriko Kando
Copyright-Jahr
2021
Verlag
Springer Singapore
Electronic ISBN
978-981-15-5554-1
Print ISBN
978-981-15-5553-4
DOI
https://doi.org/10.1007/978-981-15-5554-1

Neuer Inhalt