Skip to main content

Open Access 2018 | Open Access | Buch

Buchtitelbild

Entity-Oriented Search

insite
SUCHEN

Über dieses Buch

This open access book covers all facets of entity-oriented search—where “search” can be interpreted in the broadest sense of information access—from a unified point of view, and provides a coherent and comprehensive overview of the state of the art. It represents the first synthesis of research in this broad and rapidly developing area. Selected topics are discussed in-depth, the goal being to establish fundamental techniques and methods as a basis for future research and development. Additional topics are treated at a survey level only, containing numerous pointers to the relevant literature. A roadmap for future research, based on open issues and challenges identified along the way, rounds out the book.

The book is divided into three main parts, sandwiched between introductory and concluding chapters. The first two chapters introduce readers to the basic concepts, provide an overview of entity-oriented search tasks, and present the various types and sources of data that will be used throughout the book. Part I deals with the core task of entity ranking: given a textual query, possibly enriched with additional elements or structural hints, return a ranked list of entities. This core task is examined in a number of different variants, using both structured and unstructured data collections, and numerous query formulations. In turn, Part II is devoted to the role of entities in bridging unstructured and structured data. Part III explores how entities can enable search engines to understand the concepts, meaning, and intent behind the query that the user enters into the search box, and how they can provide rich and focused responses (as opposed to merely a list of documents)—a process known as semantic search. The final chapter concludes the book by discussing the limitations of current approaches, and suggesting directions for future research.
Researchers and graduate students are the primary target audience of this book. A general background in information retrieval is sufficient to follow the material, including an understanding of basic probability and statistics concepts as well as a basic knowledge of machine learning concepts and supervised learning algorithms.

Inhaltsverzeichnis

Frontmatter

Open Access

Chapter 1. Introduction
Abstract
Entity-oriented search is the search paradigm of organizing and accessing information centered around entities, and their attributes and relationships. This introductory chapter defines what an entity is, identifies prominent contexts for entity-oriented search, presents a number of specific tasks, puts the subject into a historical perspective, and lays the foundations for the rest of the book.
Krisztian Balog

Open Access

Chapter 2. Meet the Data
Abstract
This chapter introduces the different types of data sources, from unstructured to structured, that will be used in the remainder of the book. Specifically, we discuss the web, Wikipedia, and knowledge bases. We further introduce standard datasets and provide pointers to tools and resources.
Krisztian Balog

Entity Ranking

Frontmatter

Open Access

Chapter 3. Term-Based Models for Entity Ranking
Abstract
Ad hoc entity retrieval is the task of answering a free text query with a ranked list of entities. The main idea behind our approaches in this chapter can be summarized as follows: If textual representations can be constructed for entities, then the ranking of these representations (“entity descriptions”) becomes straightforward by building on traditional document retrieval techniques. Accordingly, the bulk of the work presented in this chapter revolves around assembling term-based entity representations from various sources, ranging from unstructured documents to structured knowledge bases. We also discuss evaluation methodology and standard test collections.
Krisztian Balog

Open Access

Chapter 4. Semantically Enriched Models for Entity Ranking
Abstract
Perhaps the most exciting challenge and opportunity in entity retrieval is how to leverage entity-specific properties—attributes, types, and relationships—to improve retrieval performance. In this chapter, we take a departure from purely term-based approaches toward semantically enriched retrieval models. We look at a number of specific entity retrieval tasks that have been studied at various benchmarking campaigns. Specifically, these tasks are ad hoc entity retrieval, list search, related entity finding, and similar entity search. Additionally, we also consider measures of (static) entity importance.
Krisztian Balog

Bridging Text and Structure

Frontmatter

Open Access

Chapter 5. Entity Linking
Abstract
Being able to identify entities in a document is a key step toward understanding what the document is about. Entity linking refers to the process of annotating an input text with entity identifiers from a reference knowledge repository. We present a canonical pipeline approach to entity linking that consists of mention detection, candidate selection, and disambiguation components. Then, we look at each of these components in detail. We further discuss evaluation methodology, test collections, and publicly available entity linking systems.
Krisztian Balog

Open Access

Chapter 6. Populating Knowledge Bases
Abstract
Knowledge base population refers to the task of discovering new facts about entities from a large text corpus, and augmenting a knowledge base with these facts. We start this chapter by giving a brief overview of the broader problem area of extracting structured information from unstructured data. Then, we present a two-step approach that facilitates knowledge base population. In step one, an incoming document stream is filtered to identify documents that potentially contain new facts about a given entity. In step two, the filtered documents are processed for extracting new facts.
Krisztian Balog

Semantic Search

Frontmatter

Open Access

Chapter 7. Understanding Information Needs
Abstract
Understanding what the user is looking for is at the heart of delivering a quality search experience. The focus of this chapter is on obtaining semantically enriched representations of search queries with the help of knowledge repositories. Specifically, we (1) identify the types or categories of entities that are targeted by the query, (2) recognize specific entity mentions in queries and annotate them with unique identifiers from the underlying knowledge repository, and (3) automatically generate query templates from a search log, which then can provide structured interpretations of queries.
Krisztian Balog

Open Access

Chapter 8. Leveraging Entities in Document Retrieval
Abstract
This chapter focuses on the classic problem of ad hoc document retrieval and discusses how entities may be leveraged to improve retrieval performance. Entities facilitate a semantic understanding of both the user’s information need, as expressed by the keyword query, and of the document’s content. We present three different families of approaches: (1) expansion-based methods, which utilize entities as a source of expansion terms to enrich the representation of the query; (2) projection-based methods, which treat entities as a latent layer, while leaving the original document/query representations intact; and (3) entity-based methods, which consider explicitly the entities that are recognized in documents, and embrace entity-based representations in “duet” with traditional term-based representations.
Krisztian Balog

Open Access

Chapter 9. Utilizing Entities for an Enhanced Search Experience
Abstract
This chapter presents a selection of topics, where entities are utilized with the overall aim of improving the users’ search experiences. First, we discuss techniques for assisting users with articulating their information needs, including query assistance services and specialized query building interfaces. Next, we turn to the question of result presentation and introduce entity cards. Finally, we study entity recommendation methods that present users with contextual suggestions, encourage exploration, and allow for serendipitous discoveries.
Krisztian Balog

Open Access

Chapter 10. Conclusions and Future Directions
Abstract
Today, the importance of entities has been broadly recognized and entities have become first-class citizens in many information access systems, including web, mobile, and enterprise search; question answering; and personal digital assistants. Entities have also become a meeting point for several research communities, including that of information retrieval, natural language processing, databases, and the Semantic Web. This final chapter concludes the book by summarizing progress, discussing limitations of current approaches, and pointing out potential future research directions.
Krisztian Balog
Backmatter
Metadaten
Titel
Entity-Oriented Search
verfasst von
Krisztian Balog
Copyright-Jahr
2018
Electronic ISBN
978-3-319-93935-3
Print ISBN
978-3-319-93933-9
DOI
https://doi.org/10.1007/978-3-319-93935-3

Neuer Inhalt