Skip to main content

Open Access 2023 | Open Access | Buch

Buchtitelbild

Learning to Quantify

verfasst von: Andrea Esuli, Alessandro Fabris, Alejandro Moreo, Fabrizio Sebastiani

Verlag: Springer International Publishing

Buchreihe : The Information Retrieval Series

insite
SUCHEN

Über dieses Buch

This open access book provides an introduction and an overview of learning to quantify (a.k.a. “quantification”), i.e. the task of training estimators of class proportions in unlabeled data by means of supervised learning. In data science, learning to quantify is a task of its own related to classification yet different from it, since estimating class proportions by simply classifying all data and counting the labels assigned by the classifier is known to often return inaccurate (“biased”) class proportion estimates.

The book introduces learning to quantify by looking at the supervised learning methods that can be used to perform it, at the evaluation measures and evaluation protocols that should be used for evaluating the quality of the returned predictions, at the numerous fields of human activity in which the use of quantification techniques may provide improved results with respect to the naive use of classification techniques, and at advanced topics in quantification research.

The book is suitable to researchers, data scientists, or PhD students, who want to come up to speed with the state of the art in learning to quantify, but also to researchers wishing to apply data science technologies to fields of human activity (e.g., the social sciences, political science, epidemiology, market research) which focus on aggregate (“macro”) data rather than on individual (“micro”) data.

Inhaltsverzeichnis

Frontmatter

Open Access

Chapter 1. The Case for Quantification
Abstract
This chapter sets the stage for the rest of the book by introducing notions fundamental to quantification, such as class proportions, class distributions and their estimation, dataset shift, and the various subtypes of dataset shift which are relevant to the quantification endeavour. In this chapter we also argue why using classification techniques for estimating class distributions is suboptimal, and we then discuss why learning to quantify has evolved as a task of its own, rather than remaining a by-product of classification.
Andrea Esuli, Alessandro Fabris, Alejandro Moreo, Fabrizio Sebastiani

Open Access

Chapter 2. Applications of Quantification
Abstract
This chapter provides the motivation for what is to come in the rest of the book by describing the applications that quantification has been put at, ranging from improving classification accuracy in domain adaptation, to measuring and improving the fairness of classification systems with respect to a sensitive attribute, to supporting research and development in fields that are usually more concerned with aggregate data than with individual data, such as the social sciences, political science, epidemiology, market research, ecological modelling, and others.
Andrea Esuli, Alessandro Fabris, Alejandro Moreo, Fabrizio Sebastiani

Open Access

Chapter 3. Evaluation of Quantification Algorithms
Abstract
In this chapter we discuss the experimental evaluation of quantification systems. We look at evaluation measures for the various types of quantification systems (binary, single-label multiclass, multi-label multiclass, ordinal), but also at evaluation protocols for quantification, that essentially consist in ways to extract multiple testing samples for use in quantification evaluation from a single classification test set. The chapter ends with a discussion on how to perform model selection (i.e., hyperparameter optimization) in a quantification-specific way.
Andrea Esuli, Alessandro Fabris, Alejandro Moreo, Fabrizio Sebastiani

Open Access

Chapter 4. Methods for Learning to Quantify
Abstract
This chapter is possibly the central chapter of the book, and looks at the various supervised learning methods for learning to quantify that have been proposed over the years. These methods belong to two main categories, depending on whether they have an aggregative nature (i.e., they require the classification of all individual unlabelled items as an intermediate step) or a non-aggregative nature (i.e., they perform no classification of individual items). In turn, the aggregative methods may be seen as belonging to two main sub-categories, depending on whether the classification of individual unlabelled items is performed by classifiers trained via general-purpose learners or via special-purpose, quantification-oriented learners.
Andrea Esuli, Alessandro Fabris, Alejandro Moreo, Fabrizio Sebastiani

Open Access

Chapter 5. Advanced Topics
Abstract
In this chapter we look at a number of “advanced” (or niche) topics in quantification, including quantification for ordinal data, “regression quantification” (the task that stands to regression as “standard” quantification stands to classification), cross-lingual quantification for textual data, quantification for networked data, and quantification for streaming data. The chapter ends with a discussion on how to derive confidence intervals for the class prevalence estimates returned by quantification systems.
Andrea Esuli, Alessandro Fabris, Alejandro Moreo, Fabrizio Sebastiani

Open Access

Chapter 6. The Quantification Landscape
Abstract
This chapter looks at other aspects of the “quantification landscape” that have not been covered in the previous chapters, and discusses the evolution of quantification research, from its beginnings to the most recent quantification-based “shared tasks”; the landscape of quantification-based, publicly available software libraries; visualization tools specifically oriented to displaying the results of quantification-based experiments; and other tasks in data science that present important similarities with quantification. This chapter also presents the results of experiments, that we have carried out ourselves, in which we compare many of the methods discussed in Chapter 2 on a common testing infrastructure.
Andrea Esuli, Alessandro Fabris, Alejandro Moreo, Fabrizio Sebastiani

Open Access

Chapter 7. The Road Ahead
Abstract
This chapter concludes the book, discussing possible future developments in the quantification arena.
Andrea Esuli, Alessandro Fabris, Alejandro Moreo, Fabrizio Sebastiani
Backmatter
Metadaten
Titel
Learning to Quantify
verfasst von
Andrea Esuli
Alessandro Fabris
Alejandro Moreo
Fabrizio Sebastiani
Copyright-Jahr
2023
Electronic ISBN
978-3-031-20467-8
Print ISBN
978-3-031-20466-1
DOI
https://doi.org/10.1007/978-3-031-20467-8

Neuer Inhalt