Skip to main content
main-content

Über dieses Buch

This comprehensive text/reference presents a broad review of diverse domain adaptation (DA) methods for machine learning, with a focus on solutions for visual applications. The book collects together solutions and perspectives proposed by an international selection of pre-eminent experts in the field, addressing not only classical image categorization, but also other computer vision tasks such as detection, segmentation and visual attributes.

Topics and features: surveys the complete field of visual DA, including shallow methods designed for homogeneous and heterogeneous data as well as deep architectures; presents a positioning of the dataset bias in the CNN-based feature arena; proposes detailed analyses of popular shallow methods that addresses landmark data selection, kernel embedding, feature alignment, joint feature transformation and classifier adaptation, or the case of limited access to the source data; discusses more recent deep DA methods, including discrepancy-based adaptation networks and adversarial discriminative DA models; addresses domain adaptation problems beyond image categorization, such as a Fisher encoding adaptation for vehicle re-identification, semantic segmentation and detection trained on synthetic images, and domain generalization for semantic part detection; describes a multi-source domain generalization technique for visual attributes and a unifying framework for multi-domain and multi-task learning.

This authoritative volume will be of great interest to a broad audience ranging from researchers and practitioners, to students involved in computer vision, pattern recognition and machine learning.

Inhaltsverzeichnis

Frontmatter

Chapter 1. A Comprehensive Survey on Domain Adaptation for Visual Applications

The aim of this chapter is to give an overview of domain adaptation and transfer learning with a specific view to visual applications. After a general motivation, we first position domain adaptation in the more general transfer learning problem. Second, we try to address and analyze briefly the state-of-the-art methods for different types of scenarios, first describing the historical shallow methods, addressing both the homogeneous and heterogeneous domain adaptation methods. Third, we discuss the effect of the success of deep convolutional architectures which led to the new type of domain adaptation methods that integrate the adaptation within the deep architecture. Fourth, we review DA methods that go beyond image categorization, such as object detection, image segmentation, video analyses or learning visual attributes. We conclude the chapter with a section where we relate domain adaptation to other machine learning solutions.
Gabriela Csurka

Chapter 2. A Deeper Look at Dataset Bias

The presence of a bias in each image data collection has recently attracted a lot of attention in the computer vision community showing the limits in generalization of any learning method trained on a specific dataset. At the same time, with the rapid development of deep learning architectures, the activation values of Convolutional Neural Networks (CNN) are emerging as reliable and robust image descriptors. In this chapter we propose to verify the potential of the CNN features when facing the dataset bias problem. With this purpose we introduce a large testbed for cross-dataset analysis and we discuss the challenges faced to create two comprehensive experimental setups by aligning twelve existing image databases. We conduct a series of analyses looking at how the datasets differ among each other and verifying the performance of existing debiasing methods under different representations. We learn important lessons on which part of the dataset bias problem can be considered solved and which open questions still need to be tackled.
Tatiana Tommasi, Novi Patricia, Barbara Caputo, Tinne Tuytelaars

Shallow Domain Adaptation Methods

Frontmatter

Chapter 3. Geodesic Flow Kernel and Landmarks: Kernel Methods for Unsupervised Domain Adaptation

Domain Adaptation (DA) aims to correct the mismatch in statistical properties between the source domain on which a classifier is trained and the target domain to which the classifier is to be applied. In this chapter, we address the challenging scenario of unsupervised domain adaptation, where the target domain does not provide any annotated data to assist in adapting the classifier. Our strategy is to learn robust features which are resilient to the mismatch across domains allowing to construct classifiers that will perform well on the target domain. To this end, we propose novel kernel learning approaches to inferring such features for adaptation. Concretely, we explore two closely related directions. On one hand, we propose unsupervised learning of a Geodesic Flow Kernel (GFK) which summarizes the inner products in an infinite sequence of feature subspaces that smoothly interpolate between the source and target domains. On the other hand, we propose supervised learning of a kernel that discriminatively combines multiple base GFKs to model the source and the target domains at fine-grained granularities. In particular, each base kernel pivots on a different set of landmarks—the most useful data instances that reveal the similarity between the source and target domains, thus bridging them to achieve adaptation. The proposed approaches are computationally convenient and capable of learning features/kernels and classifiers discriminatively without the need of labeled target data. We show through extensive empirical studies, using standard benchmark object recognition datasets, that our approaches outperform a variety of competing methods.
Boqing Gong, Kristen Grauman, Fei Sha

Chapter 4. Unsupervised Domain Adaptation Based on Subspace Alignment

Subspace-based domain adaptation methods have been very successful in the context of image recognition. In this chapter, we discuss methods using Subspace Alignment (SA). They are based on a mapping function which aligns the source subspace with the target one, so as to obtain a domain invariant feature space. The solution of the corresponding optimization problem can be obtained in closed form, leading to a simple to implement and fast algorithm. The only hyperparameter involved corresponds to the dimension of the subspaces. We give two methods, SA and SA-MLE, for setting this variable. SA is a purely linear method. As a nonlinear extension, Landmarks-based Kernelized Subspace Alignment (LSSA) first projects the data nonlinearly based on a set of landmarks, which have been selected so as to reduce the discrepancy between the domains.
Basura Fernando, Rahaf Aljundi, Rémi Emonet, Amaury Habrard, Marc Sebban, Tinne Tuytelaars

Chapter 5. Learning Domain Invariant Embeddings by Matching Distributions

One of the characteristics of the domain shift problem is that the source and target data have been drawn from different distributions. A natural approach to addressing this problem therefore consists of learning an embedding of the source and target data such that they have similar distributions in the new space. In this chapter, we study several methods that follow this approach. At the core of these methods lies the notion of distance between two distributions. We first discuss domain adaptation (DA) techniques that rely on the Maximum Mean Discrepancy to measure such a distance. We then study the use of alternative distribution distance measures within one specific Domain Adaptation framework. In this context, we focus on f-divergences, and in particular on the KL divergence and the Hellinger distance. Throughout the chapter, we evaluate the different methods and distance measures on the task of visual object recognition and compare them against related baselines on a standard DA benchmark dataset.
Mahsa Baktashmotlagh, Mehrtash Harandi, Mathieu Salzmann

Chapter 6. Adaptive Transductive Transfer Machines: A Pipeline for Unsupervised Domain Adaptation

This chapter addresses the problem of transfer learning by unsupervised domain adaptation . We introduce a pipeline which is designed for the case where the joint distribution of samples and labels \(P(\mathbf {X}^{src},\mathbf {Y}^{src})\) in the source domain is assumed to be different, but related to that of a target domain \(P(\mathbf {X}^{ trg },\mathbf {Y}^{ trg })\), but labels \(\mathbf {Y}^{ trg }\) are not available for the target set. This is a problem of Transductive Transfer Learning. In contrast to other methodologies in this book, our method combines steps that adapt both the marginal and the conditional distributions of the data.
Nazli Farajidavar, Teofilo de Campos, Josef Kittler

Chapter 7. What to Do When the Access to the Source Data Is Constrained?

A large majority of existing domain adaptation methods makes an assumption of freely available labeled source and unlabeled target data. They exploit the discrepancy between their distributions and build representations common to both target and source domains. In reality, such a simplifying assumption rarely holds, since source data are routinely a subject of legal and contractual constraints between data owners and data customers. Despite a limited access to source domain data, decision-making procedures might be available in the form of, e.g., classification rules trained on the source and made ready for a direct deployment and later reuse. In other cases, the owner of a source data is allowed to share a few representative examples such as class means. The aim of this chapter is therefore to address the domain adaptation problem in such constrained real world applications, i.e. where the reuse of source domain data is limited to classification rules or a few representative examples. As a solution, we extend recent techniques based on feature corruption and their marginalization, both considering supervised and unsupervised domain adaptation settings. The proposed models are tested and compared on private and publicly available source datasets showing significant performance gains despite the absence of the whole source data and shortage of labeled target data.
Gabriela Csurka, Boris Chidlovskii, Stéphane Clinchant

Deep Domain Adaptation Methods

Frontmatter

Chapter 8. Correlation Alignment for Unsupervised Domain Adaptation

In this chapter, we present CORrelation ALignment (CORAL), a simple yet effective method for unsupervised domain adaptation. CORAL minimizes domain shift by aligning the second-order statistics of source and target distributions, without requiring any target labels. In contrast to subspace manifold methods, it aligns the original feature distributions of the source and target domains, rather than the bases of lower-dimensional subspaces. It is also much simpler than other distribution matching methods. CORAL performs remarkably well in extensive evaluations on standard benchmark datasets. We first describe a solution that applies a linear transformation to source features to align them with target features before classifier training. For linear classifiers, we propose to equivalently apply CORAL to the classifier weights, leading to added efficiency when the number of classifiers is small but the number and dimensionality of target examples are very high. The resulting CORAL Linear Discriminant Analysis (CORAL-LDA) outperforms LDA by a large margin on standard domain adaptation benchmarks. Finally, we extend CORAL to learn a nonlinear transformation that aligns correlations of layer activations in deep neural networks (DNNs). The resulting Deep CORAL approach works seamlessly with DNNs and achieves state-of-the-art performance on standard benchmark datasets. Our code is available at: https://​github.​com/​VisionLearningGr​oup/​CORAL.
Baochen Sun, Jiashi Feng, Kate Saenko

Chapter 9. Simultaneous Deep Transfer Across Domains and Tasks

Recent reports suggest that a generic supervised deep CNN model trained on a large-scale dataset reduces, but does not remove, dataset bias . Fine-tuning deep models in a new domain can require a significant amount of labeled data, which for many applications is simply not available. We propose a new CNN architecture to exploit unlabeled and sparsely labeled target domain data. Our approach simultaneously optimizes for domain invariance to facilitate domain transfer and uses a soft label distribution matching loss to transfer information between tasks. Our proposed adaptation method offers empirical performance which exceeds previously published results on two standard benchmark visual domain adaptation tasks, evaluated across supervised and semi-supervised adaptation settings.
Judy Hoffman, Eric Tzeng, Trevor Darrell, Kate Saenko

Chapter 10. Domain-Adversarial Training of Neural Networks

We introduce a representation learning approach for domain adaptation, in which data at training and test time come from similar but different distributions. Our approach is directly inspired by the theory on domain adaptation suggesting that, for effective domain transfer to be achieved, predictions must be made based on features that cannot discriminate between the training (source) and test (target) domains. The approach implements this idea in the context of neural network architectures that are trained on labeled data from the source domain and unlabeled data from the target domain (no labeled target-domain data is necessary). As the training progresses, the approach promotes the emergence of features that are (i) discriminative for the main learning task on the source domain and (ii) indiscriminate with respect to the shift between the domains. We show that this adaptation behavior can be achieved in almost any feed-forward model by augmenting it with few standard layers and a new Gradient Reversal Layer. The resulting augmented architecture can be trained using standard backpropagation, and can thus be implemented with little effort using any of the deep learning packages. We demonstrate the success of our approach for image classification, where state-of-the-art domain adaptation performance on standard benchmarks is achieved. We also validate the approach for descriptor learning task in the context of person re-identification application.
Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, Victor Lempitsky

Beyond Image Classification

Frontmatter

Chapter 11. Unsupervised Fisher Vector Adaptation for Re-identification

Matching and recognizing objects in images and videos, with varying imaging conditions, are a challenging problems. We are particularly interested in the unsupervised setting, i.e., when we do not have labeled data to adapt to the new conditions. Our focus in this work is on the Fisher Vector framework which has been shown to be a state-of-the-art patch encoding technique. Fisher Vectors primarily encode patch statistics by measuring first and second-order statistics with respect to an a priori learned generative model. In this work, we show that it is possible to reduce the domain impact on the Fisher Vector representation by adapting the generative model parameters to the new conditions using unsupervised model adaptation techniques borrowed from the speech community. We explain under which conditions the domain influence is canceled out and show experimentally on two in-house license plate matching databases that the proposed approach improves accuracy.
Usman Tariq, Jose A. Rodriguez-Serrano, Florent Perronnin

Chapter 12. Semantic Segmentation of Urban Scenes via Domain Adaptation of SYNTHIA

Vision-based semantic segmentation in urban scenarios is a key functionality for autonomous driving. Recent revolutionary results of deep convolutional neural networks (CNNs) foreshadow the advent of reliable classifiers to perform such visual tasks. However, CNNs require learning of many parameters from raw images; thus, having a sufficient amount of diverse images with class annotations is needed. These annotations are obtained via cumbersome, human labor which is particularly challenging for semantic segmentation since pixel-level annotations are required. In this chapter, we propose to use a combination of a virtual world to automatically generate realistic synthetic images with pixel-level annotations, and domain adaptation to transfer the models learned to correctly operate in real scenarios. We address the question of how useful synthetic data can be for semantic segmentation—in particular, when using a CNN paradigm. In order to answer this question we have generated a synthetic collection of diverse urban images, named SYNTHIA, with automatically generated class annotations and object identifiers. We use SYNTHIA in combination with publicly available real-world urban images with manually provided annotations. Then, we conduct experiments with CNNs that show that combining SYNTHIA with simple domain adaptation techniques in the training stage significantly improves performance on semantic segmentation.
German Ros, Laura Sellart, Gabriel Villalonga, Elias Maidanik, Francisco Molero, Marc Garcia, Adriana Cedeño, Francisco Perez, Didier Ramirez, Eduardo Escobar, Jose Luis Gomez, David Vazquez, Antonio M. Lopez

Chapter 13. From Virtual to Real World Visual Perception Using Domain Adaptation—The DPM as Example

Supervised learning tends to produce more accurate classifiers than unsupervised learning in general. This implies that training data is preferred with annotations. When addressing visual perception challenges, such as localizing certain object classes within an image, the learning of the involved classifiers turns out to be a practical bottleneck. The reason is that, at least, we have to frame object examples with bounding boxes in thousands of images. A priori, the more complex the model is regarding its number of parameters, the more annotated examples are required. This annotation task is performed by human oracles, which ends up in inaccuracies and errors in the annotations (aka ground truth) since the task is inherently very cumbersome and sometimes ambiguous. As an alternative, we have pioneered the use of virtual worlds for collecting such annotations automatically and with high precision. However, since the models learned with virtual data must operate in the real world, we still need to perform domain adaptation (DA). In this chapter, we revisit the DA of a Deformable Part-Based Model (DPM) as an exemplifying case of virtual- to real-world DA. As a use case, we address the challenge of vehicle detection for driver assistance, using different publicly available virtual-world data. While doing so, we investigate questions such as how does the domain gap behave due to virtual-vs-real data with respect to dominant object appearance per domain, as well as the role of photo-realism in the virtual world .
Antonio M. López, Jiaolong Xu, José L. Gómez, David Vázquez, Germán Ros

Chapter 14. Generalizing Semantic Part Detectors Across Domains

The recent success of deep learning methods is partially due to large quantities of annotated data for increasingly big variety of categories. However, indefinitely acquiring large amounts of annotations is not a sustainable process, and one can wonder if there exists a volume of annotations beyond which a task can be considered as solved or at least saturated. In this work, we study this crucial question for the task of detecting semantic parts which are often seen as a natural way to share knowledge between categories. To this end, on a large dataset of 15,000 images from 100 different animal classes annotated with semantic parts, we consider the two following research questions: (i) are semantic parts really visually shareable between classes? and (ii) how many annotations are required to learn a model that generalizes well enough to unseen categories? To answer these questions we thoroughly test active learning and DA techniques, and we study their generalization properties to parts from unseen classes when they are learned from a limited number of domains and example images. One of our conclusions is that, for a majority of the domains, part annotations transfer well, and that, performance of the semantic part detection task on this dataset reaches 98% of the accuracy of the fully annotated scenario by providing only a few thousand examples.
David Novotny, Diane Larlus, Andrea Vedaldi

Beyond Domain Adaptation: Unifying Perspectives

Frontmatter

Chapter 15. A Multisource Domain Generalization Approach to Visual Attribute Detection

Attributes possess appealing properties and benefit many computer vision problems, such as object recognition, learning with humans in the loop, and image retrieval. Whereas the existing work mainly pursues utilizing attributes for various computer vision problems, we contend that the most basic problem—how to accurately and robustly detect attributes from images—has been left underexplored. Especially, the existing work rarely explicitly tackles the need that attribute detectors should generalize well across different categories, including those previously unseen. Noting that this is analogous to the objective of multisource domain generalization, if we treat each category as a domain, we provide a novel perspective to attribute detection and propose to gear the techniques in multisource domain generalization for the purpose of learning cross-category generalizable attribute detectors. We validate our understanding and approach with extensive experiments on four challenging datasets and two different problems.
Chuang Gan, Tianbao Yang, Boqing Gong

Chapter 16. Unifying Multi-domain Multitask Learning: Tensor and Neural Network Perspectives

Multi-domain learning aims to benefit from simultaneously learning across several different but related domains. In this chapter, we propose a single framework that unifies multi-domain learning (MDL) and the related but better studied area of multitask learning (MTL). By exploiting the concept of a semantic descriptor we show how our framework encompasses various classic and recent MDL/MTL algorithms as special cases with different semantic descriptor encodings. As a second contribution, we present a higher order generalization of this framework, capable of simultaneous multitask-multi-domain learning. This generalization has two mathematically equivalent views in multilinear algebra and gated neural networks, respectively. Moreover, by exploiting the semantic descriptor, it provides neural networks the capability of zero-shot learning (ZSL), where a classifier is generated for an unseen class without any training data; as well as zero-shot domain adaptation (ZSDA), where a model is generated for an unseen domain without any training data. In practice, this framework provides a powerful yet easy to implement method that can be flexibly applied to MTL, MDL, ZSL, and ZSDA.
Yongxin Yang, Timothy M. Hospedales

Backmatter

Weitere Informationen

Premium Partner

BranchenIndex Online

Die B2B-Firmensuche für Industrie und Wirtschaft: Kostenfrei in Firmenprofilen nach Lieferanten, Herstellern, Dienstleistern und Händlern recherchieren.

Whitepaper

- ANZEIGE -

Best Practices für die Mitarbeiter-Partizipation in der Produktentwicklung

Unternehmen haben das Innovationspotenzial der eigenen Mitarbeiter auch außerhalb der F&E-Abteilung erkannt. Viele Initiativen zur Partizipation scheitern in der Praxis jedoch häufig. Lesen Sie hier  - basierend auf einer qualitativ-explorativen Expertenstudie - mehr über die wesentlichen Problemfelder der mitarbeiterzentrierten Produktentwicklung und profitieren Sie von konkreten Handlungsempfehlungen aus der Praxis.
Jetzt gratis downloaden!

Bildnachweise