nach oben

International Journal of Computer Vision

Erschienen in:

01.08.2014

Domain Adaptation for Structured Regression

verfasst von: Makoto Yamada, Leonid Sigal, Yi Chang

Erschienen in: International Journal of Computer Vision | Ausgabe 1-2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Discriminative regression models have proved effective for many vision applications (here we focus on 3D full-body and head pose estimation from image and depth data). However, dataset bias is common and is able to significantly degrade the performance of a trained model on target test sets. As we show, covariate shift, a form of unsupervised domain adaptation (USDA), can be used to address certain biases in this setting, but is unable to deal with more severe structural biases in the data. We propose an effective and efficient semi-supervised domain adaptation (SSDA) approach for addressing such more severe biases in the data. Proposed SSDA is a generalization of USDA, that is able to effectively leverage labeled data in the target domain when available. Our method amounts to projecting input features into a higher dimensional space (by construction well suited for domain adaptation) and estimating weights for the training samples based on the ratio of test and train marginals in that space. The resulting augmented weighted samples can then be used to learn a model of choice, alleviating the problems of bias in the data; as an example, we introduce SSDA twin Gaussian process regression (SSDA-TGP) model. With this model we also address the issue of data sharing, where we are able to leverage samples from certain activities (e.g., walking, jogging) to improve predictive performance on very different activities (e.g., boxing). In addition, we analyze the relationship between domain similarity and effectiveness of proposed USDA versus SSDA methods. Moreover, we propose a computationally efficient alternative to TGP (Bo and Sminchisescu 2010), and it’s variants, called the direct TGP. We show that our model outperforms a number of baselines, on two public datasets: HumanEva and ETH Face Pose Range Image Dataset. We can also achieve 8–15 times speedup in computation time, over the traditional formulation of TGP, using the proposed direct formulation, with little to no loss in performance.

Vorheriger Artikel Model-Driven Domain Adaptation on Product Manifolds for Unconstrained Face Recognition

Nächster Artikel Exploring Transfer Learning Approaches for Head Pose Classification from Multi-view Surveillance Images

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nur mit Berechtigung zugänglich

We use the term alignment loosely here, as in practice, the spaces are not explicitly aligned, but rather the augmented space is by construction better suited for learning the adapted model; this effect is achieved by implicitly assigning higher importance to labeled test samples over training samples.

\(\alpha = 1\) (i.e., \(w_1({\varvec{x}}) = \frac{p_{\mathrm {te}}({\varvec{x}})}{p_{\mathrm {tr}}({\varvec{x}})}\)) gives the full adaptation from \(p_{\mathrm {tr}}({\varvec{x}})\) to \(p_{\mathrm {te}}({\varvec{x}})\). However, since the importance weight \(w_1({\varvec{x}}) = \frac{p_{\mathrm {te}}({\varvec{x}})}{p_{\mathrm {tr}}({\varvec{x}})}\) can diverge to infinity under a rather simple setting, the estimation of \(w_1({\varvec{x}}) = \frac{p_{\mathrm {te}}({\varvec{x}})}{p_{\mathrm {tr}}({\varvec{x}})}\) is unstable and the covariate shift adaptation tends to be unstable (Shimodaira 2000). To cope with this instability issue, setting \(\alpha \) to \(0 < \alpha < 1\) is practically useful for stabilizing the covariate shift adaptation, even though it cannot give an unbiased model under covariate shift (Yamada et al. 2013).

Covariate shift assumption formally amounts to assuming that conditional distributions on the source and target domains are the same but the marginal distributions are different.

While it is possible to set \(\beta > 1\), this gives an even higher importance to the target domain samples (meanwhile largely ignoring contributions from the source domain samples), which with few target samples leads to overfitting.

Agarwal, A., & Triggs, B. (2006). Recovering 3D human pose from monocular images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(1), 44–58.CrossRef

Ali, S. M., & Silvey, S. D. (1966). A general class of coefficients of divergence of one distribution from another. Journal of the Royal Statistical Society, Series B, 28, 131–142.MATHMathSciNet

Bo, L., & Sminchisescu, C. (2010). Twin Gaussian processes for structured prediction. International Journal of Computer Vision, 87(1–2), 28–52.CrossRef

Breitenstein, M., Kuettel, D., Weise, T., Van Gool, L., & Pfister, H. (2008). Real-time face pose estimation from single range images. In Proceedings of CVPR (pp. 1–8).

Chen, Y., Kim, T.-K., & Cipolla, R. (2011). Silhouette-based object phenotype recognition using 3D shape priors. In Proceedings of ICCV (pp. 25–32).

Daumé, H. (2007). Frustratingly easy domain adaptation. In Proceedins of ACL (pp. 256–263).

Evgeniou, T., & Pontil, M. (2004). Regularized multi-task learning. In Proceedings of SIGKDD (pp. 109–117).

Fanelli, G., Gall, J., & Van Gool, L. (2011). Real time head pose estimation with random regression forests. In Proceedings of CVPR (pp. 617–624).

Girshick, R., Shotton, J., Kohli, P., Criminisi, A., & Fitzgibbon, A. (2011). Efficient regression of general-activity human poses from depth images. In Proceedings of ICCV (pp. 415–422).

Gong, B., Shi, Y., Sha, F., & Grauman, K. (2012). Geodesic flow kernel for unsupervised domain adaptation. In Proceedings of CVPR (pp. 2066–2073).

Gopalan, R., Li, R., & Chellappa, R. (2011). Domain adaptation for object recognition: An unsupervised approach. In Proceedings of ICCV (pp. 999–1006).

Jiang, J. (2007). A literature survey on domain adaptation of statistical classifiers.

Kanaujia, A., Sminchisescu, C., & Metaxas, D. (2007). Semi-supervised hierarchical models for 3D human pose reconstruction. In Proceedings of CVPR (pp. 1–8).

Khosla, A., Zhou, T., Malisiewicz, T., Efros, A., & Torralba, A. (2012). Undoing the damage of dataset bias. In Proceedings of ECCV (pp. 158–171).

Kulis, B., Saenko, K., & Darrell, T. (2011). What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In Proceedings of CVPR (pp. 1785–1792)

Lim, J., Salakhutdinov, R., & Torralba, A. (2011). Transfer learning by borrowing examples for multi class object detection. In Proceedings of NIPS (pp. 118–126).

Miller, E., Matsakis, N., & Viola, P. (2000). Learning from one example through shared densities of transforms. In Proceedings of CVPR (pp. 464–471).

Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1354–1359.CrossRef

Saenko, K., Kulis, B., Fritz, M., & Darrell, T. (2010). Adapting visual category models to new domains. In Proceedings of ECCV (pp. 213–226).

Shakhnarovich, G., Viola, P., & Darrell, T. (2003). Fast pose estimation with parameter-sensitive hashing. In Proceedings of ICCV (pp. 750–757).

Shimodaira, H. (2000). Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 90(2000), 227–244.CrossRefMATHMathSciNet

Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., & Blake, A. (2011). Real-time human pose recognition in parts from a single depth image. In Proceedings of CVPR (pp. 1297–1304).

Sigal, L., & Black, M. J. (2006). Humaneva: Synchronized video and motion capture dataset for evaluation of articulated human motion. In TR CS-06-08, Brown University.

Sigal, L., Balan, A., & Black, M. (2007). Combined discriminative and generative articulated pose and non-rigid shape estimation. In Proceedings of NIPS (pp. 1337–1344).

Sminchisescu, C., Kanaujia, A., & Metaxas, D. (2006). Learning joint top-down and bottom-up processes for 3D visual inference. In Proceedings of CVPR (pp. 1743–1752).

Sugiyama, M., Nakajima, S., Kashima, H., Buenau, P. V., & Kawanabe, M. (2008). Direct importance estimation with model selection and its application to covariate shift adaptation. In Proceedings of NIPS (pp. 1433–1440).

Sun, M., Kohli, P., & Shotton, J. (2012). Conditional regression forests for human pose estimation. In Proceedings f CVPR (pp. 3394–3401).

Torralba, A., & Efros, A. (2011). Unbiased look at dataset bias. In Proceedings of CVPR (pp. 1521–1528).

Torralba, A., Murphy, K. P., & Freeman, W. T. (2004). Sharing features: efficient boosting procedures for multi-class object detection. In Proceedings of CVPR (pp. 762–769).

Urtasun, R. & Darrell, T. (2008). Sparse probabilistic regression for activity-independent human pose inference. In Proceedings of CVPR (pp. 1–8).

Weise, T., Leibe, B., & Van Gool, L. (2007). Fast 3D scanning with automatic motion compensation. In Proceedings of CVPR (pp. 1–8).

Yamada, M., Sigal, L., & Raptis, M. (2012). No bias left behind: Covariate shift adaptation for discriminative 3D pose estimation. In Proceedings of ECCV (pp. 674–687).

Yamada, M., Suzuki, T., Kanamori, T., Hachiya, H., & Sugiyama, M. (2013). Relative density-ratio estimation for robust distribution comparison. Neural computation, 25(5), 1324–1370.CrossRefMathSciNet

Titel: Domain Adaptation for Structured Regression
verfasst von: Makoto Yamada
Leonid Sigal
Yi Chang
Publikationsdatum: 01.08.2014
Verlag: Springer US
Erschienen in: International Journal of Computer Vision / Ausgabe 1-2/2014
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-013-0689-x

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 1-2/2014

Learning Kernels for Unsupervised Domain Adaptation with Applications to Visual Object Recognition

Asymmetric and Category Invariant Feature Transformations for Domain Adaptation

Exploring Transfer Learning Approaches for Head Pose Classification from Multi-view Surveillance Images

Weakly-Supervised Cross-Domain Dictionary Learning for Visual Recognition

Guest Editor’s Introduction to the Special Issue on Domain Adaptation for Vision Applications

Harnessing Lab Knowledge for Real-World Action Recognition