Skip to main content
Erschienen in: International Journal of Computer Vision 1-2/2014

01.08.2014

Domain Adaptation for Structured Regression

verfasst von: Makoto Yamada, Leonid Sigal, Yi Chang

Erschienen in: International Journal of Computer Vision | Ausgabe 1-2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Discriminative regression models have proved effective for many vision applications (here we focus on 3D full-body and head pose estimation from image and depth data). However, dataset bias is common and is able to significantly degrade the performance of a trained model on target test sets. As we show, covariate shift, a form of unsupervised domain adaptation (USDA), can be used to address certain biases in this setting, but is unable to deal with more severe structural biases in the data. We propose an effective and efficient semi-supervised domain adaptation (SSDA) approach for addressing such more severe biases in the data. Proposed SSDA is a generalization of USDA, that is able to effectively leverage labeled data in the target domain when available. Our method amounts to projecting input features into a higher dimensional space (by construction well suited for domain adaptation) and estimating weights for the training samples based on the ratio of test and train marginals in that space. The resulting augmented weighted samples can then be used to learn a model of choice, alleviating the problems of bias in the data; as an example, we introduce SSDA twin Gaussian process regression (SSDA-TGP) model. With this model we also address the issue of data sharing, where we are able to leverage samples from certain activities (e.g., walking, jogging) to improve predictive performance on very different activities (e.g., boxing). In addition, we analyze the relationship between domain similarity and effectiveness of proposed USDA versus SSDA methods. Moreover, we propose a computationally efficient alternative to TGP (Bo and Sminchisescu 2010), and it’s variants, called the direct TGP. We show that our model outperforms a number of baselines, on two public datasets: HumanEva and ETH Face Pose Range Image Dataset. We can also achieve 8–15 times speedup in computation time, over the traditional formulation of TGP, using the proposed direct formulation, with little to no loss in performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
We use the term alignment loosely here, as in practice, the spaces are not explicitly aligned, but rather the augmented space is by construction better suited for learning the adapted model; this effect is achieved by implicitly assigning higher importance to labeled test samples over training samples.
 
2
\(\alpha = 1\) (i.e., \(w_1({\varvec{x}}) = \frac{p_{\mathrm {te}}({\varvec{x}})}{p_{\mathrm {tr}}({\varvec{x}})}\)) gives the full adaptation from \(p_{\mathrm {tr}}({\varvec{x}})\) to \(p_{\mathrm {te}}({\varvec{x}})\). However, since the importance weight \(w_1({\varvec{x}}) = \frac{p_{\mathrm {te}}({\varvec{x}})}{p_{\mathrm {tr}}({\varvec{x}})}\) can diverge to infinity under a rather simple setting, the estimation of \(w_1({\varvec{x}}) = \frac{p_{\mathrm {te}}({\varvec{x}})}{p_{\mathrm {tr}}({\varvec{x}})}\) is unstable and the covariate shift adaptation tends to be unstable (Shimodaira 2000). To cope with this instability issue, setting \(\alpha \) to \(0 < \alpha < 1\) is practically useful for stabilizing the covariate shift adaptation, even though it cannot give an unbiased model under covariate shift (Yamada et al. 2013).
 
3
Covariate shift assumption formally amounts to assuming that conditional distributions on the source and target domains are the same but the marginal distributions are different.
 
4
While it is possible to set \(\beta > 1\), this gives an even higher importance to the target domain samples (meanwhile largely ignoring contributions from the source domain samples), which with few target samples leads to overfitting.
 
Literatur
Zurück zum Zitat Agarwal, A., & Triggs, B. (2006). Recovering 3D human pose from monocular images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(1), 44–58.CrossRef Agarwal, A., & Triggs, B. (2006). Recovering 3D human pose from monocular images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(1), 44–58.CrossRef
Zurück zum Zitat Ali, S. M., & Silvey, S. D. (1966). A general class of coefficients of divergence of one distribution from another. Journal of the Royal Statistical Society, Series B, 28, 131–142.MATHMathSciNet Ali, S. M., & Silvey, S. D. (1966). A general class of coefficients of divergence of one distribution from another. Journal of the Royal Statistical Society, Series B, 28, 131–142.MATHMathSciNet
Zurück zum Zitat Bo, L., & Sminchisescu, C. (2010). Twin Gaussian processes for structured prediction. International Journal of Computer Vision, 87(1–2), 28–52.CrossRef Bo, L., & Sminchisescu, C. (2010). Twin Gaussian processes for structured prediction. International Journal of Computer Vision, 87(1–2), 28–52.CrossRef
Zurück zum Zitat Breitenstein, M., Kuettel, D., Weise, T., Van Gool, L., & Pfister, H. (2008). Real-time face pose estimation from single range images. In Proceedings of CVPR (pp. 1–8). Breitenstein, M., Kuettel, D., Weise, T., Van Gool, L., & Pfister, H. (2008). Real-time face pose estimation from single range images. In Proceedings of CVPR (pp. 1–8).
Zurück zum Zitat Chen, Y., Kim, T.-K., & Cipolla, R. (2011). Silhouette-based object phenotype recognition using 3D shape priors. In Proceedings of ICCV (pp. 25–32). Chen, Y., Kim, T.-K., & Cipolla, R. (2011). Silhouette-based object phenotype recognition using 3D shape priors. In Proceedings of ICCV (pp. 25–32).
Zurück zum Zitat Daumé, H. (2007). Frustratingly easy domain adaptation. In Proceedins of ACL (pp. 256–263). Daumé, H. (2007). Frustratingly easy domain adaptation. In Proceedins of ACL (pp. 256–263).
Zurück zum Zitat Evgeniou, T., & Pontil, M. (2004). Regularized multi-task learning. In Proceedings of SIGKDD (pp. 109–117). Evgeniou, T., & Pontil, M. (2004). Regularized multi-task learning. In Proceedings of SIGKDD (pp. 109–117).
Zurück zum Zitat Fanelli, G., Gall, J., & Van Gool, L. (2011). Real time head pose estimation with random regression forests. In Proceedings of CVPR (pp. 617–624). Fanelli, G., Gall, J., & Van Gool, L. (2011). Real time head pose estimation with random regression forests. In Proceedings of CVPR (pp. 617–624).
Zurück zum Zitat Girshick, R., Shotton, J., Kohli, P., Criminisi, A., & Fitzgibbon, A. (2011). Efficient regression of general-activity human poses from depth images. In Proceedings of ICCV (pp. 415–422). Girshick, R., Shotton, J., Kohli, P., Criminisi, A., & Fitzgibbon, A. (2011). Efficient regression of general-activity human poses from depth images. In Proceedings of ICCV (pp. 415–422).
Zurück zum Zitat Gong, B., Shi, Y., Sha, F., & Grauman, K. (2012). Geodesic flow kernel for unsupervised domain adaptation. In Proceedings of CVPR (pp. 2066–2073). Gong, B., Shi, Y., Sha, F., & Grauman, K. (2012). Geodesic flow kernel for unsupervised domain adaptation. In Proceedings of CVPR (pp. 2066–2073).
Zurück zum Zitat Gopalan, R., Li, R., & Chellappa, R. (2011). Domain adaptation for object recognition: An unsupervised approach. In Proceedings of ICCV (pp. 999–1006). Gopalan, R., Li, R., & Chellappa, R. (2011). Domain adaptation for object recognition: An unsupervised approach. In Proceedings of ICCV (pp. 999–1006).
Zurück zum Zitat Jiang, J. (2007). A literature survey on domain adaptation of statistical classifiers. Jiang, J. (2007). A literature survey on domain adaptation of statistical classifiers.
Zurück zum Zitat Kanaujia, A., Sminchisescu, C., & Metaxas, D. (2007). Semi-supervised hierarchical models for 3D human pose reconstruction. In Proceedings of CVPR (pp. 1–8). Kanaujia, A., Sminchisescu, C., & Metaxas, D. (2007). Semi-supervised hierarchical models for 3D human pose reconstruction. In Proceedings of CVPR (pp. 1–8).
Zurück zum Zitat Khosla, A., Zhou, T., Malisiewicz, T., Efros, A., & Torralba, A. (2012). Undoing the damage of dataset bias. In Proceedings of ECCV (pp. 158–171). Khosla, A., Zhou, T., Malisiewicz, T., Efros, A., & Torralba, A. (2012). Undoing the damage of dataset bias. In Proceedings of ECCV (pp. 158–171).
Zurück zum Zitat Kulis, B., Saenko, K., & Darrell, T. (2011). What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In Proceedings of CVPR (pp. 1785–1792) Kulis, B., Saenko, K., & Darrell, T. (2011). What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In Proceedings of CVPR (pp. 1785–1792)
Zurück zum Zitat Lim, J., Salakhutdinov, R., & Torralba, A. (2011). Transfer learning by borrowing examples for multi class object detection. In Proceedings of NIPS (pp. 118–126). Lim, J., Salakhutdinov, R., & Torralba, A. (2011). Transfer learning by borrowing examples for multi class object detection. In Proceedings of NIPS (pp. 118–126).
Zurück zum Zitat Miller, E., Matsakis, N., & Viola, P. (2000). Learning from one example through shared densities of transforms. In Proceedings of CVPR (pp. 464–471). Miller, E., Matsakis, N., & Viola, P. (2000). Learning from one example through shared densities of transforms. In Proceedings of CVPR (pp. 464–471).
Zurück zum Zitat Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1354–1359.CrossRef Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1354–1359.CrossRef
Zurück zum Zitat Saenko, K., Kulis, B., Fritz, M., & Darrell, T. (2010). Adapting visual category models to new domains. In Proceedings of ECCV (pp. 213–226). Saenko, K., Kulis, B., Fritz, M., & Darrell, T. (2010). Adapting visual category models to new domains. In Proceedings of ECCV (pp. 213–226).
Zurück zum Zitat Shakhnarovich, G., Viola, P., & Darrell, T. (2003). Fast pose estimation with parameter-sensitive hashing. In Proceedings of ICCV (pp. 750–757). Shakhnarovich, G., Viola, P., & Darrell, T. (2003). Fast pose estimation with parameter-sensitive hashing. In Proceedings of ICCV (pp. 750–757).
Zurück zum Zitat Shimodaira, H. (2000). Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 90(2000), 227–244.CrossRefMATHMathSciNet Shimodaira, H. (2000). Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 90(2000), 227–244.CrossRefMATHMathSciNet
Zurück zum Zitat Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., & Blake, A. (2011). Real-time human pose recognition in parts from a single depth image. In Proceedings of CVPR (pp. 1297–1304). Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., & Blake, A. (2011). Real-time human pose recognition in parts from a single depth image. In Proceedings of CVPR (pp. 1297–1304).
Zurück zum Zitat Sigal, L., & Black, M. J. (2006). Humaneva: Synchronized video and motion capture dataset for evaluation of articulated human motion. In TR CS-06-08, Brown University. Sigal, L., & Black, M. J. (2006). Humaneva: Synchronized video and motion capture dataset for evaluation of articulated human motion. In TR CS-06-08, Brown University.
Zurück zum Zitat Sigal, L., Balan, A., & Black, M. (2007). Combined discriminative and generative articulated pose and non-rigid shape estimation. In Proceedings of NIPS (pp. 1337–1344). Sigal, L., Balan, A., & Black, M. (2007). Combined discriminative and generative articulated pose and non-rigid shape estimation. In Proceedings of NIPS (pp. 1337–1344).
Zurück zum Zitat Sminchisescu, C., Kanaujia, A., & Metaxas, D. (2006). Learning joint top-down and bottom-up processes for 3D visual inference. In Proceedings of CVPR (pp. 1743–1752). Sminchisescu, C., Kanaujia, A., & Metaxas, D. (2006). Learning joint top-down and bottom-up processes for 3D visual inference. In Proceedings of CVPR (pp. 1743–1752).
Zurück zum Zitat Sugiyama, M., Nakajima, S., Kashima, H., Buenau, P. V., & Kawanabe, M. (2008). Direct importance estimation with model selection and its application to covariate shift adaptation. In Proceedings of NIPS (pp. 1433–1440). Sugiyama, M., Nakajima, S., Kashima, H., Buenau, P. V., & Kawanabe, M. (2008). Direct importance estimation with model selection and its application to covariate shift adaptation. In Proceedings of NIPS (pp. 1433–1440).
Zurück zum Zitat Sun, M., Kohli, P., & Shotton, J. (2012). Conditional regression forests for human pose estimation. In Proceedings f CVPR (pp. 3394–3401). Sun, M., Kohli, P., & Shotton, J. (2012). Conditional regression forests for human pose estimation. In Proceedings f CVPR (pp. 3394–3401).
Zurück zum Zitat Torralba, A., & Efros, A. (2011). Unbiased look at dataset bias. In Proceedings of CVPR (pp. 1521–1528). Torralba, A., & Efros, A. (2011). Unbiased look at dataset bias. In Proceedings of CVPR (pp. 1521–1528).
Zurück zum Zitat Torralba, A., Murphy, K. P., & Freeman, W. T. (2004). Sharing features: efficient boosting procedures for multi-class object detection. In Proceedings of CVPR (pp. 762–769). Torralba, A., Murphy, K. P., & Freeman, W. T. (2004). Sharing features: efficient boosting procedures for multi-class object detection. In Proceedings of CVPR (pp. 762–769).
Zurück zum Zitat Urtasun, R. & Darrell, T. (2008). Sparse probabilistic regression for activity-independent human pose inference. In Proceedings of CVPR (pp. 1–8). Urtasun, R. & Darrell, T. (2008). Sparse probabilistic regression for activity-independent human pose inference. In Proceedings of CVPR (pp. 1–8).
Zurück zum Zitat Weise, T., Leibe, B., & Van Gool, L. (2007). Fast 3D scanning with automatic motion compensation. In Proceedings of CVPR (pp. 1–8). Weise, T., Leibe, B., & Van Gool, L. (2007). Fast 3D scanning with automatic motion compensation. In Proceedings of CVPR (pp. 1–8).
Zurück zum Zitat Yamada, M., Sigal, L., & Raptis, M. (2012). No bias left behind: Covariate shift adaptation for discriminative 3D pose estimation. In Proceedings of ECCV (pp. 674–687). Yamada, M., Sigal, L., & Raptis, M. (2012). No bias left behind: Covariate shift adaptation for discriminative 3D pose estimation. In Proceedings of ECCV (pp. 674–687).
Zurück zum Zitat Yamada, M., Suzuki, T., Kanamori, T., Hachiya, H., & Sugiyama, M. (2013). Relative density-ratio estimation for robust distribution comparison. Neural computation, 25(5), 1324–1370.CrossRefMathSciNet Yamada, M., Suzuki, T., Kanamori, T., Hachiya, H., & Sugiyama, M. (2013). Relative density-ratio estimation for robust distribution comparison. Neural computation, 25(5), 1324–1370.CrossRefMathSciNet
Metadaten
Titel
Domain Adaptation for Structured Regression
verfasst von
Makoto Yamada
Leonid Sigal
Yi Chang
Publikationsdatum
01.08.2014
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 1-2/2014
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-013-0689-x

Weitere Artikel der Ausgabe 1-2/2014

International Journal of Computer Vision 1-2/2014 Zur Ausgabe