Skip to main content
Top
Published in: International Journal of Computer Vision 1-2/2014

01-08-2014

Domain Adaptation for Structured Regression

Authors: Makoto Yamada, Leonid Sigal, Yi Chang

Published in: International Journal of Computer Vision | Issue 1-2/2014

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Discriminative regression models have proved effective for many vision applications (here we focus on 3D full-body and head pose estimation from image and depth data). However, dataset bias is common and is able to significantly degrade the performance of a trained model on target test sets. As we show, covariate shift, a form of unsupervised domain adaptation (USDA), can be used to address certain biases in this setting, but is unable to deal with more severe structural biases in the data. We propose an effective and efficient semi-supervised domain adaptation (SSDA) approach for addressing such more severe biases in the data. Proposed SSDA is a generalization of USDA, that is able to effectively leverage labeled data in the target domain when available. Our method amounts to projecting input features into a higher dimensional space (by construction well suited for domain adaptation) and estimating weights for the training samples based on the ratio of test and train marginals in that space. The resulting augmented weighted samples can then be used to learn a model of choice, alleviating the problems of bias in the data; as an example, we introduce SSDA twin Gaussian process regression (SSDA-TGP) model. With this model we also address the issue of data sharing, where we are able to leverage samples from certain activities (e.g., walking, jogging) to improve predictive performance on very different activities (e.g., boxing). In addition, we analyze the relationship between domain similarity and effectiveness of proposed USDA versus SSDA methods. Moreover, we propose a computationally efficient alternative to TGP (Bo and Sminchisescu 2010), and it’s variants, called the direct TGP. We show that our model outperforms a number of baselines, on two public datasets: HumanEva and ETH Face Pose Range Image Dataset. We can also achieve 8–15 times speedup in computation time, over the traditional formulation of TGP, using the proposed direct formulation, with little to no loss in performance.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
We use the term alignment loosely here, as in practice, the spaces are not explicitly aligned, but rather the augmented space is by construction better suited for learning the adapted model; this effect is achieved by implicitly assigning higher importance to labeled test samples over training samples.
 
2
\(\alpha = 1\) (i.e., \(w_1({\varvec{x}}) = \frac{p_{\mathrm {te}}({\varvec{x}})}{p_{\mathrm {tr}}({\varvec{x}})}\)) gives the full adaptation from \(p_{\mathrm {tr}}({\varvec{x}})\) to \(p_{\mathrm {te}}({\varvec{x}})\). However, since the importance weight \(w_1({\varvec{x}}) = \frac{p_{\mathrm {te}}({\varvec{x}})}{p_{\mathrm {tr}}({\varvec{x}})}\) can diverge to infinity under a rather simple setting, the estimation of \(w_1({\varvec{x}}) = \frac{p_{\mathrm {te}}({\varvec{x}})}{p_{\mathrm {tr}}({\varvec{x}})}\) is unstable and the covariate shift adaptation tends to be unstable (Shimodaira 2000). To cope with this instability issue, setting \(\alpha \) to \(0 < \alpha < 1\) is practically useful for stabilizing the covariate shift adaptation, even though it cannot give an unbiased model under covariate shift (Yamada et al. 2013).
 
3
Covariate shift assumption formally amounts to assuming that conditional distributions on the source and target domains are the same but the marginal distributions are different.
 
4
While it is possible to set \(\beta > 1\), this gives an even higher importance to the target domain samples (meanwhile largely ignoring contributions from the source domain samples), which with few target samples leads to overfitting.
 
Literature
go back to reference Agarwal, A., & Triggs, B. (2006). Recovering 3D human pose from monocular images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(1), 44–58.CrossRef Agarwal, A., & Triggs, B. (2006). Recovering 3D human pose from monocular images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(1), 44–58.CrossRef
go back to reference Ali, S. M., & Silvey, S. D. (1966). A general class of coefficients of divergence of one distribution from another. Journal of the Royal Statistical Society, Series B, 28, 131–142.MATHMathSciNet Ali, S. M., & Silvey, S. D. (1966). A general class of coefficients of divergence of one distribution from another. Journal of the Royal Statistical Society, Series B, 28, 131–142.MATHMathSciNet
go back to reference Bo, L., & Sminchisescu, C. (2010). Twin Gaussian processes for structured prediction. International Journal of Computer Vision, 87(1–2), 28–52.CrossRef Bo, L., & Sminchisescu, C. (2010). Twin Gaussian processes for structured prediction. International Journal of Computer Vision, 87(1–2), 28–52.CrossRef
go back to reference Breitenstein, M., Kuettel, D., Weise, T., Van Gool, L., & Pfister, H. (2008). Real-time face pose estimation from single range images. In Proceedings of CVPR (pp. 1–8). Breitenstein, M., Kuettel, D., Weise, T., Van Gool, L., & Pfister, H. (2008). Real-time face pose estimation from single range images. In Proceedings of CVPR (pp. 1–8).
go back to reference Chen, Y., Kim, T.-K., & Cipolla, R. (2011). Silhouette-based object phenotype recognition using 3D shape priors. In Proceedings of ICCV (pp. 25–32). Chen, Y., Kim, T.-K., & Cipolla, R. (2011). Silhouette-based object phenotype recognition using 3D shape priors. In Proceedings of ICCV (pp. 25–32).
go back to reference Daumé, H. (2007). Frustratingly easy domain adaptation. In Proceedins of ACL (pp. 256–263). Daumé, H. (2007). Frustratingly easy domain adaptation. In Proceedins of ACL (pp. 256–263).
go back to reference Evgeniou, T., & Pontil, M. (2004). Regularized multi-task learning. In Proceedings of SIGKDD (pp. 109–117). Evgeniou, T., & Pontil, M. (2004). Regularized multi-task learning. In Proceedings of SIGKDD (pp. 109–117).
go back to reference Fanelli, G., Gall, J., & Van Gool, L. (2011). Real time head pose estimation with random regression forests. In Proceedings of CVPR (pp. 617–624). Fanelli, G., Gall, J., & Van Gool, L. (2011). Real time head pose estimation with random regression forests. In Proceedings of CVPR (pp. 617–624).
go back to reference Girshick, R., Shotton, J., Kohli, P., Criminisi, A., & Fitzgibbon, A. (2011). Efficient regression of general-activity human poses from depth images. In Proceedings of ICCV (pp. 415–422). Girshick, R., Shotton, J., Kohli, P., Criminisi, A., & Fitzgibbon, A. (2011). Efficient regression of general-activity human poses from depth images. In Proceedings of ICCV (pp. 415–422).
go back to reference Gong, B., Shi, Y., Sha, F., & Grauman, K. (2012). Geodesic flow kernel for unsupervised domain adaptation. In Proceedings of CVPR (pp. 2066–2073). Gong, B., Shi, Y., Sha, F., & Grauman, K. (2012). Geodesic flow kernel for unsupervised domain adaptation. In Proceedings of CVPR (pp. 2066–2073).
go back to reference Gopalan, R., Li, R., & Chellappa, R. (2011). Domain adaptation for object recognition: An unsupervised approach. In Proceedings of ICCV (pp. 999–1006). Gopalan, R., Li, R., & Chellappa, R. (2011). Domain adaptation for object recognition: An unsupervised approach. In Proceedings of ICCV (pp. 999–1006).
go back to reference Jiang, J. (2007). A literature survey on domain adaptation of statistical classifiers. Jiang, J. (2007). A literature survey on domain adaptation of statistical classifiers.
go back to reference Kanaujia, A., Sminchisescu, C., & Metaxas, D. (2007). Semi-supervised hierarchical models for 3D human pose reconstruction. In Proceedings of CVPR (pp. 1–8). Kanaujia, A., Sminchisescu, C., & Metaxas, D. (2007). Semi-supervised hierarchical models for 3D human pose reconstruction. In Proceedings of CVPR (pp. 1–8).
go back to reference Khosla, A., Zhou, T., Malisiewicz, T., Efros, A., & Torralba, A. (2012). Undoing the damage of dataset bias. In Proceedings of ECCV (pp. 158–171). Khosla, A., Zhou, T., Malisiewicz, T., Efros, A., & Torralba, A. (2012). Undoing the damage of dataset bias. In Proceedings of ECCV (pp. 158–171).
go back to reference Kulis, B., Saenko, K., & Darrell, T. (2011). What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In Proceedings of CVPR (pp. 1785–1792) Kulis, B., Saenko, K., & Darrell, T. (2011). What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In Proceedings of CVPR (pp. 1785–1792)
go back to reference Lim, J., Salakhutdinov, R., & Torralba, A. (2011). Transfer learning by borrowing examples for multi class object detection. In Proceedings of NIPS (pp. 118–126). Lim, J., Salakhutdinov, R., & Torralba, A. (2011). Transfer learning by borrowing examples for multi class object detection. In Proceedings of NIPS (pp. 118–126).
go back to reference Miller, E., Matsakis, N., & Viola, P. (2000). Learning from one example through shared densities of transforms. In Proceedings of CVPR (pp. 464–471). Miller, E., Matsakis, N., & Viola, P. (2000). Learning from one example through shared densities of transforms. In Proceedings of CVPR (pp. 464–471).
go back to reference Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1354–1359.CrossRef Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1354–1359.CrossRef
go back to reference Saenko, K., Kulis, B., Fritz, M., & Darrell, T. (2010). Adapting visual category models to new domains. In Proceedings of ECCV (pp. 213–226). Saenko, K., Kulis, B., Fritz, M., & Darrell, T. (2010). Adapting visual category models to new domains. In Proceedings of ECCV (pp. 213–226).
go back to reference Shakhnarovich, G., Viola, P., & Darrell, T. (2003). Fast pose estimation with parameter-sensitive hashing. In Proceedings of ICCV (pp. 750–757). Shakhnarovich, G., Viola, P., & Darrell, T. (2003). Fast pose estimation with parameter-sensitive hashing. In Proceedings of ICCV (pp. 750–757).
go back to reference Shimodaira, H. (2000). Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 90(2000), 227–244.CrossRefMATHMathSciNet Shimodaira, H. (2000). Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 90(2000), 227–244.CrossRefMATHMathSciNet
go back to reference Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., & Blake, A. (2011). Real-time human pose recognition in parts from a single depth image. In Proceedings of CVPR (pp. 1297–1304). Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., & Blake, A. (2011). Real-time human pose recognition in parts from a single depth image. In Proceedings of CVPR (pp. 1297–1304).
go back to reference Sigal, L., & Black, M. J. (2006). Humaneva: Synchronized video and motion capture dataset for evaluation of articulated human motion. In TR CS-06-08, Brown University. Sigal, L., & Black, M. J. (2006). Humaneva: Synchronized video and motion capture dataset for evaluation of articulated human motion. In TR CS-06-08, Brown University.
go back to reference Sigal, L., Balan, A., & Black, M. (2007). Combined discriminative and generative articulated pose and non-rigid shape estimation. In Proceedings of NIPS (pp. 1337–1344). Sigal, L., Balan, A., & Black, M. (2007). Combined discriminative and generative articulated pose and non-rigid shape estimation. In Proceedings of NIPS (pp. 1337–1344).
go back to reference Sminchisescu, C., Kanaujia, A., & Metaxas, D. (2006). Learning joint top-down and bottom-up processes for 3D visual inference. In Proceedings of CVPR (pp. 1743–1752). Sminchisescu, C., Kanaujia, A., & Metaxas, D. (2006). Learning joint top-down and bottom-up processes for 3D visual inference. In Proceedings of CVPR (pp. 1743–1752).
go back to reference Sugiyama, M., Nakajima, S., Kashima, H., Buenau, P. V., & Kawanabe, M. (2008). Direct importance estimation with model selection and its application to covariate shift adaptation. In Proceedings of NIPS (pp. 1433–1440). Sugiyama, M., Nakajima, S., Kashima, H., Buenau, P. V., & Kawanabe, M. (2008). Direct importance estimation with model selection and its application to covariate shift adaptation. In Proceedings of NIPS (pp. 1433–1440).
go back to reference Sun, M., Kohli, P., & Shotton, J. (2012). Conditional regression forests for human pose estimation. In Proceedings f CVPR (pp. 3394–3401). Sun, M., Kohli, P., & Shotton, J. (2012). Conditional regression forests for human pose estimation. In Proceedings f CVPR (pp. 3394–3401).
go back to reference Torralba, A., & Efros, A. (2011). Unbiased look at dataset bias. In Proceedings of CVPR (pp. 1521–1528). Torralba, A., & Efros, A. (2011). Unbiased look at dataset bias. In Proceedings of CVPR (pp. 1521–1528).
go back to reference Torralba, A., Murphy, K. P., & Freeman, W. T. (2004). Sharing features: efficient boosting procedures for multi-class object detection. In Proceedings of CVPR (pp. 762–769). Torralba, A., Murphy, K. P., & Freeman, W. T. (2004). Sharing features: efficient boosting procedures for multi-class object detection. In Proceedings of CVPR (pp. 762–769).
go back to reference Urtasun, R. & Darrell, T. (2008). Sparse probabilistic regression for activity-independent human pose inference. In Proceedings of CVPR (pp. 1–8). Urtasun, R. & Darrell, T. (2008). Sparse probabilistic regression for activity-independent human pose inference. In Proceedings of CVPR (pp. 1–8).
go back to reference Weise, T., Leibe, B., & Van Gool, L. (2007). Fast 3D scanning with automatic motion compensation. In Proceedings of CVPR (pp. 1–8). Weise, T., Leibe, B., & Van Gool, L. (2007). Fast 3D scanning with automatic motion compensation. In Proceedings of CVPR (pp. 1–8).
go back to reference Yamada, M., Sigal, L., & Raptis, M. (2012). No bias left behind: Covariate shift adaptation for discriminative 3D pose estimation. In Proceedings of ECCV (pp. 674–687). Yamada, M., Sigal, L., & Raptis, M. (2012). No bias left behind: Covariate shift adaptation for discriminative 3D pose estimation. In Proceedings of ECCV (pp. 674–687).
go back to reference Yamada, M., Suzuki, T., Kanamori, T., Hachiya, H., & Sugiyama, M. (2013). Relative density-ratio estimation for robust distribution comparison. Neural computation, 25(5), 1324–1370.CrossRefMathSciNet Yamada, M., Suzuki, T., Kanamori, T., Hachiya, H., & Sugiyama, M. (2013). Relative density-ratio estimation for robust distribution comparison. Neural computation, 25(5), 1324–1370.CrossRefMathSciNet
Metadata
Title
Domain Adaptation for Structured Regression
Authors
Makoto Yamada
Leonid Sigal
Yi Chang
Publication date
01-08-2014
Publisher
Springer US
Published in
International Journal of Computer Vision / Issue 1-2/2014
Print ISSN: 0920-5691
Electronic ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-013-0689-x

Other articles of this Issue 1-2/2014

International Journal of Computer Vision 1-2/2014 Go to the issue

Premium Partner