Skip to main content
Erschienen in: International Journal of Computer Vision 3/2015

01.07.2015

Metric Regression Forests for Correspondence Estimation

verfasst von: Gerard Pons-Moll, Jonathan Taylor, Jamie Shotton, Aaron Hertzmann, Andrew Fitzgibbon

Erschienen in: International Journal of Computer Vision | Ausgabe 3/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We present a new method for inferring dense data to model correspondences, focusing on the application of human pose estimation from depth images. Recent work proposed the use of regression forests to quickly predict correspondences between depth pixels and points on a 3D human mesh model. That work, however, used a proxy forest training objective based on the classification of depth pixels to body parts. In contrast, we introduce Metric Space Information Gain (MSIG), a new decision forest training objective designed to directly minimize the entropy of distributions in a metric space. When applied to a model surface, viewed as a metric space defined by geodesic distances, MSIG aims to minimize image-to-model correspondence uncertainty. A naïve implementation of MSIG would scale quadratically with the number of training examples. As this is intractable for large datasets, we propose a method to compute MSIG in linear time. Our method is a principled generalization of the proxy classification objective, and does not require an extrinsic isometric embedding of the model surface in Euclidean space. Our experiments demonstrate that this leads to correspondences that are considerably more accurate than state of the art, using far fewer training images.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
Note that this is an extended version of Pons-Moll et al. (2013). Some portions of Taylor et al. (2012) have been included for clarity.
 
2
Distinct subscripts indicate whether \(p\) and \(l\) refer to vertices or spheres.
 
Literatur
Zurück zum Zitat Baak, A., Müller, M., Bharaj, G., Seidel, H., & Theobalt, C. (2011). A data-driven approach for real-time full body pose reconstruction from a depth camera. In: IEEE international conference on computer vision pp. 1092–1099. Baak, A., Müller, M., Bharaj, G., Seidel, H., & Theobalt, C. (2011). A data-driven approach for real-time full body pose reconstruction from a depth camera. In: IEEE international conference on computer vision pp. 1092–1099.
Zurück zum Zitat Balan, A., Sigal, L., Black, M., Davis, J., & Haussecker, H. (2007). Detailed human shape and pose from images. In: IEEE conference on computer vision and pattern recognition. Balan, A., Sigal, L., Black, M., Davis, J., & Haussecker, H. (2007). Detailed human shape and pose from images. In: IEEE conference on computer vision and pattern recognition.
Zurück zum Zitat Bentley, J. (1975). Multidimensional binary search trees used for associative searching. Communications of the ACM, 18(9), 509–517.MATHMathSciNetCrossRef Bentley, J. (1975). Multidimensional binary search trees used for associative searching. Communications of the ACM, 18(9), 509–517.MATHMathSciNetCrossRef
Zurück zum Zitat Besl, P., & McKay, N. (1992). A method for registration of 3d shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12, 239–256.CrossRef Besl, P., & McKay, N. (1992). A method for registration of 3d shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12, 239–256.CrossRef
Zurück zum Zitat Black, M., & Rangarajan, A. (1996). On the unification of line processes, outlier rejection, and robust statistics with applications in early vision. International Journal on Computer Vision, 19(1), 57–91.CrossRef Black, M., & Rangarajan, A. (1996). On the unification of line processes, outlier rejection, and robust statistics with applications in early vision. International Journal on Computer Vision, 19(1), 57–91.CrossRef
Zurück zum Zitat Bo, L., & Sminchisescu, C. (2010). Twin gaussian processes for structured prediction. International Journal on Computer Vision, 87, 28–52.CrossRef Bo, L., & Sminchisescu, C. (2010). Twin gaussian processes for structured prediction. International Journal on Computer Vision, 87, 28–52.CrossRef
Zurück zum Zitat Bregler, C., Malik, J., & Pullen, K. (2004). Twist based acquisition and tracking of animal and human kinematics. International Journal on Computer Vision, 56(3), 179–194.CrossRef Bregler, C., Malik, J., & Pullen, K. (2004). Twist based acquisition and tracking of animal and human kinematics. International Journal on Computer Vision, 56(3), 179–194.CrossRef
Zurück zum Zitat Breiman, L. (1999). Random forests. Berkeley: UC. (Technical Report TR567). Breiman, L. (1999). Random forests. Berkeley: UC. (Technical Report TR567).
Zurück zum Zitat Brubaker, M., Fleet, D., & Hertzmann, A. (2010). Physics-based person tracking using the anthropomorphic walker. In: International journal on computer vision. Brubaker, M., Fleet, D., & Hertzmann, A. (2010). Physics-based person tracking using the anthropomorphic walker. In: International journal on computer vision.
Zurück zum Zitat Buntine, W., & Niblett, T. (1992). A further comparison of splitting rules for decision-tree induction. Machine Learning, 8(1), 75–85. Buntine, W., & Niblett, T. (1992). A further comparison of splitting rules for decision-tree induction. Machine Learning, 8(1), 75–85.
Zurück zum Zitat Criminisi, A., & Shotton, J. (2013). Decision forests for computer vision and medical image analysis. London: Springer.CrossRef Criminisi, A., & Shotton, J. (2013). Decision forests for computer vision and medical image analysis. London: Springer.CrossRef
Zurück zum Zitat Deutscher, J., & Reid, I. (2005). Articulated body motion capture by stochastic search. International Journal on Computer Vision, 61(2), 185–205.CrossRef Deutscher, J., & Reid, I. (2005). Articulated body motion capture by stochastic search. International Journal on Computer Vision, 61(2), 185–205.CrossRef
Zurück zum Zitat Gall, J., Rosenhahn, B., Brox, T., & Seidel, H. P. (2010). Optimization and filtering for human motion capture. International Journal on Computer Vision, 87, 75–92.CrossRef Gall, J., Rosenhahn, B., Brox, T., & Seidel, H. P. (2010). Optimization and filtering for human motion capture. International Journal on Computer Vision, 87, 75–92.CrossRef
Zurück zum Zitat Gall, J., Yao, A., Razavi, N., Van Gool, L., & Lempitsky, V. (2011). Hough forests for object detection, tracking, and action recognition. PAMI, 33(11), 2188–2202.CrossRef Gall, J., Yao, A., Razavi, N., Van Gool, L., & Lempitsky, V. (2011). Hough forests for object detection, tracking, and action recognition. PAMI, 33(11), 2188–2202.CrossRef
Zurück zum Zitat Ganapathi, V., Plagemann, C., Koller, D., & Thrun, S. (2012). Real-time human pose tracking from range data. In: European conference on computer vision. Ganapathi, V., Plagemann, C., Koller, D., & Thrun, S. (2012). Real-time human pose tracking from range data. In: European conference on computer vision.
Zurück zum Zitat Ganapathi, V., Plagemann, C., Thrun, S., & Koller, D. (2010). Real time motion capture using a time-of-flight camera. In: Conference in computer vision and pattern recognition. Ganapathi, V., Plagemann, C., Thrun, S., & Koller, D. (2010). Real time motion capture using a time-of-flight camera. In: Conference in computer vision and pattern recognition.
Zurück zum Zitat Girshick, R., Shotton, J., Kohli, P., Criminisi, A., & Fitzgibbon, A. (2011). Efficient regression of general-activity human poses from depth images. In: IEEE international conference on computer vision, pp. 415–422. Girshick, R., Shotton, J., Kohli, P., Criminisi, A., & Fitzgibbon, A. (2011). Efficient regression of general-activity human poses from depth images. In: IEEE international conference on computer vision, pp. 415–422.
Zurück zum Zitat Kabsch, W. (1976). A solution for the best rotation to relate two sets of vectors. Acta Crystallographica, 32(5), 922–923.CrossRef Kabsch, W. (1976). A solution for the best rotation to relate two sets of vectors. Acta Crystallographica, 32(5), 922–923.CrossRef
Zurück zum Zitat Lee, C., & Elgammal, A. (2010). Coupled visual and kinematic manifold models for tracking. International Journal on Computer Vision, 87, 118–139. Lee, C., & Elgammal, A. (2010). Coupled visual and kinematic manifold models for tracking. International Journal on Computer Vision, 87, 118–139.
Zurück zum Zitat Liu, W., & White, A. (1994). The importance of attribute selection measures in decision tree induction. Machine Learning, 15(1), 25–41. Liu, W., & White, A. (1994). The importance of attribute selection measures in decision tree induction. Machine Learning, 15(1), 25–41.
Zurück zum Zitat Memisevic, R., Sigal, L., & Fleet, D. J. (2012). Shared kernel information embedding for discriminative inference. PAMI, 34(4), 778–790.CrossRef Memisevic, R., Sigal, L., & Fleet, D. J. (2012). Shared kernel information embedding for discriminative inference. PAMI, 34(4), 778–790.CrossRef
Zurück zum Zitat Nowozin, S. (2012). Improved information gain estimates for decision tree induction. In: ICML. Nowozin, S. (2012). Improved information gain estimates for decision tree induction. In: ICML.
Zurück zum Zitat Parzen, E. (1962). On estimation of a probability density function and mode. The Aannals of Mathematical Statistics, 33(3), 1065–1076.MATHMathSciNetCrossRef Parzen, E. (1962). On estimation of a probability density function and mode. The Aannals of Mathematical Statistics, 33(3), 1065–1076.MATHMathSciNetCrossRef
Zurück zum Zitat Pons-Moll, G., Baak, A., Gall, J., Leal-Taixe, L., Mueller, M., Seidel, H., & Rosenhahn, B. (2011). Outdoor human motion capture using inverse kinematics and von mises-fisher sampling. In: International conference on computer vision. Pons-Moll, G., Baak, A., Gall, J., Leal-Taixe, L., Mueller, M., Seidel, H., & Rosenhahn, B. (2011). Outdoor human motion capture using inverse kinematics and von mises-fisher sampling. In: International conference on computer vision.
Zurück zum Zitat Pons-Moll, G., Leal-Taixé, L., Truong, T., & Rosenhahn, B. (2011). Efficient and robust shape matching for model based human motion capture. In: DAGM. Pons-Moll, G., Leal-Taixé, L., Truong, T., & Rosenhahn, B. (2011). Efficient and robust shape matching for model based human motion capture. In: DAGM.
Zurück zum Zitat Pons-Moll, G., & Rosenhahn, B. (2011). Model-based pose estimation. In Visual analysis of humans (pp. 139–170). London: Springer. Pons-Moll, G., & Rosenhahn, B. (2011). Model-based pose estimation. In Visual analysis of humans (pp. 139–170). London: Springer.
Zurück zum Zitat Pons-Moll, G., Taylor, J., Shotton, J., Hertzmann, A., & Fitzgibbon, A. (2013). Metric regression forests for human pose estimation. In: British machine vision conference (BMVC). Pons-Moll, G., Taylor, J., Shotton, J., Hertzmann, A., & Fitzgibbon, A. (2013). Metric regression forests for human pose estimation. In: British machine vision conference (BMVC).
Zurück zum Zitat Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., & Blake, A. (2011). Real-time human pose recognition in parts from single depth images. In: IEEE conference in computer vision and pattern recognition, pp. 1297–1304. Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., & Blake, A. (2011). Real-time human pose recognition in parts from single depth images. In: IEEE conference in computer vision and pattern recognition, pp. 1297–1304.
Zurück zum Zitat Shotton, J., Glocker, B., Zach, C., Izadi, S., Criminisi, A., & Fitzgibbon, A. (2013). Scene coordinate regression forests for camera relocalization in RGB-D images. In: Conference in computer vision and pattern recognition. Shotton, J., Glocker, B., Zach, C., Izadi, S., Criminisi, A., & Fitzgibbon, A. (2013). Scene coordinate regression forests for camera relocalization in RGB-D images. In: Conference in computer vision and pattern recognition.
Zurück zum Zitat Silverman, B. (1986). Density estimation for statistics and data analysis (Vol. 26). London: CRC press. Silverman, B. (1986). Density estimation for statistics and data analysis (Vol. 26). London: CRC press.
Zurück zum Zitat Sminchisescu, C., Bo, L., Ionescu, C., & Kanaujia, A. (2011). Feature-based pose estimation. In Visual analysis of humans (pp. 225–251). London: Springer. Sminchisescu, C., Bo, L., Ionescu, C., & Kanaujia, A. (2011). Feature-based pose estimation. In Visual analysis of humans (pp. 225–251). London: Springer.
Zurück zum Zitat Stoll, C., Hasler, N., Gall, J., Seidel, H., & Theobalt, C. (2011) Fast articulated motion tracking using a sums of gaussians body model. In: IEEE international conference on computer vision, pp. 951–958. Stoll, C., Hasler, N., Gall, J., Seidel, H., & Theobalt, C. (2011) Fast articulated motion tracking using a sums of gaussians body model. In: IEEE international conference on computer vision, pp. 951–958.
Zurück zum Zitat Taylor, J., Shotton, J., Sharp, T., & Fitzgibbon, A. (2012). The Vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation. In: Conference in computer vision and pattern recognition. Taylor, J., Shotton, J., Sharp, T., & Fitzgibbon, A. (2012). The Vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation. In: Conference in computer vision and pattern recognition.
Zurück zum Zitat Urtasun, R., & Darrell, T. (2008). Sparse probabilistic regression for activity-independent human pose inference. In: IEEE conference in computer vision and pattern recognition, pp. 1–8. Urtasun, R., & Darrell, T. (2008). Sparse probabilistic regression for activity-independent human pose inference. In: IEEE conference in computer vision and pattern recognition, pp. 1–8.
Metadaten
Titel
Metric Regression Forests for Correspondence Estimation
verfasst von
Gerard Pons-Moll
Jonathan Taylor
Jamie Shotton
Aaron Hertzmann
Andrew Fitzgibbon
Publikationsdatum
01.07.2015
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 3/2015
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-015-0818-9

Weitere Artikel der Ausgabe 3/2015

International Journal of Computer Vision 3/2015 Zur Ausgabe