Top

International Journal of Computer Vision

Published in:

01-01-2015

An Elastic Deformation Field Model for Object Detection and Tracking

Authors: Marco Pedersoli, Radu Timofte, Tinne Tuytelaars, Luc Van Gool

Published in: International Journal of Computer Vision | Issue 2/2015

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Deformable Parts Models (DPM) are the current state-of-the-art for object detection. Nevertheless they seem sub-optimal in the representation of deformations. Object deformations are often continuous and not confined to big parts. Therefore we propose to replace the DPM star model based on big parts by a deformation field. This consists of a grid of small parts connected with pairwise constraints which can better handle continuous deformations. The naive application of this model for object detection would consist of a bounded sliding window approach: for each possible location of the image the best part configuration within a limited bound around this location is found. This is computationally very expensive.Instead, we propose a different inference procedure, where an iterative image-level search finds the best object hypothesis. We show that this approach is faster than bounded sliding windows yet produces comparable accuracy. Experiments further show that the deformation field can better approximate real object deformations and therefore, for certain classes, produces even better detection accuracy than state-of-the-art DPM. Finally, the same approach is adapted to model-free tracking, showing improved accuracy also in this case.

next article Learning Complementary Saliency Priors for Foreground Object Segmentation in Complex Scenes

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

In this work when referring to DPM we consider the specific implementation of Felzenszwalb et al. (2010).

Typically, when using alpha expansion the solution will not be exact. However, we experimentally found that the algorithm still works well, which is an indirect observation that the solution that is found is generally very close to the exact one. See Fig. 4 in the experimental results.

\(S(\mathbf {l},\mathbf {x},\mathbf {w})\) is the maximization defined in Eq. (9), where we make explicit the dependency on the image \(\mathbf {x}\) and \(\mathbf {w}\).

Alahari, K., Kohli, P., & Torr, P. H. S. (2010). Dynamic hybrid algorithms for map inference in discrete mrfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 1846–1857.CrossRef

Andriluka, M., Roth, S., & Schiele, B. (2009). Pictorial structures revisited: People detection and articulated pose estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1014–1021).

Babenko, B., Yang, M. H., & Belongie, S. (2011). Robust object tracking with online multiple instance learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(8), 1619–1632.CrossRef

Batra, D., Yadollahpour, P., Guzman, A., & Shakhnarovich, G. (2012). Diverse m-best solutions in markov random fields. In Proceedings of the European conference on computer vision.

Bergtholdt, M., Kappes, J., Schmidt, S., & Schnörr, C. (2010). A study of parts-based object class detection using complete graphs. International Journal of Computer Vision, 87(1–2), 93–117.CrossRefMathSciNet

Bourdev, L. D., Maji, S., Brox, T., & Malik, J. (2010). Detecting people using mutually consistent poselet activations. In Proceedings of the European conference on computer vision (pp. 168–181).

Boykov, Y., Veksler, O., & Zabih, R. (1998). Markov random fields with efficient approximations. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 648–656).

Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.CrossRef

Crandall, D., Felzenszwalb, P., & Huttenlocher, D. (2005). Spatial priors for part-based recognition using statistical models. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 10–17).

Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 886–893).

Desai, C., Ramanan, D., & Fowlkes, C. (2009). Discriminative models for multi-class layout. In Proceedings of the IEEE international conference on computer vision.

Duchenne, O., Joulin, A., & Ponce, J. (2011). A graph-matching kernel for object categorization. p. 1056. Barcelona, Spain.

Everingham, M., Zisserman, A., Williams, C., & Van Gool, L. (2007). The pascal visual obiect classes challenge 2007 (voc2007) results.

Felzenszwalb, P. F., Girshick, R., & McAllester, D. (2010). Cascade object detection with deformable part models. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2241–2248).

Felzenszwalb, P. F., Girshick, R. B., McAllester, D. A., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32 (9) 1627–1645.

Felzenszwalb, P. F., & Huttenlocher, D.P. (2004). Distance transforms of sampled functions. Technical report

Felzenszwalb, P. F., McAllester, D., & Ramanan, D. (2008). A discriminatively trained, multiscale, deformable part model. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–8).

Fergus, R., Perona, P., & Zisserman, A. (2003). Object class recognition by unsupervised scale-invariant learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 264–271).

Glocker, B., Komodakis, N., Tziritas, G., Navab, N., & Paragios, N. (2008). Dense image registration through mrfs and efficient linear programming. Medical Image Analysis, 12(6), 731–741.CrossRef

Hoeim, D., Rother, C., & Winn, J. M. (2008). 3d layout crf for multi-view object class recognition and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition.

Horn, B. K. P., & Schunck, B. G. (1981). Determining optical flow. Artificial Intelligence, 17, 185–203.CrossRef

Kalal, Z., Matas, J., & Mikolajczyk, K. (2010).P-n learning: Bootstrapping binary classifiers by structural constraints. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 49–56).

Kapoor, A., & Winn, J. (2006). Located hidden random fields: Learning discriminative parts for object detection. In Proceedings of the European conference on computer vision (pp. 302–315).

Kohli, P., & Torr, P. H. S. (2007). Dynamic graph cuts for efficient inference in markov random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(12), 2079–2088.CrossRef

Komodakis, N., Tziritas, G., & Paragios, N. (2008). Performance vs computational efficiency for optimizing single and dynamic mrfs: Setting the state of the art with primal-dual strategies. Computer Vision and Image Understanding, 112(1), 14–29.CrossRef

Lades, M., Vorbruggen, J. C., Buhmann, J., Lange, J., Malsburg, Cvd, Wurtz, R. P., et al. (1993). Distortion invariant object recognition in the dynamic link architecture. IEEE Transactions on Computers, 42(3), 300–311.CrossRef

Ladicky, L., Sturgess, P., Alahari, K., Russell, C., & Torr, P. (2010). Where, what and how many? Combining object detectors and crfs. In Proceedings of the European conference on computer vision (pp. 424–437).

Ladicky, L., Torr, P. H. S., & Zisserman, A. (2012). Latent svms for human detection with a locally affine deformation field. In Proceedings of the British machine vision conference (pp. 10.1–10.11).

Lucas, B. D., & Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In Proceedings of the international joint conference on artificial intelligence (pp. 674–679).

Pedersoli, M., Vedaldi, A., & Gonzàlez, J. (2011). A coarse-to-fine approach for fast deformable object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1353–1360).

Quattoni, A., Collins, M., & Darrell, T. (2004). Conditional random fields for object recognition. In L. K. Saul, Y. Weiss & L. Bottou (Eds.), Advances in neural information processing systems (pp. 1097–1104). MIT Press.

Shalev-Shwartz, S., Singer, Y., Srebro, N., & Cotter, A. (2011). Pegasos: Primal estimated sub-gradient solver for svm. Mathematical Programming, 127(1), 3–30.

Vedaldi, A., & Zisserman, A. (2009). Structured output regression for detection with partial occulsion. In Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams & A. Culotta (Eds.), Advances in neural information processing systems (pp. 1928–1936).

Vedaldi, A., & Zisserman, A. (2012). Sparse kernel approximations for efficient classification and detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2320–2327).

Wang, Y., Tran, D., Liao, Z. (2011). Learning hierarchical poselets for human parsing. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1705–1712).

Yang, Y., & Ramanan, D. (2012). Articulated human detection with flexible mixtures-of-parts. IEEE Transactions on Pattern Analysis and Machine Intelligence 99(PrePrints), 1.

Yuille, A., Rangarajan, A., & Yuille, A. L. (2002). The concave-convex procedure (cccp). In Advances in neural information processing systems (pp. 1033–1040).

Zhang, L., & van der Maaten, L. (2013). Structure preserving object tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–8).

Zhu, L., Chen, Y., Yuille, A., & Freeman, W. (2010). Latent hierarchical structural learning for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–8).

Title: An Elastic Deformation Field Model for Object Detection and Tracking
Authors: Marco Pedersoli
Radu Timofte
Tinne Tuytelaars
Luc Van Gool
Publication date: 01-01-2015
Publisher: Springer US
Published in: International Journal of Computer Vision / Issue 2/2015
Print ISSN: 0920-5691
Electronic ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-014-0736-2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 2/2015

Pose Adaptive Motion Feature Pooling for Human Action Analysis

Robust Visual Tracking Via Consistent Low-Rank Sparse Learning

Local Alignments for Fine-Grained Categorization

Locally Orderless Tracking

Learning Complementary Saliency Priors for Foreground Object Segmentation in Complex Scenes

Premium Partner