Skip to main content
Erschienen in: International Journal of Computer Vision 3/2017

26.12.2016

Applying Detection Proposals to Visual Tracking for Scale and Aspect Ratio Adaptability

verfasst von: Dafei Huang, Lei Luo, Zhaoyun Chen, Mei Wen, Chunyuan Zhang

Erschienen in: International Journal of Computer Vision | Ausgabe 3/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The newly proposed correlation filter based trackers can achieve appealing performance despite their great simplicity and superior speed. However, this kind of object trackers is not born with scale and aspect ratio adaptability, thus resulting in suboptimal tracking accuracy. To tackle this problem, this paper integrates the class-agnostic detection proposal method, which is widely adopted in object detection area, into a correlation filter tracker. In the tracker part, optimizations such as feature integration, robust model updating and proposal rejection are applied for efficient integration. As for proposal generation, through integrating and comparing four detection proposal generators along with two baseline methods, the quality of detection proposals is found to have considerable influence on tracking accuracy. Therefore, as the most promising proposal generator, EdgeBoxes is chosen and further enhanced with background suppression. Evaluations are mainly performed on a challenging 50-sequence dataset (OTB50) and its two subsets, 28 sequences with significant scale variation and 14 sequences with obvious aspect ratio change. Among the trackers equipped with different proposal generators, state-of-the-art trackers and existing correlation filter variants, our proposed tracker reports the highest accuracy while running efficiently at an average speed of 20.4 frames per second. Additionally, numerical performance analysis in per-sequence manner and experiment results on VOT2014 dataset are also presented to enable deeper insights into our approach.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
We notice that in some sequences, the tracking bounding box of STC shrinks to extremely small size, resulting in even faster speed but unreliable results.
 
2
Here “\(x\sim y\)” means that the variation is examined between each frame’s x-th and y-th previous frame.
 
3
Here “\(x\sim y\)” means that the relative variation exceeds (1 / xx) but still remains within (1 / yy).
 
Literatur
Zurück zum Zitat Alexe, B., Deselaers, T., & Ferrari, V. (2012). Measuring the objectness of image windows. TPAMI, 34(11), 2189–2202.CrossRef Alexe, B., Deselaers, T., & Ferrari, V. (2012). Measuring the objectness of image windows. TPAMI, 34(11), 2189–2202.CrossRef
Zurück zum Zitat Arbelaez, P., Pont-Tuset, J., Barron, J., Marqués, F., & Malik, J. (2014). Multiscale combinatorial grouping. In CVPR (pp. 328–335). Arbelaez, P., Pont-Tuset, J., Barron, J., Marqués, F., & Malik, J. (2014). Multiscale combinatorial grouping. In CVPR (pp. 328–335).
Zurück zum Zitat Belagiannis, V., Schubert, F., Navab, N., & Ilic, S. (2012). Segmentation based particle filtering for real-time 2D object tracking. In ECCV (pp. 842–855). Belagiannis, V., Schubert, F., Navab, N., & Ilic, S. (2012). Segmentation based particle filtering for real-time 2D object tracking. In ECCV (pp. 842–855).
Zurück zum Zitat Bolme, D. S., Beveridge, J. R., Draper, B. A., & Lui, Y. M. (2010). Visual object tracking using adaptive correlation filters. In CVPR (pp. 2544–2550). Bolme, D. S., Beveridge, J. R., Draper, B. A., & Lui, Y. M. (2010). Visual object tracking using adaptive correlation filters. In CVPR (pp. 2544–2550).
Zurück zum Zitat Cai, Z., Wen, L., Yang, J., Lei, Z., & Li, S. (2012). Structured visual tracking with dynamic graph. In ACCV (pp. 86–97). Cai, Z., Wen, L., Yang, J., Lei, Z., & Li, S. (2012). Structured visual tracking with dynamic graph. In ACCV (pp. 86–97).
Zurück zum Zitat Carreira, J., & Sminchisescu, C. (2012). CPMC: Automatic object segmentation using constrained parametric min-cuts. TPAMI, 34(7), 1312–1328.CrossRef Carreira, J., & Sminchisescu, C. (2012). CPMC: Automatic object segmentation using constrained parametric min-cuts. TPAMI, 34(7), 1312–1328.CrossRef
Zurück zum Zitat Cheng, M. M., Zhang, Z., Lin, W. Y., & Torr, P. H. S. (2014). BING: Binarized normed gradients for objectness estimation at 300fps. In CVPR (pp. 3286–3293). Cheng, M. M., Zhang, Z., Lin, W. Y., & Torr, P. H. S. (2014). BING: Binarized normed gradients for objectness estimation at 300fps. In CVPR (pp. 3286–3293).
Zurück zum Zitat Comaniciu, D., Ramesh, V., & Meer, P. (2003). Kernel-based object tracking. TPAMI, 25(5), 564–577.CrossRef Comaniciu, D., Ramesh, V., & Meer, P. (2003). Kernel-based object tracking. TPAMI, 25(5), 564–577.CrossRef
Zurück zum Zitat Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR (pp. 886–893). Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR (pp. 886–893).
Zurück zum Zitat Danelljan, M., Häger, G., Khan, F. S., & Felsberg, M. (2014a). Accurate scale estimation for robust visual tracking. In BMVC. Danelljan, M., Häger, G., Khan, F. S., & Felsberg, M. (2014a). Accurate scale estimation for robust visual tracking. In BMVC.
Zurück zum Zitat Danelljan, M., Shahbaz Khan, F., Felsberg, M., & Van de Weijer, J. (2014b). Adaptive color attributes for real-time visual tracking. In CVPR (pp. 1090–1097). Danelljan, M., Shahbaz Khan, F., Felsberg, M., & Van de Weijer, J. (2014b). Adaptive color attributes for real-time visual tracking. In CVPR (pp. 1090–1097).
Zurück zum Zitat Dollár, P., & Zitnick, C. L. (2013). Structured forests for fast edge detection. In ICCV (pp. 1841–1848). Dollár, P., & Zitnick, C. L. (2013). Structured forests for fast edge detection. In ICCV (pp. 1841–1848).
Zurück zum Zitat Duffner, S., & Garcia, C. (2013). PixelTrack: A fast adaptive algorithm for tracking non-rigid objects. In ICCV (pp. 2480–2487). Duffner, S., & Garcia, C. (2013). PixelTrack: A fast adaptive algorithm for tracking non-rigid objects. In ICCV (pp. 2480–2487).
Zurück zum Zitat Everingham, M., Eslami, S. M. A., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2015). The pascal visual object classes challenge: A retrospective. IJCV, 111(1), 98–136.CrossRef Everingham, M., Eslami, S. M. A., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2015). The pascal visual object classes challenge: A retrospective. IJCV, 111(1), 98–136.CrossRef
Zurück zum Zitat Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR (pp. 580–587). Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR (pp. 580–587).
Zurück zum Zitat Godec, M., Roth, P. M., & Bischof, H. (2011). Hough-based tracking of non-rigid objects. In ICCV (pp. 81–88). Godec, M., Roth, P. M., & Bischof, H. (2011). Hough-based tracking of non-rigid objects. In ICCV (pp. 81–88).
Zurück zum Zitat Hare, S., Saffari, A., & Torr, P. H. S. (2011). Struck: Structured output tracking with kernels. In ICCV (pp. 263–270). Hare, S., Saffari, A., & Torr, P. H. S. (2011). Struck: Structured output tracking with kernels. In ICCV (pp. 263–270).
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J. (2014). Spatial pyramid pooling in deep convolutional networks for visual recognition. In ECCV (pp. 346–361). He, K., Zhang, X., Ren, S., Sun, J. (2014). Spatial pyramid pooling in deep convolutional networks for visual recognition. In ECCV (pp. 346–361).
Zurück zum Zitat Henriques, J. F., Caseiro, R., Martins, P., & Batista, J. (2012). Exploiting the circulant structure of tracking-by-detection with kernels. In ECCV (pp. 702–715). Henriques, J. F., Caseiro, R., Martins, P., & Batista, J. (2012). Exploiting the circulant structure of tracking-by-detection with kernels. In ECCV (pp. 702–715).
Zurück zum Zitat Hosang, J., Benenson, R., & Schiele, B. (2014). How good are detection proposals, really?. In BMVC. Hosang, J., Benenson, R., & Schiele, B. (2014). How good are detection proposals, really?. In BMVC.
Zurück zum Zitat Hua, Y., Alahari, K., & Schmid, C. (2015). Online object tracking with proposal selection. In ICCV, (pp. 3092–3100). Hua, Y., Alahari, K., & Schmid, C. (2015). Online object tracking with proposal selection. In ICCV, (pp. 3092–3100).
Zurück zum Zitat Huang, D., Luo, L., Wen, M., Chen, Z., & Zhang, C. (2015). Enable scale and aspect ratio adaptability in visual tracking with detection proposals. In BMVC. Huang, D., Luo, L., Wen, M., Chen, Z., & Zhang, C. (2015). Enable scale and aspect ratio adaptability in visual tracking with detection proposals. In BMVC.
Zurück zum Zitat Jia, X., Lu, H., & Yang, M. H. (2012). Visual tracking via adaptive structural local sparse appearance model. In CVPR (pp. 1822–1829). Jia, X., Lu, H., & Yang, M. H. (2012). Visual tracking via adaptive structural local sparse appearance model. In CVPR (pp. 1822–1829).
Zurück zum Zitat Kalal, Z., Matas, J., & Mikolajczyk, K. (2010). P-N learning: Bootstrapping binary classifiers by structural constraints. In CVPR (pp. 49–56). Kalal, Z., Matas, J., & Mikolajczyk, K. (2010). P-N learning: Bootstrapping binary classifiers by structural constraints. In CVPR (pp. 49–56).
Zurück zum Zitat Krähenbühl, P., & Koltun, V. (2014). Geodesic object proposals. In ECCV (pp. 725–739). Krähenbühl, P., & Koltun, V. (2014). Geodesic object proposals. In ECCV (pp. 725–739).
Zurück zum Zitat Kristan, M., Pflugfelder, R., & Leonardis, A, et al. (2013). The visual object tracking VOT2013 challenge results. In ICCV workshop (pp. 98–111). Kristan, M., Pflugfelder, R., & Leonardis, A, et al. (2013). The visual object tracking VOT2013 challenge results. In ICCV workshop (pp. 98–111).
Zurück zum Zitat Kwon, J., & Lee, K. M. (2010). Visual tracking decomposition. In CVPR (pp. 1269–1276). Kwon, J., & Lee, K. M. (2010). Visual tracking decomposition. In CVPR (pp. 1269–1276).
Zurück zum Zitat Li, Y., & Zhu, J. (2014). A scale adaptive kernel correlation filter tracker with feature integration. In ECCV workshop, (pp. 254–265). Li, Y., & Zhu, J. (2014). A scale adaptive kernel correlation filter tracker with feature integration. In ECCV workshop, (pp. 254–265).
Zurück zum Zitat Liang, P., Pang, Y., Liao, C., Mei, X., & Ling, H. (2016). Adaptive objectness for object tracking. IEEE Signal Processing Letters, 23(7), 949–953.CrossRef Liang, P., Pang, Y., Liao, C., Mei, X., & Ling, H. (2016). Adaptive objectness for object tracking. IEEE Signal Processing Letters, 23(7), 949–953.CrossRef
Zurück zum Zitat Liu, B., Huang, J., Yang, L., & Kulikowsk, C. (2011). Robust tracking using local sparse appearance model and k-selection. In CVPR (pp. 1313–1320). Liu, B., Huang, J., Yang, L., & Kulikowsk, C. (2011). Robust tracking using local sparse appearance model and k-selection. In CVPR (pp. 1313–1320).
Zurück zum Zitat Liu, T., Wang, G., & Yang, Q. (2015). Real-time part-based visual tracking via adaptive correlation filters. In CVPR (pp. 4902–4912). Liu, T., Wang, G., & Yang, Q. (2015). Real-time part-based visual tracking via adaptive correlation filters. In CVPR (pp. 4902–4912).
Zurück zum Zitat Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. In NIPS (pp. 91–99). Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. In NIPS (pp. 91–99).
Zurück zum Zitat Uijlings, J. R. R., Van de Sande, K. E. A., Gevers, T., & Smeulders, A. W. M. (2013). Selective search for object recognition. IJCV, 104(2), 154–171.CrossRef Uijlings, J. R. R., Van de Sande, K. E. A., Gevers, T., & Smeulders, A. W. M. (2013). Selective search for object recognition. IJCV, 104(2), 154–171.CrossRef
Zurück zum Zitat Van de Weijer, J., Schmid, C., Verbeek, J., & Larlus, D. (2009). Learning color names for real-world applications. TIP, 18(7), 1512–1523.MathSciNet Van de Weijer, J., Schmid, C., Verbeek, J., & Larlus, D. (2009). Learning color names for real-world applications. TIP, 18(7), 1512–1523.MathSciNet
Zurück zum Zitat Wang, A., Wan, G., Cheng, Z., & Li, S. (2009). An incremental extremely random forest classifier for online learning and tracking. In ICIP (pp. 1449–1452). Wang, A., Wan, G., Cheng, Z., & Li, S. (2009). An incremental extremely random forest classifier for online learning and tracking. In ICIP (pp. 1449–1452).
Zurück zum Zitat Wang, A., Cheng, Z., Martin, R. R., & Li, S. (2013). Multiple-cue-based visual object contour tracking with incremental learning. LNCS, 7544, 225–243. Wang, A., Cheng, Z., Martin, R. R., & Li, S. (2013). Multiple-cue-based visual object contour tracking with incremental learning. LNCS, 7544, 225–243.
Zurück zum Zitat Wen, L., Du, D., Lei, Z., Li, S. Z., & Yang, M. H. (2015). JOTS: Joint online tracking and segmentation. In CVPR (pp. 2226–2234). Wen, L., Du, D., Lei, Z., Li, S. Z., & Yang, M. H. (2015). JOTS: Joint online tracking and segmentation. In CVPR (pp. 2226–2234).
Zurück zum Zitat Wu, Y., Lim, J., & Yang, M. H. (2013). Online object tracking: A benchmark. In CVPR (pp. 2411–2418). Wu, Y., Lim, J., & Yang, M. H. (2013). Online object tracking: A benchmark. In CVPR (pp. 2411–2418).
Zurück zum Zitat Zhang, K., Zhang, L., Zhang, D., & Yang, M. H. (2014). Fast visual tracking via dense spatio-temporal context learning. In ECCV (pp. 127–141). Zhang, K., Zhang, L., Zhang, D., & Yang, M. H. (2014). Fast visual tracking via dense spatio-temporal context learning. In ECCV (pp. 127–141).
Zurück zum Zitat Zhong, W., Lu, H., & Yang, M. H. (2012). Robust object tracking via sparsity-based collaborative model. In CVPR (pp. 1838–1845). Zhong, W., Lu, H., & Yang, M. H. (2012). Robust object tracking via sparsity-based collaborative model. In CVPR (pp. 1838–1845).
Zurück zum Zitat Zhu, G., Porikli, F., & Li, H. (2016a). Beyond local search: Tracking objects everywhere with instance-specific proposals. In CVPR (pp. 943–951). Zhu, G., Porikli, F., & Li, H. (2016a). Beyond local search: Tracking objects everywhere with instance-specific proposals. In CVPR (pp. 943–951).
Zurück zum Zitat Zhu, G., Porikli, F., & Li, H. (2016b). Robust visual tracking with deep convolutional neural network based object proposals on PETS. In CVPR workshop (pp. 26–33). Zhu, G., Porikli, F., & Li, H. (2016b). Robust visual tracking with deep convolutional neural network based object proposals on PETS. In CVPR workshop (pp. 26–33).
Zurück zum Zitat Zhu, G., Wang, J., Wu, Y., Zhang, X., & Lu, H. (2016c). MC-HOG correlation tracking with saliency proposal. In AAAI (pp. 3690–3696). Zhu, G., Wang, J., Wu, Y., Zhang, X., & Lu, H. (2016c). MC-HOG correlation tracking with saliency proposal. In AAAI (pp. 3690–3696).
Zurück zum Zitat Zitnick, C. L., & Dollár, P. (2014). Edge Boxes: Locating object proposals from edges. In ECCV (pp. 391–405). Zitnick, C. L., & Dollár, P. (2014). Edge Boxes: Locating object proposals from edges. In ECCV (pp. 391–405).
Metadaten
Titel
Applying Detection Proposals to Visual Tracking for Scale and Aspect Ratio Adaptability
verfasst von
Dafei Huang
Lei Luo
Zhaoyun Chen
Mei Wen
Chunyuan Zhang
Publikationsdatum
26.12.2016
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 3/2017
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-016-0974-6

Weitere Artikel der Ausgabe 3/2017

International Journal of Computer Vision 3/2017 Zur Ausgabe