Skip to main content
Erschienen in: Machine Vision and Applications 1-2/2017

23.07.2016 | Original Paper

Directional coherence-based spatiotemporal descriptor for object detection in static and dynamic scenes

verfasst von: Wonjun Kim, Jae-Joon Han

Erschienen in: Machine Vision and Applications | Ausgabe 1-2/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper presents a simple, yet powerful local descriptor, so-called the histograms of space–time dominant orientations (HiSTDO). Specifically, our HiSTDO is composed of two main components, i.e., the dominant orientation and its coherence, which represents how intensively gradients in the local region are distributed along the space–time dominant orientation. By incorporating them into the histogram, we define it as our HiSTDO descriptor. In contrast to previous methods vulnerable to the presence of the background clutter and the camera noise, our HiSTDO greatly encodes the space–time shape of underlying structures even under such challenging conditions, and it can thus be efficiently applied to various applications (e.g., object and action detection). Experimental results on diverse datasets demonstrate that the proposed descriptor is effective for human action as well as object detection.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)CrossRef Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)CrossRef
2.
Zurück zum Zitat Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 886–893 (2005) Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 886–893 (2005)
3.
Zurück zum Zitat Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRef Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRef
4.
Zurück zum Zitat Zeng, C., Ma, H., Ming, A.: Fast human detection using mi-SVM and a cascade of HOG-LBP features. In: Proceedings IEEE international conference on image processing (ICIP), pp. 3845–3848 (2010) Zeng, C., Ma, H., Ming, A.: Fast human detection using mi-SVM and a cascade of HOG-LBP features. In: Proceedings IEEE international conference on image processing (ICIP), pp. 3845–3848 (2010)
5.
Zurück zum Zitat Zhang, J., Huang, K., Yu, Y., Tan, T.: Boosted local structured HOG-LBP for object localization. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 1393–1400 (2011) Zhang, J., Huang, K., Yu, Y., Tan, T.: Boosted local structured HOG-LBP for object localization. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 1393–1400 (2011)
6.
Zurück zum Zitat Jia, W., Hu, R.-X., Lei, Y.-K., Zhao, Y., Gui, J.: Histogram of oriented lines for palmprint recognition. IEEE Trans. Syst. Man Cybern.: Syst. 44(3), 385–395 (2014)CrossRef Jia, W., Hu, R.-X., Lei, Y.-K., Zhao, Y., Gui, J.: Histogram of oriented lines for palmprint recognition. IEEE Trans. Syst. Man Cybern.: Syst. 44(3), 385–395 (2014)CrossRef
7.
Zurück zum Zitat Pang, Y., Zhang, K., Yuan, Y., Wang, K.: Distributed object detection with linear SVMs. IEEE Trans. Cybern. 44(11), 2122–2133 (2014)CrossRef Pang, Y., Zhang, K., Yuan, Y., Wang, K.: Distributed object detection with linear SVMs. IEEE Trans. Cybern. 44(11), 2122–2133 (2014)CrossRef
8.
Zurück zum Zitat Scovanner, P., Ali, S., Shah, M.: A 3-dimensional SIFT descriptor and its application to action recognition. In: Proceedings ACM international conference on multimedia, pp. 357–360 (2007) Scovanner, P., Ali, S., Shah, M.: A 3-dimensional SIFT descriptor and its application to action recognition. In: Proceedings ACM international conference on multimedia, pp. 357–360 (2007)
9.
Zurück zum Zitat Klaser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: Proceedings British machine vision conference (BMVC), pp. 995–1004 (2008) Klaser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: Proceedings British machine vision conference (BMVC), pp. 995–1004 (2008)
10.
Zurück zum Zitat Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of optical flow and appreance. In: Proceedings European conference on computer vision (ECCV), pp. 428–441 (2006) Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of optical flow and appreance. In: Proceedings European conference on computer vision (ECCV), pp. 428–441 (2006)
11.
Zurück zum Zitat Baker, S., Roth, S., Scharstein, D., Black, M. J., Lewis, J., Szeliski, R.: A database and evaluation methodology for optical flow. In: Proceedings IEEE international conference on computer vision (ICCV), pp. 1–8 (2007) Baker, S., Roth, S., Scharstein, D., Black, M. J., Lewis, J., Szeliski, R.: A database and evaluation methodology for optical flow. In: Proceedings IEEE international conference on computer vision (ICCV), pp. 1–8 (2007)
12.
Zurück zum Zitat Kim, W., Yoo, B., Han, J-J.: HDO : a novel local image descriptor. In: Proceedings IEEE international conference on image processing (ICIP), pp. 5671–5675 (2014) Kim, W., Yoo, B., Han, J-J.: HDO : a novel local image descriptor. In: Proceedings IEEE international conference on image processing (ICIP), pp. 5671–5675 (2014)
13.
Zurück zum Zitat Kim, W., Kim, C.: Spatiotemporal saliency detection using textural contrast and its applications. IEEE Trans. Circuits Syst. Video Technol. 24(4), 646–659 (2014)CrossRef Kim, W., Kim, C.: Spatiotemporal saliency detection using textural contrast and its applications. IEEE Trans. Circuits Syst. Video Technol. 24(4), 646–659 (2014)CrossRef
14.
Zurück zum Zitat Rowley, H., Baluja, S., Kanade, T.: Neural network-based face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 22–38 (1998)CrossRef Rowley, H., Baluja, S., Kanade, T.: Neural network-based face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 22–38 (1998)CrossRef
15.
Zurück zum Zitat Devernay, F.: A non-maxima suppression method for edge detection with sub-pixel accuracy. Technical report, INRIA, no. RR-2724 (1995) Devernay, F.: A non-maxima suppression method for edge detection with sub-pixel accuracy. Technical report, INRIA, no. RR-2724 (1995)
16.
Zurück zum Zitat Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1475–1490 (2004)CrossRef Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1475–1490 (2004)CrossRef
17.
Zurück zum Zitat Bourdev, L., Brandt, J.: Robust object detection via soft cascade. Proc. IEEE Comput. Vis. Pattern Recognit. (CVPR) 2, 236–243 (2005) Bourdev, L., Brandt, J.: Robust object detection via soft cascade. Proc. IEEE Comput. Vis. Pattern Recognit. (CVPR) 2, 236–243 (2005)
18.
Zurück zum Zitat Li, J., Wang, T., Zhang, Y.: Face detection using SURF cascade. In: Proceedings IEEE computer vision and pattern recognition workshops (CVPRW), pp. 2183–2190 (2011) Li, J., Wang, T., Zhang, Y.: Face detection using SURF cascade. In: Proceedings IEEE computer vision and pattern recognition workshops (CVPRW), pp. 2183–2190 (2011)
19.
Zurück zum Zitat Liao, S., Jain, A.K., Li, S.: A fast and accurate unconstrained face detector. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 211–223 (2016)CrossRef Liao, S., Jain, A.K., Li, S.: A fast and accurate unconstrained face detector. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 211–223 (2016)CrossRef
20.
Zurück zum Zitat Wang, H., Klaser, A., Schmid, C., Liu, C-L.: Action recognition by dense trajectories. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 3169–3176 (2011) Wang, H., Klaser, A., Schmid, C., Liu, C-L.: Action recognition by dense trajectories. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 3169–3176 (2011)
21.
Zurück zum Zitat Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings IEEE international conference on computer vision (ICCV), pp. 3551–3558 (2013) Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings IEEE international conference on computer vision (ICCV), pp. 3551–3558 (2013)
22.
Zurück zum Zitat Kappor, A., Winn, J.: Located hidden random fields: learning discriminative parts for object Detection. In: Proceedings European conference on computer vision (ECCV), pp. 302–315 (2006) Kappor, A., Winn, J.: Located hidden random fields: learning discriminative parts for object Detection. In: Proceedings European conference on computer vision (ECCV), pp. 302–315 (2006)
23.
Zurück zum Zitat Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. Int. J. Comput. Vision 77(1), 259–289 (2008)CrossRef Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. Int. J. Comput. Vision 77(1), 259–289 (2008)CrossRef
24.
Zurück zum Zitat Shechtman, E., Irani, M.: Space-time behavior-based correlation-or-how to tell if two underlying motion fields are similar without computing them? IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 2045–2056 (2007)CrossRef Shechtman, E., Irani, M.: Space-time behavior-based correlation-or-how to tell if two underlying motion fields are similar without computing them? IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 2045–2056 (2007)CrossRef
25.
Zurück zum Zitat Ke, Y., Sukthankar, R., Hebert, M.: Event detection in crowded videos. In: Proceedings IEEE international conference on computer vision (ICCV), pp. 1–8 (2007) Ke, Y., Sukthankar, R., Hebert, M.: Event detection in crowded videos. In: Proceedings IEEE international conference on computer vision (ICCV), pp. 1–8 (2007)
26.
Zurück zum Zitat Tian, Y., Cao, L., Liu, Z., Zhang, Z.: Hierarchical filtered motion for action recognition in crowded videos. IEEE Trans. Syst. Man Cybern.-Part C: Appl. Rev. 42(3), 313–323 (2012)CrossRef Tian, Y., Cao, L., Liu, Z., Zhang, Z.: Hierarchical filtered motion for action recognition in crowded videos. IEEE Trans. Syst. Man Cybern.-Part C: Appl. Rev. 42(3), 313–323 (2012)CrossRef
27.
Zurück zum Zitat Yuan, J., Liu, Z., Wu, Y.: Discriminative subvolume search for efficient action detection. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 2442–2449 (2009) Yuan, J., Liu, Z., Wu, Y.: Discriminative subvolume search for efficient action detection. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 2442–2449 (2009)
28.
Zurück zum Zitat Siva, P., Xiang, T.: Weakly supervised action detection. In: Proceedings British machine vision conference (BMVC), pp. 65.1–65.11 (2011) Siva, P., Xiang, T.: Weakly supervised action detection. In: Proceedings British machine vision conference (BMVC), pp. 65.1–65.11 (2011)
29.
Zurück zum Zitat Roshtkhari, M.J., Levine, M.D.: Human activity recognition in videos using a single example. Image Vis. Comput. 31(11), 864–876 (2013) Roshtkhari, M.J., Levine, M.D.: Human activity recognition in videos using a single example. Image Vis. Comput. 31(11), 864–876 (2013)
30.
Zurück zum Zitat Adeli-Mosabbeb, E., Fathy, M.: Non-negative matrix completion for action detection. Image Vis. Comput. 39(7), 38–51 (2015)CrossRef Adeli-Mosabbeb, E., Fathy, M.: Non-negative matrix completion for action detection. Image Vis. Comput. 39(7), 38–51 (2015)CrossRef
31.
Zurück zum Zitat Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings IEEE conference on pattern recognition (ICPR), pp. 32–36 (2004) Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings IEEE conference on pattern recognition (ICPR), pp. 32–36 (2004)
32.
Zurück zum Zitat Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos “in the wild”. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 1996–2003 (2009) Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos “in the wild”. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 1996–2003 (2009)
33.
Zurück zum Zitat Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 2929–2936 (2009) Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 2929–2936 (2009)
34.
Zurück zum Zitat Wang, H., Klaser, A., Schmid, C., Liu, L.: Action recognition by dense trajectories. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 3169–3176 (2011) Wang, H., Klaser, A., Schmid, C., Liu, L.: Action recognition by dense trajectories. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 3169–3176 (2011)
35.
Zurück zum Zitat Reddy, K., Shah, M.: Recognizing 50 human action categories of web videos. Mach. Vis. Appl. 24, 971–981 (2013)CrossRef Reddy, K., Shah, M.: Recognizing 50 human action categories of web videos. Mach. Vis. Appl. 24, 971–981 (2013)CrossRef
36.
Zurück zum Zitat Ikizler-Cinbis, N., Sclaroff, S.: Object, scene and actions: combining multiple features for human action recognition. In: Proceedings European conference on computer vision (ECCV), pp. 494–507 (2010) Ikizler-Cinbis, N., Sclaroff, S.: Object, scene and actions: combining multiple features for human action recognition. In: Proceedings European conference on computer vision (ECCV), pp. 494–507 (2010)
37.
Zurück zum Zitat Le, Q. V., Zou, W. Y., Yeung, S. Y., Ng, A. Y.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 3361–3368 (2011) Le, Q. V., Zou, W. Y., Yeung, S. Y., Ng, A. Y.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: Proceedings IEEE computer vision and pattern recognition (CVPR), pp. 3361–3368 (2011)
38.
Zurück zum Zitat Chetverikov, D., Axt, A.: Approximation-free running SVD and its application to motion detection. Pattern Recogn. Lett. 31(9), 891–897 (2010) Chetverikov, D., Axt, A.: Approximation-free running SVD and its application to motion detection. Pattern Recogn. Lett. 31(9), 891–897 (2010)
39.
Zurück zum Zitat Liu, X., Wen, Z., Zhang, Y.: Limited memory block Krylov subspace optimization for computing dominant singular value decomposition. SIAM J. Sci. Comput. 35(3), 1641–1668 (2013)MathSciNetCrossRef Liu, X., Wen, Z., Zhang, Y.: Limited memory block Krylov subspace optimization for computing dominant singular value decomposition. SIAM J. Sci. Comput. 35(3), 1641–1668 (2013)MathSciNetCrossRef
Metadaten
Titel
Directional coherence-based spatiotemporal descriptor for object detection in static and dynamic scenes
verfasst von
Wonjun Kim
Jae-Joon Han
Publikationsdatum
23.07.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
Machine Vision and Applications / Ausgabe 1-2/2017
Print ISSN: 0932-8092
Elektronische ISSN: 1432-1769
DOI
https://doi.org/10.1007/s00138-016-0801-7

Weitere Artikel der Ausgabe 1-2/2017

Machine Vision and Applications 1-2/2017 Zur Ausgabe