Skip to main content
Erschienen in:
Buchtitelbild

2013 | OriginalPaper | Buchkapitel

Learning a Family of Detectors via Multiplicative Kernels

verfasst von : Quan Yuan, Ashwin Thangali, Vitaly Ablavsky, Stan Sclaroff

Erschienen in: Topics in Medical Image Processing and Computational Vision

Verlag: Springer Netherlands

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Object detection is challenging when the object class exhibits large within-class variations. In this work, we show that foreground-background classification (detection) and within-class classification of the foreground class (pose estimation) can be jointly learned in a multiplicative form of two kernel functions. Model training is accomplished via standard SVM learning. When the foreground object masks are provided in training, the detectors can also produce object segmentations. A tracking-by-detection framework to recover foreground state in video sequences is also proposed with our model. The advantages of our method are demonstrated on tasks of object detection, view angle estimation and tracking. Our approach compares favorably to existing methods on hand and vehicle detection tasks. Quantitative tracking results are given on sequences of moving vehicles and human faces.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
In this paper, all vector variables are column vectors.
 
Literatur
1.
Zurück zum Zitat Agarwal A, Triggs B (2004) 3D human pose from silhouettes by relevance vector regression. In: Proceedings of the IEEE conference on computer vision and pattern recognition Agarwal A, Triggs B (2004) 3D human pose from silhouettes by relevance vector regression. In: Proceedings of the IEEE conference on computer vision and pattern recognition
2.
Zurück zum Zitat Andriluka M, Roth S, Schiele B (2008) People-tracking-by-detection and people-detection-by-tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition Andriluka M, Roth S, Schiele B (2008) People-tracking-by-detection and people-detection-by-tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition
3.
Zurück zum Zitat Athitsos V, Sclaroff S (2003) Estimating 3D hand pose from a cluttered image. In: Proceedings of the IEEE conference on computer vision and pattern recognition Athitsos V, Sclaroff S (2003) Estimating 3D hand pose from a cluttered image. In: Proceedings of the IEEE conference on computer vision and pattern recognition
4.
Zurück zum Zitat Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 24(24):509–522CrossRef Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 24(24):509–522CrossRef
5.
Zurück zum Zitat Bissacco A, Yang M, Soatto S (2006) Deteing humans via their pose. In: Proceedings of advances in neural information processing systems Bissacco A, Yang M, Soatto S (2006) Deteing humans via their pose. In: Proceedings of advances in neural information processing systems
6.
Zurück zum Zitat Blaschko MB, Lampert CH (2008) Learning to localize objects with structured output regression. In: Proceedings of the European conference on computer vision Blaschko MB, Lampert CH (2008) Learning to localize objects with structured output regression. In: Proceedings of the European conference on computer vision
7.
Zurück zum Zitat Borenstein E, Ullman S (2002) Class-specific, top-down segmentation. In: Proceedings of the European conference on computer vision Borenstein E, Ullman S (2002) Class-specific, top-down segmentation. In: Proceedings of the European conference on computer vision
8.
Zurück zum Zitat Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20:273–297MATH Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20:273–297MATH
9.
Zurück zum Zitat Crasborn O, van der Kooij E, Nonhebel A, Emmerik W (2004) ECHO data set for sign language of the Netherlands. Technical report Department of Linguistics, University Nijmegen Crasborn O, van der Kooij E, Nonhebel A, Emmerik W (2004) ECHO data set for sign language of the Netherlands. Technical report Department of Linguistics, University Nijmegen
10.
Zurück zum Zitat Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition
11.
Zurück zum Zitat Damoulas T, Girolami MA (2008) Pattern recognition with a Bayesian kernel combination machine. Pattern Recogn Lett 30(1):46–54CrossRef Damoulas T, Girolami MA (2008) Pattern recognition with a Bayesian kernel combination machine. Pattern Recogn Lett 30(1):46–54CrossRef
12.
Zurück zum Zitat Enzweiler M, Gavrila DM (2008) A mixed generative-discriminative framework for pedestrian classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition Enzweiler M, Gavrila DM (2008) A mixed generative-discriminative framework for pedestrian classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition
13.
Zurück zum Zitat Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part based models. IEEE Trans Pattern Anal Mach Intell (to appear) Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part based models. IEEE Trans Pattern Anal Mach Intell (to appear)
14.
Zurück zum Zitat Felzenszwalb PF, Huttenlocher DP (2005) Pictorial structures for object recognition. Int J Comput Vision 61:55–79CrossRef Felzenszwalb PF, Huttenlocher DP (2005) Pictorial structures for object recognition. Int J Comput Vision 61:55–79CrossRef
15.
Zurück zum Zitat Gavrila DM (2000) Pedestrian detection from a moving vehicle. In: Proceedings of the European conference on computer vision Gavrila DM (2000) Pedestrian detection from a moving vehicle. In: Proceedings of the European conference on computer vision
16.
Zurück zum Zitat Gross R, Matthews I, Cohn J, Kanade T, Baker S (2008) Multi-PIE. In: Proceedings of the IEEE international conference on face and gesture recognition Gross R, Matthews I, Cohn J, Kanade T, Baker S (2008) Multi-PIE. In: Proceedings of the IEEE international conference on face and gesture recognition
17.
Zurück zum Zitat Hoiem D, Efros AA, Hebert M (2008) Putting objects in perspective. Int J Comput Vision 80(1):3–15CrossRef Hoiem D, Efros AA, Hebert M (2008) Putting objects in perspective. Int J Comput Vision 80(1):3–15CrossRef
18.
Zurück zum Zitat Huang C, Ai H, Li Y, Lao S (2007) High-performance rotation invariant multiview face detection. IEEE Trans Pattern Anal Mach Intell 29(4):671–686CrossRef Huang C, Ai H, Li Y, Lao S (2007) High-performance rotation invariant multiview face detection. IEEE Trans Pattern Anal Mach Intell 29(4):671–686CrossRef
19.
Zurück zum Zitat Ioffe C, Forsyth D (2001) Probabilistic methods for finding people. Int J Comput Vision 43(1):45–68MATHCrossRef Ioffe C, Forsyth D (2001) Probabilistic methods for finding people. Int J Comput Vision 43(1):45–68MATHCrossRef
20.
Zurück zum Zitat Ionescu C, Bo L, Sminchisescu C (2009) Structural SVM for visual localization and continuous state estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition Ionescu C, Bo L, Sminchisescu C (2009) Structural SVM for visual localization and continuous state estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition
21.
Zurück zum Zitat Isard M, Blake A (1998) CONDENSATION: Conditional density propagation for visual tracking. Int J Comput Vision 29(1):5–28CrossRef Isard M, Blake A (1998) CONDENSATION: Conditional density propagation for visual tracking. Int J Comput Vision 29(1):5–28CrossRef
22.
Zurück zum Zitat Joachims T (1999) Making large-scale SVM learning practical. In: Scholkopf B, Burges C, Smola A (eds) Advances in Kernel methods—support vector learning. MIT Press, Cambridge Joachims T (1999) Making large-scale SVM learning practical. In: Scholkopf B, Burges C, Smola A (eds) Advances in Kernel methods—support vector learning. MIT Press, Cambridge
23.
Zurück zum Zitat Kumar MP, Torr PHS, Zisserman A (2005) Obj Cut. In: Proceedings of the IEEE conference on computer vision and pattern recognition Kumar MP, Torr PHS, Zisserman A (2005) Obj Cut. In: Proceedings of the IEEE conference on computer vision and pattern recognition
24.
Zurück zum Zitat Leibe B, Cornelis N, Cornelis K, Gool LV (2007) Dynamic 3D scene analysis from a moving vehicle. In: Proceedings of the IEEE conference on computer vision and pattern recognition Leibe B, Cornelis N, Cornelis K, Gool LV (2007) Dynamic 3D scene analysis from a moving vehicle. In: Proceedings of the IEEE conference on computer vision and pattern recognition
25.
Zurück zum Zitat Leibe B, Leonardis A, Schiele B (2007) Robust object detection with interleaved categorization and segmentation. Int J Comput Vision 77(1):259–289CrossRef Leibe B, Leonardis A, Schiele B (2007) Robust object detection with interleaved categorization and segmentation. Int J Comput Vision 77(1):259–289CrossRef
26.
Zurück zum Zitat Li S, Fu Q, Gu L, Scholkopf B, Cheng Y, Zhang H (2001) Kernel machine based learning for multi-view face detection and pose estimation. In: Proceedings of the IEEE international conference on computer vision Li S, Fu Q, Gu L, Scholkopf B, Cheng Y, Zhang H (2001) Kernel machine based learning for multi-view face detection and pose estimation. In: Proceedings of the IEEE international conference on computer vision
27.
Zurück zum Zitat Li S, Zhang Z (2004) Floatboost learning and statistical face detection. IEEE Trans Pattern Anal Mach Intell 26(9):1112–1123CrossRef Li S, Zhang Z (2004) Floatboost learning and statistical face detection. IEEE Trans Pattern Anal Mach Intell 26(9):1112–1123CrossRef
28.
Zurück zum Zitat Li Y, Ai H, Yamashita T, Lao S, Kawade M (2008) Tracking in low frame rate video: a cascade particle filter with discriminative observers of different life spans. IEEE Trans Pattern Anal Mach Intell 30(10):1728–1740CrossRef Li Y, Ai H, Yamashita T, Lao S, Kawade M (2008) Tracking in low frame rate video: a cascade particle filter with discriminative observers of different life spans. IEEE Trans Pattern Anal Mach Intell 30(10):1728–1740CrossRef
29.
Zurück zum Zitat Everingham M et al (2006) The 2005 PASCAL visual object class challenge. In: Machine learning challenges—evaluating predictive uncertainty, visual object classification, and recognising textual entailment, Springer Everingham M et al (2006) The 2005 PASCAL visual object class challenge. In: Machine learning challenges—evaluating predictive uncertainty, visual object classification, and recognising textual entailment, Springer
30.
Zurück zum Zitat Marszalek M, Schmid C, Harzallah H, van de Weijer J (2007) Learning object representations for visual object class recognition. In: Visual recognition challange workshop, in conjunction with ICCV Marszalek M, Schmid C, Harzallah H, van de Weijer J (2007) Learning object representations for visual object class recognition. In: Visual recognition challange workshop, in conjunction with ICCV
31.
Zurück zum Zitat Murase H, Nayar SK (1995) Visual learning and recognition of 3D objects from appearance. Int J Comput Vision 14(1):5–24CrossRef Murase H, Nayar SK (1995) Visual learning and recognition of 3D objects from appearance. Int J Comput Vision 14(1):5–24CrossRef
33.
Zurück zum Zitat Nocedal J, Wright SJ (2006) Numerical optimization. Springer, New York Nocedal J, Wright SJ (2006) Numerical optimization. Springer, New York
34.
Zurück zum Zitat Oikonomopoulos A, Patras I, Pantic M (2006) Kernel-based recognition of human actions using spatiotemporal salient points. In: Workshop on vision for human computer interaction Oikonomopoulos A, Patras I, Pantic M (2006) Kernel-based recognition of human actions using spatiotemporal salient points. In: Workshop on vision for human computer interaction
35.
Zurück zum Zitat Okuma K, Taleghani A, Freitas ND, Little J, Lowe D (2004) A boosted particle filter: multitarget detection and tracking. In: Proceeedings of the European conference on computer vision Okuma K, Taleghani A, Freitas ND, Little J, Lowe D (2004) A boosted particle filter: multitarget detection and tracking. In: Proceeedings of the European conference on computer vision
36.
Zurück zum Zitat Ong E, Bowden R (2004) A boosted classifier tree for hand shape detection. In: Proceedings of the IEEE international conference on face and gesture recognition Ong E, Bowden R (2004) A boosted classifier tree for hand shape detection. In: Proceedings of the IEEE international conference on face and gesture recognition
37.
Zurück zum Zitat Osadchy R, Miller M, LeCun Y (2004) Synergistic face detection and pose estimation with energy-based model. In: Proceedings of advances in neural information processing systems Osadchy R, Miller M, LeCun Y (2004) Synergistic face detection and pose estimation with energy-based model. In: Proceedings of advances in neural information processing systems
38.
Zurück zum Zitat Papageorgiou C, Poggio T (2000) A trainable system for object detection. Int J Comput Vision 38(1):15–33MATHCrossRef Papageorgiou C, Poggio T (2000) A trainable system for object detection. Int J Comput Vision 38(1):15–33MATHCrossRef
39.
Zurück zum Zitat Pentland A, Moghaddam B, Starner T (1994) View-based and modular eigenspaces for face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition Pentland A, Moghaddam B, Starner T (1994) View-based and modular eigenspaces for face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition
40.
Zurück zum Zitat Platt J (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Smola A, Bartlett P, Scholkopf B, Schuurmans D (eds) Advances in large margin classifiers. MIT Press, Cambridge Platt J (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Smola A, Bartlett P, Scholkopf B, Schuurmans D (eds) Advances in large margin classifiers. MIT Press, Cambridge
41.
Zurück zum Zitat Ramanan D, Forsyth DA, Zisserman A (2005) Strike a pose: tracking people by finding stylized poses. In: Proceedings of the IEEE conference on computer vision and pattern recognition Ramanan D, Forsyth DA, Zisserman A (2005) Strike a pose: tracking people by finding stylized poses. In: Proceedings of the IEEE conference on computer vision and pattern recognition
42.
Zurück zum Zitat Rifkin R, Klautau A (2004) In defense of one-vs-all classification. J Mach Learn Res 5:101–141 Rifkin R, Klautau A (2004) In defense of one-vs-all classification. J Mach Learn Res 5:101–141
43.
Zurück zum Zitat Rosales R, Sclaroff S (2002) Learning body pose via specialized maps. In: Proceedings of advances in neural information processing systems Rosales R, Sclaroff S (2002) Learning body pose via specialized maps. In: Proceedings of advances in neural information processing systems
44.
Zurück zum Zitat Russell BC, Torralba A, Murphy KP, Freeman WT (2005) LabelMe: a database and web-based tool for image annotation. Technical report, MIT Press, Cambridge Russell BC, Torralba A, Murphy KP, Freeman WT (2005) LabelMe: a database and web-based tool for image annotation. Technical report, MIT Press, Cambridge
45.
Zurück zum Zitat Seemann E, leibe B, Schiele B (2006) Multi-aspect detection of articulated objects. In: Proceedings of the IEEE conference on computer vision and pattern recognition Seemann E, leibe B, Schiele B (2006) Multi-aspect detection of articulated objects. In: Proceedings of the IEEE conference on computer vision and pattern recognition
46.
Zurück zum Zitat Shakhnarovich G, Viola P, Darrell T (2003) Fast pose estimation with parameter-sensitive hashing. In: Proceedings of the IEEE international conference on computer vision Shakhnarovich G, Viola P, Darrell T (2003) Fast pose estimation with parameter-sensitive hashing. In: Proceedings of the IEEE international conference on computer vision
47.
Zurück zum Zitat Shi J, Malik J (1997) Normalized cuts and image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition Shi J, Malik J (1997) Normalized cuts and image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition
48.
Zurück zum Zitat Sidenbladh H, Black MJ, Fleet DJ (2000) Stochastic tracking of 3D human figures using 2D image motion. In: Proceedings of the European conference on computer vision, pp 702–718 Sidenbladh H, Black MJ, Fleet DJ (2000) Stochastic tracking of 3D human figures using 2D image motion. In: Proceedings of the European conference on computer vision, pp 702–718
49.
Zurück zum Zitat Sigal L, Bhatia S, Roth S, Black M, Isard M (2004) Tracking loose-limbed people. In: Proceedings of the IEEE conference on computer vision and pattern recognition Sigal L, Bhatia S, Roth S, Black M, Isard M (2004) Tracking loose-limbed people. In: Proceedings of the IEEE conference on computer vision and pattern recognition
50.
Zurück zum Zitat Sminchisescu C, Kanaujia A, Metaxas D (2006) Learning joint top-down and bottom-up processes for 3D visual inference. In: Proceedings of the IEEE conference on computer vision and pattern recognition Sminchisescu C, Kanaujia A, Metaxas D (2006) Learning joint top-down and bottom-up processes for 3D visual inference. In: Proceedings of the IEEE conference on computer vision and pattern recognition
51.
Zurück zum Zitat Stenger B, Thayananthan A, Torr P, Cipolla R (2003) Filtering using a tree-based estimator. In: Proceedings of the IEEE international conference on computer vision Stenger B, Thayananthan A, Torr P, Cipolla R (2003) Filtering using a tree-based estimator. In: Proceedings of the IEEE international conference on computer vision
52.
Zurück zum Zitat Torralba A, Murphy K, Freeman W (2004) Sharing features: Efficient boosting procedures for multiclass object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition Torralba A, Murphy K, Freeman W (2004) Sharing features: Efficient boosting procedures for multiclass object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition
53.
Zurück zum Zitat Varma M, Ray D (2007) Learning the discriminative power-invariance trade-off. In: Proceedings of the IEEE international conference on computer vision. Rio de Janeiro, Brazil Varma M, Ray D (2007) Learning the discriminative power-invariance trade-off. In: Proceedings of the IEEE international conference on computer vision. Rio de Janeiro, Brazil
54.
Zurück zum Zitat Viola P, Jones M (2003) Fast multi-view face detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition Viola P, Jones M (2003) Fast multi-view face detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition
55.
Zurück zum Zitat Viola P, Jones M (2004) Robust real time object detection. Int J Comput Vision 57(2):137–154CrossRef Viola P, Jones M (2004) Robust real time object detection. Int J Comput Vision 57(2):137–154CrossRef
56.
Zurück zum Zitat Wang L, Shi J, Song G, Shen I (2007) Object detection combining recognition and segmentation. In: Proceedings of Asian conference on computer vision Wang L, Shi J, Song G, Shen I (2007) Object detection combining recognition and segmentation. In: Proceedings of Asian conference on computer vision
57.
Zurück zum Zitat Wu B, Nevatia R (2007) Cluster boosted tree classifier for multi-view multi-pose object detection. In: Proceedings of the IEEE international conference on computer vision Wu B, Nevatia R (2007) Cluster boosted tree classifier for multi-view multi-pose object detection. In: Proceedings of the IEEE international conference on computer vision
58.
Zurück zum Zitat Wu B, Nevatia R (2007) Simultaneous object detection and segmentation by boosting local shape feature based classifier. In: Proceedings of the IEEE conference on computer vision and pattern recognition Wu B, Nevatia R (2007) Simultaneous object detection and segmentation by boosting local shape feature based classifier. In: Proceedings of the IEEE conference on computer vision and pattern recognition
59.
Zurück zum Zitat Yuan Q, Thangali A, Ablavsky V, Sclaroff S (2007) Parameter sensitive detectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition Yuan Q, Thangali A, Ablavsky V, Sclaroff S (2007) Parameter sensitive detectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition
60.
Zurück zum Zitat Zhu L, Chen Y, Lin C, Yuille AL (2007) Rapid inference on a novel and/or graph: detection, segmentation and parsing of articulated deformable objects in cluttered backgrounds. In: Proceedings of advances in neural information processing systems Zhu L, Chen Y, Lin C, Yuille AL (2007) Rapid inference on a novel and/or graph: detection, segmentation and parsing of articulated deformable objects in cluttered backgrounds. In: Proceedings of advances in neural information processing systems
Metadaten
Titel
Learning a Family of Detectors via Multiplicative Kernels
verfasst von
Quan Yuan
Ashwin Thangali
Vitaly Ablavsky
Stan Sclaroff
Copyright-Jahr
2013
Verlag
Springer Netherlands
DOI
https://doi.org/10.1007/978-94-007-0726-9_1

Neuer Inhalt