Skip to main content
Top
Published in: Neural Computing and Applications 9/2017

08-04-2016 | IBPRIA 2015

Random clustering ferns for multimodal object recognition

Authors: M. Villamizar, A. Garrell, A. Sanfeliu, F. Moreno-Noguer

Published in: Neural Computing and Applications | Issue 9/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We propose an efficient and robust method for the recognition of objects exhibiting multiple intra-class modes, where each one is associated with a particular object appearance. The proposed method, called random clustering ferns, combines synergically a single and real-time classifier, based on the boosted assembling of extremely randomized trees (ferns), with an unsupervised and probabilistic approach in order to recognize efficiently object instances in images and discover simultaneously the most prominent appearance modes of the object through tree-structured visual words. In particular, we use boosted random ferns and probabilistic latent semantic analysis to obtain a discriminative and multimodal classifier that automatically clusters the response of its randomized trees in function of the visual object appearance. The proposed method is validated extensively in synthetic and real experiments, showing that the method is capable of detecting objects with diverse and complex appearance distributions in real-time performance.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Footnotes
1
We use interchangeably the terms cluster and mode to refer to a dense part of the object appearance distribution.
 
2
The indicator function \({\mathbb {I}}(e)=1\) if e is true, and 0 otherwise.
 
3
The EER is the point in the precision-recall curve where precision = recall.
 
4
The score distribution (Gaussian function) is calculated using the confidences of the BRFs for all class samples.
 
5
The squared Hellinger distance for two distributions P and Q is defined as: \(H^2(P,Q) = 1 -\sqrt{k_1/k_2}\exp (-0.25k_3/k_2)\), with \(k_1 = 2 \sigma _P \sigma _Q\), \(k_2=\sigma _P^2 + \sigma _Q^2\), and \(k_3 =(\mu _P - \mu _Q)^2\).
 
6
However, it is possible to use human assistance during the learning to improve the visual skills of the classifier [40].
 
8
For this problem, only 300 visual words are activated out of 38,400 words, each one corresponding to a fern output.
 
9
Since the pLSA clustering is automatic, the confusion matrix is not necessarily diagonal. However, here the labels provided by pLSA have been sorted for display purposes.
 
Literature
1.
go back to reference Ali K, Saenko K (2014) Confidence-rated multiple instance boosting for object detection. In: CVPR Ali K, Saenko K (2014) Confidence-rated multiple instance boosting for object detection. In: CVPR
2.
go back to reference Blockeel H, De Raedt L, Ramon J (1998) Top-down induction of clustering trees. In: ICML, pp 55–63 Blockeel H, De Raedt L, Ramon J (1998) Top-down induction of clustering trees. In: ICML, pp 55–63
3.
go back to reference Bosch A, Zisserman A, Muñoz X (2006) Scene classification via pLSA. In: ECCV Bosch A, Zisserman A, Muñoz X (2006) Scene classification via pLSA. In: ECCV
4.
go back to reference Bosch A, Zisserman A, Munoz X (2007) Image classification using random forests and ferns. In: ICCV, pp 1–8 Bosch A, Zisserman A, Munoz X (2007) Image classification using random forests and ferns. In: ICCV, pp 1–8
6.
go back to reference Criminisi A, Shotton J, Konukoglu E (2012) Decision forests: a unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found Trends Comput Graph Vis 7(2–3):81–227MATH Criminisi A, Shotton J, Konukoglu E (2012) Decision forests: a unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found Trends Comput Graph Vis 7(2–3):81–227MATH
7.
go back to reference Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Proceedings ECCV workshop statistical learning in computer vision, pp 59–74 Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Proceedings ECCV workshop statistical learning in computer vision, pp 59–74
8.
go back to reference Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In : CVPR, pp 886–893 Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In : CVPR, pp 886–893
9.
go back to reference Erhan D, Szegedy C, Toshev A, Anguelov D (2014) Scalable object detection using deep neural networks. In: CVPR Erhan D, Szegedy C, Toshev A, Anguelov D (2014) Scalable object detection using deep neural networks. In: CVPR
10.
go back to reference Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. PAMI 32(9):1627–1645CrossRef Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. PAMI 32(9):1627–1645CrossRef
11.
go back to reference Fergus R, Perona P, Zisserman A (2003) Object class recognition by unsupervised scale-invariant learning. In: CVPR Fergus R, Perona P, Zisserman A (2003) Object class recognition by unsupervised scale-invariant learning. In: CVPR
12.
go back to reference Gall J, Yao A, Razavi N, Van Gool L, Lempitsky V (2011) Hough forests for object detection, tracking, and action recognition. PAMI 33(11):2188–2202CrossRef Gall J, Yao A, Razavi N, Van Gool L, Lempitsky V (2011) Hough forests for object detection, tracking, and action recognition. PAMI 33(11):2188–2202CrossRef
13.
go back to reference Garrell A, Villamizar M, Moreno-Noguer F, Sanfeliu A (2013) Proactive behavior of an autonomous mobile robot for human-assisted learning. In: RO-MAN Garrell A, Villamizar M, Moreno-Noguer F, Sanfeliu A (2013) Proactive behavior of an autonomous mobile robot for human-assisted learning. In: RO-MAN
14.
go back to reference Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp 580–587 Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp 580–587
15.
go back to reference Hall D, Perona P (2014) From categories to individuals in real time: a unified boosting approach. In: CVPR Hall D, Perona P (2014) From categories to individuals in real time: a unified boosting approach. In: CVPR
16.
go back to reference Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1–2):177–196CrossRefMATH Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1–2):177–196CrossRefMATH
17.
go back to reference Jurie F, Triggs B (2005) Creating efficient codebooks for visual recognition. In: ICCV, pp 604–610 Jurie F, Triggs B (2005) Creating efficient codebooks for visual recognition. In: ICCV, pp 604–610
18.
go back to reference Kalal Z, Mikolajczyk K, Matas J (2012) Tracking–learning–detection. PAMI 34(7):1409–1422CrossRef Kalal Z, Mikolajczyk K, Matas J (2012) Tracking–learning–detection. PAMI 34(7):1409–1422CrossRef
19.
go back to reference Kim TK, Cipolla R (2009) Mcboost: multiple classifier boosting for perceptual co-clustering of images and visual features. In: NIPS, pp 841–848 Kim TK, Cipolla R (2009) Mcboost: multiple classifier boosting for perceptual co-clustering of images and visual features. In: NIPS, pp 841–848
20.
go back to reference Klein DA, Schulz D, Frintrop S, Cremers AB (2010) Adaptive real-time video-tracking for arbitrary objects. In: IROS Klein DA, Schulz D, Frintrop S, Cremers AB (2010) Adaptive real-time video-tracking for arbitrary objects. In: IROS
21.
go back to reference Krupka E, Vinnikov A, Klein B, Hillel AB, Freedman D, Stachniak S (2014) Discriminative ferns ensemble for hand pose recognition. In: CVPR, pp 3670–3677 Krupka E, Vinnikov A, Klein B, Hillel AB, Freedman D, Stachniak S (2014) Discriminative ferns ensemble for hand pose recognition. In: CVPR, pp 3670–3677
22.
go back to reference LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef
23.
go back to reference Liu B, Xia Y, Yu PS (2000) Clustering through decision tree construction. In: Proceedings of the ninth ACM international conferenced information and knowledge management, pp 20–29 Liu B, Xia Y, Yu PS (2000) Clustering through decision tree construction. In: Proceedings of the ninth ACM international conferenced information and knowledge management, pp 20–29
24.
go back to reference Lowe DG (2004) Distinctive image features from scale-invariant keypoints. IJCV 60(2):91–110CrossRef Lowe DG (2004) Distinctive image features from scale-invariant keypoints. IJCV 60(2):91–110CrossRef
25.
go back to reference Malisiewicz T, Gupta A, Efros AA (2011) Ensemble of exemplar-svms for object detection and beyond. In: ICCV, pp 89–96 Malisiewicz T, Gupta A, Efros AA (2011) Ensemble of exemplar-svms for object detection and beyond. In: ICCV, pp 89–96
26.
go back to reference Marée R, Geurts P, Piater J, Wehenkel L (2005) Random subwindows for robust image classification. In: CVPR, pp 34–40 Marée R, Geurts P, Piater J, Wehenkel L (2005) Random subwindows for robust image classification. In: CVPR, pp 34–40
27.
go back to reference Moosmann F, Nowak E, Jurie F (2008) Randomized clustering forests for image classification. PAMI 30(9):1632–1646CrossRef Moosmann F, Nowak E, Jurie F (2008) Randomized clustering forests for image classification. PAMI 30(9):1632–1646CrossRef
28.
go back to reference Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, Cambridge Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, Cambridge
29.
go back to reference Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: CVPR, pp 2161–2168 Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: CVPR, pp 2161–2168
30.
go back to reference Ozuysal M, Calonder M, Lepetit V, Fua P (2010) Fast keypoint recognition using random ferns. PAMI 32(3):448–461CrossRef Ozuysal M, Calonder M, Lepetit V, Fua P (2010) Fast keypoint recognition using random ferns. PAMI 32(3):448–461CrossRef
31.
go back to reference Ozuysal M, Lepetit V, Fua P (2009) Pose estimation for category specific multiview object localization. In: CVPR, pp 778–785 Ozuysal M, Lepetit V, Fua P (2009) Pose estimation for category specific multiview object localization. In: CVPR, pp 778–785
32.
go back to reference Perbet F, Stenger B, Maki A (2009) Random forest clustering and application to video segmentation. In: BMVC, pp 1–10 Perbet F, Stenger B, Maki A (2009) Random forest clustering and application to video segmentation. In: BMVC, pp 1–10
33.
go back to reference Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3):297–336CrossRefMATH Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3):297–336CrossRefMATH
34.
go back to reference Sharma P, Nevatia R (2014) Multi class boosted random ferns for adapting a generic object detector to a specific video. In: WACV, pp 745–752 Sharma P, Nevatia R (2014) Multi class boosted random ferns for adapting a generic object detector to a specific video. In: WACV, pp 745–752
35.
go back to reference Shotton J, Johnson M, Cipolla R (2008) Semantic texton forests for image categorization and segmentation. In: CVPR, pp 1–8 Shotton J, Johnson M, Cipolla R (2008) Semantic texton forests for image categorization and segmentation. In: CVPR, pp 1–8
36.
go back to reference Sivic J, Russell B, Efros A, Zisserman A, Freeman WT (2005) Discovering objects and their location in images. In: ICCV Sivic J, Russell B, Efros A, Zisserman A, Freeman WT (2005) Discovering objects and their location in images. In: ICCV
37.
go back to reference Torralba A, Murphy KP, Freeman WT (2007) Sharing visual features for multiclass and multiview object detection. PAMI 29(5):854–869CrossRef Torralba A, Murphy KP, Freeman WT (2007) Sharing visual features for multiclass and multiview object detection. PAMI 29(5):854–869CrossRef
38.
go back to reference Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(85):2579–2605MATH Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(85):2579–2605MATH
39.
go back to reference Villamizar M, Andrade-Cetto J, Sanfeliu A, Moreno-Noguer F (2012) Bootstrapping boosted random ferns for discriminative and efficient object classification. Pattern Recognit 45(9):3141–3153CrossRef Villamizar M, Andrade-Cetto J, Sanfeliu A, Moreno-Noguer F (2012) Bootstrapping boosted random ferns for discriminative and efficient object classification. Pattern Recognit 45(9):3141–3153CrossRef
40.
go back to reference Villamizar M, Garrell A, Sanfeliu A, Moreno-Noguer F (2012) Online human-assisted learning using random ferns. In: ICPR, pp 2821–2824 Villamizar M, Garrell A, Sanfeliu A, Moreno-Noguer F (2012) Online human-assisted learning using random ferns. In: ICPR, pp 2821–2824
41.
go back to reference Villamizar M, Garrell A, Sanfeliu A, Moreno-Noguer F (2015) Modeling robot’s world with minimal effort. In: ICRA Villamizar M, Garrell A, Sanfeliu A, Moreno-Noguer F (2015) Modeling robot’s world with minimal effort. In: ICRA
42.
go back to reference Villamizar M, Garrell A, Sanfeliu A, Moreno-Noguer F (2015) Multimodal object recognition using random clustering trees. In: IBPRIA Villamizar M, Garrell A, Sanfeliu A, Moreno-Noguer F (2015) Multimodal object recognition using random clustering trees. In: IBPRIA
43.
go back to reference Villamizar M, Grabner H, Andrade-Cetto J, Sanfeliu A, Van Gool L, Moreno-Noguer F (2011) Efficient 3d object detection using multiple pose-specific classifiers. In: BMVC Villamizar M, Grabner H, Andrade-Cetto J, Sanfeliu A, Van Gool L, Moreno-Noguer F (2011) Efficient 3d object detection using multiple pose-specific classifiers. In: BMVC
44.
go back to reference Villamizar M, Moreno-Noguer F, Andrade-Cetto J, Sanfeliu A (2010) Efficient rotation invariant object detection using boosted random ferns. In: CVPR, pp 1038–1045 Villamizar M, Moreno-Noguer F, Andrade-Cetto J, Sanfeliu A (2010) Efficient rotation invariant object detection using boosted random ferns. In: CVPR, pp 1038–1045
45.
go back to reference Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: CVPR, pp l–511 Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: CVPR, pp l–511
46.
go back to reference Wu B, Nevatia R (2007) Cluster boosted tree classifier for multi-view, multi-pose object detection. In: ICCV, pp 1–8 Wu B, Nevatia R (2007) Cluster boosted tree classifier for multi-view, multi-pose object detection. In: ICCV, pp 1–8
47.
go back to reference Yan J, Lei Z, Wen L, Li SZ (2014) The fastest deformable part model for object detection. In: CVPR, pp 2497–2504 Yan J, Lei Z, Wen L, Li SZ (2014) The fastest deformable part model for object detection. In: CVPR, pp 2497–2504
Metadata
Title
Random clustering ferns for multimodal object recognition
Authors
M. Villamizar
A. Garrell
A. Sanfeliu
F. Moreno-Noguer
Publication date
08-04-2016
Publisher
Springer London
Published in
Neural Computing and Applications / Issue 9/2017
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-016-2284-x

Other articles of this Issue 9/2017

Neural Computing and Applications 9/2017 Go to the issue

Premium Partner