Skip to main content

2014 | OriginalPaper | Buchkapitel

Multimodal Mixed Conditional Random Field Model for Category-Independent Object Detection

verfasst von : Jian-Hua Zhang, Jian-Wei Zhang, Sheng-Yong Chen, Ying Hu

Erschienen in: Foundations and Practical Applications of Cognitive Systems and Information Processing

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Category-independent object detection is extremely useful for many robot vision tasks. Most existing methods rank a lot of regions by measuring their object-likeness. However, to obtain a sufficient object covering rate too many regions need to be sampled. In this paper, we present a novel method that directly detects and localizes category-independent objects. We develop a novel model which is named as “mixed robust higher-order conditional random field” model which combines 2D and 3D data into a uniform framework. A set of novel features is developed based on 2D and 3D saliency and oversegments. The potentials used in this model are computed from these features. Extensive experiments are carried out on a public RGB-D dataset. By comparison with state-of-the-art ranking methods, the experimental results show the comparable performance of category-independent object detection without sampling a large number of extra regions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Alexe B, Deselaers T, Ferrari V (2010) What is an object ?. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 73–80 Alexe B, Deselaers T, Ferrari V (2010) What is an object ?. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 73–80
2.
Zurück zum Zitat Babenko B, Yang MH, Belongie S (2011) Robust object tracking with online multiple instance learning. IEEE Trans Pattern Anal Mach Intell 33(8):1619–1632 Babenko B, Yang MH, Belongie S (2011) Robust object tracking with online multiple instance learning. IEEE Trans Pattern Anal Mach Intell 33(8):1619–1632
3.
Zurück zum Zitat Carreira J, Sminchisescu C (2010) Constrained parametric min-cuts for automatic object segmentation. IEEE Trans Pattern Anal Mach Intell Early Access. doi:10.1109/TPAMI.2011.231 Carreira J, Sminchisescu C (2010) Constrained parametric min-cuts for automatic object segmentation. IEEE Trans Pattern Anal Mach Intell Early Access. doi:10.​1109/​TPAMI.​2011.​231
5.
Zurück zum Zitat Choi MJ, Torralba A, Willsky AS (Feb. 2012) A tree-based context model for object recognition. IEEE Trans Pattern Anal Mach Intell 34(2):240–252 Choi MJ, Torralba A, Willsky AS (Feb. 2012) A tree-based context model for object recognition. IEEE Trans Pattern Anal Mach Intell 34(2):240–252
6.
Zurück zum Zitat Collet A, Srinivasay SS , Hebert M (2011) Structure discovery in multi-modal data: a region-based approach. In: Proceedings of IEEE international conference robotics and automation, pp 5695–5702, 2011 Collet A, Srinivasay SS , Hebert M (2011) Structure discovery in multi-modal data: a region-based approach. In: Proceedings of IEEE international conference robotics and automation, pp 5695–5702, 2011
7.
Zurück zum Zitat Endres I, Hoiem D (2010) Category independent object roposals. In: Proceedings of European conference on computer vision, pp 575–588, 2010 Endres I, Hoiem D (2010) Category independent object roposals. In: Proceedings of European conference on computer vision, pp 575–588, 2010
8.
Zurück zum Zitat Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: Proceedings of IEEE on conference computer vision and pattern recognition, pp 1778–1785 Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: Proceedings of IEEE on conference computer vision and pattern recognition, pp 1778–1785
9.
Zurück zum Zitat Felzenszwalb P, Huttenlocher D (Sep. 2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181 Felzenszwalb P, Huttenlocher D (Sep. 2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181
10.
Zurück zum Zitat Feng J, Wei Y, Tao L, Zhang C, Sun J (2011) Salient object detection by composition. In: Proceedings of IEEE international conference on computer vision, 2011 Feng J, Wei Y, Tao L, Zhang C, Sun J (2011) Salient object detection by composition. In: Proceedings of IEEE international conference on computer vision, 2011
12.
Zurück zum Zitat Ion A, Carreira J, Sminchisescu C (2011) Image segmentation by figure-ground composition into maximal cliques. In: Proceedings of IEEE international conference on computer vision, 2011 Ion A, Carreira J, Sminchisescu C (2011) Image segmentation by figure-ground composition into maximal cliques. In: Proceedings of IEEE international conference on computer vision, 2011
13.
Zurück zum Zitat Itti L, Koch C, Niebur E (Nov. 1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259 Itti L, Koch C, Niebur E (Nov. 1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259
14.
Zurück zum Zitat Kohli P, Ladicky L, Torr PHS (2009) Robust higher order potentials for enforcing label consistency. Int J Comput Vis 82(3):302–324CrossRef Kohli P, Ladicky L, Torr PHS (2009) Robust higher order potentials for enforcing label consistency. Int J Comput Vis 82(3):302–324CrossRef
15.
Zurück zum Zitat Lai k, Bo L, Ren X, Fox D (2011) A large-scale hierarchical multi-view RGB-D object dataset. In: Proceedings of IEEE international conference on robotics and automation, pp 1817–1824, 2011 Lai k, Bo L, Ren X, Fox D (2011) A large-scale hierarchical multi-view RGB-D object dataset. In: Proceedings of IEEE international conference on robotics and automation, pp 1817–1824, 2011
16.
Zurück zum Zitat Levinshtein A, Sminchisescu C, Dickinson S (2010) Optimal contour closure by superpixel grouping. In: Proceedings of European conference computer vision, pp 480–493, 2010 Levinshtein A, Sminchisescu C, Dickinson S (2010) Optimal contour closure by superpixel grouping. In: Proceedings of European conference computer vision, pp 480–493, 2010
17.
Zurück zum Zitat Li Y, Yan J, Zhou Y (2009) Visual saliency based on conditional entropy. In: Proceedings of Asian conference on computer vision, 2009 Li Y, Yan J, Zhou Y (2009) Visual saliency based on conditional entropy. In: Proceedings of Asian conference on computer vision, 2009
18.
Zurück zum Zitat Maire M, Arbelaez P, Fowlkes C, Malik J (2008) Using contours to detect and localize junctions in natural images. In: Proceedings of IEEE conference on computer vision and pattern recognition, 2008 Maire M, Arbelaez P, Fowlkes C, Malik J (2008) Using contours to detect and localize junctions in natural images. In: Proceedings of IEEE conference on computer vision and pattern recognition, 2008
19.
Zurück zum Zitat Rahtu E, Kannala J, Blaschko M (2011) Learning a category independent object detection cascade. In: Proceedings of IEEE international conference computer vision Rahtu E, Kannala J, Blaschko M (2011) Learning a category independent object detection cascade. In: Proceedings of IEEE international conference computer vision
20.
Zurück zum Zitat Ren X, Fowlkes C, Malik J (2006) Figure/ground assignment in natural images. In: Proceedings of European conference computer vision, 2006 Ren X, Fowlkes C, Malik J (2006) Figure/ground assignment in natural images. In: Proceedings of European conference computer vision, 2006
21.
Zurück zum Zitat Russell BC, Efros AA, Sivic J, Freeman WT, Zisserman A (2006) Using multiple segmentations to discover objects and their extent in image collections. In: CVPR, 2006 Russell BC, Efros AA, Sivic J, Freeman WT, Zisserman A (2006) Using multiple segmentations to discover objects and their extent in image collections. In: CVPR, 2006
22.
Zurück zum Zitat Saenko K, Karayev S, Jia Y, Shyr A, Janoch A, Long J, Fritz M, Darrell T (2011) Practical 3-D object detection using category and instance-level appearance models. In: Proceedings of IEEE international conference intelligent robots and systems, pp , 2011 Saenko K, Karayev S, Jia Y, Shyr A, Janoch A, Long J, Fritz M, Darrell T (2011) Practical 3-D object detection using category and instance-level appearance models. In: Proceedings of IEEE international conference intelligent robots and systems, pp , 2011
23.
Zurück zum Zitat Shi J, Malik J (Aug. 2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905 Shi J, Malik J (Aug. 2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
24.
Zurück zum Zitat Shotton J, Winn J, Rother C, Criminisi A (2009) TextonBoost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int J Comput Vis 81(1):2–23CrossRef Shotton J, Winn J, Rother C, Criminisi A (2009) TextonBoost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int J Comput Vis 81(1):2–23CrossRef
25.
Zurück zum Zitat Sutton C, McCallum A (2005) Piecewise training for undirected models. In: Proceedings of annual conference on uncertainty in artificial intelligence, pp 568–575, 2005 Sutton C, McCallum A (2005) Piecewise training for undirected models. In: Proceedings of annual conference on uncertainty in artificial intelligence, pp 568–575, 2005
26.
Zurück zum Zitat Veksler O, Boykov Y, Mehrani P (2010) Superpixels and supervoxels in an energy optimization framework. In: Proceedings of European conference on computer vision, pp 211–224, 2010 Veksler O, Boykov Y, Mehrani P (2010) Superpixels and supervoxels in an energy optimization framework. In: Proceedings of European conference on computer vision, pp 211–224, 2010
27.
Zurück zum Zitat Zhang JH, Xiao J, Zhang J, Zhang H, Chen SY (2011) Integrate multi-modal cues for category-independent object detection and localization. In: Proceedigs of IEEE international conference intelligent robots and systems, pp 801–806, 2011 Zhang JH, Xiao J, Zhang J, Zhang H, Chen SY (2011) Integrate multi-modal cues for category-independent object detection and localization. In: Proceedigs of IEEE international conference intelligent robots and systems, pp 801–806, 2011
Metadaten
Titel
Multimodal Mixed Conditional Random Field Model for Category-Independent Object Detection
verfasst von
Jian-Hua Zhang
Jian-Wei Zhang
Sheng-Yong Chen
Ying Hu
Copyright-Jahr
2014
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-642-37835-5_54