Top

Soft Computing

Published in:

21-11-2017 | Methodologies and Application

Scalable scene understanding via saliency consensus

Authors: Bharath Ramesh, Nicholas Lim Zhi Jian, Liang Chen, Cheng Xiang, Zhi Gao

Published in: Soft Computing | Issue 7/2019

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Given a single image, we propose a scene understanding framework that segments and categorizes the objects in the scene, and classifies the overall scene. A handful of frameworks already exist to perform these tasks coherently, but training of these models is time-consuming, thereby limiting their scalability. This paper presents a scalable framework by adopting an object-based approach, which sequentially performs unsupervised object discovery using multiple saliency detection algorithms, object segmentation by graph-cut, object classification using the bag-of-features model, and lastly, scene classification by binary decision trees. A novel region-of-interest (ROI) detector, based on morphological image processing techniques, is proposed to automatically provide object location priors from saliency maps. Additionally, for improving object discovery, multiple saliency detectors are combined using a novel method to produce the ROI map, which is then used to obtain the segmentation. We tested our system on a novel object-based scene dataset and obtained a high classification accuracy using the proposed object discovery step. Unlike other existing frameworks, the proposed algorithm maintains scalability due to the fully unsupervised object discovery step, and therefore it can easily accommodate more objects and scene categories.

previous article An effective improved differential evolution algorithm to solve constrained optimization problems

next article A novel intelligent diagnosis method using optimal LS-SVM with improved PSO algorithm

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

https://sites.google.com/site/bharathramesh03/gallery.

A demo can be found at https://sites.google.com/site/bharathramesh03/research.

Arbelaez P, Maire M, Fowlkes C, Malik J (2011) Contour detection and hierarchical image segmentation. IEEE Trans Pattern Anal Mach Intell 33(5):898–916CrossRef

Bao SY, Sun M, Savarese S (2011) Toward coherent object detection and scene layout understanding. Image Vis Comput 29(9):569–579CrossRef

Borji A, Sihite D, Itti, L (2012) Salient object detection: a benchmark. In: European conference on computer vision, lecture notes in computer science, pp 414–429

Bosch A, Zisserman A, Munoz X (2006) Scene classification via pLSA. In: European conference on computer vision, lecture notes in computer science, vol 3954, pp 517–530

Bosch A, Zisserman A, Muoz X (2007) Image classification using random forests and ferns. In: 11th international conference on computer vision, pp 1–8

Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. IEEE Trans Pattern Anal Mach Intell 23(11):1222–1239CrossRef

Bruce N, Tsotsos J (2007) Attention based on information maximization. J Vis 7(9):950–950CrossRef

Cabrerizo FJ, Moreno JM, Pérez IJ, Herrera-Viedma E (2010) Analyzing consensus approaches in fuzzy group decision making: advantages and drawbacks. Soft Comput 14(5):451–463CrossRef

Cabrerizo FJ, Chiclana F, Al-Hmouz R, Morfeq A, Balamash AS, Herrera-Viedma E (2015) Fuzzy decision making and consensus: challenges. J Intell Fuzzy Syst 29(3):1109–1118MathSciNetCrossRefMATH

Chapelle O, Haffner P, Vapnik V (1999) Support vector machines for histogram-based image classification. IEEE Trans Neural Netw 10(5):1055–1064CrossRef

Cheng MM, Zhang GX, Mitra NJ, Huang X, Hu SM (2011) Global contrast based salient region detection. In: IEEE conference on computer vision and pattern recognition, pp 409–416

Choi M, Lim J, Torralba A, Willsky A (2010) Exploiting hierarchical context on a large database of object categories. In: IEEE conference on computer vision and pattern recognition, pp 129–136

Congcong L, Kowdle A, Saxena A, Tsuhan C (2012) Toward holistic scene understanding: feedback enabled cascaded classification models. IEEE Trans Pattern Anal Mach Intell 34(7):1394–1408CrossRef

Csurka G, Dance CR, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, ECCV, pp 1–22

Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, pp 248–255

Dubey SR, Dixit P, Singh N, Gupta JP (2013) Infected fruit part detection using k-means clustering segmentation technique. Int J Interact Multimed Artif Intell 2(2):65–72

Eddins SL (2012) MATLAB R2012b documentation: morphological reconstruction

Fei-Fei L, Fergus R, Perona P (2007) Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Comput Vis Image Underst 106(1):59–70CrossRef

Felzenszwalb P, Girshick R, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645CrossRef

Gonzalez RC, Woods RE, Eddins SL (2010) Morphological reconstruction. Digital image processing using MATLAB

Harel J, Koch C, Perona P (2007) Graph-based visual saliency. Adv Neural Inf Process Syst 19:545

Hou X, Zhang L (2007) Saliency detection: a spectral residual approach. In: IEEE conference on computer vision and pattern recognition, pp 1–8

Hou X, Zhang L (2009) Dynamic visual attention: searching for coding length increments. In: Advances in neural information processing systems, pp 681–688

Hou X, Harel J, Koch C (2012) Image signature: highlighting sparse salient regions. IEEE Trans Pattern Anal Mach Intell 34(1):194–201CrossRef

Huang G, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B Cybern 42(2):513–529CrossRef

Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259CrossRef

Jia Y, Huang C, Darrell T (2012) Beyond spatial pyramids: receptive field learning for pooled image features. In: IEEE conference on computer vision and pattern recognition, pp 3370–3377

Judd T, Durand F, Torralba A (2012) A benchmark of computational models of saliency to predict human fixations, Technical Report. TR-2012-001, MIT-CSAIL

Ladicky L, Sturgess P, Alahari K, Russell C, Torr P (2010) What, where and how many? Combining object detectors and CRFS. In: European conference on computer vision, lecture notes in computer science. Springer, Berlin, pp 424–437

Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. IEEE conference on computer vision and pattern recognition 2:2169–2178

Li Y, Sun J, Tang C, Shum H (2004) Lazy snapping. ACM Trans Graph (ToG) 23(3):303–308CrossRef

Li L, Socher R, Fei-Fei L (2009) Towards total scene understanding: classification, annotation and segmentation in an automatic framework. In: IEEE conference on computer vision and pattern recognition, pp 2036–2043

Li Y, Zhou Y, Yan J, Niu Z, Yang J (2010) Visual saliency based on conditional entropy. In: Asian conference on computer vision, lecture notes in computer Science, vol 5994, pp 246–257

Liaw A, Wiener M (2002) Classification and regression by randomforest. R News 2(3):18–22

Lienhart R, Maydt J (2002) An extended set of haar-like features for rapid object detection. In: Proceedings of the international conference on image processing, vol 1, pp I–900–I–903. https://doi.org/10.1109/ICIP.2002.1038171

Lowe D (1999) Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE international conference on computer vision, vol 2, pp 1150–1157

Martin D, Fowlkes C, Tal D, Malik J (2001) A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings of the 8th international conference computer vision, vol 2, pp 416–423

Mottaghi R, Fidler S, Yuille A, Urtasun R, Parikh D (2016) Human-machine CRFS for identifying bottlenecks in scene understanding. IEEE Trans Pattern Anal Mach Intell 38(1):74–87CrossRef

Nene S, Nayar S, Murase H et al (1996) Columbia object image library (coil-20), Technical report. Columbia University

Nowak E, Jurie F, Triggs B (2006) Sampling strategies for bag-of-features image classification. In: European conference on computer vision, lecture notes in computer science, vol 3954, pp 490–503

Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175CrossRefMATH

Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: IEEE conference on computer vision and pattern recognition (CVPR)

Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149CrossRef

Rezazadegan Tavakoli H, Rahtu E, Heikkil J (2011) Fast and efficient saliency detection using sparse sampling and kernel density estimation. In: Heyden A, Kahl F (eds) Image analysis, lecture notes in computer science, vol 6688. Springer, Berlin, pp 666–675

Riche N, Mancas M, Duvinage M, Mibulumukini M, Gosselin B, Dutoit T (2013) Rare 2012: a multi-scale rarity-based saliency detection with its comparative statistical analysis. Signal Process Image Commun 28(6):642–658CrossRef

Schyns P, Oliva A (1994) From blobs to boundary edges: evidence for time-and spatial-scale-dependent scene recognition. Psychol Sci 5(4):195CrossRef

Shin HC, Roth HR, Gao M, Lu L, Xu Z, Nogues I, Yao J, Mollura D, Summers RM (2016) Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging 35(5):1285–1298CrossRef

Song S, Lichtenberg SP, Xiao J (2015) Sun rgb-d: a rgb-d scene understanding benchmark suite. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 567–576

Vedaldi A, Gulshan V, Varma M, Zisserman A (2009) Multiple kernels for object detection. In: IEEE international conference on computer vision, pp 606–613

Vikram TN, Tscherepanow M, Wrede B (2012) A saliency map based on sampling an image into random rectangular regions of interest. Pattern Recognit 45(9):3114–3124CrossRef

Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: IEEE computer society conference on computer vision and pattern recognition, vol 1, pp 511–518

Xiao J, Hays J, Ehinger K, Oliva A, Torralba A (2010) Sun database: large-scale scene recognition from abbey to zoo. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3485–3492

Zhang L, Tong MH, Marks TK, Shan H, Cottrell GW (2008) Sun: a Bayesian framework for saliency using natural statistics. J Vis 8(7):32CrossRef

Zhou B, Khosla A, Lapedriza A, Torralba A, Oliva A (2016) Places: an image database for deep scene understanding. arXiv preprint: arXiv:1610.02055

Title: Scalable scene understanding via saliency consensus
Authors: Bharath Ramesh
Nicholas Lim Zhi Jian
Liang Chen
Cheng Xiang
Zhi Gao
Publication date: 21-11-2017
Publisher: Springer Berlin Heidelberg
Published in: Soft Computing / Issue 7/2019
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI: https://doi.org/10.1007/s00500-017-2939-2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 7/2019

A distributed argumentation algorithm for mining consistent opinions in weighted Twitter discussions

Implicit definability of truth constants in Łukasiewicz logic

Betting on continuous independent events

Syntactic characterizations of classes of first-order structures in mathematical fuzzy logic

A soft-computing-based approach to artificial visual attention using human eye-fixation paradigm: toward a human-like skill in robot vision

A fuzzy-based classification strategy (FBCS) based on brain–computer interface

Premium Partner