nach oben

Cognitive Processing

Erschienen in:

15.11.2021 | Research Article

A visual object segmentation algorithm with spatial and temporal coherence inspired by the architecture of the visual cortex

verfasst von: Juan A. Ramirez-Quintana, Raul Rangel-Gonzalez, Mario I. Chacon-Murguia, Graciela Ramirez-Alonso

Erschienen in: Cognitive Processing | Ausgabe 1/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Scene analysis in video sequences is a complex task for a computer vision system. Several schemes have been addressed in this analysis, such as deep learning networks or traditional image processing methods. However, these methods require thorough training or manual adjustment of parameters to achieve accurate results. Therefore, it is necessary to develop novel methods to analyze the scenario information in video sequences. For this reason, this paper proposes a method for object segmentation in video sequences inspired by the structural layers of the visual cortex. The method is called Neuro-Inspired Object Segmentation, SegNI. SegNI has a hierarchical architecture that analyzes object features such as edges, color, and motion to generate regions that represent the objects in the scenario. The results obtained with the Video Segmentation Benchmark VSB100 dataset demonstrate that SegNI can adapt automatically to videos with scenarios that have different nature, composition, and different types of objects. Also, SegNI adapts its processing to new scenario conditions without training, which is a significant advantage over deep learning networks.

Vorheriger Artikel Brief inductions in episodic past or future thinking: effects on episodic detail and problem-solving

Nächster Artikel Visuospatial working memory and the construction of a spatial situation model in listening comprehension: An examination using a spatial tapping task

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nur mit Berechtigung zugänglich

Andersen RA (1997) Neural mechanisms of visual motion perception in primates. Cell Press. https://doi.org/10.1016/S0896-6273(00)80326-8CrossRef

Arbeláez P, Maire M, Fowlkes C, Malik J (2011) Contour detection and hierarchical image segmentation. IEEE Trans Pattern Anal Mach Intell 33(5):898–916. https://doi.org/10.1109/TPAMI.2010.161CrossRefPubMed

Bednar JA, Miikkulainen R (2000) Tilt aftereffects in a self-organizing model of the primary visual cortex. Neural Comput 12(7):1721–1740. https://doi.org/10.1162/089976600300015321CrossRefPubMed

Bednar JA, De Paula JB, Miikkulainen R (2005) Self-organization of color opponent receptive fields and laterally connected orientation maps. Neurocomputing. https://doi.org/10.1016/j.neucom.2004.10.055CrossRef

Brito da Silva LE, Elnabarawy I, Wunsch DC (2019) A survey of adaptive resonance theory neural network models for engineering applications. Neural Netw 120:167–203. https://doi.org/10.1016/j.neunet.2019.09.012CrossRefPubMed

Brostow GJ, Fauqueur J, Cipolla R (2009) Semantic object classes in video: a high-definition ground truth database. Pattern Recogn Lett 30(2):88–97. https://doi.org/10.1016/j.patrec.2008.04.005CrossRef

Caelles S, Pont-Tuset J, Perazzi F, Montes A, Maninis KK, Van Gool L (2019) The 2019 davis challenge on vos: Unsupervised multi-object segmentation. arXiv:190500737

Chabane AN, Islam N, Zerr B (2017) Incremental clustering of sonar images using self-organizing maps combined with fuzzy adaptive resonance theory. Ocean Eng 142:133–144. https://doi.org/10.1016/j.oceaneng.2017.06.061CrossRef

Chacon-Murguia MI, Guzman-Pando A, Ramirez-Alonso G, Ramirez-Quintana JA (2019) A novel instrument to compare dynamic object detection algorithms. Image Vis Comput 88:19–28. https://doi.org/10.1016/j.imavis.2019.04.006CrossRef

Chang P, Wang X, Huang J (2012) Color image segmentation based on visual perception. In: 2012 IEEE international conference on information science and technology, pp 425–429, https://doi.org/10.1109/ICIST.2012.6221682

Cheng D, Zhu Q, Huang J, Wu Q, Yang L (2021) Clustering with local density peaks-based minimum spanning tree. IEEE Trans Knowl Data Eng 33(2):374–387. https://doi.org/10.1109/TKDE.2019.2930056CrossRef

Chua L, Roska T (2010) Cellular neural networks and visual computing: foundations and applications. Cambridge University Press, Cambridge

Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

Corso JJ, Sharon E, Dube S, El-Saden S, Sinha U, Yuille A (2008) Efficient multilevel brain tumor segmentation with integrated bayesian model classification. IEEE Trans Med Imaging 27(5):629–640. https://doi.org/10.1109/TMI.2007.912817CrossRefPubMed

Dong T, Zhang X, Ding Z, Fan J (2020) Multi-layered tree crown extraction from lidar data using graph-based segmentation. Comput Electron Agric 170:105213. https://doi.org/10.1016/j.compag.2020.105213CrossRef

Du X, Dai P, Wang S, Cheng Y, Wu D (2017) Coupled wilson-cowan oscillator model with double-node for image enhancement. In: 2017 IEEE third international conference on multimedia big data (BigMM), pp. 129–133, https://doi.org/10.1109/BigMM.2017.46

Fairchild MD (2013) Color appearance models. Wiley, LondonCrossRef

Farnworth T, Renton C, Strydom R, Wills A, Perez T (2021) A heteroscedastic likelihood model for two-frame optical flow. IEEE Robot Automat Lett 6(2):1200–1207. https://doi.org/10.1109/LRA.2021.3056342CrossRef

Fortun D, Bouthemy P, Kervrann C (2015) Optical flow modeling and computation: a survey. Comput Vis Image Understand Real World Vid Netw 134:1–21. https://doi.org/10.1016/j.cviu.2015.02.008CrossRef

Galasso F, Cipolla R, Schiele B (2013) Video segmentation with superpixels. In: Lecture Notes in Computer Science, including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, Springer, Berlin, Heidelberg, pp 760–774, https://doi.org/10.1007/978-3-642-37331-2_57

Galasso F, Nagaraja NS, Cárdenas TJ, Brox T, Schiele B (2013) A unified video segmentation benchmark: Annotation, metrics and analysis. In: 2013 IEEE international conference on computer vision, pp 3527–3534, https://doi.org/10.1109/ICCV.2013.438

Garg S, Goel V, Kumar S (2020) Unsupervised video object segmentation using online mask selection and space-time memory networks. The 2020 DAVIS Challenge on Video Object Segmentation - CVPR Workshops

Gharaee Z (2021) Online recognition of unsegmented actions with hierarchical SOM architecture. Cognit Process 22(1):77–91. https://doi.org/10.1007/s10339-020-00986-4CrossRef

Grundmann M, Kwatra V, Han M, Essa I (2010) Efficient hierarchical graph-based video segmentation. In: 2010 IEEE computer society conference on computer vision and pattern recognition, pp 2141–2148, https://doi.org/10.1109/CVPR.2010.5539893

Gupta A, Anpalagan A, Guan L, Khwaja AS (2021) Deep learning for object detection and scene perception in self-driving cars: survey, challenges, and open issues. Array 10:100057. https://doi.org/10.1016/j.array.2021.100057CrossRef

Jiang L, Zhang D, Che L (2021) Texture analysis-based multi-focus image fusion using a modified pulse-coupled neural network (pcnn). Signal Process Image Commun. https://doi.org/10.1016/j.image.2020.116068CrossRef

Keuper M, Brox T (2016) Point-wise mutual information-based video segmentation with high temporal consistency. In: Hua G, Jégou H (eds) Computer Vision - ECCV 2016 Workshops. Springer International Publishing, Cham, pp 789–803

Kruger N, Janssen P, Kalkan S, Lappe M, Leonardis A, Piater J, Rodriguez-Sanchez AJ, Wiskott L (2013) Deep hierarchies in the primate visual cortex: what can we learn for computer vision? IEEE Trans Pattern Anal Mach Intell 35(8):1847–1871. https://doi.org/10.1109/TPAMI.2012.272CrossRefPubMed

Kuzmina M, Manykin E (2005) Oscillatory neural network for adaptive dynamical image processing. In: international conference on computational intelligence for modelling, control and automation and international conference on intelligent agents, web technologies and internet commerce (CIMCA-IAWTIC’06), vol 1, pp 301–306, https://doi.org/10.1109/CIMCA.2005.1631283

Li W, Ogunbona P, Ye L, Kharitonenko I (2004) Visual perceptual process model and object segmentation. In: proceedings 7th international conference on signal processing, 2004. ICSP ’04. 2004., vol 1, pp 753–756 vol.1, https://doi.org/10.1109/ICOSP.2004.1452772

Masland RH, Dallos P, Firestein S (2020) The senses : a comprehensive reference. Elsevier, Amsterdam

Minaee S, Boykov YY, Porikli F, Plaza AJ, Kehtarnavaz N, Terzopoulos D (2021) Image segmentation using deep learning: a survey. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3059968CrossRefPubMed

Mou L, Hua Y, Zhu XX (2020) Relation matters: relational context-aware fully convolutional network for semantic segmentation of high-resolution aerial images. IEEE Trans Geosci Remote Sens 58(11):7557–7569. https://doi.org/10.1109/TGRS.2020.2979552CrossRef

Ochs P, Brox T (2011) Object segmentation in video: a hierarchical variational approach for turning point trajectories into dense regions. In: 2011 international conference on computer vision, pp 1583–1590, https://doi.org/10.1109/ICCV.2011.6126418

Pisal A, Sor R, Kinage KS (2017) Facial feature extraction using hierarchical max(hmax) method. In: 2017 international conference on computing, communication, control and automation (ICCUBEA), pp 1–5, https://doi.org/10.1109/ICCUBEA.2017.8463755

Ramirez-Quintana JA, Chacon-Murguia MI (2015) Self-adaptive som-cnn neural system for dynamic object detection in normal and complex scenarios. Pattern Recogni 48(4):1137–1149. https://doi.org/10.1016/j.patcog.2014.09.009CrossRef

Saglam A, Baykan NA (2017) Effects of color spaces and distance norms on graph-based image segmentation. In: 2017 3rd international conference on frontiers of signal processing (ICFSP), pp 130–135, https://doi.org/10.1109/ICFSP.2017.8097156

Sanchez G, Madrenas J, Cosp-Vilella J (2019) Legion-based image segmentation by means of spiking neural networks using normalized synaptic weights implemented on a compact scalable neuromorphic architecture. Neurocomputing 352:106–120. https://doi.org/10.1016/j.neucom.2019.04.037CrossRef

Sengupta N, McNabb CB, Kasabov N, Russell BR (2018) Integrating space, time, and orientation in spiking neural networks: a case study on multimodal brain data modeling. IEEE Trans Neural Netw Learn Syst 29(11):5249–5263. https://doi.org/10.1109/TNNLS.2018.2796023CrossRefPubMed

Stoll S, Finlayson NJ, Schwarzkopf DS (2020) Topographic signatures of global object perception in human visual cortex. NeuroImage 220:116926. https://doi.org/10.1016/j.neuroimage.2020.116926CrossRefPubMed

Sundberg P, Brox T, Maire M, Arbeláez P, Malik J (2011) Occlusion boundary detection and figure/ground assignment from optical flow. In: CVPR 2011:2233–2240. https://doi.org/10.1109/CVPR.2011.5995364

Sung M, Kim Y (2020) Training spiking neural networks with an adaptive leaky integrate-and-fire neuron. In: 2020 IEEE international conference on consumer electronics - Asia (ICCE-Asia), pp 1–2, https://doi.org/10.1109/ICCE-Asia49877.2020.9277455

T Zhou YY W Wang, Shen J (2020) Target-aware adaptive tracking for unsupervised video object segmentation. The 2020 DAVIS Challenge on Video Object Segmentation - CVPR Workshops

Thwaites A, Wingfield C, Wieser E, Soltan A, Marslen-Wilson WD, Nimmo-Smith I (2018) Entrainment to the ciecam02 and cielab colour appearance models in the human cortex. Vis Res 145:1–10. https://doi.org/10.1016/j.visres.2018.01.011CrossRefPubMed

Tjøstheim TA, Balkenius C (2019) Cumulative inhibition in neural networks. Cognit Process 20(1):87–102. https://doi.org/10.1007/s10339-018-0888-zCrossRef

Tran Q, Su S, Nguyen V (2020) Pyramidal lucas-kanade-based noncontact breath motion detection. IEEE Trans Syst Man Cybern Syst 50(7):2659–2670. https://doi.org/10.1109/TSMC.2018.2825458CrossRef

Wang Q, Gao J, Yuan Y (2018) A joint convolutional neural networks and context transfer for street scenes labeling. IEEE Trans Intell Transp Syst 19(5):1457–1470. https://doi.org/10.1109/TITS.2017.2726546CrossRef

Wang Q, Gao J, Li X (2019) Weakly supervised adversarial domain adaptation for semantic segmentation in urban scenes. IEEE Trans Image Process 28(9):4376–4386. https://doi.org/10.1109/TIP.2019.2910667CrossRefPubMed

Wang Z, Wang Z (2020) A generic approach for cell segmentation based on gabor filtering and area-constrained ultimate erosion. Artif Intell Med 107:101929. https://doi.org/10.1016/j.artmed.2020.101929CrossRefPubMed

X Xiao CC, Lu Y (2020) Global tracklet matching for unsupervised video object segmentation. The 2020 DAVIS Challenge on Video Object Segmentation - CVPR Workshops

Xu C, Xiong C, Corso JJ (2012) Streaming hierarchical video segmentation. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer, Berlin, Heidelberg, PART 6, pp 626–639, https://doi.org/10.1007/978-3-642-33783-3_45

Xu H, Hancock ER, Zhou W (2019) The low-rank decomposition of correlation-enhanced superpixels for video segmentation. Soft Comput 23(24):13055–13065. https://doi.org/10.1007/s00500-019-03849-zCrossRef

Xu N, Yang L, Fan Y, Yue D, Liang Y, Yang J, Huang T (2018) Youtube-vos: A large-scale video object segmentation benchmark

Yamasaki T, Tobimatsu S (2018) Driving ability in alzheimer disease spectrum: neural basis, assessment, and potential use of optic flow event-related potentials. Front Neurol 9:1–14. https://doi.org/10.3389/fneur.2018.00750CrossRef

Yang K, Hu X, Stiefelhagen R (2021) Is context-aware cnn ready for the surroundings? panoramic semantic segmentation in the wild. IEEE Trans Image Process 30:1866–1881. https://doi.org/10.1109/TIP.2020.3048682CrossRefPubMed

Yu B, Zhang L (2004) Pulse-coupled neural networks for contour and motion matchings. IEEE Trans Neural Netw 15(5):1186–1201. https://doi.org/10.1109/TNN.2004.832830CrossRefPubMed

Yu J, Xia G, Gao C, Samal A (2016) A computational model for object-based visual saliency: spreading attention along gestalt cues. IEEE Trans Multimed 18(2):273–286. https://doi.org/10.1109/TMM.2015.2505908CrossRef

Zhao Y, Yuan Y, Nie F, Wang Q (2018) Spectral clustering based on iterative optimization for large-scale and high-dimensional data. Neurocomputing 318:227–235. https://doi.org/10.1016/j.neucom.2018.08.059CrossRef

Titel: A visual object segmentation algorithm with spatial and temporal coherence inspired by the architecture of the visual cortex
verfasst von: Juan A. Ramirez-Quintana
Raul Rangel-Gonzalez
Mario I. Chacon-Murguia
Graciela Ramirez-Alonso
Publikationsdatum: 15.11.2021
Verlag: Springer Berlin Heidelberg
Erschienen in: Cognitive Processing / Ausgabe 1/2022
Print ISSN: 1612-4782
Elektronische ISSN: 1612-4790
DOI: https://doi.org/10.1007/s10339-021-01065-y

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 1/2022

Do attentional focus instructions affect real-time reinvestment during level-ground walking in older adults?

Implicit evidence on the dissociation of identity and emotion recognition

Understanding indirect requests for information in high-functioning autism

Artificial grammar learning is facilitated by distributed practice: Evidence from a letter reordering task

The implicit sense of agency is not a perceptual effect but is a judgment effect

Schoolchildren’s autobiographical memory: COMT gene Val158Met polymorphism effects on emotional content and quality of first memories