Skip to main content
Top
Published in: Cognitive Computation 6/2016

01-12-2016

A Biologically Inspired Framework for Visual Information Processing and an Application on Modeling Bottom-Up Visual Attention

Authors: Ala Aboudib, Vincent Gripon, Gilles Coppin

Published in: Cognitive Computation | Issue 6/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Background

An emerging trend in visual information processing is toward incorporating some interesting properties of the ventral stream in order to account for some limitations of machine learning algorithms. Selective attention and cortical magnification are two such important phenomena that have been the subject of a large body of research in recent years. In this paper, we focus on designing a new model for visual acquisition that takes these important properties into account.

Methods

We propose a new framework for visual information acquisition and representation that emulates the architecture of the primate visual system by integrating features such as retinal sampling and cortical magnification while avoiding spatial deformations and other side effects produced by models that tried to implement these two features. It also explicitly integrates the notion of visual angle, which is rarely taken into account by vision models. We argue that this framework can provide the infrastructure for implementing vision tasks such as object recognition and computational visual attention algorithms.

Results

To demonstrate the utility of the proposed vision framework, we propose an algorithm for bottom-up saliency prediction implemented using the proposed architecture. We evaluate the performance of the proposed model on the MIT saliency benchmark and show that it attains state-of-the-art performance, while providing some advantages over other models.

Conclusion

Here is a summary of the main contributions of this paper: (1) Introducing a new bio-inspired framework for visual information acquisition and representation that offers the following properties: (a) Providing a method for taking the distance between an image and the viewer into account. This is done by incorporating a visual angle parameter which is ignored by most visual acquisition models. (b) Reducing the amount of visual information acquired by introducing a new scheme for emulating retinal sampling and cortical magnification effects observed in the ventral stream. (2) Providing a concrete application of the proposed framework by using it as a substrate for building a new saliency-based visual attention model, which is shown to attain state-of-the-art performance on the MIT saliency benchmark. (3) Providing an online Git repository that implements the introduced framework that is meant to be developed as a scalable, collaborative project.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Aboudib A, Gripon V, Coppin G. A model of bottom-up visual attention using cortical magnification. In: 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP). 2015;1493–1497. doi:10.1109/ICASSP.2015.7178219. Aboudib A, Gripon V, Coppin G. A model of bottom-up visual attention using cortical magnification. In: 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP). 2015;1493–1497. doi:10.​1109/​ICASSP.​2015.​7178219.
2.
go back to reference Achanta R, Hemami S, Estrada F, Susstrunk S. Frequency-tuned salient region detection. In: IEEE conference on computer vision and pattern recognition (CVPR), 2009, IEEE; 2009. p. 1597–1604. Achanta R, Hemami S, Estrada F, Susstrunk S. Frequency-tuned salient region detection. In: IEEE conference on computer vision and pattern recognition (CVPR), 2009, IEEE; 2009. p. 1597–1604.
3.
go back to reference Anselmi F, Rosasco L, Poggio T. On invariance and selectivity in representation learning. 2015 arXiv preprint arXiv:150305938. Anselmi F, Rosasco L, Poggio T. On invariance and selectivity in representation learning. 2015 arXiv preprint arXiv:​150305938.
4.
go back to reference Bonaiuto J, Itti L. Combining attention and recognition for rapid scene analysis. In: IEEE Computer Society Conference on computer vision and pattern recognition-workshops, 2005. CVPR Workshops. IEEE; 2005. p. 90. Bonaiuto J, Itti L. Combining attention and recognition for rapid scene analysis. In: IEEE Computer Society Conference on computer vision and pattern recognition-workshops, 2005. CVPR Workshops. IEEE; 2005. p. 90.
5.
go back to reference Borji A, Itti L. State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Mach Intell. 2013;35(1):185–207.CrossRefPubMed Borji A, Itti L. State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Mach Intell. 2013;35(1):185–207.CrossRefPubMed
6.
go back to reference Borji A, Sihite DN, Itti L. Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study. IEEE Trans Image Process. 2013;22(1):55–69.CrossRefPubMed Borji A, Sihite DN, Itti L. Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study. IEEE Trans Image Process. 2013;22(1):55–69.CrossRefPubMed
7.
go back to reference Borji A, Tavakoli HR, Sihite DN, Itti L. Analysis of scores, datasets, and models in visual saliency prediction. In: 2013 IEEE international conference on computer vision (ICCV). IEEE; 2013. p. 921–928. Borji A, Tavakoli HR, Sihite DN, Itti L. Analysis of scores, datasets, and models in visual saliency prediction. In: 2013 IEEE international conference on computer vision (ICCV). IEEE; 2013. p. 921–928.
8.
go back to reference Borji A, Sihite DN, Itti L. What/where to look next? Modeling top-down visual attention in complex interactive environments. IEEE Trans Syst Man Cybern Syst. 2014;44(5):523–38.CrossRef Borji A, Sihite DN, Itti L. What/where to look next? Modeling top-down visual attention in complex interactive environments. IEEE Trans Syst Man Cybern Syst. 2014;44(5):523–38.CrossRef
9.
go back to reference Dowling JE. The retina: an approachable part of the brain. Cambridge: Harvard University Press; 1987. Dowling JE. The retina: an approachable part of the brain. Cambridge: Harvard University Press; 1987.
11.
go back to reference Gabor D. Theory of communication. Part 1: the analysis of information. J Inst Electr Eng Part III Radio Commun Eng. 1946;93(26):429–41. Gabor D. Theory of communication. Part 1: the analysis of information. J Inst Electr Eng Part III Radio Commun Eng. 1946;93(26):429–41.
12.
go back to reference Gao F, Zhang Y, Wang J, Sun J, Yang E, Hussain A. Visual attention model based vehicle target detection in synthetic aperture radar images: a novel approach. Cogn Comput. 2015;7(4):434–44.CrossRef Gao F, Zhang Y, Wang J, Sun J, Yang E, Hussain A. Visual attention model based vehicle target detection in synthetic aperture radar images: a novel approach. Cogn Comput. 2015;7(4):434–44.CrossRef
13.
go back to reference Garcia-Diaz A, Leboran V, Fdez-Vidal XR, Pardo XM. On the relationship between optical variability, visual saliency, and eye fixations: a computational approach. J Vis. 2012;12(6):17-17.CrossRef Garcia-Diaz A, Leboran V, Fdez-Vidal XR, Pardo XM. On the relationship between optical variability, visual saliency, and eye fixations: a computational approach. J Vis. 2012;12(6):17-17.CrossRef
14.
go back to reference Gattass R, Gross C, Sandell J. Visual topography of v2 in the macaque. J Comp Neurol. 1981;201(4):519–39.CrossRefPubMed Gattass R, Gross C, Sandell J. Visual topography of v2 in the macaque. J Comp Neurol. 1981;201(4):519–39.CrossRefPubMed
15.
go back to reference Gattass R, Sousa A, Gross C. Visuotopic organization and extent of v3 and v4 of the macaque. J Neurosci. 1988;8(6):1831–45.PubMed Gattass R, Sousa A, Gross C. Visuotopic organization and extent of v3 and v4 of the macaque. J Neurosci. 1988;8(6):1831–45.PubMed
16.
go back to reference Goferman S, Zelnik-Manor L, Tal A. Context-aware saliency detection. IEEE Trans Pattern Anal Mach Intell. 2012;34(10):1915–26.CrossRefPubMed Goferman S, Zelnik-Manor L, Tal A. Context-aware saliency detection. IEEE Trans Pattern Anal Mach Intell. 2012;34(10):1915–26.CrossRefPubMed
17.
go back to reference Gonzalez RC, Woods RE. Digital image processing; 2002. Gonzalez RC, Woods RE. Digital image processing; 2002.
18.
go back to reference Goodale MA, Milner AD. Separate visual pathways for perception and action. Trends Neurosci. 1992;15(1):20–5.CrossRefPubMed Goodale MA, Milner AD. Separate visual pathways for perception and action. Trends Neurosci. 1992;15(1):20–5.CrossRefPubMed
19.
go back to reference Harel J, Koch C, Perona P. Graph-based visual saliency. In: Advances in neural information processing systems; 2006. p. 545–552. Harel J, Koch C, Perona P. Graph-based visual saliency. In: Advances in neural information processing systems; 2006. p. 545–552.
21.
go back to reference Hubel DH, Wiesel TN. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol. 1962;160(1):106–54.CrossRefPubMedPubMedCentral Hubel DH, Wiesel TN. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol. 1962;160(1):106–54.CrossRefPubMedPubMedCentral
22.
go back to reference Isik L, Leibo JZ, Mutch J, Lee SW, Poggio T. A hierarchical model of peripheral vision. Tech. rep. MIT’s Computer Science and Artificial Intelligence Laboratory; 2011. Isik L, Leibo JZ, Mutch J, Lee SW, Poggio T. A hierarchical model of peripheral vision. Tech. rep. MIT’s Computer Science and Artificial Intelligence Laboratory; 2011.
23.
go back to reference Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell. 1998;20(11):1254–9.CrossRef Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell. 1998;20(11):1254–9.CrossRef
24.
go back to reference Judd T, Ehinger K, Durand F, Torralba A. Learning to predict where humans look. In: IEEE conference on computer vision and pattern recognition (CVPR), 2009, IEEE; 2009. p. 2106–2113. Judd T, Ehinger K, Durand F, Torralba A. Learning to predict where humans look. In: IEEE conference on computer vision and pattern recognition (CVPR), 2009, IEEE; 2009. p. 2106–2113.
25.
go back to reference Koch C, Ullman S. Shifts in selective visual attention: towards the underlying neural circuitry. In: Matters of intelligence. Springer; 1987. p. 115–141. Koch C, Ullman S. Shifts in selective visual attention: towards the underlying neural circuitry. In: Matters of intelligence. Springer; 1987. p. 115–141.
26.
go back to reference Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems; 2012. p. 1097–1105. Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems; 2012. p. 1097–1105.
27.
go back to reference Kruthiventi SS, Ayush K, Babu RV. Deepfix: A fully convolutional neural network for predicting human eye fixations. 2015. CoRR arXiv:1510.02927. Kruthiventi SS, Ayush K, Babu RV. Deepfix: A fully convolutional neural network for predicting human eye fixations. 2015. CoRR arXiv:​1510.​02927.
28.
go back to reference Lake BM, Salakhutdinov R, Tenenbaum JB. Human-level concept learning through probabilistic program induction. Science. 2015;350(6266):1332–8.CrossRefPubMed Lake BM, Salakhutdinov R, Tenenbaum JB. Human-level concept learning through probabilistic program induction. Science. 2015;350(6266):1332–8.CrossRefPubMed
29.
go back to reference Larochelle H, Hinton GE. Learning to combine foveal glimpses with a third-order boltzmann machine. In: Lafferty J, Williams C, Shawe-Taylor J, Zemel R, Culotta A, editors. Advances in neural information processing systems, vol. 23. Red Hook: Curran Associates Inc; 2010. p. 1243–1251. Larochelle H, Hinton GE. Learning to combine foveal glimpses with a third-order boltzmann machine. In: Lafferty J, Williams C, Shawe-Taylor J, Zemel R, Culotta A, editors. Advances in neural information processing systems, vol. 23. Red Hook: Curran Associates Inc; 2010. p. 1243–1251.
30.
go back to reference LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–2324.CrossRef LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–2324.CrossRef
31.
go back to reference Lee H, Battle A, Raina R, Ng AY, Efficient sparse coding algorithms. In: Advances in neural information processing systems; 2006. p. 801–808. Lee H, Battle A, Raina R, Ng AY, Efficient sparse coding algorithms. In: Advances in neural information processing systems; 2006. p. 801–808.
32.
go back to reference Liu H, Liu Y, Sun F. Robust exemplar extraction using structured sparse coding. IEEE Trans Neural Netw Learn Syst. 2015;26(8):1816–21.CrossRefPubMed Liu H, Liu Y, Sun F. Robust exemplar extraction using structured sparse coding. IEEE Trans Neural Netw Learn Syst. 2015;26(8):1816–21.CrossRefPubMed
33.
go back to reference López-García F, Dosil R, Pardo XM, Fdez-Vidal XR. Scene recognition through visual attention and image features: a comparison between sift and surf approaches. Rijeka: INTECH Open Access Publisher; 2011. López-García F, Dosil R, Pardo XM, Fdez-Vidal XR. Scene recognition through visual attention and image features: a comparison between sift and surf approaches. Rijeka: INTECH Open Access Publisher; 2011.
34.
go back to reference Marčelja S. Mathematical description of the responses of simple cortical cells*. JOSA. 1980;70(11):1297–300.CrossRef Marčelja S. Mathematical description of the responses of simple cortical cells*. JOSA. 1980;70(11):1297–300.CrossRef
35.
go back to reference Marr D. Vision, a computational investigation into the human representation and processing of visual information. San Francisco: WH Freeman; 1982. Marr D. Vision, a computational investigation into the human representation and processing of visual information. San Francisco: WH Freeman; 1982.
36.
go back to reference Martínez J, Robles LA. A new foveal cartesian geometry approach used for object tracking. SPPRA. 2006;6:133–9. Martínez J, Robles LA. A new foveal cartesian geometry approach used for object tracking. SPPRA. 2006;6:133–9.
37.
go back to reference McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys. 1943;5(4):115–33.CrossRef McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys. 1943;5(4):115–33.CrossRef
38.
39.
go back to reference Murray N, Vanrell M, Otazu X, Parraga CA. Saliency estimation using a non-parametric low-level vision model. In: IEEE conference on computer vision and pattern recognition (CVPR), 2011, IEEE; 2011. p. 433–440. Murray N, Vanrell M, Otazu X, Parraga CA. Saliency estimation using a non-parametric low-level vision model. In: IEEE conference on computer vision and pattern recognition (CVPR), 2011, IEEE; 2011. p. 433–440.
40.
go back to reference Pan J, Li X, Li X, Pang Y. Incrementally detecting moving objects in video with sparsity and connectivity. Cogn Comput. 2016;8(3):420–8.CrossRef Pan J, Li X, Li X, Pang Y. Incrementally detecting moving objects in video with sparsity and connectivity. Cogn Comput. 2016;8(3):420–8.CrossRef
41.
go back to reference Poggio T, Mutch J, Isik L. Computational role of eccentricity dependent cortical magnification; 2014. arXiv preprint arXiv:14061770. Poggio T, Mutch J, Isik L. Computational role of eccentricity dependent cortical magnification; 2014. arXiv preprint arXiv:​14061770.
44.
go back to reference Rodieck RW. Quantitative analysis of cat retinal ganglion cell response to visual stimuli. Vis Res. 1965;5(12):583–601.CrossRefPubMed Rodieck RW. Quantitative analysis of cat retinal ganglion cell response to visual stimuli. Vis Res. 1965;5(12):583–601.CrossRefPubMed
45.
go back to reference Rybak IA, Gusakova V, Golovan A, Podladchikova L, Shevtsova N. A model of attention-guided visual perception and recognition. Vis Res. 1998;38(15):2387–400.CrossRefPubMed Rybak IA, Gusakova V, Golovan A, Podladchikova L, Shevtsova N. A model of attention-guided visual perception and recognition. Vis Res. 1998;38(15):2387–400.CrossRefPubMed
46.
go back to reference Salin PA, Bullier J. Corticocortical connections in the visual system: structure and function. Physiol Rev. 1995;75(1):107–55.PubMed Salin PA, Bullier J. Corticocortical connections in the visual system: structure and function. Physiol Rev. 1995;75(1):107–55.PubMed
47.
go back to reference Schwartz EL. Anatomical and physiological correlates of visual computation from striate to infero-temporal cortex. IEEE Trans Syst Man Cybern. 1984;2:257–71.CrossRef Schwartz EL. Anatomical and physiological correlates of visual computation from striate to infero-temporal cortex. IEEE Trans Syst Man Cybern. 1984;2:257–71.CrossRef
48.
go back to reference Serre T, Wolf L, Bileschi S, Riesenhuber M, Poggio T. Robust object recognition with cortex-like mechanisms. IEEE Trans Pattern Anal Mach Intell. 2007;29(3):411–26.CrossRefPubMed Serre T, Wolf L, Bileschi S, Riesenhuber M, Poggio T. Robust object recognition with cortex-like mechanisms. IEEE Trans Pattern Anal Mach Intell. 2007;29(3):411–26.CrossRefPubMed
49.
go back to reference Torralba A, Oliva A, Castelhano MS, Henderson JM. Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol Rev. 2006;113(4):766.CrossRefPubMed Torralba A, Oliva A, Castelhano MS, Henderson JM. Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol Rev. 2006;113(4):766.CrossRefPubMed
50.
go back to reference Treisman AM, Gelade G. A feature-integration theory of attention. Cogn Psychol. 1980;12(1):97–136.CrossRefPubMed Treisman AM, Gelade G. A feature-integration theory of attention. Cogn Psychol. 1980;12(1):97–136.CrossRefPubMed
51.
go back to reference Tu Z, Abel A, Zhang L, Luo B, Hussain A. A new spatio-temporal saliency-based video object segmentation. Cogn Comput. 2016;8:629–47.CrossRef Tu Z, Abel A, Zhang L, Luo B, Hussain A. A new spatio-temporal saliency-based video object segmentation. Cogn Comput. 2016;8:629–47.CrossRef
52.
go back to reference Walther D, Koch C. Attention in hierarchical models of object recognition. Prog Brain Res. 2007;165:57–78.CrossRefPubMed Walther D, Koch C. Attention in hierarchical models of object recognition. Prog Brain Res. 2007;165:57–78.CrossRefPubMed
53.
go back to reference Walther D, Rutishauser U, Koch C, Perona P. On the usefulness of attention for object recognition. In: Workshop on attention and performance in computational vision at ECCV, Citeseer; 2004. p. 96–103. Walther D, Rutishauser U, Koch C, Perona P. On the usefulness of attention for object recognition. In: Workshop on attention and performance in computational vision at ECCV, Citeseer; 2004. p. 96–103.
54.
go back to reference Wohrer A, Kornprobst P. Virtual retina: a biological retina model and simulator, with contrast gain control. J Comput Neurosci. 2009;26(2):219–49.CrossRefPubMed Wohrer A, Kornprobst P. Virtual retina: a biological retina model and simulator, with contrast gain control. J Comput Neurosci. 2009;26(2):219–49.CrossRefPubMed
55.
go back to reference Zhang J, Sclaroff S. Saliency detection: a Boolean map approach. In: Proceedings of the IEEE international conference on computer vision; 2013. p. 153–160. Zhang J, Sclaroff S. Saliency detection: a Boolean map approach. In: Proceedings of the IEEE international conference on computer vision; 2013. p. 153–160.
56.
go back to reference Zhang L, Tong MH, Marks TK, Shan H, Cottrell GW. Sun: a Bayesian framework for saliency using natural statistics. J Vis. 2008;8(7):32.CrossRefPubMed Zhang L, Tong MH, Marks TK, Shan H, Cottrell GW. Sun: a Bayesian framework for saliency using natural statistics. J Vis. 2008;8(7):32.CrossRefPubMed
57.
go back to reference Zhao J, Sun S, Liu X, Sun J, Yang A. A novel biologically inspired visual saliency model. Cogn Comput. 2014;6(4):841–8.CrossRef Zhao J, Sun S, Liu X, Sun J, Yang A. A novel biologically inspired visual saliency model. Cogn Comput. 2014;6(4):841–8.CrossRef
58.
go back to reference Zheng Y, Zemel R, Zhang YJ, Larochelle H. A neural autoregressive approach to attention-based recognition. Int J Comput Vis. 2015;113(1):67–79.CrossRef Zheng Y, Zemel R, Zhang YJ, Larochelle H. A neural autoregressive approach to attention-based recognition. Int J Comput Vis. 2015;113(1):67–79.CrossRef
59.
go back to reference Zhu JY, Wu J, Xu Y, Chang E, Tu Z. Unsupervised object class discovery via saliency-guided multiple class learning. IEEE Trans Pattern Anal Mach Intell. 2015;37(4):862–75.CrossRefPubMed Zhu JY, Wu J, Xu Y, Chang E, Tu Z. Unsupervised object class discovery via saliency-guided multiple class learning. IEEE Trans Pattern Anal Mach Intell. 2015;37(4):862–75.CrossRefPubMed
Metadata
Title
A Biologically Inspired Framework for Visual Information Processing and an Application on Modeling Bottom-Up Visual Attention
Authors
Ala Aboudib
Vincent Gripon
Gilles Coppin
Publication date
01-12-2016
Publisher
Springer US
Published in
Cognitive Computation / Issue 6/2016
Print ISSN: 1866-9956
Electronic ISSN: 1866-9964
DOI
https://doi.org/10.1007/s12559-016-9430-8

Other articles of this Issue 6/2016

Cognitive Computation 6/2016 Go to the issue

Premium Partner