nach oben

Cognitive Computation

Erschienen in:

01.03.2013

Visual Saliency from Image Features with Application to Compression

verfasst von: P. Harding, N. M. Robertson

Erschienen in: Cognitive Computation | Ausgabe 1/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Image feature point algorithms and their associated regional descriptors can be viewed as primitive detectors of visually salient information. In this paper, a new method for constructing a visual attention probability map using features is proposed. (Throughout this work, we use SURF features yet the algorithm is not limited to SURF alone.). This technique is validated using comprehensive human eye-tracking experiments. We call this algorithm “visual interest” (VI) since the resultant segmentation reveals image regions that are visually salient during the performance of multiple observer search tasks. We demonstrate that it works on generic, eye-level photographs and is not dependent on heuristic tuning. We further show that the descriptor-matching property of the SURF feature points can be exploited via object recognition to modulate the context of the attention probability map for a given object search task, refining the salient area. We fully validate the VI algorithm through applying it to salient compression using a pre-blur of non-salient regions prior to JPEG and conducting comprehensive observer performance tests. When using the object contextualisation, we conclude that JPEG files are around 33 % larger than they need to be to fully represent the task-relevant information within them. We finally demonstrate the utility of the segmentation as a region of interest in JPEG2000 compression to achieve superior image quality (measured statistically using PSNR and SSIM) over the automatically selected salient image regions while reducing the image filesize by down to 25 % of that of the original. Our technique therefore delivers superior compression performance through the detection and selective preservation of visually salient information relevant to multiple observer tasks. In contrast to the state of the art in task-directed visual attention models, the VI algorithm reacts only to the image content and requires no detailed prior knowledge of the scene nor of the ultimate observer task.

Vorheriger Artikel Improving Visual Saliency by Adding ‘Face Feature Map’ and ‘Center Bias’

Nächster Artikel Counterfactuals, Computation, and Consciousness

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

The segmentation produces an “expectation” value of eye-fixation capture. Our parameters capture reliably 70–75 % of cluttered indoor scene fixations and considerably higher than that for outdoor scenes, 85–95 %, even while the task is varied (see Fig. 5).

Such confusions could arise from letters at low resolution including the following sets of confusions: F ⇔ R, W ⇔ M, W ⇔ N, G ⇔ 6, G ⇔ C, B ⇔ 8, V ⇔ Y, X ⇔ A, H ⇔ K, 5 ⇔ 6, F ⇔ P, H ⇔ A, G ⇔ D, O ⇔ D, B ⇔ E, 6 ⇔ 8 and G ⇔ 6.

There is a region of interest (ROI) capability in JPEG2000, but the parameters for the JPEG2000 algorithm are not related to the output quality. In contrast, the JPEG algorithm was designed using data from observer tests: a Q value of 50 is expected to produce good visual quality for photo-real imagery.

Read as: JPEG with quality level, Q = 40.

Based on the kdu_compress examples of the Kakadu Software Company.

Bay H, TuyteFlaars T, Gool LV. Surf: speeded up robust features. Comput Vis Image Underst (CVIU) (CVIU). 2006;110(4):346–59.

Brockmole JR, Castelhano MS, Henderson JM. Contextual cueing in naturalistic scenes: global and local contexts. J Exp Psychol: Learn Mem Cogn. 2006;32(4):699–706.CrossRef

deCampos TE, Csurka G, Perronnin F. Images as sets of locally weighted features. Technical report VSSP-TR-1/2010, FEPS. University of Surrey, Guildford, UK; 2010.

Fergus R. Visual object category recognition. Ph.D. thesis, University of Oxford; 2005.

Hansen BC, Essock EA. A horizontal bias in human visual processing of orientation and its correspondence to the structural components of natural scenes. J Vis. 2004;4(12):1044–60.PubMedCrossRef

Harding P, Robertson NM. A comparison of feature detectors with passive and task-based visual saliency. LNCS. 2009;5575:716–25.

Harel J, Koch C, Perona P. Graph-based visual saliency. In: Advances in neural information processing systems, vol 19; 2007. p. 545–52.

Hopfinger JB, Buonocore MH, R.Mangun G. The neural mechanisms of top–down attentional control. Nat Neurosci. 2000;3:284–91.PubMedCrossRef

Itti L. Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans Image Process. 2004;13(10):1304–18.PubMedCrossRef

10.

Itti, L, Koch C. Computational modelling of visual attention. Nat Rev Neurosci. 2001;2(3):194–203.PubMedCrossRef

11.

Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Transact Patt Anal Mach Intell. 1998;20(11):1254–59.CrossRef

12.

Kadir T, Brady M. Saliency, scale and image description. Int J Comput Vis. 2001;45(2):83–105.CrossRef

13.

Lindeberg T. Scale-space theory: A basic tool for analysing structures at different scales. J Appl Stat. 1994;21(2):224–70.

14.

Lowe DG. Distinctive image features from scale-invariant keypoints. Int J Comput Vis. 2004;60:91–110.CrossRef

15.

Matas J, Chum O, Urban M, Pajdla T. Robust wide baseline stereo from maximally stable extremal regions. In: Proceedings of the British machine vision conference; 2002. p. 384–93.

16.

Mikolajczyk K, Schmid C. An affine invariant interest point detector. In: 7th European conference on computer vision, vol 1; 2002. p. 128–42.

17.

Mikolajczyk K, Schmid C. A performance evaluation of local descriptors. IEEE Transact Patt Anal Mach Intell. 2005;27(10):1615–30.CrossRef

18.

Miller JL, Wiltse JM. Resolution requirements for alphanumeric readability. Opt Eng. 2003;42(3):846–52.CrossRef

19.

Navalpakkam V, Itti L. Modeling the influence of task on attention. Vis Res. 2005;45(2):205–31.PubMedCrossRef

20.

Navalpakkam V, Itti L. Search goal tunes visual features optimally. Neuron. 2007;53(4):605–17.PubMedCrossRef

21.

Pasadena. Dataset web address (accessed 08/2011). http://www.vision.caltech.edu/html-files/archive.html.

22.

Peters R, Itti L. Beyond bottom-up: incorporating task-dependent influences into a computational model of spatial attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR); 2007. p. 1–8.

23.

Philbin J, Chum O, Isard M, Sivic J, Zisserman A. Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2007. p. 1–8. URL: http://www.robots.ox.ac.uk/vgg/data/oxbuildings/index.html.

24.

Rosten E, Drummond T. Fusing points and lines for high performance tracking. In: 10th IEEE international conference on computer vision, vol 2; 2005. p. 1508–11.

25.

Rosten E, Drummond T. Machine learning for high-speed corner detection. In: Proceedings of the 9th European conference on computer vision, vol 1; 2006. p. 430–43.

26.

Taubman D. Kakadu v5.0 survey document; 2001.

27.

Taubman DS, Marcellin MW. JPEG 2000: image compression fundamentals, standards and practice. Norwell: Kluwer; 2001.

28.

Torralba A. Contextual priming for object detection. Int J Comput Vis. 2003;53(2):169–91.CrossRef

29.

Torralba A, Oliva A, Castelhano M, Henderson J. Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol Rev. 2006;113(4):766–86. URL: http://people.csail.mit.edu/torralba/GlobalFeaturesAndAttention/.

30.

Union IT. Reference algorithm for computing peak signal to noise ratio (psnr) of a video sequence with a constant delay. ITU-T standard; 2009.

31.

Viola P, Jones M. Robust real-time object detection. Int J Comput Vis. 2001;57:137–54.CrossRef

32.

Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Transact Image Process. 2004;13:600–12.CrossRef

33.

Wolfe J. Visual attention. In: De Valois KK, editors. Seeing. 2nd ed. San Diego: Academic Press; 2000.

34.

Wolfe JM, Horowitz TS, Kenner N, Hyle M, Vasan N. How fast can you change your mind? the speed of top–down guidance in visual search. Vis Res. 2004;44:1411–26.PubMedCrossRef

35.

Yu S, Lisin D. Image compression based on visual saliency at individual scales. In: Advances in visual computing, LNCS, vol 5875. Springer; 2009. p. 157–66.

36.

Zhai Y, Shah M. Visual attention detection in video sequences using spatiotemporal cues. In: ACM multimedia; 2006. p. 815–24.

Titel: Visual Saliency from Image Features with Application to Compression
verfasst von: P. Harding
N. M. Robertson
Publikationsdatum: 01.03.2013
Verlag: Springer-Verlag
Erschienen in: Cognitive Computation / Ausgabe 1/2013
Print ISSN: 1866-9956
Elektronische ISSN: 1866-9964
DOI: https://doi.org/10.1007/s12559-012-9150-7

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 1/2013

Clustering-Based Extraction of Near Border Data Samples for Remote Sensing Image Classification

Special Issue Editorial: Computational Intelligence and Applications

Stochastic Hybrid System with Polynomial Growth Coefficients

A Neural Mechanism for Reward Discounting: Insights from Modeling Hippocampal–Striatal Interactions

Non-blind Image Deblurring from a Single Image

Deformation Prediction of Landslide Based on Improved Back-propagation Neural Network