Skip to main content
Erschienen in: Cognitive Computation 1/2013

01.03.2013

Visual Saliency from Image Features with Application to Compression

verfasst von: P. Harding, N. M. Robertson

Erschienen in: Cognitive Computation | Ausgabe 1/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Image feature point algorithms and their associated regional descriptors can be viewed as primitive detectors of visually salient information. In this paper, a new method for constructing a visual attention probability map using features is proposed. (Throughout this work, we use SURF features yet the algorithm is not limited to SURF alone.). This technique is validated using comprehensive human eye-tracking experiments. We call this algorithm “visual interest” (VI) since the resultant segmentation reveals image regions that are visually salient during the performance of multiple observer search tasks. We demonstrate that it works on generic, eye-level photographs and is not dependent on heuristic tuning. We further show that the descriptor-matching property of the SURF feature points can be exploited via object recognition to modulate the context of the attention probability map for a given object search task, refining the salient area. We fully validate the VI algorithm through applying it to salient compression using a pre-blur of non-salient regions prior to JPEG and conducting comprehensive observer performance tests. When using the object contextualisation, we conclude that JPEG files are around 33 % larger than they need to be to fully represent the task-relevant information within them. We finally demonstrate the utility of the segmentation as a region of interest in JPEG2000 compression to achieve superior image quality (measured statistically using PSNR and SSIM) over the automatically selected salient image regions while reducing the image filesize by down to 25 % of that of the original. Our technique therefore delivers superior compression performance through the detection and selective preservation of visually salient information relevant to multiple observer tasks. In contrast to the state of the art in task-directed visual attention models, the VI algorithm reacts only to the image content and requires no detailed prior knowledge of the scene nor of the ultimate observer task.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The segmentation produces an “expectation” value of eye-fixation capture. Our parameters capture reliably 70–75 % of cluttered indoor scene fixations and considerably higher than that for outdoor scenes, 85–95 %, even while the task is varied (see Fig. 5).
 
2
Such confusions could arise from letters at low resolution including the following sets of confusions: FR,   WM,   WN,   G ⇔ 6,   GC,   B ⇔ 8,   VY,   XA,   HK,   5 ⇔ 6,   FP,   HA,   GD,   OD,   BE,   6 ⇔ 8 and G ⇔ 6.
 
3
There is a region of interest (ROI) capability in JPEG2000, but the parameters for the JPEG2000 algorithm are not related to the output quality. In contrast, the JPEG algorithm was designed using data from observer tests: a Q value of 50 is expected to produce good visual quality for photo-real imagery.
 
4
Read as: JPEG with quality level, Q = 40.
 
5
Based on the kducompress examples of the Kakadu Software Company.
 
Literatur
1.
Zurück zum Zitat Bay H, TuyteFlaars T, Gool LV. Surf: speeded up robust features. Comput Vis Image Underst (CVIU) (CVIU). 2006;110(4):346–59. Bay H, TuyteFlaars T, Gool LV. Surf: speeded up robust features. Comput Vis Image Underst (CVIU) (CVIU). 2006;110(4):346–59.
2.
Zurück zum Zitat Brockmole JR, Castelhano MS, Henderson JM. Contextual cueing in naturalistic scenes: global and local contexts. J Exp Psychol: Learn Mem Cogn. 2006;32(4):699–706.CrossRef Brockmole JR, Castelhano MS, Henderson JM. Contextual cueing in naturalistic scenes: global and local contexts. J Exp Psychol: Learn Mem Cogn. 2006;32(4):699–706.CrossRef
3.
Zurück zum Zitat deCampos TE, Csurka G, Perronnin F. Images as sets of locally weighted features. Technical report VSSP-TR-1/2010, FEPS. University of Surrey, Guildford, UK; 2010. deCampos TE, Csurka G, Perronnin F. Images as sets of locally weighted features. Technical report VSSP-TR-1/2010, FEPS. University of Surrey, Guildford, UK; 2010.
4.
Zurück zum Zitat Fergus R. Visual object category recognition. Ph.D. thesis, University of Oxford; 2005. Fergus R. Visual object category recognition. Ph.D. thesis, University of Oxford; 2005.
5.
Zurück zum Zitat Hansen BC, Essock EA. A horizontal bias in human visual processing of orientation and its correspondence to the structural components of natural scenes. J Vis. 2004;4(12):1044–60.PubMedCrossRef Hansen BC, Essock EA. A horizontal bias in human visual processing of orientation and its correspondence to the structural components of natural scenes. J Vis. 2004;4(12):1044–60.PubMedCrossRef
6.
Zurück zum Zitat Harding P, Robertson NM. A comparison of feature detectors with passive and task-based visual saliency. LNCS. 2009;5575:716–25. Harding P, Robertson NM. A comparison of feature detectors with passive and task-based visual saliency. LNCS. 2009;5575:716–25.
7.
Zurück zum Zitat Harel J, Koch C, Perona P. Graph-based visual saliency. In: Advances in neural information processing systems, vol 19; 2007. p. 545–52. Harel J, Koch C, Perona P. Graph-based visual saliency. In: Advances in neural information processing systems, vol 19; 2007. p. 545–52.
8.
Zurück zum Zitat Hopfinger JB, Buonocore MH, R.Mangun G. The neural mechanisms of top–down attentional control. Nat Neurosci. 2000;3:284–91.PubMedCrossRef Hopfinger JB, Buonocore MH, R.Mangun G. The neural mechanisms of top–down attentional control. Nat Neurosci. 2000;3:284–91.PubMedCrossRef
9.
Zurück zum Zitat Itti L. Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans Image Process. 2004;13(10):1304–18.PubMedCrossRef Itti L. Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans Image Process. 2004;13(10):1304–18.PubMedCrossRef
10.
Zurück zum Zitat Itti, L, Koch C. Computational modelling of visual attention. Nat Rev Neurosci. 2001;2(3):194–203.PubMedCrossRef Itti, L, Koch C. Computational modelling of visual attention. Nat Rev Neurosci. 2001;2(3):194–203.PubMedCrossRef
11.
Zurück zum Zitat Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Transact Patt Anal Mach Intell. 1998;20(11):1254–59.CrossRef Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Transact Patt Anal Mach Intell. 1998;20(11):1254–59.CrossRef
12.
Zurück zum Zitat Kadir T, Brady M. Saliency, scale and image description. Int J Comput Vis. 2001;45(2):83–105.CrossRef Kadir T, Brady M. Saliency, scale and image description. Int J Comput Vis. 2001;45(2):83–105.CrossRef
13.
Zurück zum Zitat Lindeberg T. Scale-space theory: A basic tool for analysing structures at different scales. J Appl Stat. 1994;21(2):224–70. Lindeberg T. Scale-space theory: A basic tool for analysing structures at different scales. J Appl Stat. 1994;21(2):224–70.
14.
Zurück zum Zitat Lowe DG. Distinctive image features from scale-invariant keypoints. Int J Comput Vis. 2004;60:91–110.CrossRef Lowe DG. Distinctive image features from scale-invariant keypoints. Int J Comput Vis. 2004;60:91–110.CrossRef
15.
Zurück zum Zitat Matas J, Chum O, Urban M, Pajdla T. Robust wide baseline stereo from maximally stable extremal regions. In: Proceedings of the British machine vision conference; 2002. p. 384–93. Matas J, Chum O, Urban M, Pajdla T. Robust wide baseline stereo from maximally stable extremal regions. In: Proceedings of the British machine vision conference; 2002. p. 384–93.
16.
Zurück zum Zitat Mikolajczyk K, Schmid C. An affine invariant interest point detector. In: 7th European conference on computer vision, vol 1; 2002. p. 128–42. Mikolajczyk K, Schmid C. An affine invariant interest point detector. In: 7th European conference on computer vision, vol 1; 2002. p. 128–42.
17.
Zurück zum Zitat Mikolajczyk K, Schmid C. A performance evaluation of local descriptors. IEEE Transact Patt Anal Mach Intell. 2005;27(10):1615–30.CrossRef Mikolajczyk K, Schmid C. A performance evaluation of local descriptors. IEEE Transact Patt Anal Mach Intell. 2005;27(10):1615–30.CrossRef
18.
Zurück zum Zitat Miller JL, Wiltse JM. Resolution requirements for alphanumeric readability. Opt Eng. 2003;42(3):846–52.CrossRef Miller JL, Wiltse JM. Resolution requirements for alphanumeric readability. Opt Eng. 2003;42(3):846–52.CrossRef
19.
Zurück zum Zitat Navalpakkam V, Itti L. Modeling the influence of task on attention. Vis Res. 2005;45(2):205–31.PubMedCrossRef Navalpakkam V, Itti L. Modeling the influence of task on attention. Vis Res. 2005;45(2):205–31.PubMedCrossRef
20.
Zurück zum Zitat Navalpakkam V, Itti L. Search goal tunes visual features optimally. Neuron. 2007;53(4):605–17.PubMedCrossRef Navalpakkam V, Itti L. Search goal tunes visual features optimally. Neuron. 2007;53(4):605–17.PubMedCrossRef
22.
Zurück zum Zitat Peters R, Itti L. Beyond bottom-up: incorporating task-dependent influences into a computational model of spatial attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR); 2007. p. 1–8. Peters R, Itti L. Beyond bottom-up: incorporating task-dependent influences into a computational model of spatial attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR); 2007. p. 1–8.
24.
Zurück zum Zitat Rosten E, Drummond T. Fusing points and lines for high performance tracking. In: 10th IEEE international conference on computer vision, vol 2; 2005. p. 1508–11. Rosten E, Drummond T. Fusing points and lines for high performance tracking. In: 10th IEEE international conference on computer vision, vol 2; 2005. p. 1508–11.
25.
Zurück zum Zitat Rosten E, Drummond T. Machine learning for high-speed corner detection. In: Proceedings of the 9th European conference on computer vision, vol 1; 2006. p. 430–43. Rosten E, Drummond T. Machine learning for high-speed corner detection. In: Proceedings of the 9th European conference on computer vision, vol 1; 2006. p. 430–43.
26.
Zurück zum Zitat Taubman D. Kakadu v5.0 survey document; 2001. Taubman D. Kakadu v5.0 survey document; 2001.
27.
Zurück zum Zitat Taubman DS, Marcellin MW. JPEG 2000: image compression fundamentals, standards and practice. Norwell: Kluwer; 2001. Taubman DS, Marcellin MW. JPEG 2000: image compression fundamentals, standards and practice. Norwell: Kluwer; 2001.
28.
Zurück zum Zitat Torralba A. Contextual priming for object detection. Int J Comput Vis. 2003;53(2):169–91.CrossRef Torralba A. Contextual priming for object detection. Int J Comput Vis. 2003;53(2):169–91.CrossRef
30.
Zurück zum Zitat Union IT. Reference algorithm for computing peak signal to noise ratio (psnr) of a video sequence with a constant delay. ITU-T standard; 2009. Union IT. Reference algorithm for computing peak signal to noise ratio (psnr) of a video sequence with a constant delay. ITU-T standard; 2009.
31.
Zurück zum Zitat Viola P, Jones M. Robust real-time object detection. Int J Comput Vis. 2001;57:137–54.CrossRef Viola P, Jones M. Robust real-time object detection. Int J Comput Vis. 2001;57:137–54.CrossRef
32.
Zurück zum Zitat Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Transact Image Process. 2004;13:600–12.CrossRef Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Transact Image Process. 2004;13:600–12.CrossRef
33.
Zurück zum Zitat Wolfe J. Visual attention. In: De Valois KK, editors. Seeing. 2nd ed. San Diego: Academic Press; 2000. Wolfe J. Visual attention. In: De Valois KK, editors. Seeing. 2nd ed. San Diego: Academic Press; 2000.
34.
Zurück zum Zitat Wolfe JM, Horowitz TS, Kenner N, Hyle M, Vasan N. How fast can you change your mind? the speed of top–down guidance in visual search. Vis Res. 2004;44:1411–26.PubMedCrossRef Wolfe JM, Horowitz TS, Kenner N, Hyle M, Vasan N. How fast can you change your mind? the speed of top–down guidance in visual search. Vis Res. 2004;44:1411–26.PubMedCrossRef
35.
Zurück zum Zitat Yu S, Lisin D. Image compression based on visual saliency at individual scales. In: Advances in visual computing, LNCS, vol 5875. Springer; 2009. p. 157–66. Yu S, Lisin D. Image compression based on visual saliency at individual scales. In: Advances in visual computing, LNCS, vol 5875. Springer; 2009. p. 157–66.
36.
Zurück zum Zitat Zhai Y, Shah M. Visual attention detection in video sequences using spatiotemporal cues. In: ACM multimedia; 2006. p. 815–24. Zhai Y, Shah M. Visual attention detection in video sequences using spatiotemporal cues. In: ACM multimedia; 2006. p. 815–24.
Metadaten
Titel
Visual Saliency from Image Features with Application to Compression
verfasst von
P. Harding
N. M. Robertson
Publikationsdatum
01.03.2013
Verlag
Springer-Verlag
Erschienen in
Cognitive Computation / Ausgabe 1/2013
Print ISSN: 1866-9956
Elektronische ISSN: 1866-9964
DOI
https://doi.org/10.1007/s12559-012-9150-7

Weitere Artikel der Ausgabe 1/2013

Cognitive Computation 1/2013 Zur Ausgabe