Skip to main content

2011 | OriginalPaper | Buchkapitel

An Integrated Approach to Visual Attention Modeling for Saliency Detection in Videos

verfasst von : Sunaad Nataraju, Vineeth Balasubramanian, Sethuraman Panchanathan

Erschienen in: Machine Learning for Vision-Based Motion Analysis

Verlag: Springer London

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this chapter, we present a framework to learn and predict regions of interest in videos, based on human eye movements. In our approach, the eye gaze information of several users are recorded as they watch videos that are similar, and belong to a particular application domain. This information is used to train a classifier to learn low-level video features from regions that attracted the visual attention of users. Such a classifier is combined with vision-based approaches to provide an integrated framework to detect salient regions in videos. Till date, saliency prediction has been viewed from two different perspectives, namely visual attention modeling and spatiotemporal interest point detection. These approaches have largely been vision-based. They detect regions having a predefined set of characteristics such as complex motion or high contrast, for all kinds of videos. However, what is ‘interesting’ varies from one application to another. By learning features of regions that capture the attention of viewers while watching a video, we aim to distinguish those that are actually salient in the given context, from the rest. The integrated approach ensures that both regions with anticipated content (top–down attention) and unanticipated content (bottom–up attention) are predicted by the proposed framework as salient. In our experiments with news videos of popular channels, the results show a significant improvement in the identification of relevant salient regions in such videos, when compared with existing approaches.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatiotemporal features. In: 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65–72 (2005) Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatiotemporal features. In: 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65–72 (2005)
2.
Zurück zum Zitat Duchowski, A.: A breadth-first survey of eye-tracking applications. Behav. Res. Methods Instrum. Comput. 34(4), 455–470 (2002) CrossRef Duchowski, A.: A breadth-first survey of eye-tracking applications. Behav. Res. Methods Instrum. Comput. 34(4), 455–470 (2002) CrossRef
3.
Zurück zum Zitat Findlay, J.M., Walker, R., Kentridge, R.W.: Eye Movement Research. Elsevier, Amsterdam (1995) Findlay, J.M., Walker, R., Kentridge, R.W.: Eye Movement Research. Elsevier, Amsterdam (1995)
4.
Zurück zum Zitat Gao, D., Vasconcelos, N.: Discriminant saliency for visual recognition from cluttered scenes. Adv. Neural Inf. Process. Syst. 17, 481–488 (2005) Gao, D., Vasconcelos, N.: Discriminant saliency for visual recognition from cluttered scenes. Adv. Neural Inf. Process. Syst. 17, 481–488 (2005)
5.
Zurück zum Zitat Gao, D., Vasconcelos, N.: Bottom–up saliency is a discriminant process. In: IEEE International Conference on Computer Vision (2007) Gao, D., Vasconcelos, N.: Bottom–up saliency is a discriminant process. In: IEEE International Conference on Computer Vision (2007)
6.
Zurück zum Zitat Gao, D., Han, S., Vasconcelos, N.: Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(6), 989–1005 (2009) CrossRef Gao, D., Han, S., Vasconcelos, N.: Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(6), 989–1005 (2009) CrossRef
7.
Zurück zum Zitat Granka, L., Joachims, T., Gay, G.: Eye-tracking analysis of user behavior in WWW search. In: Proc. of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 478–479. ACM, New York (2004) Granka, L., Joachims, T., Gay, G.: Eye-tracking analysis of user behavior in WWW search. In: Proc. of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 478–479. ACM, New York (2004)
8.
Zurück zum Zitat Guo, C., Ma, Q., Zhang, L.: Spatiotemporal saliency detection using phase spectrum of quaternion fourier transform. In: Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2008) Guo, C., Ma, Q., Zhang, L.: Spatiotemporal saliency detection using phase spectrum of quaternion fourier transform. In: Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2008)
9.
Zurück zum Zitat Harris, C., Stephens, M.: A combined corner and edge detector. In: Alvey Vision Conference, vol. 15, p. 50 (1988) Harris, C., Stephens, M.: A combined corner and edge detector. In: Alvey Vision Conference, vol. 15, p. 50 (1988)
10.
Zurück zum Zitat Hou, X., Zhang, L.: Saliency detection: a spectral residual approach. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR07. IEEE Computer Society, pp. 1–8. IEEE Comput. Soc., Los Alamitos (2007) CrossRef Hou, X., Zhang, L.: Saliency detection: a spectral residual approach. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR07. IEEE Computer Society, pp. 1–8. IEEE Comput. Soc., Los Alamitos (2007) CrossRef
11.
Zurück zum Zitat Itti, L.: Models of bottom–up attention and saliency. In: Neurobiology of Attention, vol. 582. Elsevier, Amsterdam (2005) Itti, L.: Models of bottom–up attention and saliency. In: Neurobiology of Attention, vol. 582. Elsevier, Amsterdam (2005)
12.
Zurück zum Zitat Itti, L., Koch, C.: Computational modelling of visual attention. Nat. Rev. Neurosci. 2(3), 194–203 (2001) CrossRef Itti, L., Koch, C.: Computational modelling of visual attention. Nat. Rev. Neurosci. 2(3), 194–203 (2001) CrossRef
13.
Zurück zum Zitat Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998) CrossRef Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998) CrossRef
14.
Zurück zum Zitat Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: IEEE International Conference on Computer Vision, ICCV (2009) Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: IEEE International Conference on Computer Vision, ICCV (2009)
15.
Zurück zum Zitat Kadir, T., Brady, M.: Scale saliency: a novel approach to salient feature and scale selection. In: International Conference on Visual Information Engineering, VIE 2003, pp. 25–28 (2003) Kadir, T., Brady, M.: Scale saliency: a novel approach to salient feature and scale selection. In: International Conference on Visual Information Engineering, VIE 2003, pp. 25–28 (2003)
16.
Zurück zum Zitat Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: 10th IEEE International Conference on Computer Vision, ICCV 2005, vol. 1 (2005) Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: 10th IEEE International Conference on Computer Vision, ICCV 2005, vol. 1 (2005)
17.
Zurück zum Zitat Kienzle, W., Wichmann, F., Scholkopf, B., Franz, M.: Learning an interest operator from human eye movements. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 17, p. 22 (2006) Kienzle, W., Wichmann, F., Scholkopf, B., Franz, M.: Learning an interest operator from human eye movements. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 17, p. 22 (2006)
18.
Zurück zum Zitat Kienzle, W., Wichmann, F., Scholkopf, B., Franz, M.: A nonparametric approach to bottom–up visual saliency. Adv. Neural Inf. Process. Syst. 19, 689 (2007) Kienzle, W., Wichmann, F., Scholkopf, B., Franz, M.: A nonparametric approach to bottom–up visual saliency. Adv. Neural Inf. Process. Syst. 19, 689 (2007)
19.
Zurück zum Zitat Kienzle, W., Scholkopf, B., Wichmann, F., Franz, M.: How to find interesting locations in video: a spatiotemporal interest point detector learned from human eye movements. Lect. Notes Comput. Sci. 4713, 405 (2007) CrossRef Kienzle, W., Scholkopf, B., Wichmann, F., Franz, M.: How to find interesting locations in video: a spatiotemporal interest point detector learned from human eye movements. Lect. Notes Comput. Sci. 4713, 405 (2007) CrossRef
20.
Zurück zum Zitat Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. Hum. Neurobiol. 4(4), 219–227 (1985) Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. Hum. Neurobiol. 4(4), 219–227 (1985)
22.
Zurück zum Zitat Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Conference on Computer Vision and Pattern Recognition (2008) Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)
23.
Zurück zum Zitat Liu, T., Sun, J., Zheng, N., Tang, X., Shum, H.: Learning to detect a salient object. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2007) Liu, T., Sun, J., Zheng, N., Tang, X., Shum, H.: Learning to detect a salient object. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2007)
24.
Zurück zum Zitat Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004) CrossRef Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004) CrossRef
25.
Zurück zum Zitat Mohanna, F., Mokhtarian, F.: Performance evaluation of corner detection algorithms under similarity and affine transforms. In: British Machine Vision Conference, BMVC (2001) Mohanna, F., Mokhtarian, F.: Performance evaluation of corner detection algorithms under similarity and affine transforms. In: British Machine Vision Conference, BMVC (2001)
26.
Zurück zum Zitat Nabney, I.T.: Netlab, corrected edn. Springer, Berlin (2001) Nabney, I.T.: Netlab, corrected edn. Springer, Berlin (2001)
27.
Zurück zum Zitat Navalpakkam, V., Itti, L.: An integrated model of top–down and bottom–up attention for optimal object detection. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 1–7 (2006) Navalpakkam, V., Itti, L.: An integrated model of top–down and bottom–up attention for optimal object detection. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 1–7 (2006)
28.
Zurück zum Zitat Navalpakkam, V., Itti, L.: Optimal cue selection strategy. Adv. Neural Inf. Process. Syst. 18, 987 (2006) Navalpakkam, V., Itti, L.: Optimal cue selection strategy. Adv. Neural Inf. Process. Syst. 18, 987 (2006)
29.
Zurück zum Zitat Niebles, J., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. Int. J. Comput. Vis. 79(3), 299–318 (2008) CrossRef Niebles, J., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. Int. J. Comput. Vis. 79(3), 299–318 (2008) CrossRef
30.
Zurück zum Zitat Oikonomopoulos, A., Patras, I., Pantic, M.: Human action recognition with spatiotemporal salient points. IEEE Trans. Syst. Man Cybern., Part B 36(3), 710–719 (2006) CrossRef Oikonomopoulos, A., Patras, I., Pantic, M.: Human action recognition with spatiotemporal salient points. IEEE Trans. Syst. Man Cybern., Part B 36(3), 710–719 (2006) CrossRef
31.
Zurück zum Zitat Oliva, A., Torralba, A., Castelhano, M., Henderson, J.: Top–down control of visual attention in object detection. In: IEEE Proc. of the International Conference on Image Processing, vol. 1, pp. 253–256 (2003) Oliva, A., Torralba, A., Castelhano, M., Henderson, J.: Top–down control of visual attention in object detection. In: IEEE Proc. of the International Conference on Image Processing, vol. 1, pp. 253–256 (2003)
32.
Zurück zum Zitat Oyekoya, O., Stentiford, F.: An eye tracking interface for image search. In: ETRA ’06: Proc. of the 2006 Symposium on Eye Tracking Research and Applications, New York, NY, USA, pp. 40–40. ACM, New York (2006) CrossRef Oyekoya, O., Stentiford, F.: An eye tracking interface for image search. In: ETRA ’06: Proc. of the 2006 Symposium on Eye Tracking Research and Applications, New York, NY, USA, pp. 40–40. ACM, New York (2006) CrossRef
33.
Zurück zum Zitat Oyekoya, O., Stentiford, F.: Perceptual image retrieval using eye movements. Int. J. Comput. Math. 84(9), 1379–1391 (2007) MathSciNetMATHCrossRef Oyekoya, O., Stentiford, F.: Perceptual image retrieval using eye movements. Int. J. Comput. Math. 84(9), 1379–1391 (2007) MathSciNetMATHCrossRef
34.
Zurück zum Zitat Ruderman, D.: The statistics of natural images. Netw. Comput. Neural Syst. 5(4), 517–548 (1994) MATHCrossRef Ruderman, D.: The statistics of natural images. Netw. Comput. Neural Syst. 5(4), 517–548 (1994) MATHCrossRef
35.
Zurück zum Zitat Salojarvi, J., Kojo, I., Simola, J., Kaski, S.: Can relevance be inferred from eye movements in information retrieval. In: Proc. of WSOM, vol. 3, pp. 261–266 (2003) Salojarvi, J., Kojo, I., Simola, J., Kaski, S.: Can relevance be inferred from eye movements in information retrieval. In: Proc. of WSOM, vol. 3, pp. 261–266 (2003)
36.
Zurück zum Zitat Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proc. of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 3 (2004) Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proc. of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 3 (2004)
37.
Zurück zum Zitat Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: Proc. of the 15th International Conference on Multimedia, p. 360. ACM, New York (2007) Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: Proc. of the 15th International Conference on Multimedia, p. 360. ACM, New York (2007)
38.
Zurück zum Zitat Srivastava, A., Lee, A., Simoncelli, E., Zhu, S.: On advances in statistical modeling of natural images. J. Math. Imaging Vis. 18(1), 17–33 (2003) MathSciNetMATHCrossRef Srivastava, A., Lee, A., Simoncelli, E., Zhu, S.: On advances in statistical modeling of natural images. J. Math. Imaging Vis. 18(1), 17–33 (2003) MathSciNetMATHCrossRef
39.
Zurück zum Zitat Stentiford, F.: An estimator for visual attention through competitive novelty with application to image compression. In: Picture Coding Symposium, pp. 25–27 (2001) Stentiford, F.: An estimator for visual attention through competitive novelty with application to image compression. In: Picture Coding Symposium, pp. 25–27 (2001)
40.
Zurück zum Zitat Stentiford, F.: An attention based similarity measure with application to content based information retrieval. In: Storage and Retrieval for Media Databases, pp. 20–24 (2003) Stentiford, F.: An attention based similarity measure with application to content based information retrieval. In: Storage and Retrieval for Media Databases, pp. 20–24 (2003)
41.
Zurück zum Zitat Torralba, A.: Modeling global scene factors in attention. J. Opt. Soc. Am. A 20(7), 1407–1418 (2003) CrossRef Torralba, A.: Modeling global scene factors in attention. J. Opt. Soc. Am. A 20(7), 1407–1418 (2003) CrossRef
42.
Zurück zum Zitat Treisman, A., Gelade, G.: A feature-integration theory of attention. Cogn. Psychol. 12(1), 97–136 (1980) CrossRef Treisman, A., Gelade, G.: A feature-integration theory of attention. Cogn. Psychol. 12(1), 97–136 (1980) CrossRef
43.
Zurück zum Zitat Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple. In: Proc. of CVPR2001, vol. 1 (2001) Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple. In: Proc. of CVPR2001, vol. 1 (2001)
44.
Zurück zum Zitat Wang, Z., Li, B.: A two-stage approach to saliency detection in images. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, pp. 965–968 (2008) Wang, Z., Li, B.: A two-stage approach to saliency detection in images. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, pp. 965–968 (2008)
45.
Zurück zum Zitat Ziou, D., Tabbone, S.: Edge detection techniques an overview. Int. J. Pattern Recognit. Image Anal. 8(4), 537–559 (1998) Ziou, D., Tabbone, S.: Edge detection techniques an overview. Int. J. Pattern Recognit. Image Anal. 8(4), 537–559 (1998)
Metadaten
Titel
An Integrated Approach to Visual Attention Modeling for Saliency Detection in Videos
verfasst von
Sunaad Nataraju
Vineeth Balasubramanian
Sethuraman Panchanathan
Copyright-Jahr
2011
Verlag
Springer London
DOI
https://doi.org/10.1007/978-0-85729-057-1_8