skip to main content
10.1145/2857491.2857495acmconferencesArticle/Chapter ViewAbstractPublication PagesetraConference Proceedingsconference-collections
research-article

Factors underlying inter-observer agreement in gaze patterns: predictive modelling and analysis

Published:14 March 2016Publication History

ABSTRACT

In viewing an image or real-world scene, different observers may exhibit different viewing patterns. This is evidently due to a variety of different factors, involving both bottom-up and top-down processing. In the literature addressing prediction of visual saliency, agreement in gaze patterns across observers is often quantified according to a measure of inter-observer congruency (IOC). Intuitively, common viewership patterns may be expected to diagnose certain image qualities including the capacity for an image to draw attention, or perceptual qualities of an image relevant to applications in human computer interaction, visual design and other domains. Moreover, there is value in determining the extent to which different factors contribute to inter-observer variability, and corresponding dependence on the type of content being viewed. In this paper, we assess the extent to which different types of features contribute to variability in viewing patterns across observers. This is accomplished in considering correlation between image derived features and IOC values, and based on the capacity for more complex feature sets to predict IOC based on a regression model. Experimental results demonstrate the value of different feature types for predicting IOC. These results also establish the relative importance of top-down and bottom-up information in driving gaze and provide new insight into predictive analysis for gaze behavior associated with perceptual characteristics of images.

References

  1. Borji, A., Sihite, D. N., and Itti, L. 2013. Quantitative analysis of human-model agreement in visual saliency modeling: A comparative study. IEEE Transactions on Image Processing 22, 1, 55--69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bruce, N. D. B., and Tsotsos, J. K. 2009. Saliency, attention, and visual search: An information theoretic approach. Journal of Vision 9, 3.Google ScholarGoogle ScholarCross RefCross Ref
  3. Chang, C. C., and Lin, C. J. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1--27:27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. 2010. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 9, 1627--1645. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Garcia-Diaz, A., Fdez-Vidal, X. R., Pardo, X. M., and Dosil, R. 2012. Saliency from hierarchical adaptation through decorrelation and variance normalization. Image and Vision Computing 30, 1, 51--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Harel, J., Koch, C., and Perona, P. 2007. Graph-based visual saliency. Advances in neural information processing systems 19, 545.Google ScholarGoogle Scholar
  7. Hossein Khatoonabadi, S., Vasconcelos, N., Bajic, I. V., and Shan, Y. 2015. How many bits does it take for a stimulus to be salient? In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5501--5510.Google ScholarGoogle Scholar
  8. Hou, X., and Zhang, L. 2007. Saliency detection: A spectral residual approach. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  9. Hou, X., and Zhang, L. 2009. Dynamic visual attention: Searching for coding length increments. 681--688.Google ScholarGoogle Scholar
  10. Hou, X., Harel, J., and Koch, C. 2012. Image signature: Highlighting sparse salient regions. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 1, 194--201. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Isola, P., Xiao, J., Torralba, A., and Oliva, A. 2011. What makes an image memorable? Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 145--152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Itti, L., Koch, C., and Niebur, E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 11, 1254--1259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jazayeri, M., and Movshon, J. A. 2006. Optimal representation of sensory information by neural populations. Nature Neuroscience 9, 5, 690--696.Google ScholarGoogle ScholarCross RefCross Ref
  14. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. 2014. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093.Google ScholarGoogle Scholar
  15. Judd, T., Ehinger, K., Durand, F., and Torralba, A. 2009. Learning to predict where humans look. In Computer Vision, 2009 IEEE 12th international conference on, IEEE, 2106--2113.Google ScholarGoogle Scholar
  16. Judd, T., Durand, F., and Torralba, A. 2011. Fixations on low-resolution images. Journal of Vision 11, 4, 1--20.Google ScholarGoogle ScholarCross RefCross Ref
  17. Kanan, C., Tong, M. H., Zhang, L., and Cottrell, G. W. 2009. Sun: Top-down saliency using natural statistics. Visual Cognition 17, 6-7, 979--1003.Google ScholarGoogle ScholarCross RefCross Ref
  18. Koehler, K., Guo, F., Zhang, S., and Eckstein, M. P. 2014. What do saliency models predict? Journal of Vision 14, 3.Google ScholarGoogle ScholarCross RefCross Ref
  19. Krizhevsky, A., Sutskever, I., and Hinton, G. E. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, 1097--1105.Google ScholarGoogle Scholar
  20. Le Meur, O., and Baccino, T. 2013. Methods for comparing scanpaths and saliency maps: Strengths and weaknesses. Behavior Research Methods 45, 1, 251--266.Google ScholarGoogle ScholarCross RefCross Ref
  21. Le Meur, O., Le Callet, P., Barba, D., and Thoreau, D. 2006. A coherent computational approach to model bottom-up visual attention. Pattern Analysis and Machine Intelligence, IEEE Transactions on 28, 5, 802--817. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Le Meur, O., Baccino, T., and Roumy, A. 2011. Prediction of the inter-observer visual congruency (iovc) and application to image ranking. In Proceedings of the 19th ACM international conference on Multimedia, ACM, 373--382. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 2 (Nov.), 91--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Mancas, M., and Le Meur, O. 2013. Memorability of natural scenes: The role of attention. 2013 IEEE International Conference on Image Processing, ICIP 2013 - Proceedings, 196--200.Google ScholarGoogle ScholarCross RefCross Ref
  25. Murray, N., Marchesotti, L., and Perronnin, F. 2012. Ava: A large-scale database for aesthetic visual analysis. 2408--2415. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Oliva, A., and Torralba, A. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42, 3, 145--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Olshausen, B. A., and Field, D. J. 1997. Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision Research 37, 23, 3311--3325.Google ScholarGoogle ScholarCross RefCross Ref
  28. Rahman, S., Rochan, M., Wang, Y., and Bruce, N. D. 2014. Examining visual saliency prediction in naturalistic scenes. In Image Processing (ICIP), 2014 IEEE International Conference on, IEEE, 4082--4086.Google ScholarGoogle Scholar
  29. Rosenholtz, R., Li, Y., and Nakano, L. 2007. Measuring visual clutter. Journal of Vision 7, 2.Google ScholarGoogle ScholarCross RefCross Ref
  30. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L., 2014. ImageNet Large Scale Visual Recognition Challenge.Google ScholarGoogle Scholar
  31. Seo, H. J., and Milanfar, P. 2009. Static and space-time visual saliency detection by self-resemblance. Journal of Vision 9, 12, 1--27.Google ScholarGoogle ScholarCross RefCross Ref
  32. Simonyan, K., and Zisserman, A. 2014. Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556.Google ScholarGoogle Scholar
  33. Torralba, A., Oliva, A., Castelhano, M. S., and Henderson, J. M. 2006. Contextual guidance of eye movements and attention in real-world scenes: The role of global features in object search. Psychological Review 113, 4, 766--786.Google ScholarGoogle ScholarCross RefCross Ref
  34. Torralba, A. 2003. Modeling global scene factors in attention. Journal of the Optical Society of America A: Optics and Image Science, and Vision 20, 7, 1407--1418.Google ScholarGoogle ScholarCross RefCross Ref
  35. Van der Maaten, L., and Hinton, G. 2008. Visualizing data using t-sne. Journal of Machine Learning Research 9, 2579--2605, 85.Google ScholarGoogle Scholar
  36. Viola, P., and Jones, M. J. 2004. Robust real-time face detection. International journal of computer vision 57, 2, 137--154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Wainwright, M. J. 1999. Visual adaptation as optimal information transmission. Vision Research 39, 23, 3960--3974. cited By (since 1996)114.Google ScholarGoogle ScholarCross RefCross Ref
  38. Xiao, J., Hays, J., Ehinger, K., Oliva, A., and Torralba, A. 2010. Sun database: Large-scale scene recognition from abbey to zoo. 3485--3492.Google ScholarGoogle Scholar
  39. Yan, J., Liu, J., Li, Y., Niu, Z., and Liu, Y. 2010. Visual saliency detection via rank-sparsity decomposition. Proceedings of International Conference on Image Processing, ICIP, 1089--1092.Google ScholarGoogle Scholar

Index Terms

  1. Factors underlying inter-observer agreement in gaze patterns: predictive modelling and analysis

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ETRA '16: Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications
        March 2016
        378 pages
        ISBN:9781450341257
        DOI:10.1145/2857491

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 14 March 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate69of137submissions,50%

        Upcoming Conference

        ETRA '24
        The 2024 Symposium on Eye Tracking Research and Applications
        June 4 - 7, 2024
        Glasgow , United Kingdom

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader