Skip to main content
Top

20-08-2024 | Research

From Pixels to Prepositions: Linking Visual Perception with Spatial Prepositions Far and Near

Authors: Krishna Raj S R, Srinivasa Chakravarthy V, Anindita Sahoo

Published in: Cognitive Computation

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Human language is influenced by sensory-motor experiences. Sensory experiences gathered in a spatiotemporal world are used as raw material to create more abstract concepts. In language, one way to encode spatial relationships is through spatial prepositions. Spatial prepositions that specify the proximity of objects in space, like far and near or their variants, are found in most languages. The mechanism for determining the proximity of another entity to itself is a useful evolutionary trait. From the taxic behavior in unicellular organisms like bacteria to the tropism in the plant kingdom, this behavior can be found in almost all organisms. In humans, vision plays a critical role in spatial localization and navigation. This computational study analyzes the relationship between vision and spatial prepositions using an artificial neural network. For this study, a synthetic image dataset was created, with each image featuring a 2D projection of an object placed in 3D space. The objects can be of various shapes, sizes, and colors. A convolutional neural network is trained to classify the object in the images as far or near based on a set threshold. The study mainly explores two visual scenarios: objects confined to a plane (grounded) and objects not confined to a plane (ungrounded), while also analyzing the influence of camera placement. The classification performance is high for the grounded case, demonstrating that the problem of far/near classification is well-defined for grounded objects, given that the camera is at a sufficient height. The network performance showed that depth can be determined in grounded cases only from monocular cues with high accuracy, given the camera is at an adequate height. The difference in the network’s performance between grounded and ungrounded cases can be explained using the physical properties of the retinal imaging system. The task of determining the distance of an object from individual images in the dataset is challenging as they lack any background cues. Still, the network performance shows the influence of spatial constraints placed on the image generation process in determining depth. The results show that monocular cues significantly contribute to depth perception when all the objects are confined to a single plane. A set of sensory inputs (images) and a specific task (far/near classification) allowed us to obtain the aforementioned results. The visual task, along with reaching and motion, may enable humans to carve the space into various spatial prepositional categories like far and near. The network’s performance and how it learns to classify between far and near provided insights into certain visual illusions that involve size constancy.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
5.
go back to reference Tyler A, Evans V. The Semantics of English Prepositions: Spatial Scenes, Embodied Meaning, and Cognition. Cambridge University Press; 2003. Tyler A, Evans V. The Semantics of English Prepositions: Spatial Scenes, Embodied Meaning, and Cognition. Cambridge University Press; 2003.
16.
go back to reference Kelleher JD, Kruijff GJM, Costello FJ. Proximity in context: an empirically grounded computational model of proximity for processing topological spatial expressions. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. ACL-44. USA: Association for Computational Linguistics; 2006. p. 745–752. https://doi.org/10.3115/1220175.1220269. Kelleher JD, Kruijff GJM, Costello FJ. Proximity in context: an empirically grounded computational model of proximity for processing topological spatial expressions. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. ACL-44. USA: Association for Computational Linguistics; 2006. p. 745–752. https://​doi.​org/​10.​3115/​1220175.​1220269.
22.
go back to reference Bennett DC. Spatial and Temporal Uses of English Prepositions. Longman Publishing Group; 1975. Bennett DC. Spatial and Temporal Uses of English Prepositions. Longman Publishing Group; 1975.
23.
go back to reference Lyons J. Introduction to Theoretical Linguistics. Cambridge university press; 1968. Lyons J. Introduction to Theoretical Linguistics. Cambridge university press; 1968.
25.
go back to reference Levinson SC. Language and Space. Annu Rev Anthropol. 1996;25:353–82. Levinson SC. Language and Space. Annu Rev Anthropol. 1996;25:353–82.
30.
go back to reference Snowden RJ, Thompson P, Troscianko T. In: The third dimension. Oxford: Oxford University Press; 2012. p. 203–36. Snowden RJ, Thompson P, Troscianko T. In: The third dimension. Oxford: Oxford University Press; 2012. p. 203–36.
56.
go back to reference Songnian Z, Qi Z, Chang L, Xuemin L, Shousi S, Jun Q. The representation of visual depth perception based on the plenoptic function in the retina and its neural computation in visual cortex V1. BMC Neurosci. 2014;15(1):1–18.CrossRef Songnian Z, Qi Z, Chang L, Xuemin L, Shousi S, Jun Q. The representation of visual depth perception based on the plenoptic function in the retina and its neural computation in visual cortex V1. BMC Neurosci. 2014;15(1):1–18.CrossRef
63.
go back to reference Descartes R. Discourse on Method, Optics, Geometry, Meteorology. New York: Bobbs-Merrill; 1967. Descartes R. Discourse on Method, Optics, Geometry, Meteorology. New York: Bobbs-Merrill; 1967.
65.
go back to reference Gibson JJ. 1. In: Why Do Things Look as They Do? Oxford, England: Houghton Mifflin; 1950. p. 1–11. Gibson JJ. 1. In: Why Do Things Look as They Do? Oxford, England: Houghton Mifflin; 1950. p. 1–11.
68.
go back to reference Seckel AL. Incredible Visual Illusions: You Won’t Believe Your Eyes! Arcturus Publishing Ltd; 2004. Seckel AL. Incredible Visual Illusions: You Won’t Believe Your Eyes! Arcturus Publishing Ltd; 2004.
70.
go back to reference Gibson JJ. 9. In: The Constant Sizes and Shapes of Things. Oxford, England: Houghton Mifflin; 1950. p. 163–187. Gibson JJ. 9. In: The Constant Sizes and Shapes of Things. Oxford, England: Houghton Mifflin; 1950. p. 163–187.
73.
go back to reference Gregory RL. Eye and Brain: The Psychology of Seeing. Princeton University Press; 2015. Gregory RL. Eye and Brain: The Psychology of Seeing. Princeton University Press; 2015.
79.
go back to reference Enright JT. The Moon Illusion Examined from a New Point of View. Proc Am Philos Soc. 1975;119(2):87–107. Enright JT. The Moon Illusion Examined from a New Point of View. Proc Am Philos Soc. 1975;119(2):87–107.
80.
go back to reference Gibson JJ. The perception of the visual world. The perception of the visual world.. Oxford, England: Houghton Mifflin; 1950. Gibson JJ. The perception of the visual world. The perception of the visual world.. Oxford, England: Houghton Mifflin; 1950.
82.
go back to reference Iavecchia JH, Iavecchia HP, Roscoe SN. The moon illusion revisited. Aviat Space Environ Med. 1983;54(1):39–46. Iavecchia JH, Iavecchia HP, Roscoe SN. The moon illusion revisited. Aviat Space Environ Med. 1983;54(1):39–46.
Metadata
Title
From Pixels to Prepositions: Linking Visual Perception with Spatial Prepositions Far and Near
Authors
Krishna Raj S R
Srinivasa Chakravarthy V
Anindita Sahoo
Publication date
20-08-2024
Publisher
Springer US
Published in
Cognitive Computation
Print ISSN: 1866-9956
Electronic ISSN: 1866-9964
DOI
https://doi.org/10.1007/s12559-024-10329-6

Premium Partner