Skip to main content

2022 | Book

Deep Learning to See

Towards New Foundations of Computer Vision

Authors: Prof. Dr. Alessandro Betti, Prof. Dr. Marco Gori, Prof. Dr. Stefano Melacci

Publisher: Springer International Publishing

Book Series : SpringerBriefs in Computer Science


About this book

The remarkable progress in computer vision over the last few years is, by and large, attributed to deep learning, fueled by the availability of huge sets of labeled data, and paired with the explosive growth of the GPU paradigm. While subscribing to this view, this work criticizes the supposed scientific progress in the field, and proposes the investigation of vision within the framework of information-based laws of nature.

This work poses fundamental questions about vision that remain far from understood, leading the reader on a journey populated by novel challenges resonating with the foundations of machine learning. The central thesis proposed is that for a deeper understanding of visual computational processes, it is necessary to look beyond the applications of general purpose machine learning algorithms, and focus instead on appropriate learning theories that take into account the spatiotemporal nature of the visual signal.

Serving to inspire and stimulate critical reflection and discussion, yet requiring no prior advanced technical knowledge, the text can naturally be paired with classic textbooks on computer vision to better frame the current state of the art, open problems, and novel potential solutions. As such, it will be of great benefit to graduate and advanced undergraduate students in computer science, computational neuroscience, physics, and other related disciplines.

Table of Contents

Chapter 1. Motion Is the Protagonist of Vision
Marco Gori, Alessandro Betti, Stefano Melacci
Chapter 2. Focus of Attention
As we assume that there is an underlying process of eye movements, we recognize the importance of continuously interacting with motion fields, even in the case of still images. The trajectory of the point on which the agent focuses its attention is a fundamental information in the theory of visual perception that we want to propose. We argue how the external motion, which comes either in case of moving objects or in case of moving agent, is integrated with the head/eyes motion of the agent.
Marco Gori, Alessandro Betti, Stefano Melacci
Chapter 3. Principles of Motion Invariance
Marco Gori, Alessandro Betti, Stefano Melacci
Chapter 4. Foveated Neural Networks
Marco Gori, Alessandro Betti, Stefano Melacci
Chapter 5. Information-Based Laws of Feature Learning
What are the mechanisms behind learning to see? This is what we address this chapter, where the underlying computational process does in fact characterize the agent’s life in his own visual environment. This is built up on the neural architecture described in the previous chapter that is properly chosen for the incorporation of the motion consistent and abstraction constraints coming from the I and the II Principles. We begin addressing the simplest case of feature conjugation that arises when we estimate the optical flow and continue by considering the canonical set of ODE that express all the visual constraints.
Marco Gori, Alessandro Betti, Stefano Melacci
Chapter 6. Non-visual Environmental Interactions
Marco Gori, Alessandro Betti, Stefano Melacci
Deep Learning to See
Prof. Dr. Alessandro Betti
Prof. Dr. Marco Gori
Prof. Dr. Stefano Melacci
Copyright Year
Electronic ISBN
Print ISBN

Premium Partner