nach oben

1988 | Buch

Kapitel lesen Erstes Kapitel lesen

Motion Understanding

Robot and Human Vision

herausgegeben von: W. N. Martin, J. K. Aggarwal

Verlag: Springer US

Buchreihe : The International Series in Engineering and Computer Science

Enthalten in: Professional Book Archive

Einloggen, um Zugang zu erhalten

Über dieses Buch

The physical processes which initiate and maintain motion have been a major concern of serious investigation throughout the evolution of scientific thought. As early as the fifth century B. C. questions regarding motion were presented as touchstones for the most fundamental concepts about existence. Such wide ranging philosophical issues are beyond the scope of this book, however, consider the paradox of the flying arrow attri buted to Zeno of Elea: An arrow is shot from point A to point B requiring a sequence of time instants to traverse the distance. Now, for any time instant, T, of the sequence the arrow is at a position, Pi' and at Ti+! the i arrow is at Pi+i> with Pi ::I-P+• Clearly, each Ti must be a singular time i 1 unit at which the arrow is at rest at Pi because if the arrow were moving during Ti there would be a further sequence, Til' of time instants required for the arrow to traverse the smaller distance. Now, regardless of the level to which this recursive argument is applied, one is left with the flight of the arrow comprising a sequence of positions at which the arrow is at rest. The original intent of presenting this paradox has been interpreted to be as an argument against the possibility of individuated objects moving in space.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Bounding Constraint Propagation for Optical Flow Estimation

Abstract

The velocity field that represents the motion of object points across an image is called the optical flow field. Optical flow results from relative motion between a camera and objects in the scene. One class of techniques for the estimation of optical flow utilizes a relationship between the motion of surfaces and the derivatives of image brightness (Limb and Murphy, 1975; Cafforio and Rocca, 1976; Fennema and Thompson, 1979; Netravali and Robbins, 1979; Schalkoff, 1979; Lucas and Kanade, 1981; Schunck and Horn, 1981; Thompson and Barnard, 1981; and Schalkoff and McVey, 1982). The major difficulty with gradient-based methods is their sensitivity to conditions commonly encountered in real imagery. Highly textured regions, motion boundaries, and depth discontinuities can all be troublesome for gradient-based methods. Fortunately, the areas characterized by these difficult conditions are usually small and localized.

Joseph K. Kearney, William B. Thompson

Chapter 2. Image Flow: Fundamentals and Algorithms

Abstract

This chapter describes work toward understanding the fundamentals of image flow and presents algorithms for estimating the image flow field. Image flow is the velocity field in the image plane that arises due to the projection of moving patterns in the scene onto the image plane. The motion of patterns in the image plane may be due to the motion of the observer, the motion of objects in the scene, or both. The motion may also be apparent motion where a change in the image between frames gives the illusion of motion. The image flow field can be used to solve important vision problems provided that it can be accurately and reliably computed. Potential applications are discussed in Section 2.1.2.

Brian G. Schunck

Chapter 3. A Computational Approach to the Fusion of Stereopsis and Kineopsis

Abstract

Vision research in fields as diverse as computer science, psychology, and neurophysiology, has led to the emergence of stereopsis and kineopsis as the two principal views which explain some of the mechanisms of space perception.

Amar Mitiche

Chapter 4. The Empirical Study of Structure from Motion

Abstract

There have been important accomplishments in the computational analysis of the recovery of information about three-dimensional relationships in the environment from dynamic two-dimensional images. As an achievement in artificial intelligence and as a significant application to robotics, these results stand by themselves. There has been increasing interest, however, in applying these analyses to the study of human visual perception. The question of applicability of a particular computational analysis to human vision is an empirical one, and one that can only be answered through rigorous empirical research. In this chapter, I will review the current status of empirical research that addresses four issues of relevance to motion understanding. The first issue is whether there are two paths to the recovery of three-dimensional structure from motion: (a) one that proceeds from the primal sketch to a viewer-centered 2 1/2 dimensional sketch to an object-centered 3-dimensional model as proposed by Marr (1982), and (b) a direct path from the primal sketch to an object-centered representation, with viewer-centered information added from separate sources. The second issue is whether the solution to the correspondence problem must precede the extraction of structure from motion. The third issue is the now familiar controversy in the perception literature about whether a rigidity constraint plays a central role in the recovery of structure from motion.

Myron L. Braunstein

Chapter 5. Motion Estimation Using More Than Two Images

Abstract

The idea of using three or more consecutive frames for motion analysis has been mentioned by many researchers (Ullman, 1979; Lawton, 1980; Yasumoto and Medioni, 1985). However, most approaches formulate the motion problem in such a way that they do not use the time flow information content of the image sequence optimally. The majority of techniques work with only two frames at a time, and even then, they are treated no differently than a stereo pair. Consequently, they do not make use of the fact that these frames are from a time sequence, that is, the third frame was taken T seconds after the second frame and 2T seconds after the first frame. (T being the time between any two successive frames.) As a result, most of the current approaches to motion analysis are formulated in such a way that they must treat a matched set of n features in three frames, as a set of 2n features in two frames. By doing so, they treat a three-frame sequence as two sets of two-frame sequences, and hence, under-utilize the information available to them.

Hormoz Shariat, Keith Price

Chapter 6. An Experimental Investigation of Estimation Approaches for Optical Flow Fields

Abstract

The halftone image of a scene can be described by giving the gray value as a function of the image plane coordinate vector x, i.e., the picture function g(x). In the case of relative motion between the camera and the depicted scene, the picture function will depend not only on the image plane coordinate vector x, but in addition on the time t. During a short enough time interval, △t, the resulting temporal change of the picture function can be approximated by a local shift of spatial gray value structures. This apparent shift can be described by an optical flow vector field, u(x,t). It maps the gray value g(x–u△t,t) recorded at time t at the image plane location x–u△t into the gray value g(x,t + △t) recorded at location x at time t + △t. In most cases, this optical flow field is a good approximation to the temporal displacement of the image of a depicted surface element between time t and time t + △t.

W. Enkelmann, R. Kories, H.-H. Nagel, G. Zimmermann

Chapter 7. The Incremental Rigidity Scheme and Long-Range Motion Correspondence

Abstract

The human visual system is capable of extracting three-dimensional shape information from two-dimensional transformations in the image. Experiments employing shadow projections of moving objects and computer generated displays have established that the three-dimensional shape of objects in motion can be perceived when their changing projection is observed, even when each static view is completely devoid of three-dimensional information.

Shimon Ullman

Chapter 8. Some Problems with Correspondence

Abstract

The notion of correspondence underlies many current theories of human and machine visual information processing. Algorithms for both the correspondence process and solutions to the correspondence problem have appeared regularly in the computer vision literature. Algorithms for stereopsis (Marr and Poggio, 1977; Barnard and Thompson, 1980; Mayhew and Frisby, 1980) and for tracking objects through time (Moravec, 1977; Ullman, 1979; Dreschler and Nagel, 1981; Webb, 1981; Jain and Sethi, 1984) have been presented which assume that token matching of separated or successive views is the underlying visual process. This paper will address the notion of token matching as a primitive operation in vision. We will argue that correspondence seems ill suited to the task of accounting for how an object is positioned in time or space, and that some other mechanism may provide a more apt account.

Michael Jenkin, Paul A. Kolers

Chapter 9. Recovering Connectivity from Moving Point-Light Displays

Abstract

An important trend in the visual sciences is the emerging convergence between psychophysical and computational approaches to visual information processing (Beck, Hope, and Rosenfeld, 1983). Each field is concerned with similar issues and problems; however, each applies a sufficiently different approach to provide complementary lines of investigation. One point of convergence is found in current “bootstrapping” procedures that analyze visual information into minimal stimulus conditions and then seek to model processes by which these conditions can be transformed into relevant environmental properties.

Dennis R. Proffitt, Bennett I. Bertenthal

Chapter 10. Algorithms for Motion Estimation Based on Three-Dimensional Correspondences

Abstract

The goal in the motion estimation problem is to determine the transformation between two three-dimensional positions of a rigid body that is undergoing some arbitrary motion. Throughout this discussion it will be assumed that it is possible to acquire three-dimensional positional information of points located on the rigid body at two separate time instances. This may be accomplished through the use of a stereo camera setup (Barnard and Fischler, 1982). Alternatively, three-dimensional positional data can be obtained explicitly from a laser range finder. In general, points selected from the rigid body will tend to correspond to special geometrical features, such as corners or identifiable surface markings, and must often be extracted by special low level processing tasks (Gu and Huang, 1984). Such feature points will be invariant to the particular position sensing techniques used, and thus may be reliably located independent of object position and orientation.

S. D. Blostein, T. S. Huang

Chapter 11. Towards a Theory of Motion Understanding in Man and Machine

Abstract

The design of a system that understands visual motion, whether biological or machine, must adhere to certain constraints. These constraints include the types and numbers of available processors, how those processors are arranged, the nature of the task, as well as the characteristics of the input itself. This chapter examines some of these constraints, and in two parts, presents a framework for research in this area. The first part, Section 11.2, involves time complexity arguments demonstrating that the common attack to this problem, namely, an approach that is spatially parallel (as least conceptually), with temporal considerations strictly subsequent to the spatial ones, cannot possibly succeed. The essence of this claim is not a new one, and was motivated by similar comments by Neisser (1967), among others. What Neisser and others did not do, however, is provide a framework that is plausible. Expanding on the time complexity argument, we show that in addition to spatial parallelism, the basic system elements include hierarchical organization through abstraction of both prototypical visual knowledge as well as early representations of the input, and the separation of input measurements into logical feature maps.

John K. Tsotsos, David J. Fleet, Allan D. Jepson

Backmatter

Titel: Motion Understanding
herausgegeben von: W. N. Martin
J. K. Aggarwal
Verlag: Springer US
Electronic ISBN: 978-1-4613-1071-6
Print ISBN: 978-1-4612-8413-0
DOI: https://doi.org/10.1007/978-1-4613-1071-6

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Chapter 1. Bounding Constraint Propagation for Optical Flow Estimation

Chapter 2. Image Flow: Fundamentals and Algorithms

Chapter 3. A Computational Approach to the Fusion of Stereopsis and Kineopsis

Chapter 4. The Empirical Study of Structure from Motion

Chapter 5. Motion Estimation Using More Than Two Images

Chapter 6. An Experimental Investigation of Estimation Approaches for Optical Flow Fields

Chapter 7. The Incremental Rigidity Scheme and Long-Range Motion Correspondence

Chapter 8. Some Problems with Correspondence

Chapter 9. Recovering Connectivity from Moving Point-Light Displays

Chapter 10. Algorithms for Motion Estimation Based on Three-Dimensional Correspondences

Chapter 11. Towards a Theory of Motion Understanding in Man and Machine

Backmatter