Skip to main content
main-content

Über dieses Buch

COMPUTER VISION is a field of research that encompasses many objectives. A primary goal has been to construct visual sensors that can provide general-purpose robots with the same information about their surroundings as we receive from our own visual senses. This book takes an important step towards this goal by describing a working computer vision system named SCERPO. This system can recognize known three-dimensional objects in ordinary black-and-white images taken from unknown viewpoints, even when parts of the object are undetectable or hidden from view. A second major goal of computer vision re­ search is to provide a computational understanding of human vision. The research presented in this book has many implica­ tions for our understanding of human vision, particularly in the areas of perceptual organization and knowledge-based recogni­ tion. An attempt has been made to relate each computational result to the relevant areas in the psychology of vision. Since the material is meant to be accessible to a wide range of inter­ disciplinary readers, the book is written in plain language and attempts to explain most concepts from the starting position of the non-specialist. vii viii PREFACE One of the most important conclusions ansmg from this research is that visual recognition can commonly be achieved directly from the two-dimensional image without any prelim­ inary reconstruction of depth information or surface orienta­ tion from the visual input.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Introduction

Abstract
THE FIELD of computer vision covers a wide range of topics that are often only loosely related to the capabilities and function of human vision. Computer vision systems may use exotic sensors such as laser rangefinders or make restrictive assumptions regarding the scene that have no counterpart in human vision. However, the following pages will deal mostly with the interpretation of single black-and-white images—in particular with edge descriptions extracted from them—and will not generally make use of restrictive assumptions regarding the scene. This portion of the computer vision problem seems to be one of the most central to the functioning of the human visual system, and much of the following study will be motivated by what is known of human vision.
David G. Lowe

Chapter 2. Previous Research on Perceptual Organization

Abstract
THE HISTORY of research on perceptual organization consists, in its broad outlines, of a search for some underlying principle which would unify the various grouping phenomena of human perception. The Gestalt psychologists thought that this underlying principle was some basic ability of the human mind to proceed from the whole to the part. Later research summarized many of the Gestaltists’ results with the observation that people seem to perceive the simplest possible interpretation for any given data—although simplicity proved to be very difficult to define or quantify. The research described in this book is based on the still more recent principle that it is the degree of non-accidentalness that determines the significance of a grouping. In other words, it is not simplicity itself that determines significance but the extent of surprising simplicity given expectations regarding the distribution of features.
David G. Lowe

Chapter 3. Measuring the Significance of Image Relations

Abstract
PERCEPTUAL ORGANIZATION can be viewed as a process that assigns a degree of significance to each potential grouping of image features. Our goal in this chapter will be to take a unified view of the many grouping phenomena by examining the underlying principles for measuring the significance of each grouping. As was described in Chapter 1, perceptual groupings are useful to the extent that they are unlikely to have arisen by accident of viewpoint or position, and therefore are likely to reflect meaningful structure of the scene. Our basic argument will be that certain image relations are carriers of statistical information indicating that they are non-accidental in origin, and that this degree of non-accidentalness forms the basis for assigning degrees of significance. Note that there are an infinite number of different types of relations that could be considered (e.g., “all pairs of straight line segments at N degrees relative orientation,” for any given N), and a combinatorial number of sets of elements to consider in any given image. Only a small subset of these possible relations are likely to be of any significance or are worth the effort required for detection.
David G. Lowe

Chapter 4. The Segmentation of Image Curves

Abstract
IN THIS CHAPTER we will apply the methods of perceptual organization to the difficult but important problem of segmenting image curves. Smoothed, segmented image curves are important perceptual structures in themselves, as well as being needed for the subsequent detection of collinearity, parallelism, connectivity, and other perceptual groupings. Most current edge detectors only detect edge points (image locations through which an edge is judged to pass) and possibly link these together into lists of points on the basis of proximity. The gap between the output of edge-detection techniques and the smoothed, segmented curve descriptions needed for model matching and many perceptual grouping operations is a significant missing link in current image-description methodology. One reason for the difficulty of curve segmentation is that it is actually a combination of several different problems: choosing the best scale of description for a curve, deciding where to place tangent discontinuities (corners), and assigning levels of significance to the final segmentations. This chapter will outline the various requirements that an ideal solution to this problem should satisfy, and will demonstrate a computer program that satisfies most of them.
David G. Lowe

Chapter 5. The Use of Viewpoint Invariance

Abstract
Chapter 3 described the properties of the image formation process that determine which image groupings will occur significantly more often than by chance. In particular, only those image groupings that are present over a substantial range of viewpoints of a scene will occur often enough to be reliably separated on statistical grounds from accidental occurrences of the same image features. In addition to determining which image groupings are detectable, these viewpoint invariance conditions lead to specific inferences regarding three-space relations from the two-dimensional image groupings. Figure 5-1 illustrates this form of inference, which can provide some of the same three-dimensional information as processes such as stereo or shape-from-motion. In this chapter we will also consider other forms of imaging invariance, such as the invariance of certain shadow features with respect to positions of the light sources that illuminate the scene.
David G. Lowe

Chapter 6. Model-Based Search and Inference

Abstract
One of the central topics of artificial intelligence research has been the problem of efficient search. For many problems in AI, there are straightforward methods for solving a problem by enumerating over a large set of possible interpretations and looking for those that are consistent with the given data. Unfortunately, in many cases the set of possible interpretations is combinatorially large and cannot be enumerated in a reasonable amount of time. Therefore, a substantial amount of AI research has been devoted to finding methods for speeding up the search process. These methods fall into two classes: the so-called weak methods that look for general purpose algorithms that could be applied to any search problem, and the strong methods that attempt to apply specific knowledge of the problem at hand. Recent AI research has tended to concentrate on the strong methods—in particular, making use of large amounts of world knowledge that can often be applied to find a quick path through what would otherwise be an intractable search space.
David G. Lowe

Chapter 7. The Verification of Image Interpretations

Abstract
One of the central arguments for the importance of perceptual organization is that it reduces the otherwise enormous task of searching for spatial correspondence between image features and prior knowledge of objects. However, this argument is based on the assumption that the final verification of the correctness of a set of correspondences can be carried out quickly and reliably as part of the search process. In this chapter, some practical methods will be described for performing this verification process, in particular for determining the viewpoint and unknown parameters of a three-dimensional object as accurately as possible and measuring the degree to which the spatial information in the image agrees with the predictions of the model.
David G. Lowe

Chapter 8. The SCERPO Vision System

Abstract
THE PREVIOUS CHAPTERS have described a number of separate components that would need to function together to perform visual recognition. In this chapter, an implemented vi­sion system will be described that combines many of these components into a functioning system for performing recognition. The system has been named SCERPO (an acronym for Spatial Correspondence, Evidential Reasoning, and Perceptual Organi­zation). All code for this system was written over a relatively short period by a single person, which required that each com­ponent be simplified over the more complete versions previously described. For example, curve segmentation is performed only for straight line segments rather than arbitrary curves, object models are presumed to be fully specified with no unknown in­ternal parameters, perceptual organization has been simplified, and the evidential reasoning component does little more than an exhaustive search. Nevertheless, the system has a high level of performance in comparison to most previous vision systems, and it clearly demonstrates the potential for the methodology as a whole.
David G. Lowe

Chapter 9. Conclusions

Abstract
WE ARE now in a position to look back over the range of material in this book and evaluate its contributions to computer vision research. The direction that has been taken is substan­tially different from the mainstream of current computer vision research. Our goal at the outset was to develop methods for visual recognition based upon the use of spatial information in the image. The direct formation of depth information from the image has been de-emphasized, and a process of perceptual orga­nization has taken its place as a primary bottom-up descriptive process. The problem of visual recognition has been cast as es­sentially a problem of search, in which the major research effort needs to be devoted to reducing the size of the search space at each level of the visual hierarchy. A consequence of this search-based methodology is that intermediate levels of description are not required to be highly reliable; rather, it is their average sta­tistical performance in distinguishing useful alternatives which is of importance for reducing the search space. This naturally leads to methods for evaluating relations which are probabilistic in nature rather than being based upon binary decisions.
David G. Lowe

Backmatter

Weitere Informationen