Motion Understanding Systems

This paper presents an integrated framework for vision that includes early, intermediate and high level components. It has been developed as a result of a substantial amount of study of each level of the problem. One of the goals of the work is to use this framework as the perception component of a mobile, task- driven robot. The novel aspects of the model include a significant top-down component, motivated both by recent biological observations as well as issues of computational complexity, a new definition of intermediate level vision based on the idea of aggregation and that permits both task-directed and bottom-up analysis modes, and temporal integration of recognition results over image sequences. Implementations of each part have been tested with good results; current work is focussing on extensions to the theoretical aspects of each of the three levels of vision as well as investigations into their integration into a single system.

John K. Tsotsos

Reflections on Active (Machine) Vision

Over the past years, algorithmic approaches for the evaluation of digitized images have become more reliable provided the type of objects and their arrangement in the depicted scene as well as the scene illumination can be carefully optimized. The evaluation of images of natural scenes, however, still provides a lot of unsolved problems for algorithmic approaches. The - seemingly effortless - performance of animal and human vision has been experienced as a perpetual challenge to improve image evaluation algorithms by incorporation of new approaches. In addition, continuous improvements in the price / performance ratio of sensor and computing components facilitated the investigation of more complex approaches in order to cope with some of the problems. Attempts to evaluate color images, stereopairs, temporal image sequences recorded by monocular, binocular, or trinocular camera configurations, as well as multisensor data fusion can all be considered as examples for this development.

Hans-Hellmut Nagel

Motion Parameter Estimation Using Correspondence Approach

Understanding object motion and estimating the object motion parameters is a difficult and computationally complex task. In general, approaches based on optic flow and correspondence have been utilized to address this problem. In this paper we present an approach based on correspondence. We use perturbation analysis and optimal control results to estimate time varying parameters. In our approach we simultaneously estimate motion corresponding to the six degrees of freedom, and our approach can readily handle long image sequences. We present results of sensitivity of the solution with respect to noise, initial estimates and sequence length.

Gwojyh Tseng, Arun K. Sood

Active Vision Through Prediction-Error Minimization

Motion vision and pictorial vision are considered to be fairly separate abilities. In the development of biological species on our planet, motion vision is assumed to be phylogenetically older. Also in the development of human infants, motion vision precedes the capability of pictorial vision as psychologists have found out [Yonas 83]. In computer vision, the development went in the opposite direction. The study of static scenes was the starting point for most of the research into vision. This may be attributable to the fact that the interpretation of a static image seems to be easier than dealing with motion. However, when the task is motion understanding the best starting point probably is not the platform developed for static image processing but a basically different one taking a fresh unbiased view on the different type of problem.

Ernst D. Dickmanns

Representing Scenes Along a Route by a Moving Observer

How can a robot autonomously represent and memorize the route it has passed ? As a step toward the solution, this paper presents an intermediate representation, called Panoramic Representation, of scenes along the route, which provides rich information on the route such as a dynamic 2-D projection and a path-oriented 2(1/2)-D sketch. Rather than analyzing local views at discrete locations and integrating them into a model, the robot continuously scans the side-view and yields the Panoramic Representation, based upon an extended Spatiotemporal Volume analysis. Matching of incoming imagery in an autonomous navigation with the Panoramic Representation, memorized in the trial navigation guided by a human, is possible by 2-D or 2(1/2)-D shape matching. The Panoramic Representation is further analyzed to obtain a qualitative description of the route by selecting distinctive patterns either in color, shape or depth as landmarks.

Saburo Tsuji, Jiang Yu Zheng

Gaze Behaviors for Robotics

Five control systems loosely corresponding to primate saccadic, vergence, pursuit, vestibuloocular, and head control operate on a simulated two-eyed robot head maneuvered by a robot arm. The goal is to get some qualitative understanding of the interaction of such reflexes under various assumptions. The simulation is meant to be relevant to U. Rochester’s robot. Thus it incorporates kinematics of the robot head but assumes a ”tool-coordinate” system available to robot arm commands, so that arm kinematic calculations are unnecessary. Dynamics are not modeled, since they are handled by the commercial controllers currently used in the Rochester robot. Even small delays render the effect of delay-free controllers unstable, but multi-delay version of a Smith predictor can to cope with delays. If each controller acts on the predicted system and ignores other controllers, the situation is improved but still potentially unstable if controllers with different delays act on the same control output. The system’s performance is much improved if controllers consider the effect of other controllers, and the resulting system is stable in the presence of a certain amount of stochastic disturbance of control delays and inputs, and also in the presence of systematic error arising from inaccurate plant and world models.

Christopher Brown

On Focus-of-Attention by Active Focusing

Focus-of-attention is extremely important in human visual perception. If computer vision systems are to perform tasks in a complex, dynamic world they will have to be able to control processing in a way that is analogous to visual attention in humans.In this paper we will investigate problems in connection with foveation, that is in examining selected regions of the world at high resolution. We will consider a static world viewed by an active observer that has this capability. The tasks we will consider is that of finding and classifying junctions of contours, features that give important information about 3-dimensional structure like object shape and occlusions. Since they are completely local features, we can study them without treating the problem of integrating local information into global cues. We will show that foveation, as simulated by controlled, active zooming, allows robust detection and classification of junctions with very simple algorithms.

Kjell Brunnström, Jan-Olof Eklundh, Akira Kakimoto

Active Perception and 3D Object Recognition

The fundamental problem faced by all vision systems is the ambiguity created by the projection process. An object’s projected shape in an image changes dramatically for small changes in the observer’s viewpoint. This is the basic difficulty in creating a machine vision system that can respond robustly in an unconstrained 3D environment. Our approach to this problem enables the vision system to actively engage its interpretation of the surroundings using a distributed memory system modeled as visual potentials and made up of characteristic view(point)s (Koenderink and van Doorn, 1979).

Harry Wechsler, Lee Zimmerman

Symbolic Representation of Object Models

An approach which combines symbolic and numerical methods for the solution of systems of geometric constraints is described. Such constraints arise from the description of parameterized object models as well as the geometric relationships between objects, cameras and light sources. Typical applications are environmental modeling for photointerpretation and autonomous navigation. The representation of object models in terms of algebraic constraint equations is described. A symbolic triangulation method for reducing systems of polynomial constraint equations is presented. The use of such triangular systems in the solution of constraint equations is discussed in the context of nonlinear programming.

Joseph L. Mundy

Environment Models and Information Assimilation

An autonomous intelligent agent working in a real, unstructured, dynamic environment must have very close interactions among its perceptual, cognitive, and motor components. We believe that by placing the environment model at the heart of these systems this interaction can be facilitated significantly. In our approach, the environment model is responsible for interaction among different components, providing temporal coherence, combining information from multiple sensors, and the purposive behaviour of the system. The information about the environment is acquired by using multiple disparate sensors, from multiple viewpoints, and at multiple time instants. We believe that the combination of information from disparate sensors should be viewed as a problem in information assimilation, rather than sensor integration. The focus in information assimilation is on the physical world being modeled, sensory information is just a means to the end. Sensor integration treats the goal implicitly, misplacing the focus on the processing of sensed information.

Ramesh Jain

Scene Segmentation in the Framework of Active Perception

It has been widely acknowledged in the Machine Perception community that Scene Segmentation problem is ill defined, and hence difficult! To make our primitives adequately explain our data, we perform feedback on processed sensory data to explore the scene. This is Active Perception, the modeling and control strategies for perception.

Ruzena Bajcsy, Alok Gupta, Helen Anderson

A Neural Network Application for Classifying Surfaces

Using range data in robot vision is becoming more and more interesting. Several procedures are corrently investigated for performing a reliable segmentation of range data. In this paper a new method for classifying surface fragments is proposed. The method exploits back-propagating neural networks and triangular decomposition of surfaces.

Goffredo G. Pieroni

Indexing via Color Histograms

The color spectrum of multicolored objects provides a a robust, efficient cue for indexing into a large database of models. This paper shows color histograms to be stable object representations over change in view, and demonstrates they can differentiate among a large number of objects. It introduces a technique called Histogram Intersection for matching model and image histograms and a fast incremental version of Histogram Intersection that allows real-time indexing into a large database of stored models using standard vision hardware. Color can also be used to search for the location of an object. An algorithm called Histogram, Backprojection performs this task efficiently in crowded scenes.

Michael J. Swain, Dana H. Ballard

On the Design of Robot Vision Systems

The logical structure of robot vision is analyzed. Both sensor data fusion and knowledge representation in a vision system may be broken down into an equal number of hierarchical levels where at each level knowledge representation in the form of partial models, and data fusion, complement each other. Multi-processor architectures reflecting this logical structure are introduced. Vision systems based on such an architecture are highly modular and clearly structured because each module contains the hardware, the program, and the knowledge required for performing a well defined subtask. The modules are thus largely self-contained, which minimizes the need for inter-module communication. This, and the fact that for each module hardware may be used which is specialized for the most frequent operations of that module, makes such systems particularly efficient. Results obtained with such systems in real- world experiments with vision guided robots are given.

Volker Graefe

Parallel Vision

This paper outlines an approach to rapid object recognition based on extraction of primitive image parts, and computation of properties of these parts, using divideand-conquer techniques implemented on a pyramid or hypercube of processors. Object descriptions (as seen from given viewpoints) in terms of parts, properties, and relations are stored in another set of processors, and object hypotheses are triggered by the occurrence of given sets of part properties. The “object processors” then attempt to verify these hypotheses by checking that the parts satisfy given relations.

Azriel Rosenfeld

Towards Parallel Processing of Multisensed Data

According to applications, data may come from many different sources even simultaneously as in multisensed environments: this implies fast input channels and, consequently, processing elements able to provide the information required to match the specific domain requests. For instance, in an autonomous vehicle control system the telecameras and other sensors should allow the computer unit of the vehicle to decide and manage the driving strategy of such vehicle.

C. Guerra, S. Levialdi

Real-Time Vision for Autonomous and Teleoperated Control of Unmanned Vehicles

This paper focuses on two related topics: a control system architecture that unifies autonomous and teleoperated control of unmanned vehicles, and examples of how real-time vision processing fits into this architecture. The NIST hierarchical real-time control system architecture and its application to unmanned vehicles is presented. The paper then discusses recent work at NIST in real-time vision for both teleoperated and autonomous vehicles. For teleoperated vehicles, we describe a system for video compression for low data rate remote vehicle driving. For autonomous vehicles, we describe passive range extraction from optical flow for applications such as target extraction and identification, vehicle driving, and terrain mapping. We also describe how each of these vision systems fits into the control system architecture.

Martin Herman, James S. Albus, Tsai-Hong Hong

Behavioral Knowledge in Sensor/Data Fusion Systems

This paper proposes a behavioral control formalism to explain situations often encountered in complex sensor/data fusion (eventually decentralized) information acquisition, command and control systems. It involves making the input-output relations of each actor involved in the fusion process conditional on selected behavioral classes, with error creation and repair options. For each behavioral class, the input-output relations are modified by concurrent behavioral rules, while the class itself has visible and hidden attributes.A formalism is also given for interactions between actors. This approach allows to describe and analyze such situations as: indirect observations, ambiguous controls, hysterisis, deadlocks, initiation of attention mechanisms.

L. F. Pau

Group-Theoretic Approach to Motion Analysis for 3D Robotic Tracking

“Motion Understanding” in the field of Computer Vision requires that sequences of 2D images be analysed for changes that result from the relative 3D motions of objects with respect to the observer. Many approaches to motion understanding in computer vision are based first on computing the ‘optical flow’ of a set of time-varying images. Traditionally, this vector field corresponds to the 2D translations of local regions of the intensity profile which shift as a result of the 3D motions in the visible world. However; there are higherorder transformations which occur in the image as a result of the projective geometry.It is proposed that, for the task of 3D visuo-motor tracking, local image-based transformations of moving object surfaces can be modelled by a six-parameter affine group. The six degrees-of-freedom of the Euclidean rigid-body motion group [Given by the semi-direct product of 3D translations and rotations, (cf. Carlton and Shepard[9]), E+= ℝ3 Ⓢ(3)] project perspectively to a unique six-dimensional vector field group, (cf. Blicher and Omohundro[6]). Six velocity-sensitive operator types are presented which estimate the one-parameter groups that decompose the local affine transformations in the moving scenes. The subgroups form a canonical basis for estimating 3D relative motion and surface orientation, as specified by the state of a quaternion.

Roy Eagleson

Multi-Sensor Integration for Robots Interacting with Autonomous Objects

This paper outlines the development of a sound theoretical basis for the treatment of information derived from multiple dissimilar sensors. In particular a decentralized recursive filtering procedure based on Kalman filtering theory, capable of handling unsynchronized sensory information is developed. This filtering procedure allows efficient sub-optimal reconstruction of predictive position estimates for possibly autonomous object(s) moving in 3-D space. Possible application environments include collision avoidance, and retrieval of autonomous moving objects.

S. Bruder, M. Farooq, M. Bayoumi

Segmentation of Moving Objects

This paper describes a new method for segmentation of moving objects. The segmentation is done in two steps. First a segmentation of a displacement vectorfield leads to rough contours of the objects. Since a displacement vectorfìeld provides information about the strenght and direction of a movement, it is possible to separate moving objects (moving differently) from each other or from the background. In the second step these rough contours are refined. Within a search area around the contours found in the first segmentation step, the ‘best boundary’ is determined by optimizing a criterion for a ‘good boundary’ with the A*-algorithm. The computational cost is acceptable, because the search areas are very small compared with the whole picture. In contrast to methods based on difference images, it is shown that a separation can be achieved even if the moving objects touch each other or the scene is taken with pan.

Harald Kirchner

Apparent Motions in Three-Dimensional Images

Apparent motions rank among the most interesting and conspicuous optical illusions. They have enabled psychologists to gain valuable insights into the mechanism of vision, although to people not familiar with such phenomena they appear spectacular and sometimes even terrifying. Completely solid objects seem to deform and penetrate each other, images follow the observer’s changing points of view and motions appear to be falsified and even reversed. Such impressions are, of course, entertaining and it is for this reason that the most widely known apparent motion, the cinema film, has become particularly popular. Motion pictures are successively shown static images of instantaneous stages of motion which appear to fuse into continuous motion.

S. Klein, W. Dultz

Efficient Representation and High-Compression Coding of Image Sequences

Because of the large amount of data represented by an image sequence of even modest resolution, the transmission and storage of these sequences often requires compression ratios exceeding 200:1. Such ratios are near the limits of conventional coding methods. To exceed these limits, we seek to use knowledge of the human visual system, and to code sequences such that only perceptually important information is retained. For still images, techniques of this type have become known as “second generation” coding methods. Five approaches to this problem are described in this paper: 3-D split-and-merge, edge- based coding, Gibbs-Markov random fields, the Gabor decomposition, and transformed QMF filters.Index terms: Image sequence coding, second generation, split-and-merge, edge detection, Gibbs distribution, Gabor decomposition, QMF filters.

Todd R. Reed, Touradj Ebrahimi, Patrice Willemin, Ferran Marques, Murat Kunt, Gaetano Giunta, T. George Campbell

A New Paradigm for Computational Stereo

Most research in computational stereo has followed the approach described in Barnard and Fischler’s “Computational Stereo” (in Computing Surveys‚ vol. 14, no. 4, 1982). This approach, although conceptually appealing and theoretically elegant, suffers from several limitations. Among these are the difficulties in the matching process, problems with feature localization, restrictive camera geometries, and, perhaps most importantly, the extensive computational effort required to produce depth estimates. By approaching the problem from more of an engineering perspective, a new paradigm for computational stereo had been developed that avoids the problems inherent in the conventional “extract and match” paradigm. The Intensity Gradient Analysis (IGA) technique determines depth values by analyzing temporal intensity gradients arising from the optic flow field induced by known camera motion. IGA assumes nothing about the nature of the environment and places no restrictions on camera orientation (IGA is applicable to sequences obtained using arbitrary translational motion).

Kurt Skifstad, Ramesh Jain

Smoothing Range Data for Curvature Estimation

The calculation of curvature plays a crucial role in the recognition of sculptured objects from range data. The smoothing operation which must precede the calculation of curvature should be considered an important part of this process as it has the potential to alter its results. We take as our goal the selection of an effective scale and method for smoothing range data which minimizes changes in the shape of the object. Toward this end we have examined the effect of two aspects of the smoothing process on detection of extremal points of curvature. First, we discuss the use of cross validation as method of scale selection, and second we look at the results of nonlinear smoothing, in particular the use of anisotropic diffusion. Several recommendations are also made regarding the potential usefulness of smoothing methods which take into account the orientation of the surface.

Gaile G. Gordon

Residual Analysis for Range Image Segmentation and Classification

This paper presents an algorithm for the segmentation and classification of dense range images of industrial parts. Range images, are unique in that they directly approximate the physical surfaces of a real world 3-D scene. The segmentation of images (range or intensity) is based on edge detection or region growing techniques. The approach presented in this paper segments range images by combining edge detection and region growing techniques. Jump and roof edges are detected using residual analysis. The residual is defined as the absolute value of the difference between the original image and a filtered version. We show that, at an edge, the difference after smoothing has a maxima in the direction perpendicular to the edge for jump and roof edges. The segmented surfaces is then classified into planar, convex, or concave. The classification is done in two steps. The first step utilizes a variation of the Wald-Wolfowitz runs test to classify the surfaces into planar or curved. The second step further classifies each curved surface into convex or concave using a multi-scale residual computation. The performance of the algorithm on a number of industrial parts range images is presented.

Ezzet H. Al-Hujazi, Arun Sood

Using Curvature Information in the Decomposition and Representation of Planar Curves

This paper describes a new symbolic representation for planar curves. This representation is based on a segmentation of the curve based on regions of uniform curvature. Rather than smooth noisy data before doing the decomposition, the technique defines a family of functions that extract the segments of the curve as part of the smoothing process. The representation decomposes the curve at multiple scales and the parts produced appear to correspond to a natural decomposition of the curve. It also allows for multiple descriptions of some parts of the curve. The final representation can be rendered compact, avoids several common disadvantages in noisy curve description, and should be useful for recognition.

Gregory Dudek, John Tsotsos

Investigation into Building an Invariant Surface Model from Sparse Data

This paper addresses the problem of forming surface depth estimates from sparse information, which are invariant to three-dimensional rotations and translations of the surface in the chosen coordinate system. We begin this investigation by examining a simplified version of this problem, that of forming invariant curve estimates from sparse data, to help gain insight into the more complex problem in higher dimensions. Two new algorithms are proposed and studied in detail, and several examples are presented to demonstrate their effectiveness. The extension of these algorithms to surfaces in threedimensional space is also briefly discussed.

Robert L. Stevenson, Edward J. Delp

Form Features Recognition in a Multi-Level Representation Context

In this paper we propose a system for the automatic extraction and representation of form features in the framework of solid modeling that can be used in different contexts of application.This system works in two steps. The first step starts from a boundary model of an object and identifies the so-called generic shape features by considering only their geometric and topological aspects. These features are subdivided into two general classes, protrusions and depressions, extracted as solid volumes and arranged in a multi-level structure. This representation, called Attributed Structured Face Adjacency Hypergraph (ASFAH) is able to represent the recursive decomposition of an object in its main shape and the set of its form features (possibly compound). In the second step the identified features are classified according to the functional meaning in the application context. Thus, the hierarchical representation is reorganized by gluing and/or grouping sets of shape features which are functionally related in the considered context.

Bianca Falcidieno, Franca Giannini

Orientation and Spatial Occupancy Representations in Shape Analysis

In this paper we propose representations of two-dimensional curves that capture curve orientation and spatial occupancy, and show how they can be used in isolation or jointly to address problems in dynamic shape analysis and model-based matching of occluded objects. To explicitly represent curve orientation, we generalize the notion of extended circular image to a non-convex curve, by representing it as a sequence of extended circular images of its convex and concave parts. Evaluating the similarity of two curves can then be reduced to evaluating the similarity of corresponding segments by directly correlating their extended circular images. To explicitly represent spatial occupancy in a manner that can be used in shape matching, we blur the two-dimensional binary image obtained from the curve. Two curves that are similar in both shape and size and optimally aligned with respect to each other in both position and orientation will then result in a value close to one of the correlation coefficient obtained from the respective binary images. In dynamic shape analysis we use the orientation-based representation alone, while in the model-based matching of occluded objects we use orientation for generating hypotheses and spatial occupancy for selecting the correct ones.

Evangelos E. Milios

On Active Contour Models

The use of deformable models to extract features of interest in images has been introduced by Kass et al, known as snakes or energy minimizing curves. We present a model of deformation which can solve some of the problems encountered with the original method. The external forces applied on the curve deriving from the image and pushing to the high gradient regions are modified to give more stable results.The original “snake” model, when it is not submitted to any external force will find its equilibrium by vanishing either to a point or to a line according to the boundary conditions. Also a snake which is not close enough to contours will not be attracted. We define a new model of Active Contour which makes the curve v(s) = (x(s), y(s)) behave well for these cases. The equation we solve is : $$\frac{\partial {v}}{\partial {t}}- \frac{\partial}{\partial {s}}(w_1\frac{\partial}{\partial {s}})+\frac{\partial^2}{\partial {s^2}}(w_2\frac{\partial{^2v}}{\partial {s^2}})=f_1(v)+f_2(v)$$ where f₁ is derived from the image to attract the curve to edges, and f₂ is an internal force which simulates a pressure force. The initial data v₀ needs no more to be too much close to the solution to converge. The curve behaves like a balloon which is blown up. When it passes by edges, it is stopped if the contour is strong, or passes through if the contour is too weak.We give examples of results of the new algorithm applied to medical images to extract a ventricle.

Laurent D. Cohen

Templates and the Hough Transform

Shape, in both 2 and 3D, provides a primary cue for object recognition and the Hough Transform method has received considerable attention as a shape analysis technique. The relationship between the Hough Transform and template matching has long been recognised but its exploitation has been much neglected. In this paper we introduce a novel result which relates the quantization of the Hough parameter space and image template shapes. We show how the result can be used to construct equivalent feature space templates and demonstrate that this analysis yields valuable information concerning the design of Hough Transforms. The template interpretation explains why several recent hierarchical algorithms for efficient Hough Transform implementation may fail in complex imagery and it gives insight into several other aspects (e.g. filtering) of proposed Hough transform methods.

J. Princen, J. Illingworth, J. Kittler

Coefficient Focusing for the Moore-Penrose Associative Memory

This paper describes a modification of the Moore-Penrose distributed associative memory (DAM). The modification lies in the inclusion of an iterative scheme to increase the selectivity of the memory, and in the inclusion of a reject option. The reject option is indispensible when the DAM is used in a dynamic vision environment. The DAM can be applied for recognition of two-dimensional objects. When an unknown object is presented to the memory it outputs an associated image and a set of association coefficients. Previous applications used the magnitude of these coefficients to indicate the strength of association with each of the stored images. By exploiting the formal similarity of the DAM recollection scheme with multiple linear regression we suggest to use statistical significance measures instead to measure the strength of association. We further suggest an iterative scheme to improve the selectivity of the association coefficiants. This scheme is based on discarding from further consideration response vectors deemed to be insignificant. Experimental results prove the feasibility and benefits of the new recognition method.

Wolfgang Pölzleitner, Harry Wechsler

Object detection in noisy images

Scale-space representation is a topic of active research in computer vision. Several researchers have studied the behavior of signals in the scale-space domain and how a signal can be reconstructed from its scale-space. However, not much work has been done on the signal detection problem, i.e. detecting the presence or absence of signal models from a given scale-space representation. In this paper we propose a model-based object detection algorithm for separating the objects from the background in the scale-space domain. There are a number of unresolved issues, some of which are discussed here. The algorithm is used to detect an infrared image of a tank in a noisy background. The performance of a multiscale approach is compared with that of a single scale approach by using a synthetic image and adding controlled amounts of noise. A synthetic image of randomly placed blobs of different sizes is used as the clean image. Two classes of noisy images arc considered. The first class is obtained by adding clutter (i.e. colored noise) and the second class by adding an equivalent amount of white noise. The multiscale and single scale algorithms are applied to delect the blobs, and performance indices such as number of detections, number of false alarms, delocalization errors etc. are computed. The results indicate that (i) the multiscale approach is better than the single scale approach and (ii) the degradation in performance is greater with clutter than with white noise.

V. Topkar, B. Kjell, A. Sood

Active Feature Localization

An application of an active method in order to compute highly accurate 3D localization of point features from few projections is presented. The angle of projection of the image is controlled by the system and directed to extract 3D information from the environment in a manner leading to accurate location in less computation.This model is relevant for tomographic reconstruction and for feature based stereo.

Yaacov Hel-Or, Amir Shmuel, Michael Werman

Haptic Perception with a Robot Hand: Requirements and Realization

This paper first discusses briefly some of the recent ideas of perceptual psychology on the human haptic system particularly those of J.J. Gibson and Klatzky and Lederman. Following this introduction, we present some of the requirements of robotic haptic sensing and the results of experiments using a Utah/MIT dexterous robot hand to derive geometric object information using active sensing.

Paul Michelman, Peter K. Allen

Visual Techniques for the Controlled Movement of Docking

This paper discusses ongoing research in developing vision strategies for the docking behavior of an autonomous mobile robot, concentrating on the needs of the controlled movement of docking in a manufacturing environment. In the controlled movement, a perceptual strategy must provide feedback to the motor behavior in order to make accurate corrections to the mobile robot’s approach trajectory. Two novel techniques have been developed: adaptive tracking of an artificial landmark through a sequence of images, and the use of texture to recover relative depth and orientation. Experimental results are presented. These techniques, in conjunction with an inverse perspective transform technique for the coarse recovery of depth and orientation, form the basis of the perceptual strategy for the controlled movement.

Robin R. Murphy

Towards the Development of a Multisensorial Vision System: An Adaptive Approach to the Low-Level Phase

The use of vision systems in order to accomplish the task of recognizing and describing 3D outdoor scenes is an open problem faced by current research on Computer Vision. In the past few years new methodologies have been developed which make it possible to integrate different information sources about a given surrounding environment into a single recognition framework. The system proposed in this paper has been designed to reach an high degree of reliability, by adaptively integrating data coming from a variable number of physical and virtual imaging sensors. The problem addressed in this paper lies in optimizing the choice of low-level methods for processing the outputs of this set of sensors, and for extracting data that better describe, at a symbolic level, the scene under examination. The system, which integrates techniques typical for Pattern Recognition, Artificial Intelligence, and Image Processing is characterized by a modular structure and has the capability for evaluating the data and results obtained during the whole recognition process. To this end attention has been focused on correlating the symbolic and numerical aspects of the computational process; consequently, it has been made possible to assess the progressive integration results according to accurate evaluation criteria.

Daniele D. Giusto, Carlo S. Regazzoni, Gianni Vernazza

Attributed Parallel Rewriting in Vision

The success of many processes depends upon the interpretation of visual messages sent from an environment to a system which has to execute a task as a part of a strategy. The accomplishment of the task often requires that the visual message be described according to different (sub-)tasks which must be contextually satisfied exploiting knowledge from different sources.

P. Mussio, M. Protti

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter