Skip to main content

1995 | Buch

Automatic Extraction of Man-Made Objects from Aerial and Space Images

herausgegeben von: Prof. Dr. Armin Gruen, Prof. Dr. Olaf Kuebler, Dr. Peggy Agouris

Verlag: Birkhäuser Basel

Buchreihe : Monte Verità

insite
SUCHEN

Über dieses Buch

Advancements in digital sensor technology, digital image analysis techniques, as well as computer software and hardware have brought together the fields of computer vision and photogrammetry, which are now converging towards sharing, to a great extent, objectives and algorithms. The potential for mutual benefits by the close collaboration and interaction of these two disciplines is great, as photogrammetric know-how can be aided by the most recent image analysis developments in computer vision, while modern quantitative photogrammetric approaches can support computer vision activities. Devising methodologies for automating the extraction of man-made objects (e.g. buildings, roads) from digital aerial or satellite imagery is an application where this cooperation and mutual support is already reaping benefits. The valuable spatial information collected using these interdisciplinary techniques is of improved qualitative and quantitative accuracy. This book offers a comprehensive selection of high-quality and in-depth contributions from world-wide leading research institutions, treating theoretical as well as implementational issues, and representing the state-of-the-art on this subject among the photogrammetric and computer vision communities.

Inhaltsverzeichnis

Frontmatter

General Strategies

Frontmatter
Using Context to Control Computer Vision Algorithms
Abstract
We are investigating the design of an architecture that can be used as the basis for controlling the invocation of image understanding algorithms for cartographic feature extraction. The key research question is whether sufficient contextual constraints are available to choose algorithms and their parameters for aerial photo analysis. Our approach has been to apply the context-based architecture incorporated in CONDOR, an SRI system for automatically constructing scene models of natural terrain from ground-level views. The semiautomated nature of cartographic feature extraction allows access to additional sources of contextual constraints that were not available to CONDOR.1
Thomas M. Strat
Inferring Homogeneous Regions from Rich Image Attributes
Abstract
Image segmentation is an important part in any computer vision framework. However, the transition from local low-level representations to useful structures and relations in the intermediate levels has turned out to be a truly difficult problem. This paper addresses the difficult transition from low-level into intermediate-level vision, where the latter deals with producing a description of image and scene attributes in which more global relations are made explicit. We propose to combine a rich attributed contour representation with very general geometric contour relations. The implemented geometric relations, which are proximity, curvilinearity, parallelism and corner-like relations, allow to handle general man-made objects whose projected surfaces can be described by combinations of the defined relations. The combination of rich image attributes and geometric relations allows to discriminate between strong and weak contour relations. Strong relations require that not only the geometrical constraints are met but also that the contour attributes (e.g. photometric) are in agreement. We describe the approach and show some preliminary results.
Olof Henricsson
Stereo Reconstruction from Multiple Views
Abstract
We present a unified framework for 3-D shape reconstruction that allows us to combine multiple information sources—such as stereo and shape-from-shading—derived from images whose vantage points may be very different. A formal integration framework is critical because, in recovering complicated surfaces, the information from a single source is often insufficient to provide a unique answer.
We describe two complementary implementations of this paradigm that both rely on deforming a generic object-centered 3-D representation of the surface so as to minimize an objective function. The first implementation models the surface as a triangulated mesh. The second models it as a set of particles that interact with each other through forces that tend to align them.
P. Fua
Model Registration and Validation
Abstract
An important application of machine vision is to provide a means to monitor a scene over a period of time and report changes in the content of the scene. We have developed a validation mechanism that implements the first step towards a system for detecting changes in images of aerial scenes. Validation seeks to confirm of the presence of model objects in the image. Our system uses a 3-D site model of the scene as a basis for model validation, and eventually for detecting changes and to update the site model. The validation process is implemented in three steps: resection, fine registration of the image to the model by matching of model features to image features, and validation of the objects in the model. Our system, if necessary, is aided by shadows to help validate the model. The system has been tested using a hand-generated site model and several images of a 500:1 scale model of the site, acquired form several viewpoints.
Andres Huertas, Mathias Bejanin, Ramakant Nevatia
Automatic Matching of 3-D Models to Imagery
Abstract
This paper presents a technique for automatically locating known control features in uncontrolled imagery. The automatic matching process exploits the available image support data and information concerning the imaging process, including sensor operation, environmental factors, and control feature information. One key feature of the algorithm is the radiometric modeling which is used to apply sensor characteristics, object reflectance, and illumination conditions. The importance of radiometric modeling is demonstrated.
Walter Mueller, James Olson
Structural 3D-Analysis of Aerial Images with a Blackboard-based Production System
Abstract
A model-based method for the analysis of structures in aerial images is described. The knowledge about object structures is represented by a set of productions. Together with the object concepts these productions form a net. Such production systems are implemented in a blackboard architecture. The database is realized in form of an associative memory. A simple example net is used to explain the 3D-analysis and the detection of a high building. The result of the image analysis and partial results obtained by a derivation graph are displayed. Problems with the associative access are discussed and an extended hardware concept is proposed.
Uwe Stilla, Eckart Michaelsen, Karl Lütjen
Semi-Automatic Detection and Extraction of Man-Made Objects in Multispectral Aerial and Satellite Images
Abstract
Extraction of man-made objects in images is an important problem in computer vision. In this paper a general procedure to extract man-made objects in colour images of terrains using ‘Background Discriminant Transformation (BDT)’ is presented. BDT is simple to use, robust and application oriented. The procedure does not require extensive knowledge of the terrain. To illustrate the procedure an example is shown from an airborne colour image. However, the process has been also tested, and reported elsewhere, on spaceborne multispectral images such SPOT images. At this stage the process is semi-automatic.
The procedure is implemented on an Image Processing CAD workstation. The workstation has Image Processing, CAD graphics and GIS modules interacting closely. This integrated system has the capabilities to perform the high-level image processing needed for image interpretation.
Vittala K. Shettigara, Siegfried G. Kempinger, Robert Aitchison
Information Extraction from Digital Images a KTH Approach
Abstract
The research on Digital Photogrammetry at our Department focuses on automated gathering of cartographic information from multiple digital images. Topics are the description of features and objects to be expected in two-dimensional data. When appropriate, the reconstruction is performed in three-dimensional object space. Objects are buildings in aerial imagery. This approach has potential impact on image coding, image compression and resampling. All major steps of the developed concept are described, experience with alternate approaches are presented and the current state is demonstrated on examples, selected from latest publications of researchers at the Department.
Eberhard Gülch
Linear Feature Extraction with Dynamic Programming and Globally Enforced Least Squares Matching
Abstract
In this paper we present a method for semi-automatic extraction of road networks from SPOT imagery using wavelet-transformed images and cost functions expressing local gray value variations and global continuity constraints. For larger scale images and for various object types, we propose a global approach for semi-automatic object outline detection which uses local gray value variations to precisely identify edge locations. In this new procedure of globally enforced least squares template matching, the mathematical foundation is provided by least squares matching, while global continuity is enforced through the introduction of object-type-dependent shape constraints.
Armin Gruen, Peggy Agouris, Haihong Li
Semi-Automatic Feature Extraction by Snakes
Abstract
Implementation of a semi-automatic method of feature extraction, based on the active contour method or ‘snakes’, will be described and results obtained from the procedure demonstrated The paper will define the method of snakes for 2 dimensional features on single images, and 3 dimensional features on two or more images. The results of tests on SPOT satellite images are described in terms of the accuracy of the extracted features and pull-in range, for a range of features, for both 2 dimensional and 3 dimensional geometry.
John C. Trinder, Haihong Li
From Ziplock Snakes to Velcro™ Surfaces
Abstract
The use of energy minimizing deformable models in various applications has become very popular. The issue of initializing such models, however, has not received much attention although the model’s performance depends critically on its initial state. We aim at obtaining good convergence and segmentation properties from a minimum of a priori information.
We present a new approach to segmentation of 2- and 3-Dimensional shapes that initializes and then optimizes a deformable model given only the data and a very small number of 2-D or 3-D seed points respectively. This is a valuable capability for medical, robotic and cartographic applications where such seed points can be naturally supplied. In effect, the 2-D “snake” and the 3-D surface model are clamped onto the object boundary in manner reminiscent of a ziplock or velcro being closed.
We develop the method’s mathematic framework and show results using 2-D cartographic data. Preliminary results in 3-D using volumetric medical data are shown as well.
W. Neuenschwander, P. Fua, G. Székely, O. Kübler

Building Extraction

Frontmatter
A Layered Abduction Model of Building Recognition
Abstract
This paper addresses the problem of recognizing buildings from large-scale aerial images on a rather conceptual level. After defining the problem and describing the assumptions and constraints, the object recognition problem is decomposed into different layers, beginning with the preprocessed images and progressing through intermediate levels of raw and segmented surfaces towards a geometric and semantic description of buildings. The interaction of the tasks on every layer is cast as a layered abduction model based on the hypothesis that visual perception is layered abduction. A brief motivation of this hypothesis is provided in the third section, together with a basic background of abduction. The remaining part of the paper is concerned with applying the general abduction model to the problem of building recognition.
Toni Schenk
Detection of Buildings from Monocular Images
Abstract
We describe a system for detection and description of buildings in aerial scenes. This is a difficult task as the aerial images contain a variety of objects. Low-level segmentation processes give highly fragmented segments due to a number of reasons. We use a perceptual grouping approach to collect these fragments and discard those that come from other sources. We use shape properties of the buildings for this. We use shadows and walls to help form and verify the hypotheses generated by the grouping process. This latter step also provides 3-D descriptions of the buildings. Our system has been tested on a number of examples and is able to work with overhead or oblique views.
Chungan Lin, Andres Huertas, Ramakant Nevatia
High-resolution Stereo for the Detection of Buildings
Abstract
After a brief overview of the research going on in the Pastis group at Inria, which covers retrieving depth and recovering symbolic information from remotely sensed data, we concentrate in this paper on a particularly difficult task, high-resolution stereo, which deals with images with a resolution of one meter or less, on which there are smooth textured natural areas as well as discontinuities, or even occultations, of man-made structures. Two approaches are proposed for extracting reliable dense depth maps: one relies on the use of several aerial images, with increasing disparities, and the other on an adaptive window correlation scheme, which prevents the correlation window to extend over radiometric, and thus depth discontinuities.
M. Berthod, L. Gabet, G. Giraudon, J. L. Lotti
3-D Reconstruction of Urban Scenes from Sequences of Images
Abstract
In this paper, we address the problem of the recovery of the Euclidean geometry of a scene from a sequence of images without any prior knowledge either about the parameters of the cameras, or about the motion of the cam-era(s). We do not require any knowledge of the absolute coordinates of some control points in the scene to achieve this goal. Using various computer vision tools, we establish correspondences between images and recover the epipolar geometry of the set of images, from which we show how to compute the complete set of perspective projection matrices for each camera position. These being known, we proceed to reconstruct the scene. This reconstruction is defined up to an unknown projective transformation (i.e. is parameterized with 15 arbitrary parameters). Next we show how to go from this reconstruction to a more constrained class of reconstructions, defined up to an unknown affine transformation (i.e. parameterized with 12 arbitrary parameters) by exploiting known geometric relations between features in the scene such as parallelism. Finally, we show how to go from this reconstruction to another class, defined up to an unknown similitude (i.e. parameterized with 7 arbitrary parameters). This means that in an Euclidean frame attached to the scene or to one of the cameras, the reconstruction depends only upon one parameter, the global scale. This parameter is easily fixed as soon as one absolute length measurement is known. We see this vision system as a building block, a vision server, of a CAD system that is used by a human to model a scene for such applications as simulation, virtual or augmented reality. We believe that such a system can save a lot of tedious work to the human observer as well as play a leading role in keeping the geometric data base accurate and coherent.
Olivier Faugeras, Stéphane Laveau, Luc Robert, Gabriella Csurka, Cyril Zeller
Automatic Extraction of Buildings and Terrain from Aerial Images
Abstract
A system has been developed to acquire, extend and refine 3D geometric site models from aerial imagery. The system hypothesizes potential building roofs in an image, automatically locates supporting geometric evidence in other images, and determines the precise shape and position of the new buildings via multi-image triangulation. Model-to-image registration techniques are applied to align new images with the site model, and model extension and refinement procedures are performed to acquire previously unseen buildings and improve the geometric accuracy of the existing 3D models. A correlation-based terrain recovery algorithm provides complementary information about the site, in the form of a digital elevation map.
Robert T. Collins, Allen R. Hanson, Edward M. Riseman, Howard Schultz
Mid-Level Vision Processes for Automatic Building Extraction
Abstract
Mid-level processes in vision are understood to produce structured descriptions of images without relying on very specific semantic scene knowledge. Automatic building extraction can use geometric models to a large extent. Geometric hypotheses may be inferred from the given data in 2D or 3D and represent elementary constraints as incidence or collinearity or more specific relations as symmetries. The inferred hypothesis may lead to difficulties during spatial inference due to noise and to inconsistent and mutually dependent constraints. The paper discusses the selection of mutually not-contradicting constraints via robust estimation and the selection of a set of independent constraints as a prerequisite for an optimal estimation of the objects shape. Examples from the analysis of image and range data are given.
Wolfgang Förstner
Geometric versus Texture Detail in 3-D Models of Real World Buildings
Abstract
The creation of three-dimensional CAD models of real world objects and their rendering using photorealistic texture are current topics of investigation. In the case of buildings and other objects of urban environments the need for this technology is evident from various applications. The primary source material consists of images which serve for both the reconstruction of the object’s geometry and the creation of texture.
Depending on the scale of the digital source images and on the specifics of an application one may require more or less detail of the geometry. The use of photorealistic texture enhances the perceived detail even in the absence of a detailed geometric model; however one needs to overcome correspondence problems between texture and geometry, which partly may be caused by issues of illumination.
We present in this paper a progress report on one aspect of urban models, namely the reconstruction of roofs. To improve the automation of datacollection we propose the fusion of digital map data and aerial images. We discuss an affine matching procedure. In addition we illustrate the correspondence problem between geometry and photo-texture.
Our initial experiences are based on results from two test sites, one in Vienna and the other in the city of Graz.
Michael Gruber, Marko Pasko, Franz Leberl
Use of DTMs/DSMs and Orthoimages to Support Building Extraction
Abstract
The acquisition of 3D models of buildings and other man-made objects is currently an issue of high importance to many users of geoinformation, including planners, geographers, architects, etc. Aerial imagery has proven to be a valuable data source for these models. The project AMOBE1, a joint research effort between the photogrammetry and image sciences groups at ETH, aims firstly at developing practical algorithms to support the semi-automatic reconstruction of man-made objects from aerial imagery and secondly, at developing improved techniques for automatic digital terrain and surface model generation. In the latter case it is important to differentiate between terrain models (DTMs), which model the terrain, and surface models (DSMs), which model all 3D objects. In this paper, we explore the roles of both DTMs and DSMs and their derived products orthoimages in supporting the extraction of buildings from aerial imagery. In Section 3 the quality of commercially available automatic DTM/DSM generation software is investigated. In Section 4 a number of methods for automatically detecting buildings in DSMs are presented and evaluated. In Section 5, the use of DSMs in deriving coarse building models is described. Applications of orthoimages are discussed in Section 6. First, however, the fundamental assumptions and strategy employed in AMOBE is outlined. Note finally that the techniques and tests described in this paper constitute preliminary investigations. Promising directions, as noted, are the subject of ongoing research.
Emmanuel Baltsavias, Scott Mason, Dirk Stallmann
Data fusion for the detection and reconstruction of buildings
Abstract
Research in detection and reconstruction of man-made objects from aerial images has made significant progress in the past two or three years. Two important reasons for that are: (1) data fusion of different sources provides more information for the ill-posed image analysis processes and (2) more sophisticated algorithms are developed which apply grouping and reasoning processes using a model of the object class of interest.
This paper presents an algorithm for automatic detection and reconstruction of buildings using height and image data. A given Digital Height Model (DHM), in the experiments computed by automatic stereo matching, is used to focus attention on regions where buildings are expected. Detection relies on the heuristic that buildings are represented in a DHM by regions with local height maxima. Object contours of buildings can be modelled by straight lines. Therefore, three-dimensional line segments are extracted from the image pair by stereo matching of grey-value edges. Again, the DHM is used to provide approximate parallaxes for the line segments. A building can be reconstructed by matching these observed three-dimensional lines to the lines of a model of the building. Position and shape of a building is estimated by minimizing the distances between the observed lines and the corresponding lines of a parameterized building model. The resulting error of least squares estimation provides a measure on how good the observed lines fit to the model. Thus it can be used to evaluate the result of the reconstruction. To have a second quality check the extracted roof regions of a building in a stereo image pair are matched by area based matching.
Norbert Haala, Michael Hahn
Building Extraction Building Extraction and Verification from Spaceborne and Aerial Imagery using Image Understanding Fusion Techniques
Abstract
For automatic building extraction, the need of the fusion of several image understanding cues has been emphasised. In this paper, the fusion of a stereoscopic (pyramidal matching) and a monoscopic technique (building detection) was investigated. In the fusion technique proposed, building detection output is used as boundaries of height interpolation and pyramidal matching output as the source of height information. After height interpolation, buildings are reconstructed in 3D space. This fusion technique was tested with several examples and proved to be very powerful. The results of this paper support the role of fusion in more complete image understanding.
Taejung Kim, Jan-Peter Muller
Building Extraction from Stereo Pairs of Aerial Images: Accuracy and Productivity Constraint of a Topographic Production Line
Abstract
This paper presents the research that has been lead for a few years at IGN on automatic building extraction from aerial photographs. A description of the operational context and of the topographic application that is aimed at introduces the main constraints to be overcome. We give then a quick sketch of the building extraction process that has been worked out up to now, and develop its quality assessment on a test site. The analysis of the results underlines the part played by the planar approximation of disparities resulting of an area-based correlation process in the fulfilment of the accuracy requirements. Further research will focus on reliability improvement and on more complex shape handling.
Olivier Jamet, Olivier Dissard, Sylvain Airault

Road Extraction

Frontmatter
Tracking Roads in Satellite Images by Playing Twenty Questions
Abstract
We present new experiments on tracking roads from SPOT satellite images. The principle of the algorithm is as follows: we choose “tests” (matched filters for short road segments) one at a time in order to remove as much uncertainty as possible about the road position given the results of the previous tests. The tests are chosen based on a statistical model for the joint distribution of tests and road positions. On-line, we alternate between data collection and optimization: at each iteration new image data is examined and a minimization problem is solved, resulting in a new image location to inspect, and so forth. We report experiments using panchromatic SPOT satellite imagery with a ground resolution of ten meters: given a starting point and starting direction, we are able to track in real time mountain highways in southern France over large distances without manual intervention.
Bruno Jedynak, Jean-Philippe Rozé
New Geometric Stochastic Technology for Finding and Recognizing Roads and Their Features in Aerial Images
Abstract
This paper presents an automated approach to finding main roads in aerial images. The approach is to build geometric-probabilistic models for road image generation. We use Gibbs Distributions. Then, given an image, roads are found by map (maximum aposteriori probability) estimation. The map estimation is handled by partitioning an image into windows, realizing the estimation in each window through the use of dynamic programming, and then, starting with the windows containing high confidence estimates, using dynamic programming again to obtain optimal global estimates of the roads present. The approach is model-based from the outset and is completely different than those appearing in the published literature. It produces two boundaries for each road, or four boundary when a mid road barrier is present.
Meir Barzohar, David B. Cooper
Road tracing by profile matching and Kaiman filtering
Abstract
Road tracing is a promising technique to increase the efficiency of road mapping. In this paper a new road tracing algorithm is presented. Road positions are computed by matching the average grey value profile of a reference road segment with profiles taken from the image. The road parameters are estimated by the recursive Kaiman filter. By utilizing the prediction step of the Kaiman filter the road tracer is able to continue following the road despite temporary failures of the profile matching that are due to road crossings, exits and cars.
George Vosselman, Jurrien de Knecht
Model-Based Road Extraction from Images
Abstract
In this paper we present an approach to the automatic extraction of roads from aerial images. We argue that a model for road extraction is needed in every step of the image interpretation process. The model needs to include knowledge about different aspects of roads, like geometry, radiometry, topology, and context. The main part of this paper discusses the parts of that knowledge that we have implemented so far. It is shown that roads can be successfully detected at various resolution levels of the same image. Furthermore, we show that combining the results obtained in each level helps to eliminate false hypotheses typical for each level. The approach has been successfully applied to a variety of images.
Carsten Steger, Clemens Glock, Wolfgang Eckstein, Helmut Mayer, Bernd Radig

Map-Based Extraction

Frontmatter
Automatic Extraction and Structuring of Objects from Scanned Topographical Maps — An Alternative to the Extraction from Aerial and Space Images ?
Abstract
This paper presents investigations and developments in the area of automatic extraction of cartographic features from scanned topographic maps. It focuses on the recognition and extraction of buildings for the establishment of topographic and cartographic information systems or for the provision of information supporting the automatic extraction of man-made objects from aerial or space images. The described solution combines knowledge-based pattern recognition techniques, raster data processing operations and raster-vector conversion procedures based on robust estimation and constrained adjustment techniques. Following a stringent quality control process the structured results can be exported in a selection of formats. The implemented solution allows largely automatic processing of entire map sheets with very high success rates within a few hours using standard UNIX workstations. Finally, the paper presents results from applications of the technology such as the recognition of buildings for the simulation and design of mobile communication networks for Swiss PTT.
Stephan Nebiker, Alessandro Carosio
Cooperative use of aerial images and maps for the interpretation of urban scenes
Abstract
We present a methodology for the interpretation of aerial images on urban areas. It is based on a careful examination of the most relevant elements used by a human person to understand such complex scenes. These elements are well reflected in a geographical map. For this reason it seems of prior interest to manage simultaneously cartographic and pictural information.
Henri Maître, Isabelle Bloch, Henri Moissinac, Christophe Gouinaud
Map—based semantic modeling for the extraction of objects from aerial images
Abstract
Images taken from satellite or airborne platforms usually do not represent isolated information of man’s environment. In most countries, valuable context data are available which may be integrated successfully in the image interpretation procedure. This paper presents the verification phase of a Map Oriented SE mantic image underStanding process1(Moses). It is implemented as a model driven process, where semantic networks are used as modeling tools. In a three stage scheme, the models are successively refined and for image analysis an automatically generated semantic network, specialized on the analysis of the underlying scene is used. Digitized topographic maps serve as a principal knowledge source.
Franz Quint, Manfred Sties
Backmatter
Metadaten
Titel
Automatic Extraction of Man-Made Objects from Aerial and Space Images
herausgegeben von
Prof. Dr. Armin Gruen
Prof. Dr. Olaf Kuebler
Dr. Peggy Agouris
Copyright-Jahr
1995
Verlag
Birkhäuser Basel
Electronic ISBN
978-3-0348-9242-1
Print ISBN
978-3-0348-9958-1
DOI
https://doi.org/10.1007/978-3-0348-9242-1