We present a novel method for 3D reconstruction of urban scenes extending a recently introduced heightmap model. Our model has several advantages for 3D modeling of urban scenes: it naturally enforces vertical surfaces, has no holes, leads to an efficient algorithm, and is compact in size. We remove the major limitation of the heightmap by enabling modeling of overhanging structures. Our method is based on an an
-layer heightmap with each layer representing a surface between full and empty space. The configuration of layers can be computed optimally using a dynamic programming method. Our cost function is derived from probabilistic occupancy, and incorporates the Bayesian Information Criterion (BIC) for selecting the number of layers to use at each pixel. 3D surface models are extracted from the heightmap. We show results from a variety of datasets including Internet photo collections. Our method runs on the GPU and the complete system processes video at 13 Hz.