Location awareness on the Internet and 3D models of our habitat (as produced by Microsoft (Bing) or Google (Google Earth)) are a major driving force for creating 3D models from image data. A key factor for these models are highly accurate and fully automated stereo matching pipelines producing highly accurate 3D point clouds that are possible due to the fact that we can produce images with high redundancy (i.e., a single point is projected in many images). Especially this high redundancy makes fully automatic processing pipelines possible. Highly overlapping images yield also highly redundant range images. This paper proposes a novel method to fuse these range images. The proposed method is based on the recently introduced total generalized variation method (
). The second order variant of this functional is ideally suited for piece-wise affine surfaces and therefore an ideal case for buildings which can be well approximated by piece-wise planar surfaces. In this paper we first present the functional consisting of a robust data term based on the Huber-
norm and the
regularization term. We derive a numerical algorithm based on a primal dual formulation that can be efficiently implemented on the
. We present experimental results on synthetic data as well as on a city scale data set, where we compare the method to other methods.