Matching 2D and 3D articulated shapes using the eccentricity transform

https://doi.org/10.1016/j.cviu.2011.02.006Get rights and content

Abstract

This paper presents a novel method for 2D and 3D shape matching that is insensitive to articulation. It uses the eccentricity transform, which is based on the computation of geodesic distances. Geodesic distances computed over a 2D or 3D shape are articulation insensitive. The eccentricity transform considers the length of the longest geodesics. Histograms of the eccentricity transform characterize the compactness of a shape, in a way insensitive to rotation, scaling, and articulation. To characterize the structure of a shape, a histogram of the connected components of the level-sets of the transform is used. These two histograms make up a highly compact descriptor and the resulting method for shape matching is straightforward. Experimental results on established 2D and 3D benchmarks show results similar to more complex state of the art methods, especially when considering articulation. The connection between the geometrical modification of a shape and the corresponding impact on its histogram representation is explained. The influence of the number of bins in the two histograms and the respective importance of each histogram is studied in detail.

Highlights

► We propose a method for matching 2D and 3D shapes. ► The method builds on the computation of longest geodesic distances in a shape. ► A shape is described by two histograms: one for compactness and one for structure. ► Competitive results are achieved for connected shapes undergoing articulation.

Introduction

The recent increase in available 3D models and acquisition systems has created the need for efficient retrieval of stored models, making 3D shape matching gain attention also outside the computer vision community. Together with its 2D counterpart, 3D shape matching is useful for identification and retrieval in classical vision tasks, but can also be found in Computer Aided Design/Computer Aided Manufacturing (CAD/CAM), virtual reality (VR), medicine, molecular biology, security, and entertainment [1].

Shape matching requires to set up a signature that characterizes the properties of interest for the recognition [2]. Depending on the task, the invariance of this signature to local deformations such as articulation is important for the identification of 2D and 3D shapes. Matching can then be carried out over this (usually lower dimensional) space of signatures.

Most shape descriptors are computed over a transformed domain that amplifies the important features of the shape while throwing away ambiguities such as translation, rotation or local deformations. For 2D shapes, the Fourier transform of the boundary curve [3] is an example of such a transformed-domain descriptor adapted to smooth shapes. Shape transformations computed with geodesic distances [4], [5] lead to signatures invariant to isometric deformations such as bending or articulation. To capture salient features of 2D shapes, local quantities such as curvature [6] or shape contexts [7] can be computed. They can be extended to bending invariant signatures using geodesic distances [8], [9]. Another possibility is to represent a shape as a collection of modestly overlapping disk components, which contain both local geometric and structural information [10]. More global features include the Laplace spectra [11] and the skeleton [12]. Contour flexibility [13] is a novel shape descriptor of planar contours, which obtains both local and global features from a contour. It represents the deformable potential at each point along a contour. The rolling penetrate descriptor [14] combines the advantages of contour-based and region-based methods, and provides a unified scheme to handle various shapes, geometrical transforms, noise, distortion and occlusion. Some transformations involve the computation of a function defined on the shape, for instance the solution to a linear partial differential equation [15] or geometric quantities [16].

Among approaches matching 3D shapes, existing methods can be divided into [1]: statistical descriptors, like for example geometric 3D moments employed by [17], [18], 3D moment invariants [19], and the shape distribution [16], [20]. A novel shape representation based on [19] is 3D gray level moment invariants [21], which are independent of translation, scaling and rotation. Extension-based descriptors are calculated from features sampled along certain directions from a position within the shape [22], [23]. Volume-based descriptors use the volumetric representation of a 3D shape to extract features (examples are Shape histograms [24], Model Voxelization [25], and point set methods [26]). The 3D shape impact descriptor [27] considers a 3D object as a distributed 3D mass and it is indirectly computed from the resulting fields. The field is described using both Newton’s and general relativity laws. In [28], 3D shapes are sampled using a technique based on critical points of the eigenfunctions of the Laplace-Beltrami operator. A point-based statistical descriptor is used that incorporates an approximation of the geodesic shape distribution and other geometric information describing the surface at that point. Matching is carried out using Bipartite graph matching. Descriptors using the surface geometry compute curvature measures and/or the distribution of surface normal vectors [29], [30]. Image-based descriptors reduce the problem of 3D shape matching to an image similarity problem by comparing 2D projections of the 3D shapes [31], [32], [33]. Reeb graphs have been used to match the topology of two shapes [34], [35]. Skeletons are intuitive shape descriptions and can be obtained from a 3D shape by applying a thinning algorithm on the voxelization of a solid object like in [36]. Descriptors using spin images work with a set of 2D histograms of the shape geometry and a search for point-to-point correspondences is done to match 3D objects [37]. In [38], 3D shapes are automatically decomposed into parts using topological features of the Laplace-Beltrami Eigenfunctions. The parts of near isometric shapes are registered to each other.

A generic class of approaches for 2D and 3D shape retrieval is to define a metric between pairs of points on the shape, and compare either directly these metric spaces or compare features extracted from these spaces. An important goal of this metric space design is to make the shape retrieval more or less invariant to bending and articulations. The Gromov-Hausdorff framework directly compares metric spaces [39], and can be used for shape retrieval in conjunction with geodesic metric spaces [4] or diffusion spaces that are more robust to topological noise [40], [41]. Associated to the diffusion metric, it is possible to define a metric using a dimensionality reduction within a few eigenvectors of the Laplacian [42]. To speed-up retrieval applications, one can consider low dimensional features extracted from these metric spaces. This can be for instance the set of geodesic distances between critical points of the Laplace eigenfunction [28], local distributions of geodesic curves [8], statistical moments of a diffusion distance [15]. A popular class of approaches considers histograms, such as the distribution of Euclidean distances [16], of the mean geodesic distance [5] or the maximum geodesic distance [43], [44], or bags of features [45]. This article elaborates on this idea of building compact geodesic descriptors using two different kinds of histograms to represent faithfully the distribution of geodesic distances.

Considering the underlying transform on which our descriptor is built, probably the most similar works in shape matching are [5], [34], where instead of the length of the geodesic to the point furthest away (this work), the mean or the sum over all points are considered. Distance based transforms are also used in [15], [12], [46], where the length of random or shortest paths to an (existing) boundary are computed. Describing a shape as a normalized histogram of the values of a function at all points of the shape, as in the case of the normalized ECC histogram in this paper, is conceptually similar to the shape distributions in [16], where different functions and histogram comparison methods are considered.

In [8] a model of articulated objects is presented. It is defined as a union of (rigid) parts Oi and joints (named ‘junctions’ by the authors). An articulation is defined as a transformation that is rigid when limited to any part Oi, but can be non-rigid on the junctions. An articulated instance of an object is an articulated object itself (actually the same object) that can be articulated back to the original one. The term articulated shape refers to the shape of an articulated object in a certain pose. In the context of shape matching the concept of articulated shape means that shapes that belong to articulations of the same object, belong to the same class. Assuming that the size of the junctions is very small compared to the size of the parts Oi, it is shown that the variation of the geodesic distance1 during articulation is small and that geodesic distances are articulation insensitive.

Similar to the method in [8], our method does not explicitly involve any part models. In [8] the part based articulation model is used to support the analysis of the properties of the geodesic distance under articulation. We have briefly recalled it to aid the definition of an articulated shape. Finding the correspondences between all the parts of two shapes is an NP-complete problem in graph theory (known also as the ‘matching’ of two graphs) and requires the correct decomposition of the unknown object into parts. A one-to-one correspondence (bijection) for all parts is not always possible as some parts might be missing (e.g. due to segmentation errors).

The eccentricity transform of a shape associates to each of its points the distance to the point furthest away. It is based on the computation of geodesic distances and thus robust with respect to articulation. It is robust against minor segmentation errors, and salt and pepper like noise [47], and stable in the presence of holes (i.e. it is defined in the same way for shapes with and without holes, and does not require pre-selection of for e.g. a single closed boundary to be processed).

In a common framework, we propose histograms built on the eccentricity transform as a descriptor for 2D and 3D shape matching. The descriptor consists of two histograms: the ECC histogram h which is the normalized histogram of the eccentricity transform of the shape, and the ECC structure histogram s which is the histogram of the number of connected components of the level-sets of the eccentricity transform. The descriptor is invariant to changes in orientation, scale, and articulation. It requires only a simple representation and can be efficiently matched.We present an in-depth study of the properties of the approach (relation between shape and descriptor, parameters), supported by experimental results, and an analysis of the results and possibilities for improvement. The descriptor is computed on four different domains using geodesic distances: inside 2D shapes, inside the volume, inside the border voxels, or on the surface meshes of 3D shapes, and compared to state of the art methods. These numerical results support that the proposed approach performs similarly and in some case better than the state of the art on databases of articulated 2D and 3D shapes. Initial results using only the ECC histogram h have been presented for 2D in [43], and for 3D (volumetric representation only) in [44].

To the best of our knowledge, this is the first approach applying the eccentricity transform to the problem of shape matching.

The paper is organized as follows: Section 2 recalls the eccentricity transform and discusses used variants and computation. Section 3 explains the proposed matching method and discusses pros and cons of the descriptor (Section 3.4). In Section 4 experiments, and the effect of the parameters are presented, followed by future work (Section 5) and conclusion (Section 6). The Appendix recalls the algorithm used to compute the eccentricity transform.

Section snippets

Eccentricity transform

The following definitions and properties follow [47], [48], and are extended to n-dimensional domains.

Let the shape S be a closed set in Rn. A path π in S is the continuous mapping from the interval [0,1] to S. Let Π(p1, p2) be the set of all paths between two points p1,p2S, within the set S.

The geodesic distance d(p1, p2) between two points p1,p2S is defined as the length λ(π) of the shortest path π  Π(p1, p2)d(p1,p2)=min{λ(π)|πΠ(p1,p2)},where the length λ(π) isλ(π(t))=01π˙(t)dt,π(t) is a

Eccentricity histogram matching

To match two shapes we first create a shape descriptor for each of them and then match these descriptors to obtain a similarity measure. The proposed shape descriptor is made out of two components: the first one characterizes the compactness (geometry) of the shape (Section 3.1) and the second one its structure (Section 3.2).

The descriptor is highly compact, which is an advantage for real time retrieving and low memory devices, it is invariant under many natural deformations, it can handle

Matching experiments in 2D and 3D

This section shows results on popular benchmarks and comparison with state of the art methods. When comparing the results, keep in mind that the proposed method is simple and matching the computed descriptors is fast. An approximation of the ECC can be computed for many shapes with as few as 50 distance propagations (e.g. the average number for the ECCmesh on the McGill database is 54), and determining δ between two computed descriptors (L2-norm) has practically no CPU time consumption.

Potential extensions

Topological changes in the shape. Geodesic distances are sensitive to changes in the topology of the shape. For example, it is enough to touch the index fingers of two hands in one point to drastically change the geodesic distance between the points of the two palms: before we had to go over the arms and torso, now we only have to travel over two fingers. This problem has been approached in [42], [41] by using diffusion distance as an alternative to geodesic distance. Diffusion distances

Conclusion

We have presented a method for matching 2D and 3D shapes. The method is based on the eccentricity transform, which uses maximal geodesic distances and is insensitive to articulation. Descriptors are composed of two terms: a normalized histogram of the eccentricity transform and a histogram of the connected components of the level-sets of the eccentricity transform. They characterize the compactness and structure of the shape, are compact and easy to match. The method is straightforward but

Acknowledgments

The presented work was partially supported by the Austrian Science Fund under Grants S9103-N13 and P18716-N13. Adrian Ion was supported, in part, by the European Commission, under Project MCEXT-025481.

References (75)

  • S. Belongie et al.

    Shape matching and object recognition using shape contexts

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2002)
  • H. Ling et al.

    Shape classification using the inner-distance

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2007)
  • K. Nasreddine, A. Benzinou, R. Fablet, Shape geodesics for boundary-based object recognition and retrieval, 2009, pp....
  • J. Xu, Shape matching using morphological structural shape components, 2008, pp....
  • M. Reuter et al.

    Laplace-spectra as fingerprints for shape matching

  • K. Siddiqi et al.

    Shock graphs and shape matching

    International Journal of Computer Vision

    (1999)
  • C. Xu et al.

    2d shape matching by contour flexibility, Pattern Analysis and Machine Intelligence

    IEEE Transactions on

    (2009)
  • L. Gorelick, M. Galun, E. Sharon, R. Basri, A. Brandt, Shape representation and classification using the poisson...
  • R. Osada et al.

    Shape distributions

    ACM Transactions on Graphics

    (2002)
  • M. Elad et al.

    Content based retrieval of vrml objects: an iterative and interactive approach

  • F.A. Sadjadi et al.

    Three-dimensional moment invariants, Pattern Analysis and Machine Intelligence

    IEEE Transactions on PAMI-2

    (1980)
  • C.Y. Ip et al.

    Automated learning of model classifications

  • K. Guo, M. Li, 3D gray level moment invariants: a novel shape representation, In: Chinese Conference on Pattern...
  • D. Vranic et al.

    Description of 3D-shape using a complex function on the sphere

  • D. Vranic, D. Saupe, 3D model retrieval with spherical harmonics and moments, in: 23rd DAGM-Symposium on Pattern...
  • M. Ankerst et al.

    3D shape histograms for similarity search and classification in spatial databases

  • D. Vranic, D. Saupe, 3D shape descriptor based on 3D fourier transform, in: EURASIP Conference on Digital Signal...
  • J.W.H. Tangelder et al.

    Polyhedral model retrieval using weighted point sets

  • M.R. Ruggeri et al.

    Spectral-driven isometry-invariant matching of 3D shapes

    International Journal of Computer Vision

    (2009)
  • E. Paquet et al.

    Nefertiti: A tool for 3-D shape databases management

    SAE transactions

    (1999)
  • T. Zaharia, F. Preux, Three-dimensional shape-based retrieval within the mpeg-7 framework, in: SPIE Conference on...
  • T. Ansary, J.-P. Vandeborre, S. Mahmoudi, M. Daoudi, A bayesian framework for 3D models retrieval based on...
  • C.M. Cyr et al.

    A similarity-based aspect-graph approach to 3D object recognition

    International Journal of Computer Vision

    (2004)
  • D.-Y. Chen, M. Ouhyoung, X.-P. Tian, Y.-T. Shen, M. Ouhyoung, On visual similarity based 3D model retrieval, in:...
  • M. Hilaga et al.

    Topology matching for fully automatic similarity estimation of 3D shapes

  • Y. Shinagawa et al.

    Surface coding based on morse theory, Computer Graphics and Applications

    IEEE

    (1991)
  • H. Sundar et al.

    Skeleton based shape matching and retrieval

    Shape Modeling International

    (2003)
  • Cited by (33)

    • Contour segment grouping for object detection

      2017, Journal of Visual Communication and Image Representation
    • VIV: Using visible internal volume to compute junction-aware shape descriptor of 3D articulated models

      2016, Neurocomputing
      Citation Excerpt :

      Non-rigid shape analysis has been receiving a growing amount of attention in many applications in computer vision, computer graphics, pattern recognition and molecular structure analysis [1,2]. A simplified approach to non-rigid shape analysis is based on articulated shapes [1,3–6], which assumes that the non-rigid shape consists a set of rigid parts connected by flexible junctions (or named joints/hinges [1,7]). Each of rigid parts has a certain degree of freedom to move, and junctions are relatively small compared to the parts they connect.

    • Distance interior ratio: A new shape signature for 2D shape retrieval

      2016, Pattern Recognition Letters
      Citation Excerpt :

      Methods such as thin-plate spline (TPS) [2] and dynamic programming (DP) [15] are applied in order to obtain the optimum assignment of correspondence points of the two objects. Recent local descriptors considered both shape contour and region that are insensitive to articulated shape [11] and occluded shape [13]. Both methods apply the geodesic distance in order to find the shortest path inside the shape region to represent a shape’s geometric structure.

    View all citing articles on Scopus
    View full text