Bag of contour fragments for robust shape classification

doi:10.1016/j.patcog.2013.12.008

Pattern Recognition

Volume 47, Issue 6, June 2014, Pages 2116-2125

https://doi.org/10.1016/j.patcog.2013.12.008 Get rights and content

Highlights

•
A new shape representation is proposed by encoding contour fragments in shape.
•
The proposed shape representation is compact yet informative.
•
The proposed shape representation is robust to shape deformation and conclusion.
•
We obtain the state-of-the-art shape classification performance on several bench-mark datasets.

Abstract

Shape representation is a fundamental problem in computer vision. Current approaches to shape representation mainly focus on designing low-level shape descriptors which are robust to rotation, scaling and deformation of shapes. In this paper, we focus on mid-level modeling of shape representation. We develop a new shape representation called Bag of Contour Fragments (BCF) inspired by classical Bag of Words (BoW) model. In BCF, a shape is decomposed into contour fragments each of which is then individually described using a shape descriptor, e.g., the Shape Context descriptor, and encoded into a shape code. Finally, a compact shape representation is built by pooling shape codes in the shape. Shape classification with BCF only requires an efficient linear SVM classifier. In our experiments, we fully study the characteristics of BCF, show that BCF achieves the state-of-the-art performance on several well-known shape benchmarks, and can be applied to real image classification problem.

Introduction

Shape is an intrinsic feature for image understanding, which is stable to illumination and variations in object color and texture. Because of these advantages, shape is widely considered for object recognition. In particular, with the recent advance in contour detection proposed by Arbelaez et al. in [1], shape based object recognition in natural image is becoming more practical and attracts more attention in computer vision community. Main challenges in shape based object recognition include deformation, occlusion and viewpoint variation of objects. Various shape descriptors have been proposed to address these challenges, e.g., [2], [3], [4], [5]. Shape based object recognition is usually considered as a classification problem. Given a set of training shapes and category label of each training shape, we need to determine which category a testing shape belongs to. Traditional shape classification methods are usually based on matching shape descriptors from two different shapes: for every training shape, we find correspondences between its shape descriptors and the shape descriptors in the testing shape using matching algorithms, such as Hungarian algorithm, dynamic programming algorithm; then we compute matching costs according to the matching results; finally, we rank training shapes based on the matching costs and classify the testing shape using the nearest neighbor (NN) classifier. This exemplar-based shape classification strategy has been widely used, for example, in [2], [3], [6]. However, it has its own limitations. With few training samples, it is difficult to capture the large intra-class variation using these algorithms. For large training samples, it is extremely time consuming to perform shape matching one-by-one.

Different from exemplar-based shape matching, in this paper, we propose a compact shape representation and handle the large intra-class variation by discriminative learning. Inspired by the huge progress in image classification and representation with Bag-of-Words (BoW) [7], [8], we decompose shape into contour fragments and quantize the contour fragments into shape codes. The contour fragments under different scales contain both local and global shape information which can be encoded utilizing coding strategies for local descriptors [9], [10]. Then, a statistical histogram of shape codes is used to represent each shape and similarity of shapes can be directly computed from these histograms. Matching shapes based on this new shape representation does not explicitly give correspondences between contour fragments. But using a classifier for shape classification is much more efficient than using the typical matching algorithms such as Hungarian, thin plate spline (TPS), dynamic programming, dynamic time warping, and so on. In fact, BoW model is a natural solution for finding correspondences between two sets of features and can be used efficiently for recognition tasks. However, it has seldom been successfully applied to shape analysis, since the popular image descriptors such as SIFT [11] and LBP [12] are mainly designed for describing the local texture/appearance variations. These image features are not good at capturing the intrinsic structure in shape. Toward this end, we directly work on shape contour by decomposing it into contour fragments. We name our method Bag of Contour Fragments (BCF), which can not only provide a compact and informative representation, but also achieve the state-of-the-art classification performance on several popular shape benchmarks.

Pipeline of building shape representation in BCF is shown in Fig. 1. The outer contour of each shape is decomposed into salient contour fragments using a well-known contour decomposition method named discrete contour evolution (DCE) [13]. Each contour fragment is then described by collecting the shape context features [2] on its reference points, and encoded into shape codes. Finally, the shape codes are pooled into a compact image representation with spatial pyramid. We utilize the current advances in image classification, such as local-constrained linear coding (LLC) [9] for feature coding and spatial pyramid matching (SPM) [14] in our BCF shape classification framework. Both LLC and SPM are seldom used in shape analysis. LLC utilizes the locality constraints and encodes each descriptor with its local-coordinate system in a codebook. In practice, it first performs k-nearest-neighbor search to find local-coordinates for feature to be encoded, and then solves a constrained least square fitting problem on the local-coordinates. The state-of-the-art performance on PASCAL VOC [15] image classification has shown effectiveness of LLC. SPM is a simple and computationally efficient extension of the orderless BoW model for image representation. It works by partitioning the image into increasingly fine sub-regions and computing histograms of local features found inside each sub-region. Histograms of different sub-regions are concatenated as final image representation. SPM can capture the spatial information in contour fragments which are useful for shape recognition. BCF naturally utilizes LLC and SPM to improve the accuracy of shape classification.

One of the major difficulties involved in shape classification for many shape-matching based algorithms is to directly match two shapes with large deformation since shapes are only partially similar to each other. BCF can easily solve this problem caused by large shape deformation, and is good at classifying shapes with partial similarity. As each shape contour is divided into contour fragments in BCF, the contour fragments contain partial shape information. After coding, a discriminative classifier such as SVM or Adaboost can be used to select the representative and informative contour parts for each shape category. Fig. 4 shows some contour fragments selected by linear SVM in four shape categories in our experiments. We can see that even though contour fragments are parts of the shapes, they are very informative for recognizing shape category. Thus, BCF is able to deal with partial occlusion in shape, especially, in the edge map extracted from real image. Besides, we find that BCF is also robust to noisy contour in our experiments.

In summary, the proposed BCF has several good properties:

1.
It provides a very compact shape representation which is a single vector rather than a set of feature vectors used in many other methods.
2.
It precisely preserves information of individual shape contour via LLC and spatial layout of contour fragments in one shape via SPM.
3.
For shape classification it avoids pairwise matching between local shape descriptors and significantly reduces the time cost.
4.
It is robust to the shapes with occlusions or parts missing, and can be easily applied to real image classification.

The rest of the paper is organized as follows. We review the related works in Section 2. Then, we introduce the details of our shape representation with BCF in Section 3, including extracting, encoding and pooling contour fragments, and so on. We evaluate the proposed method on several popular shape benchmarks, illustrate good properties of BCF in applications, and demonstrate its effectiveness in shape classification in various datasets in Section 4. Finally, we conclude this paper in Section 5.

Section snippets

Related work

Here, we briefly review the recent progress in shape classification. Sun and Super [16] proposed a shape classification framework for recognizing contour shapes using class contour segments as input features with Bayesian classifier. Bai et al. [6] adopted contour segments and skeleton paths as the input features for shape classification with a Gaussian mixture model. Daliri and Torre [17], [18] transformed the contour points into a symbol representation, and then used the edit distance between

Bag of contour fragments

In this section, given a shape S, we show how to build BCF shape representation $f (S)$ for S and use $f (S)$ for shape classification step by step.

Experiments

In this section we test our method for shape classification on a variety of shape datasets and compare results of our method with the state-of-the-art shape classification approaches, available for those datasets in the literature. We also study the robustness of our method. First of all, we give implementation details as following.¹

Conclusions

In this paper, we present a novel shape representation called BCF for shape classification. To the best of our knowledge, this is the first paper that introduces the idea of BoW together with LLC and SPM for shape representation. Since BCF is a part-based model, it is intrinsically robust to occlusion and deformation of shape. In the experiments, we have extensively tested the performance of BCF; all these experimental results on shape benchmarks show that BCF is able to achieve the

Conflict of interest

None declared.

Acknowledgements

We would like to thank the anonymous reviewer for their helpful suggestions. This work was supported by National Natural Science Foundation of China (NSFC) Grants 61222308 and 61173120, the Program for New Century Excellent Talents in University in China, National Science Foundation (NSF) Grants OIA-1027897 and IIS-1302164, and Fundamental Research Funds for the Central Universities (HUST 2013TS115). We thank Mr. Ho Simon Wang for assisting to improve the language of this paper.

References (60)

N. Alajlan et al.
Shape retrieval using triangle-area representation and dynamic space warping
Pattern Recognit.
(2007)
L. Latecki et al.
Convexity rule for shape decomposition based on discrete contour evolution
Comput. Vis. Image Underst.
(1999)
M. Daliri et al.
Robust symbolic representation for shape recognition and retrieval
Pattern Recognit.
(2008)
M. Daliri et al.
Shape recognition based on kernel-edit distance
Comput. Vis. Image Underst.
(2010)
A. Erdem et al.
A similarity-based approach for shape classification using Aslan skeletons
Pattern Recognit. Lett.
(2010)
A. Torsello et al.
A skeletal measure of 2d shape similarity
Comput. Vis. Image Underst.
(2004)
E. Baseski et al.
Dissimilarity between two skeletal trees in a context
Pattern Recognit.
(2009)
W. Kim et al.
A region-based shape descriptor using Zernike moments
Signal Process.Image Commun.
(2000)
S. Lewin et al.
Perceptually motivated shape evolution with shape-preserving property
Pattern Recognit. Lett.
(2010)
X. Bai et al.
Detection and recognition of contour parts based on shape similarity
Pattern Recognit.
(2008)

E. Attalla et al.

Robust shape similarity retrieval based on contour segmentation polygonal multiresolution and elastic matching

Pattern Recognit.

(2005)

P. Arbelaez et al.

Contour detection and hierarchical image segmentation

IEEE Trans. Pattern Anal. Mach. Intell.

(2011)

S. Belongie et al.

Shape matching and object recognition using shape contexts

IEEE Trans. Pattern Anal. Mach. Intell.

(2002)

H. Ling et al.

Shape classification using the inner-distance

IEEE Trans. Pattern Anal. Mach. Intell.

(2007)

F. Mokhtarian et al.

Efficient and robust retrieval by shape content through curvature scale space

Ser. Softw. Eng. Knowl. Eng.

(1997)

X. Bai, W. Liu, Z. Tu, Integrating contour and skeleton for shape classification, in: International Conference on...

J. Sivic, A. Zisserman, Video google: a text retrieval approach to object matching in videos, in: International...

G. Csurka, C. Dance, L. Fan, J. Willamowski, C. Bray, Visual categorization with bags of keypoints, in: Workshop on...

J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, Y. Gong, Locality-constrained linear coding for image classification, in:...

F. Perronnin, J. Sánchez, T. Mensink, Improving the fisher kernel for large-scale image classification, in: Europe...

D. Lowe

Distinctive image features from scale-invariant keypoints

Int. J. Comput. Vis.

(2004)

T. Ahonen et al.

Face description with local binary patternsapplication to face recognition

IEEE Trans. Pattern Anal. Mach. Intell.

(2006)

S. Lazebnik, C. Schmid, J. Ponce, Beyond bags of features: spatial pyramid matching for recognizing natural scene...

M. Everingham et al.

The Pascal visual object classes (voc) challenge

Int. J. Comput. Vis.

(2010)

K. Sun, B. Super, Classification of contour shapes using class segment sets, in: IEEE Conference on Computer Vision and...

B. Wang, W. Shen, W. Liu, X.-G. You, X. Bai, Shape classification using tree-unions, in: International Conference on...

A. Torsello et al.

Learning shape-classes using a mixture of tree-unions

IEEE Trans. Pattern Anal. Mach. Intell.

(2006)

K. Siddiqi et al.

Shock graphs and shape matching

Int. J. Comput. Vis.

(1999)

X. Bai et al.

Path similarity skeleton graph matching

IEEE Trans. Pattern Anal. Mach. Intell.

(2008)

D. Zhang, G. Lu, Generic fourier descriptor for shape-based image retrieval, in: IEEE International Conference on...

Cited by (169)

Shape classification using a new shape descriptor and multi-view learning
2024, Displays
Shape classification is considered as a vital task in solving many computer vision problems. Different factors such as affine transformations, scaling, rotations, variation in perspective, noise and occlusion have made the shape classification problem to be a hard problem. This work investigates a new shape descriptor that extracts different features from each boundary pixel. This makes it to be more informative and discriminant in comparison with other descriptors. After feature extraction, the “bag-of-features (BoF)” model is employed to construct the final representation for each image. To enhance the functionality of the BoF model, a novel codebook generation approach is presented. The proposed approach tends to derive a more meaningful visual codebook. Consequently, the produced feature vectors can handle inter- and intra-class variations more effectively. Comprehensive experiments conducted on the various complicated shape datasets show the supremacy of our approach compared to other methods.
Octagonal lattice-based triangulated shape descriptor engaging second-order derivatives supplementing image retrieval
2024, Journal of Visual Communication and Image Representation
Erstwhile shape description schemes lack primarily in establishing trade-offs with accuracy and computational load. Accordingly, a lightweight shape descriptor offering precise definition and compaction of high-frequency features is contributed in this paper using a simple geometrical shape for localization and shape characterization. Initially, the input image is octagonally tessellated and triangularly decomposed into sub-regions whose side-wise differences are evaluated and subjected to second-order differentiation to produce three high-frequency values representing triangle corners. The resultant is processed by the law of sines to yield localized shape features exhibiting congruence and is reiterated on the residual regions, followed by a novel octal encoding scheme encompassing maximal variations in the localized regions. The resulting features are globally fabricated into shape histograms in a non-overlapping manner representing the shape vector. This scheme validated on widely popular benchmark shape datasets demonstrates superior retrieval and recognition accuracies greater than 93% which is lacking in its competitors.
An algorithm for extracting similar segments of moving target trajectories based on shape matching
2024, Engineering Applications of Artificial Intelligence
Trajectory similarity analysis of moving target is the foundation for mining high-value and regular behavioral information such as motion preferences, activity hotspots and frequent paths. Unlike most trajectory similarity analysis methods aimed at discovering correlations of target activities in time, space or spatio-temporal domains, this paper focuses on the shape matching of target trajectories. If some specific shapes frequently appear in historical trajectories, extracting these local shapes would be beneficial for analyzing the target motion templates and behavior modes. Trajectory segments with similar shapes may not have spatio-temporal correlation, and the shapes also have geometric transformation characteristics such as rotation, scaling and translation. Since the existing trajectory similarity analysis methods cannot be directly applied, an algorithm for extracting similar segments based on shape matching is proposed. First, a new shape descriptor based on signed barycenter distance (SBD) is established. It describes a trajectory as a one-dimensional shape feature sequence, which has the advantage of low computational complexity. Then, the distributed nearest neighbor search strategy is used in the particle swarm optimization (PSO) method, which aims to accelerate the retrieval of trajectory segments with similar shapes and improve the matching accuracy. Experiments on MPEG-7, handwritten character and maneuvering target simulation trajectory data sets show that compared with the existing typical shape descriptors, SBD shape descriptor has advantages in accuracy and noise insensitivity, and the improved PSO method can efficiently and accurately obtain the local shape matching results.
Capacitive empirical risk function-based bag-of-words and pattern classification processes
2023, Pattern Recognition
This paper proposes capacitive bag-of-words (Cap BoW) modeling and capacitive pattern classification processes for improved generalization ability. The Cap BoW and capacitive pattern classification are realized by introducing the notion of the capacitive empirical risk function (CERF). The CERF is used as a cost function to build a capacitive pattern classification process. The resulted capacitive pattern classification is used in the classification stage of the bag-of-words process, realizing thereby the Cap BoW process. The use of the CERF in building the capacitive pattern classification and Cap BoW processes is demonstrated to achieve simultaneous reduction to the empirical risk function (ERF) and confidence interval, reducing potential overfitting and enhancing the generalization ability of considered models. To have a thorough evaluation, two groups of experiments are carried out. The first group addresses the capacitive pattern classification process for 4 datasets. The second group addresses a more holistic impact of the CERF by using Cap Bow model to build capacitive dual ergodicity limits-based bag-of-words (Cap DEL-BoW) process, which is applied to 5 image datasets. In both groups of experiments, remarkable enhancement is demonstrated with the use of CERF-based pattern classification and Cap BoW processes. Comparison with corresponding conventional non-capacitive models demonstrates the tangible enhancement with the use of CERF-based models.
Assessing polygonal approximations: A new measurement and a comparative study
2023, Pattern Recognition
Two proposals related to the evaluation of polygonal approximations are presented in this document. First, a new measurement, called normalized compression ratio and adjustment error ( $NCA$ ), to provide a fair evaluation of the performance of the polygonal approximations of 2D closed curves is proposed. Second, a new methodology for evaluation of measurements for assessing polygonal approximations is also proposed. This methodology is based on the optimal quality curve concept, which can characterize the performance of the measurements. A simple visual analysis of the optimal quality curve allows possible drawbacks or weaknesses of the measurement to be detected. The new evaluation methodology is used to compare the performance of the proposed $NCA$ and the most popular measurements, such as Rosin’s Merit, $FOM$ or versions of $FOM$ . Experiments show that $NCA$ obtains the best results and, therefore, may be used to fairly evaluate the performance of polygonal approximations.
Multi-level contour combination features for shape recognition
2023, Computer Vision and Image Understanding
We present a novel multi-level contour combination feature for shape recognition. This combination feature effectively solves large intra-class changes and nonlinear deformations of object shapes, thereby enhancing the performance of shape recognition. First, we divide the shape contour into two levels: the sampling points and the contour fragments, where sampling points are used to describe the detailed information of a shape and contour fragments are used to represent the global feature of a shape. Second, we employ the Fisher vector (FV) approach to encode the local sampling point feature and contour fragment feature as high-level characteristics. Finally, we combine the high-level characteristics after FV encoding and perform shape recognition through a linear support vector machine (SVM) model. The proposed method has been assessed on three benchmark shape datasets, including the Animal, MPEG-7,and ETH-80 datasets. Our method achieves 92.70%, 99.26% and 98.32% classification accuracy on the Animal, MPEG-7, and ETH-80 datasets, respectively. In addition, our method can also be applied to the classification of objects in real-word scenes. We combine the Weizmann Horse and the ETHZ Cow real-world scene datasets, and our method achieves 99.25% classification accuracy on the combined dataset. The recognition results of our approach are better than prior state-of-the-art shape recognition methods, which demonstrate the effectiveness and superiority of our approach.

View all citing articles on Scopus

Xinggang Wang received the B.E. degree in electronic information engineering from Huazhong University of Technology and Science, Wuhan, China, in 2009. He is currently working toward the Ph.D. degree in Department of Electronic Information Engineering from Huazhong University of Technology and Science. From May 2010 to July 2011, he was with the Department of Computer and Information Science, Temple University, Philadelphia, PA., as a visiting scholar. From February 2013 to present, he is with the University of California, Los Angeles, as a visiting graduate researcher. He is a reviewer of Pattern Recognition. His research interests include object recognition and shape analysis in computer vision, and machine learning.

Bin Feng received the B.S. and Ph.D. degrees from Huazhong University of Science and Technology (HUST), Wuhan, China, in 2001 and 2006, respectively, all in electronics and information engineering. He is currently an Associate Professor with the Department of Electronics and Information Engineering, HUST. From February 2009 to January 2010, he was with the Department of Electronics and Computer Science, Hongkong University of Science and Technology (HKUST) as a visiting researcher. His research interests include computer vision and intelligent video processing.

Xiang Bai received the B.S., M.S., and Ph.D. degrees from Huazhong University of Science and Technology (HUST), Wuhan, China, in 2003, 2005, and 2009, respectively, all in electronics and information engineering. He is currently an Associate Professor with the Department of Electronics and Information Engineering, HUST. From January 2006 to May 2007, he was with the Department of Computer and Information Science, Temple University, Philadelphia, PA. From October 2007 to October 2008, he was with the University of California, Los Angeles, as a joint Ph.D. student. His research interests include computer graphics, computer vision, and pattern recognition.

Wenyu Liu received the B.S. degree in computer science from Tsinghua University, Beijing, China, in 1986 and the M.S. and Ph.D. degrees in electronics and information engineering from Huazhong University of Science and Technology (HUST), Wuhan, China, in 1991 and 2001, respectively. He is currently a Professor and Associate Dean of the Department of Electronics and Information Engineering, HUST. His current research interests include multimedia information processing and computer vision.

Longin Jan Latecki received the Ph.D. degree in computer science from Hamburg University, Germany, in 1992. He is a professor of computer science at Temple University, Philadelphia. His main research interests include shape representation and similarity, object detection and recognition in images, robot perception, machine learning, and digital geometry. He has published 200 research papers and books. He is an editorial board member of Pattern Recognition and the International Journal of Mathematical Imaging. He received the annual Pattern Recognition Society Award, together with Azriel Rosenfeld, for the best article published in the journal Pattern Recognition in 1998. He is the recipient of the 2000 Olympus Prize, the main annual award from the German Society for Pattern Recognition (DAGM). He is a senior member of the IEEE.

View full text

Bag of contour fragments for robust shape classification

Highlights

Abstract

Introduction

Section snippets

Related work

Bag of contour fragments

Experiments

Conclusions

Conflict of interest

Acknowledgements

Pattern Recognit.

Comput. Vis. Image Underst.

Pattern Recognit.

Comput. Vis. Image Underst.

Pattern Recognit. Lett.

Comput. Vis. Image Underst.

Pattern Recognit.

Signal Process.Image Commun.

Pattern Recognit. Lett.

Pattern Recognit.

Pattern Recognit.

Contour detection and hierarchical image segmentation

IEEE Trans. Pattern Anal. Mach. Intell.

Shape matching and object recognition using shape contexts

IEEE Trans. Pattern Anal. Mach. Intell.

Shape classification using the inner-distance

IEEE Trans. Pattern Anal. Mach. Intell.

Efficient and robust retrieval by shape content through curvature scale space

Ser. Softw. Eng. Knowl. Eng.

Distinctive image features from scale-invariant keypoints

Int. J. Comput. Vis.

Face description with local binary patternsapplication to face recognition

IEEE Trans. Pattern Anal. Mach. Intell.

The Pascal visual object classes (voc) challenge

Int. J. Comput. Vis.

Learning shape-classes using a mixture of tree-unions

IEEE Trans. Pattern Anal. Mach. Intell.

Shock graphs and shape matching

Int. J. Comput. Vis.

Path similarity skeleton graph matching

IEEE Trans. Pattern Anal. Mach. Intell.