Elsevier

Pattern Recognition

Volume 44, Issue 5, May 2011, Pages 1089-1105
Pattern Recognition

Features extraction from hand images based on new detection operators

https://doi.org/10.1016/j.patcog.2010.08.007Get rights and content

Abstract

Human hand shape features extraction from image frame sequences is one of the key steps in human hand 2D/3D tracking system and human hand shape recognition system. In order to satisfy the need of human hand tracking in real time, a fast and accurate method for acquirement of edge features from human hand images without consideration of hand over face is put forward in this paper. The proposed approach is composed of two steps, the coarse location phase (CLP) and the refined location phase (RLP) from coarseness to refinement. In the phase of CLP, the hand contour is approximately described by a polygon with concave and convex, an approach to obtaining hand shape polygon using locating points and locating lines is meticulously discussed. Then, a coarse location (CL) algorithm for extraction of interested hand shape features, such as contour, fingertips, roots of fingers, joints and the intersection of knuckle on different fingers, is proposed. In the phase of RLP, a multi-scale approach is introduced into our study to refine the features obtained by the CL algorithm. By means of defining the response strength of different types of features, a refined location (RL) algorithm is proposed. The major contribution of this paper is that the novel detection operators for features of hand images are presented in the above two steps, which have been successfully applied to our 3D hand shape tracking system and 2D hand shape recognition system. A number of comparative studies with real images and online videos demonstrate that the proposed method can extract the three defined human hand image features with high accuracy and high speed.

Introduction

Selecting good features has received extensive and intensive attention in the past few years in the fields of human hand tracking [1], [2], [3], posture recognition [4], sign language recognition [5], [6] and TV control [7]. For example, the use of posture recognition provides an attractive alternative to the cumbersome interface devices for human–computer interaction. Vision-based recognition of hand postures in particular, promises a natural and unobtrusive human–computer interaction [8].

Motivation towards the feature extraction problem in this paper is that we need geometric features of hand images, especially the fingertips and finger roots, in our 2D real-time hand recognition system and our 3D freehand real-time tracking system. In these systems, features extraction is one of the essential and key contents because it affects the accuracy and the performance of these systems. The main contribution of this paper is that it presents effective detection operators for different types of features of hand images; the main advantage of the proposed detection operators is that it can extract the requested feature with high accuracy and high speed with independence of datasets, hand postures or masks of hand images.

Section snippets

Related work

It is hard to extract desirable information from posture images since features, such as fingertips, finger directions and hand contours, are not always available and reliable due to self-occlusion and lighting conditions. So, the algorithms that require direct extraction of high level features often rely on markers to extract fingertip joint locations or some anchor points on the palm [9], [10]. Dorfmuller-Ulhaas et al. [11] used special equipment, and users must wear gloves with

Problem definition

For a given human hand image, seek a hand features set Π, ={{Γ(k)}}, k=1, 2,…,K. Π is the contour of the image and is composed of K subsets, each subset is a type of feature, and(k)={r(k)(i)|i=1,2,...,N(k)}where N(k) is the element number of the subset (k) and r(k)(i) is described as the following form:r(k)(i)=maxi,(PiW)(WΠ){Pi}in which ℘ is a detection operator and W is a window.

The problem is to find a uniform mathematical description of operators ℘ and windows W, and each feature r

Hypothesis on human-hand image contours

Based on the physiological features of human hand structure and characteristics of camera imaging, it is useful to approximately make the following two hypotheses on human hand image contours.

  • (1)

    The edge of a finger knuckle is a segment, and an edge of a fingertip is an arc, so the contour of a hand shape is mainly composed of segments and arcs.

  • (2)

    The contour of a hand shape can be described approximately in terms of a concave–convex polygon model (hand shape polygon for short in this paper). The

Refined location for features

Due to the effect of occlusion, light and clutter background, unbalanced distribution of observable image features, the image features are always featured with scale. A real object shows its substantial existence in a certain scale range, therefore, scale space theory provides a reasonable theoretical foundation for treatment of image structure in different scales. The multi-scale method will be used to refine the accuracy of the CL algorithm.

The way to differentiate human hand from human face

The skin color model proposed by Malassiotis et al. [46] is utilized in order to segment skin from frame images, the bounding window of hand in current image is obtained using CamShift algorithm [47], human hand can be distinguished from human face according to the ratio of length to width of human hand, which is 0.5–0.85 in this paper. So feature extraction can be performed in the tight bounding box of hand postures.

Experimental results and analysis

The experiments are conducted under the basic configuration of our personal computer with one 1.8 GHz Pentium 4 CPU and 256 M Memory.

Applications of OM

The feature extraction approach put forward in this paper has been applied to our online real-time 2D hand shape recognition system shown in Fig. 17(a) and our online 3D freehand tracking system shown in Fig. 17(b).

Fig. 17(c) and (d) shows that both the accuracy and the time cost of OM are superior to that of CM within the consecutive 40 frames, which is stochastically selected in this experiment. In Fig. 17(c), the average accuracy of OM remains steady while that of CM decreases compared with

Conclusion and future prospect

Hand postures are a powerful human-to-computer communication modality. The expressiveness of hand postures and natural freehand tracking can be explored to achieve natural HCI.

Research on fast and accurate approaches used to obtain observation features is also one of the essential and key contents in our 3D freehand tracking system and 2D hand recognition system. This paper contains a preliminary discussion on this question both in theory and in practice.

Unlike adopting a certain techniques in

Acknowledgements

This paper is supported by NSFC (No.60973093, No.61070130), the Natural Science Foundation for Distinguished Youth Scholar of Shandong Province (No.JQ200820), the Natural Science Foundation of Shandong Province (Y2007G39), Key Project of Natural Science Foundation of Shandong Province (2006G03) and the Science and Technology Plan of Shandong Province Education Department (J07YJ18).

Zhiquan Feng is a professor of School of Information Science and Engineering, Jinan University. He got the Master degree from northwestern polytechnical university, china in 1995, and Ph.D. degree from Computer Science and Engineering Department, Shandong University in 2006. He has published more than 50 papers on international journals, national journals, and conferences in recent years. His research interests include: human hand tracking/recognition/interaction, virtual reality,

References (47)

  • T. Shanableh et al.

    Spatio-temporal feature-extraction techniques for isolated posture recognition in Arabic sign language

    IEEE Transactions on Systems, Man, and Cybernetics, Part B

    (2007)
  • S. Lenman et al.

    Computer Vision Based Hand Posture Interfaces for Human–Computer Interaction[R]

    (2002)
  • Ali Erol et al.

    A Review on Vision-Based Full DOF Hand Motion Estimation

    Proceedings of the IEEE Workshop on Vision for Human-Computer Interaction (V4HCI)

    (2005)
  • Klaus Dorfmuller-Ulhaas, Dieter Schmalstieg, Finger tracking for interaction in augmented environments, in:...
  • C. Maggioni
  • J.H. Usabiaga, Global hand pose estimation by multiple camera ellipse tracking, Master Thesis, Department of Computer...
  • H. Kim et al.

    Interaction with hand posture for a back projection wall

    Computer Graphics International

    (2004)
  • Robert Y. Wang et al.

    Real-time hand-tracking with a color glove

    ACM Transactions on Graphics (SIGGRAPH 2009

    (2009)
  • J. Letessier et al.

    Visual Tracking of Bare Fingers for Interactive Surfaces. UIST ’04

    17th Annual ACM Symposium on User Interface Software and Technology

    (2004)
  • H. Koike et al.

    Integrating paper and digital information on enhanced desk: a method for real time finger tracking on an augmented desk system

    ACM Transactions on Computer-Human Interaction

    (2001)
  • R. O’Hagan, A. Zelinsky, Finger track—a robust and real-time posture interface, in: Proceedings of the 10th Australian...
  • J. Crowley et al.

    Finger tracking as an input device for augmented reality

    International Workshop on Posture and Face Recognition

    (1995)
  • C. Von Hardenberg et al.

    Bare-hand human–computer interaction

    Proceedings of Perceptual User Interfaces

    (2001)
  • Cited by (51)

    • A robust context attention network for human hand detection

      2022, Expert Systems with Applications
      Citation Excerpt :

      Then, the most promising candidates were applied to build detectors. Another approach proposed by Feng et al. (2011) is composed of two steps, the coarse location phase (CLP) and the refined location phase (RLP) from coarse to refined. The approach achieves high accuracy and high speed but the human hand image features are defined by templates.

    • Motion-towards-each-other-based hand gesture initialization

      2015, Pattern Recognition
      Citation Excerpt :

      The initialization procedure is shown in Fig. 4. The first part contains image features from videos, such as fingertips, finger roots, and hand contours [17]. The second part contains projections of temporary 3D hand models onto the frame images, and the third part contains temporary 3D hand models.

    • Multi-instance Finger Knuckle Print Recognition based on Fusion of Local Features

      2022, International Journal of Advanced Computer Science and Applications
    View all citing articles on Scopus

    Zhiquan Feng is a professor of School of Information Science and Engineering, Jinan University. He got the Master degree from northwestern polytechnical university, china in 1995, and Ph.D. degree from Computer Science and Engineering Department, Shandong University in 2006. He has published more than 50 papers on international journals, national journals, and conferences in recent years. His research interests include: human hand tracking/recognition/interaction, virtual reality, human–computer interaction and image processing.

    Bo Yang is a professor and vice-president of University of Jinan, Jinan, China. He is the Director of the Provincial Key Laboratory for Network-based Intelligent Computing and also acts as the Associate Director of Shandong Computer Federation, and Member of the Technical Committee of Intelligent Control of Chinese Association of Automation. His main research interests include computer networks, artificial intelligence, machine learning, knowledge discovery, and data mining. He has published numerous papers and gotten some of important scientific awards in this area.

    Yuehui Chen received his B.Sc. degree in the Department of mathematics (major in Control Theory) from the Shandong University in 1985, and Master and Ph.D. degree in the School of Electrical Engineering and Computer Science from the Kumamoto University of Japan in 1999 and 2001. During 2001–2003, he had worked as the Senior Researcher at the Memory-Tech Corporation, Tokyo. Since 2003 he has been a member at the Faculty of School of Information Science and Engineering, Jinan University, where he currently heads the Computational Intelligence Laboratory. His research interests include Evolutionary Computation, Neural Networks, Fuzzy Logic Systems, Hybrid Computational Intelligence, Computational Intelligence Grid and their applications in time-series prediction, system identification, intelligent control, intrusion detection systems, web intelligence, bioinformatics and systems biology. He is the author and co-author of more than 100 technique papers.

    View full text