Features extraction from hand images based on new detection operators

doi:10.1016/j.patcog.2010.08.007

Pattern Recognition

Volume 44, Issue 5, May 2011, Pages 1089-1105

https://doi.org/10.1016/j.patcog.2010.08.007 Get rights and content

Abstract

Human hand shape features extraction from image frame sequences is one of the key steps in human hand 2D/3D tracking system and human hand shape recognition system. In order to satisfy the need of human hand tracking in real time, a fast and accurate method for acquirement of edge features from human hand images without consideration of hand over face is put forward in this paper. The proposed approach is composed of two steps, the coarse location phase (CLP) and the refined location phase (RLP) from coarseness to refinement. In the phase of CLP, the hand contour is approximately described by a polygon with concave and convex, an approach to obtaining hand shape polygon using locating points and locating lines is meticulously discussed. Then, a coarse location (CL) algorithm for extraction of interested hand shape features, such as contour, fingertips, roots of fingers, joints and the intersection of knuckle on different fingers, is proposed. In the phase of RLP, a multi-scale approach is introduced into our study to refine the features obtained by the CL algorithm. By means of defining the response strength of different types of features, a refined location (RL) algorithm is proposed. The major contribution of this paper is that the novel detection operators for features of hand images are presented in the above two steps, which have been successfully applied to our 3D hand shape tracking system and 2D hand shape recognition system. A number of comparative studies with real images and online videos demonstrate that the proposed method can extract the three defined human hand image features with high accuracy and high speed.

Introduction

Selecting good features has received extensive and intensive attention in the past few years in the fields of human hand tracking [1], [2], [3], posture recognition [4], sign language recognition [5], [6] and TV control [7]. For example, the use of posture recognition provides an attractive alternative to the cumbersome interface devices for human–computer interaction. Vision-based recognition of hand postures in particular, promises a natural and unobtrusive human–computer interaction [8].

Motivation towards the feature extraction problem in this paper is that we need geometric features of hand images, especially the fingertips and finger roots, in our 2D real-time hand recognition system and our 3D freehand real-time tracking system. In these systems, features extraction is one of the essential and key contents because it affects the accuracy and the performance of these systems. The main contribution of this paper is that it presents effective detection operators for different types of features of hand images; the main advantage of the proposed detection operators is that it can extract the requested feature with high accuracy and high speed with independence of datasets, hand postures or masks of hand images.

Section snippets

Related work

It is hard to extract desirable information from posture images since features, such as fingertips, finger directions and hand contours, are not always available and reliable due to self-occlusion and lighting conditions. So, the algorithms that require direct extraction of high level features often rely on markers to extract fingertip joint locations or some anchor points on the palm [9], [10]. Dorfmuller-Ulhaas et al. [11] used special equipment, and users must wear gloves with

Problem definition

For a given human hand image, seek a hand features set $ℕ \subseteq Π$ , $ℕ = {{Γ_{(k)}}}$ , k=1, 2,…,K. $Π$ is the contour of the image and $ℕ$ is composed of K subsets, each subset is a type of feature, and $ℕ_{(k)} = {r_{(k)}^{(i)} | i = 1, 2, ..., N_{(k)}}$ where N_(k) is the element number of the subset $ℕ_{(k)}$ and $r_{(k)}^{(i)}$ is described as the following form: $r_{(k)}^{(i)} = \max_{i, (P_{i} \in W) \land (W \subseteq Π)} ℘ {P_{i}}$ in which ℘ is a detection operator and W is a window.

The problem is to find a uniform mathematical description of operators ℘ and windows W, and each feature $r$

Hypothesis on human-hand image contours

Based on the physiological features of human hand structure and characteristics of camera imaging, it is useful to approximately make the following two hypotheses on human hand image contours.

(1)
The edge of a finger knuckle is a segment, and an edge of a fingertip is an arc, so the contour of a hand shape is mainly composed of segments and arcs.
(2)
The contour of a hand shape can be described approximately in terms of a concave–convex polygon model (hand shape polygon for short in this paper). The

Refined location for features

Due to the effect of occlusion, light and clutter background, unbalanced distribution of observable image features, the image features are always featured with scale. A real object shows its substantial existence in a certain scale range, therefore, scale space theory provides a reasonable theoretical foundation for treatment of image structure in different scales. The multi-scale method will be used to refine the accuracy of the CL algorithm.

The way to differentiate human hand from human face

The skin color model proposed by Malassiotis et al. [46] is utilized in order to segment skin from frame images, the bounding window of hand in current image is obtained using CamShift algorithm [47], human hand can be distinguished from human face according to the ratio of length to width of human hand, which is 0.5–0.85 in this paper. So feature extraction can be performed in the tight bounding box of hand postures.

Experimental results and analysis

The experiments are conducted under the basic configuration of our personal computer with one 1.8 GHz Pentium 4 CPU and 256 M Memory.

Applications of OM

The feature extraction approach put forward in this paper has been applied to our online real-time 2D hand shape recognition system shown in Fig. 17(a) and our online 3D freehand tracking system shown in Fig. 17(b).

Fig. 17(c) and (d) shows that both the accuracy and the time cost of OM are superior to that of CM within the consecutive 40 frames, which is stochastically selected in this experiment. In Fig. 17(c), the average accuracy of OM remains steady while that of CM decreases compared with

Conclusion and future prospect

Hand postures are a powerful human-to-computer communication modality. The expressiveness of hand postures and natural freehand tracking can be explored to achieve natural HCI.

Research on fast and accurate approaches used to obtain observation features is also one of the essential and key contents in our 3D freehand tracking system and 2D hand recognition system. This paper contains a preliminary discussion on this question both in theory and in practice.

Unlike adopting a certain techniques in

Acknowledgements

This paper is supported by NSFC (No.60973093, No.61070130), the Natural Science Foundation for Distinguished Youth Scholar of Shandong Province (No.JQ200820), the Natural Science Foundation of Shandong Province (Y2007G39), Key Project of Natural Science Foundation of Shandong Province (2006G03) and the Science and Technology Plan of Shandong Province Education Department (J07YJ18).

Zhiquan Feng is a professor of School of Information Science and Engineering, Jinan University. He got the Master degree from northwestern polytechnical university, china in 1995, and Ph.D. degree from Computer Science and Engineering Department, Shandong University in 2006. He has published more than 50 papers on international journals, national journals, and conferences in recent years. His research interests include: human hand tracking/recognition/interaction, virtual reality,

References (47)

M Bray et al.
Smart particle filtering for high-dimensional tracking
Computer Vision and Image Understanding
(2007)
S. Malassiotis et al.
Real-time hand posture recognition using range data
Image and Vision Computing
(2008)
S.S. Ge et al.
Hand posture recognition and tracking based on distributed locally linear embedding
Image and Vision Computing
(2008)
R.G. O’Hagan et al.
Visual Posture Interfaces for Virtual Environments
Interacting with Computers
(2002)
Xiaoming Yin et al.
Finger identification and hand posture recognition for human–robot interaction
Image and Vision Computing
(2007)
Paul Smith et al.
Resolving hand over face occlusion
Image and Vision Computing
(2007)
Motomasa Tomida, Kiyoshi Hoshino. 3D hand posture estimation with single camera by two-stage searches from database,...
Feng Zhiquan et al.
Research on human hand tracking aiming at improving its accurateness
Journal of Computer Research and Development
(2008)
K. Oka, Y. Sato, H. Koike, Real-time tracking of multiple fingertips and posture recognition for augmented desk...
Chunku Wang, Wen Gao, Shiguang Shan, An approach based on phonemes to large vocabulary Chinese sign language...

T. Shanableh et al.

Spatio-temporal feature-extraction techniques for isolated posture recognition in Arabic sign language

IEEE Transactions on Systems, Man, and Cybernetics, Part B

(2007)

S. Lenman et al.

Computer Vision Based Hand Posture Interfaces for Human–Computer Interaction[R]

(2002)

Ali Erol et al.

A Review on Vision-Based Full DOF Hand Motion Estimation

Proceedings of the IEEE Workshop on Vision for Human-Computer Interaction (V4HCI)

(2005)

Klaus Dorfmuller-Ulhaas, Dieter Schmalstieg, Finger tracking for interaction in augmented environments, in:...

C. Maggioni

J.H. Usabiaga, Global hand pose estimation by multiple camera ellipse tracking, Master Thesis, Department of Computer...

H. Kim et al.

Interaction with hand posture for a back projection wall

Computer Graphics International

(2004)

Robert Y. Wang et al.

Real-time hand-tracking with a color glove

ACM Transactions on Graphics (SIGGRAPH 2009

(2009)

J. Letessier et al.

Visual Tracking of Bare Fingers for Interactive Surfaces. UIST ’04

17th Annual ACM Symposium on User Interface Software and Technology

(2004)

H. Koike et al.

Integrating paper and digital information on enhanced desk: a method for real time finger tracking on an augmented desk system

ACM Transactions on Computer-Human Interaction

(2001)

R. O’Hagan, A. Zelinsky, Finger track—a robust and real-time posture interface, in: Proceedings of the 10th Australian...

J. Crowley et al.

Finger tracking as an input device for augmented reality

International Workshop on Posture and Face Recognition

(1995)

C. Von Hardenberg et al.

Bare-hand human–computer interaction

Proceedings of Perceptual User Interfaces

(2001)

Cited by (51)

A robust context attention network for human hand detection
2022, Expert Systems with Applications
Citation Excerpt :
Then, the most promising candidates were applied to build detectors. Another approach proposed by Feng et al. (2011) is composed of two steps, the coarse location phase (CLP) and the refined location phase (RLP) from coarse to refined. The approach achieves high accuracy and high speed but the human hand image features are defined by templates.
Current state-of-the-art approaches for human hand detection have achieved great success by making good use of multiscale and contextual information but still remain unsatisfactory for hand detection, especially in complex scenarios. The main reason is that there are some parts similar to human hands, such as wrists, faces and feet. Simply using contextual information makes it difficult to address these problems. In this paper, we propose a Context Attention Feature Pyramid Network (CA-FPN) for human hand detection. In this method, a novel Context Attention Module (CAM) is inserted into the feature pyramid networks. The CAM is designed to capture relative contextual information for hands and build long-range dependencies around hands. Our CA-FPN can achieve state-of-the-art results on two public hand detection datasets: the Oxford and Vision for Intelligent and Applications (VIVA) datasets. Furthermore, the inference time of our CA-FPN is approximately 8.5 FPS on one TITAN X GPU, indicating that it can be used in real-time applications. Besides, the CAM helps improve head detection on the HollywoodHeads dataset, demonstrating its robustness in other detection tasks. The code has been made available at https://github.com/IC-LAB/CA-FPN.
An HCI paradigm fusing flexible object selection and AOM-based animation
2016, Information Sciences
The use of three-dimensional (3D) gesture input devices is important and necessary in 3D systems, but such devices face considerable challenges posed by the high dimensionality of dexterous hand motion. The objective of this study is to achieve real-time interaction in object selection and direct manipulation in 3D application systems by capturing and visualizing the interaction intentions and probing the cognitive behavior models of users. An interactive operation procedure is divided into three stages: object selection, manipulation and reset. Trajectory scene interaction (TSI) is proposed for object selection starting from a fixed position called a forward point (FP). The manipulations exerted on the selected object include grasping and translation. After these manipulations, the gesture is reset to the FP. This work offers four novel contributions. First, flexible object selection and atomic operation model (AOM)-based animations are fused to form a uniform, real-time human-computer interaction (HCI) paradigm. Second, a cognitive behavior model is proposed for recognizing and reacting to hand gestures as captured by a monocular camera. Third, an approach to capturing, expressing, and probing a user's interaction intention is presented. Fourth, a 3D real-time gesture input interface is achieved. The use of the proposed HCI interface, which offers fast speed, satisfactory accuracy and a responsive user experience, is demonstrated in virtual assembly, a game of chess, dialing a cell phone number and menu operation.
A novel finger and hand pose estimation technique for real-time hand gesture recognition
2016, Pattern Recognition
This paper presents a high-level hand feature extraction method for real-time gesture recognition. Firstly, the fingers are modelled as cylindrical objects due to their parallel edge feature. Then a novel algorithm is proposed to directly extract fingers from salient hand edges. Considering the hand geometrical characteristics, the hand posture is segmented and described based on the finger positions, palm center location and wrist position. A weighted radial projection algorithm with the origin at the wrist position is applied to localize each finger. The developed system can not only extract extensional fingers but also flexional fingers with high accuracy. Furthermore, hand rotation and finger angle variation have no effect on the algorithm performance. The orientation of the gesture can be calculated without the aid of arm direction and it would not be disturbed by the bare arm area. Experiments have been performed to demonstrate that the proposed method can directly extract high-level hand feature and estimate hand poses in real-time.
Motion-towards-each-other-based hand gesture initialization
2015, Pattern Recognition
Citation Excerpt :
The initialization procedure is shown in Fig. 4. The first part contains image features from videos, such as fingertips, finger roots, and hand contours [17]. The second part contains projections of temporary 3D hand models onto the frame images, and the third part contains temporary 3D hand models.
Initialization for a 3D hand model is significant in determining a 3D hand model that corresponds to a user׳s 3D hand pose in the initial frame. We propose a motion-towards-each-other-based human–computer cooperation algorithm with the aim to reduce the operational burden on the user and cause the computer׳s response to the user׳s operation to fall within acceptable scopes. The computer first recognizes the user׳s interactive intentions through the hand gesture sensing algorithm and at last cooperates with the user until they achieve the meeting point. The main contributions of this study are sensing the operator׳s interactive intentions and focusing on the motion-towards-each-other-based interactive cooperation mechanisms between a human and a computer to achieve low cognitive and operational burdens. The proposed approach can effectively reduce users׳ cognitive and operational burdens. The proposed initialization system is successfully applied to several application systems.
A review on computational methods based automated sign language recognition system for hearing and speech impaired community
2023, Concurrency and Computation: Practice and Experience
Multi-instance Finger Knuckle Print Recognition based on Fusion of Local Features
2022, International Journal of Advanced Computer Science and Applications

View all citing articles on Scopus

Bo Yang is a professor and vice-president of University of Jinan, Jinan, China. He is the Director of the Provincial Key Laboratory for Network-based Intelligent Computing and also acts as the Associate Director of Shandong Computer Federation, and Member of the Technical Committee of Intelligent Control of Chinese Association of Automation. His main research interests include computer networks, artificial intelligence, machine learning, knowledge discovery, and data mining. He has published numerous papers and gotten some of important scientific awards in this area.

Yuehui Chen received his B.Sc. degree in the Department of mathematics (major in Control Theory) from the Shandong University in 1985, and Master and Ph.D. degree in the School of Electrical Engineering and Computer Science from the Kumamoto University of Japan in 1999 and 2001. During 2001–2003, he had worked as the Senior Researcher at the Memory-Tech Corporation, Tokyo. Since 2003 he has been a member at the Faculty of School of Information Science and Engineering, Jinan University, where he currently heads the Computational Intelligence Laboratory. His research interests include Evolutionary Computation, Neural Networks, Fuzzy Logic Systems, Hybrid Computational Intelligence, Computational Intelligence Grid and their applications in time-series prediction, system identification, intelligent control, intrusion detection systems, web intelligence, bioinformatics and systems biology. He is the author and co-author of more than 100 technique papers.

View full text

Features extraction from hand images based on new detection operators

Abstract

Introduction

Section snippets

Related work

Problem definition

Hypothesis on human-hand image contours

Refined location for features

The way to differentiate human hand from human face

Experimental results and analysis

Applications of OM

Conclusion and future prospect

Acknowledgements

Computer Vision and Image Understanding

Image and Vision Computing

Image and Vision Computing

Interacting with Computers

Image and Vision Computing

Image and Vision Computing

Research on human hand tracking aiming at improving its accurateness

Journal of Computer Research and Development

Spatio-temporal feature-extraction techniques for isolated posture recognition in Arabic sign language

IEEE Transactions on Systems, Man, and Cybernetics, Part B

Computer Vision Based Hand Posture Interfaces for Human–Computer Interaction[R]

A Review on Vision-Based Full DOF Hand Motion Estimation

Proceedings of the IEEE Workshop on Vision for Human-Computer Interaction (V4HCI)

Interaction with hand posture for a back projection wall

Computer Graphics International

Real-time hand-tracking with a color glove

ACM Transactions on Graphics (SIGGRAPH 2009

Visual Tracking of Bare Fingers for Interactive Surfaces. UIST ’04

17th Annual ACM Symposium on User Interface Software and Technology

Integrating paper and digital information on enhanced desk: a method for real time finger tracking on an augmented desk system

ACM Transactions on Computer-Human Interaction

Finger tracking as an input device for augmented reality

International Workshop on Posture and Face Recognition

Bare-hand human–computer interaction

Proceedings of Perceptual User Interfaces