Using the gesture for natural human computer interaction became one of the important issues in recent few decades, to encompass human life appliance. Sign languages usually used among people to explain specific meaning or deliver a meaningful message, for this reason gestures motivate to simulate the natural interaction between humans but in this time between human and computers by modeling, analysing the gesture, and finally recognize it. This paper present a study of hand gesture recognition systems by investigating different algorithms, and tools used for adopting hand gesture recognition system in various system stages starting from image pre-processing to segment that hand and methods applied to capture hand shape to extract necessary feature, and terminating with suitable recognition algorithm in this work gestures modeling, analysis and recognition are demonstrated in detail with challenges that obstacle the performance of recognition system. Comparisons of different gesture recognition factors are shown as well.
Keywords |
Human Computer Interaction (HCI), Hand Gesture, Segmentation, Geometric Feature, Features
Extraction, Gesture Recognition. |
INTRODUCTION |
With the diffusion of the modern virtual applications the needs for understanding and dive deeply in human gesture
recognition field especially the science that has various applications in our life. For life appliance, gestures have
various domestic appliances [1][18] such as television control using hand gestures [2], domestic robot control [1],
digital photo album [1], and other household’s devices such as washing machine, refrigerator, vacuum cleaner, stove
top as mentioned in [1]. TV channels can be switched on and off, turned on and off and changing the volume [2].
Domestic robot is an example for controlling robot using speech and gesture [1], vocal commands are interpreted
according to speech vocabulary already stored, and gestures are detected by tracking the hand [1][22]. Digital photo
album (GIA) where the fingertip is used to manipulate the photo using some commands (next, edit, slide) using on a
touch sensitive screen [1]. Washing machine has some commands to set such as rotation speed [1][7][8], wash timer,
spin dry timer, etc. gestures have to control these commands in a close distance [1]. |
In this work a lot of lately algorithms and tools used in various gesture recognition systems are demonstrated with
comparisons of different gesture recognition system factors. Short paper of this work is published in AITCC
Conference [33]. |
The outline of the paper is as follows: Section 2 explains gesture classification system with a detail description of the
system stages. Some recent studies are discussed in detail with a demonstration of the recognition system stages to
make a close view of demonstrated gesture classification system steps. Finally, Discussion and conclusions are
presented. |
II. GESTURE CLASSIFICATION SYSTEM |
The main stages for any gesture recognition system are: extraction technique, features analysis, and finally
classification tool. Figure 2 demonstrates these steps. |
Many internal sub-stages can in included in these steps according to the application [12][14], in each phase different
processing steps can be used [14]. Different processing steps were needed when applied glove based and vision based
for acquiring the data [14], geometric and non-geometric features extraction methods, postures and gestures
classification tools used. In the following sub sections, we will discuss the main two stages only. |
III. HAND EXTRACTION TECHNIQUES: RELATED WORK |
For any gesture system, the first step is to extract the hand object from the entire image and this can be performed by
firstly decided the input device required to collect the data necessary to accomplish a specific task, and secondly
segmented the hand from other unrelated background objects to model the hand. For hand posture and gesture
recognition system different technologies are used for acquiring input data, these technologies are: Instrumented glove
based [9][16][19], computer vision [16][9][20], and marked glove or color markers technologies [14].The instrumented
glove based technology demand the user to wear a special data glove-based device [9][16] which provides a
measurements of hand location [4][6][14], position [4] [9], orientation angles [4] [9][16] and degree of freedom (DOF)
with high accuracy [9][14]. Computer vision technology use the input hand image acquired from one or more cameras
[9][16][14]. And finally the colored marker technology require the user to wear a color glove and depending on the
color hand localization, fingertip, and hand blob are determined easily [14], however, these techniques are used for
modeling the hand as well. Figure 3 shows pictorial representation of these technologies. |
IV. SEGMENTATION TECHNIQUES |
After acquiring the desired hand image in one of the methods mentioned above, the hand needs to be segmented from
other objects. The extraction of the segmented hand represents an important step for the success of any recognition
system [14], and hand modeling process relies on the correct and robust segmentation method [14]. Without doubting,
the color model used and the background have great impact on the success of the segmentation process. Pixels color is
the most important signs that are utilized for separating the human skin pigment [14], other cues can be used for
detecting the hand, and some of these methods are [23]: color pixel information, motion information [10], or both of
these methods are combined to achieve robust hand detection [26]. |
Different studies have addressed the segmentation problem [27], in [24] normalized RGB color model were applied for
skin color detection using chrominance components only to minimize the illumination changes. Authors in [25]
integrated parametric and non-parametric models which are: Gaussian Mixture Model (GMM) and histogram-based
methods to locate the hands, the system trained offline using GMM and normalized r-g color space and tested online
using histogram and HS color model [25], where GMM is affected with lighting variations, then histogram applied to
overcome this problem [25]. It is noticeable that HSI color space has effective performance with the histogram [25]. |
V. GESTURE ANALYSIS |
To analyse the gesture, the system firstly have to performed gesture detection and then extract the important features.
To detect the gesture, the input image should be located [12] by segmenting the region of interest from other unwanted
background objects as previously explained in the segmentation process. Features should be distinguished and not
interfere with each other that can be classified clearly in the recognition stage, as well as when represent in the feature
space domain [14] with minimum erroneousness during the testing step [14]. Parameters can be estimated in
appearance based and 3D model based [12]. For 3D model based, the two crucial parameters are: joint angles [12] and
palm dimensions [12], which require to estimate the initial parameter [12], and to change the parameters according to
the development of hand gesture during the time [12]. Appearance based parameters estimation is: shape analysis [12],
active contour [12], and image motion estimation [10][11]. |
Hasan [28] recognize the hand gesture using geometrical features, the input image is segmented used HSV color model,
and four features has been extracted, these features are; Perpendicular Casted Distance (PD) which represents the
distance between the casted finger’s base and the palm center, Base Angle (BA) which is the angle between the line
formed by finger’s base to palm center and hand direction line HDL, Base(s) Angle (BsA): which is similar to BA
feature but it replaces the HDL with the line formed by palm to finger’s base, and Base Border (BB) that represents the
distance between the vertical casting of a particular boundary pixel that has a higher distance from the mentioned line
and the nearest finger’s base. In [30], we proposed a hand gesture recognition system that applied an innovative
approach to model the hand using variable length chromosomes genetic algorithm where the outcome of this algorithm
is the detailed extraction of hand structure (palm, fingers, and wrist). Palm center coordinate has been located using GA
with a decreasing population size. The wrist and fingers’ reference points are determined to facilitate the extraction of
the important features required for classification purposes [30]. |
Stergiopoulou [15] recognized hand gesture by applying SGONG algorithm to capture the hand shape, and extracted
three features; RC Angle: which represents an angle formed by the hand slope line and the line that joints the root with
centre of the hand, TC Angle: which is the angle formed by the hand slope line and the line that joints the fingertip with
the hand centre, and the distance from the palm centre. For recognition, systems [28] and [30] applied mixture of
Gaussian classifiers whereas [15] applied Gaussian distribution. |
Yang [31] extracted and classified two-dimensional motion in sequence image using motion trajectories. Yang depends
on the idea that the shape of the human hand and face approximately take the ellipses form, the merged the motion
regions and skin color to extract the shape of the hand and the face, and the emergence continuous until the merged
areas approximately become close elliptic shape [31]. Chen et al. [32] extracted the palm and fingers area to recognize
the fingers and identify the gesture accordingly where a rule classifier is applied to identify fingers’ labels; they used
1300 images as a data set. |
VI. DISCUSSION AND CONCLUSIONS |
Gesture system has the potential to simplify the interaction with various life applications [17] ranging from sign
language to virtual environments [29] and different household appliances. The segmentation process plays a major role
in hand detection and gesture recognition. Various techniques have been applied for segmenting the hand; the most common method used is to extract the skin color from the input image. Other studies applied hand motion or both skin
color and motion to extract the hand. The extraction of the hand shape provides a great benefit in features extraction
step. [15] [28] and [30] applied different methods for capturing the shape of the hand to extract hand geometric
features. Other studies used non-geometric features to extract the hand such as silhouette and contour. Various input
devices are available for acquiring hand images and the choice to select a specific input device relies on the demanding
application. Gesture analysis is explained with parameters modeling and estimation required to achieve robust gesture
recognition system are provided as well [11]. Table 2 shows some recognition factors such as recognition rate and
recognition time with number of recognized +gestures of some selected gesture recognition methods. |
Tables at a glance |
|
|
Table 1 |
Table 2 |
|
|
Figures at a glance |
|
|
|
|
Figure 1 |
Figure 2 |
Figure 3 |
Figure 4 |
|