Hand Gesture Recognition Analysis of Various Techniques, Methods and Their Algorithms | Open Access Journals

ISSN ONLINE(2319-8753)PRINT(2347-6710)

Hand Gesture Recognition – Analysis of Various Techniques, Methods and Their Algorithms

R.Pradipa1, Ms S.Kavitha2
  1. PG Scholar, Dept of Computer Science and Engineering, Velammal college of Engineering Technology, ,Madurai ,Tamil Nadu ,India
  2. Assistant Professor, Dept of Computer Science and Engineering, Velammal college ofEngineering Technology,, Madurai ,Tamil Nadu ,India
Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Innovative Research in Science, Engineering and Technology


A Human Computer Interaction(HCI) between computers and human understands human language and develop a user friendly interface. Gestures a non-verbal form of communication provides the HCI interface. The goal of gesture recognition is to create a system which can identify specific human gestures and use them to convey information or for device control. Real-time vision-based hand gesture recognition is considered to be more and more feasible for HCI with the help of latest advances in the field of computer vision and pattern recognition. T his survey papers deals with discussion of various techniques ,methods and algorithms related to the gesture recognition. The hand gesture is the most easy and natural way of communication. Hand gesture recognition has the various advantages of able to communicate with the Technology through basic sign language. The gesture will able to reduce the use of most prominent hardware devices which are used to control the activities of computer.


Glove, Visionbased, Camshift, Segmentation


Hand gestures are spontaneous and powerful communication mode for Human Computer Interaction(HCI). Traditional input devices are available for interaction with computer, such as keyboard, mouse,joystick as well as touch screen; however they do not provide natural interface. The proposed systemwill consist of desktop or laptop interface, the hand gesture may be used by the users may need towear any data glove, or may use the web camera for capturing the hand image. The initial steptowards any hand gesture recognition is hand tracking and segmentation.
Sensor devices are used in Data-Glove based methods for digitizing handand finger motions into multiparametric data. The other sensors will collect handconfiguration and hand movements .In contrast, the Vision Based methods require only acamera, thus realizing a natural interaction between humans and computers without the use of anyextra devices. These systems tend to complement biological vision by describing artificial visionsystems that are implemented in software and/or hardware. The challenging problems ofthese systems need to be background invariant, lighting insensitive, person and cameraindependent to achieve real time performance.Various algorithms used in hand posture and gesture recognition and discusses the advantages and disadvantages ofeach.Various algorithmic techniques for recognizing hand postures andgestures are discussed.The Segmentation is the process of finding a connected region within the image with a specific property such as color or intensity, or a relationship between pixels, that is, a pattern and the algorithms should be adaptable.


The Input which is the Raw data is collected basically in three ways.
The first is to use input devices worn by the user. This measures the various joint angles of the hand and a six degree of freedom (6 DOF) tracking.This consists of one or two instrumented gloves that measuredevice that gathers hand position and orientation data. The second way tocollect rawhand data is to use a computer-visionbased approach by which one or more cameras collect images of the user‟s hands. The cameras grab an arbitrary number of images per second and send the images to image processing routines in order to perform posture and gesture recognition as well to 3D triangulation in order to find the hands‟ position in space. The third way is hybrid approach to collect raw hand data is to combine the previous two methods with the aim of achieving a more accurate level of recognition by using the two data streams to reduce each other‟s error.
A. Instrumented Gloves
Finger movement through various kinds of sensor technology is measured by Instrumented Gloves.Usually on the back of the hand,the sensors are embedded in a glove or placed on it.
Glove-based Input devices are basically categorized based on the production in marketplace and based on their companies.
„Sayre Glove‟ wasdeveloped by Thomas Defanti and Daniel Sandin in a 1977 for the National Endowment of the Arts.
Light-based sensors are used in this glove with flexible tubes with a light source at one end and a photocell at the other. The amount of light that hit the photocells varied as the fingers were bent, thus providing ameasure of finger flexion.Metacarpophalangeal joints of the four fingers and thumb along with the interphalangeal joints of the indexand middle fingers could be measured by the glove, for a total of 7 DOF.
Gary Grimes at Bell Telephone Laboratories invented the‟ Digital Data Entry Glove‟, designed in 1981, was specifically for performing manual data entry using the Single-Hand Manual Alphabet.
It used touch or proximity sensors, “knuckle-bend sensors”, tilt sensors, and inertial sensors to replace a traditional keyboard. To check whether the user‟s thumb was touching another part of the hand or fingers the touch or proximity sensors. Silver-filled conductive rubber pads that sent an electrical signal when they made contact were used. The four knuckle-bend sensors measured the flexion of the joints in the thumb, index finger, and pinkie finger.
The two tilt sensors measured the tilt of the hand in the horizontal plane, and the two inertial sensors measured theforearm twisting and thewrist flexing. The drawback of this glove was that it was developed for a specific task where the recognition of hand signs were based strictly on hardware. Therefore, it was not generic enough to perform robust hand posture and gesture recognition in anyother application other than entry of ASCII characters.
Data-Glove and Z-Glove was developed by VPL Research.
They were designed for applications that required direct object manipulation with the hand, finger spelling, evaluationof hand impairment through the generalpurpose interface devices.
Five to fifteen sensors that measured the flexion of both the metacarpophalangeal joints and proximal interphalangeal joints of the four fingers and thumb for a total of 10 DOF were used. In some cases to measure angles between adjacent fingers abduction sensors were used. These sensors were made up of flexible tubes with a reflective interiorwall, a light source at one end and a photosensitive detector at the other that detected both direct light rays and reflected light rays. Depending on the bending of the tubes, the detector would change its electrical resistance as a function of light intensity.
Space Glove was developed by W Industries in 1991, a unique in thatthe user placed his fingers and thumb through plastic rings that sat between the proximal interphalangeals and the metacarpophalangeal joints. The glove used sensors with twelve bit analog-to-digital onverters that measured the flexion of the metacarpophalangeal joints and the interphalangeal joint of the thumb for a total of six DOF.
The SuperGlove was developed by Nissho Electronics, has a minimumof 10 and a maximum of 16 bend sensors that use a special resistive ink applied to flexible boards sewn into the glove. With its minimal and standard configuration, the SuperGlove measures flexion of both themetacarpophalangeal and proximal interphalangeal joints for all four fingers and thethumb. The glove comes in two different sizesand is available for both the left and right hand.
The CyberGlove can be equipped with either 18 or 22 bend sensors. With 18 sensors, the CyberGlove measures the flexion of the proximal interphalangeal and the metacarpophalangeal joints of the four fingersand the thumb, the abduction/adduction angles between the fingers, radial and palmer abduction, wrist roll, and wrist pitch.
B. Vision-Based Technology
Main difficulties in using glove-based input devices to collection of raw posture and gesture recognition data which is possible only by wearing the gloves by the user and attached to the computer.
This will restrict freedom of movement similar to the traditional interaction methods.
Collection of data for hand posture and gesture recognition requires by vision-based solution consist of four equally important components. The first is the placement and number of cameras used. Placing the cameras is critical because the visibility of the hand or hands being tracked must be maximized for robust recognition. Visibility is important because of the many occlusion problems present in vision-based tracking. The number of cameras used for tracking is another important issue[1].
The number of cameras used for tracking is another important issue.The second component in a vision-based solution for hand posture and gesture recognition is to make the hands more visible to the camera for simpler extraction of hand data[10].
The third component of a vision-based solution for hand gesture and posture recognition is the extraction of features from the stream or streams of raw image data; the fourth component is to apply recognition algorithms to these extracted features.


The raw data collected from a vision- or glove-based data collection system,must be analyzed to determine if any postures or gestures have been recognized. Various algorithms are as follows.
A. Template Matching
The simplest methods for recognizing hand postures is through Template matching[19]. The template matching is a method to check whether a given data record can be classified as a member of a set of stored data records. Recognizing hand postures using template matching has two parts.The first is to create the templates by collecting datavalues for each posture in the posture set.The second part is to find the posture template most closely matching the current data record bycomparing the current sensor readings with the given set.
B. Feature Extraction Analysis
The low-level information from the raw data is analyzed in order to produce higher-level semantic information and areused to recognize postures and gestures is defined as Feature Extraction and Analysis.The system recognized these gestures with over 97% accuracy[26]. It is a robust way to recognize hand postures and gestures[1]. It can be used to recognize both simple hand postures and gestures and also complex ones as well.
C. Active Shapes Model
A technique for locating a feature within a still image is called Active shape models or “smart snakes”. A contour on the image that is roughly the shape of the feature to be tracked is used.
The manipulation of contour is done by moving it iteratively toward nearby edges that deform the contour to fit the feature.
Active shape model is applied to each frame and use the position of the feature in that frame as an initial approximation for the next frame.
D. Principal Component Analysis
A statistical technique for reducing the dimensionality of a data set in which there are many interrelated variables is called Principal Component Analysis where retaining variation in the dataset[6].
Reduction of data set is by transforming the old data to a new set of variables that are ordered so that the first few variables contain most of the variation present in the original variables.
By computing the eigenvectors and eigenvalues of the data set‟s covariance matrix the original data set is transformed.
When dealing with image data is that it is highly sensitive to position, orientation, and scaling of the hand in the image.
E. Linear Fingertip Models
This is a modelthat assumes most finger movements are linear and comprise very little rotational movement. The model uses only the fingertips as input data and permits a model that represents each fingertip trajectory through space as a simple vector[13].
Once the fingertips are detected, their trajectories are calculated using motion correspondence. The postures themselves are modeled from a small training set by storing a motion code, the gesture name, and direction and magnitude vectors for each of the fingertips. The postures are recognized if all the direction and magnitude vectors match (within some threshold) a gesture record in the training set. System testingshowed good recognition accuracy (greater than 90%), but the system did not run inreal time and the posture and gesture set should be expanded to determine if the technique is robust.
F. Casual Analysis
A vision-based recognition technique that stems from work in scene analysis is known as Causal Analysis. The technique extracts information from a video stream by usinghigh-level knowledge about actions in the scene and how they relate to one another and the physical environment.The gesture filters normalize and combine the features and use causal knowledge of how humans interact with objects in the physical world to recognize gestures.The systemcaptures information on shoulder, elbow and wrist joint positions in the image plane. From these positions, the system extracts a feature set that includes wrist accelerationand deceleration, work done against gravity, size of gesture, area between arms, angle between forearms, nearness to body, and verticality. Gesture filters normalize and combine the features and use causal knowledge of how humans interact with objectsin the physical world to recognize gestures such as opening, lifting, patting, pushing,stopping, and clutching.There is no clarity how accurate this method is. This system also has the disadvantage of not using data from the fingers. More research needs to be conducted in order to determine if thistechnique is robust enough to be used in any nontrivial applications


A. Hidden Markov Model
Hidden Markov Models (HMM) model deals with the dynamic aspects of gestures[25]. Gestures are extracted from a sequence of video images by tracking the skin-colour blobs corresponding to the hand into a body– face space centered on the face of the user. The goal is to recognize two classes ofgestures: deictic and symbolic.The image is filtered using a fast look–up indexing table.
After filtering, skin colour pixels are gathered into blobs. Blobs are statistical objects based on the location (x,y) and the colourimetry (Y,U,V) of the skin colour pixels in order to determine homogeneous areas.
B. YUV colour space and camshift algorithm
This method deals with recognition of hand gestures. It is done in the following five steps[1].
First, a digital camera records a video stream of hand gestures.
All the frames are taken into consideration and then using YUV colour space skin colour based segmentation is performed.
The YUV colour system is employed for separating chrominance and intensity. The symbol Y indicates intensity while UV specifies chrominance components.
Now the hand is separated using CAMSHIFT algorithm. Since the hand is the largest connected region, we can segment the hand from the body.
After this is done, the position of the hand centroid is calculated in each frame. This is done by first calculating the zeroth and first moments and then using this information the centroid is calculated.
Now the different centroid points are joint to form a trajectory. This trajectory shows the path of the hand movement and thus the hand tracking procedure is determined.
Using Time Flight Camera
This approach uses x and y-projections of the image and optional depth features for gesture classification. The system uses a 3-D time-offlight (TOF) sensor which has the bigadvantage of simplifying hand segmentation. The gestures used in the system show a goodseparation potential along the two image axes. Hence, the projections of the hand onto the x- and y-axis are used as features for the classification.
D. Naïve Bayes’ Classifier
Naïve Bayes‟ Classifier is an effective and fast method for static hand gesture recognition. It is based on classifying the different gestures according to geometricbasedinvariants which are obtained from image data after segmentation; thus,unlike many other recognition methods, this method is not dependent on skin colour. The gestures are extracted from each frame of the video,with a static background[15]. The first step is to segment and label the objects of interest and to extract geometric invariants from them. Next step is the classification of gestures by using a Knearest neighbor algorithm aided with distance weightingalgorithm (KNNDW) to provide suitable data for a locally weighted Naïve Bayes‟ classifier. Theinvariants of each region of interest are the input vector for this classifier, while the output is the type of the gesture. The final step is to locate the specific properties of the gesture that are needed for processing in the system.
E. 3d Hand Model Based Approach
This three dimensional hand model is based on the 3D kinematic hand model withconsiderable DOF‟s, and try to estimate the hand parameters by comparison between the input images and the possible 2D appearance projected by the 3D hand model[18]. This approach is ideal for realistic interactions in virtual environments.
F. Appearance Based Approach
This method use image features to model the visual appearance of the hand and compare these parameters with the extracted image features from the video input.Real time performance due to the easier 2D image features that are employed. A straightforward and simple approach that isoften utilized[18] .


An efficient hand tracking and segmentation is the key of success towards any gesturerecognition, due to challenges of vision based methods, such as varying lighting condition,complex background and skin color detection; variation in human skin color complexion requiredthe robust development of algorithm for natural interface. Color is very powerful descriptor for object detection.
A. Anticipated Static Gesture Set
Static gesture is a specific posture assigned with meaning. Following are the static gesture setspecified for the proposed system with the specific meaning. Application interface will be provided after recognition of specified posture for action. Simplicity and user friendliness were taken intoconsideration for the design of anticipated posture set. For the mouse cursor movement thecenter of the hand gesture window was passed as a mouse cursor.
B. Hand Segmentation Using HSV Color Space And Sampled Storage Approach
A Novel Approach for Image segmentation algorithm has been developed and tested for greencolor glove. In this approach,color based segmentation was attempted using HSV color space.
The H, S and V separation was done using following equations.
δ= V-min{R,G,B}
C. Hand Segmentation Using Lab Color Space
The Input captures the RGB image which is converted to lab color space. In CIE L* a* b* co-ordinates, where L* defines lightness, a* represent red/green value and b* denotes the blue/yellow color value. a* axis and +a direction shift towards red while along the b* axis +b movement shift toward yellow. Once the image gets converted into a* and b* planes, thresholding was done. Convolution operation was applied on binary images for the segmentation. Morphological processing was done to get the superior hand shape. This algorithm works for skin color detection but it was sensitive for complex background. The steps are as follows
Capture the Image
Read the input image
Convert RGB image into lab color space
Convert the color values in I into color structure specified in cform
Compute the threshold value.
Convert Intensity image into binary image
Performing morphological operations such as erosion.
D. Hand Tracking And Segmentation (HTS) Algorithm
The objective of this algorithm was robust skin color detection and removal of complexbackground. The hand detection and segmentation were attempted. Hand tracking wasdone using mean shift algorithm.Odd frame has been considered for fast processing. For better performance user‟s skin color sample was passed and HSV histogram was created..CamShiftfunction within the OpenCV library is used for tracking and detection. Edge traversal algorithm usage would get fine contour of the hand shape. As dynamic background was considered,while capturing the user‟s gesture after edge detection,there was a possibility to detect unwanted edges from the background. In an attempt to only identify the boundary of user‟s hand edge,traversal algorithm was devised.
Capture the image frames from camera.
Process odd frames, track the hand using CamShift function by providing skin color.
Samples at the run time.
HSV histogram is created and the experimented threshold value is passed to the CamShift function for tracking required hand portion.
Segment the required hand portion from Image.
Find the edges by using Canny edge detection.
Dilate the image.
Erode the image.
Apply edge traversal algorithm to get final contour.


Hand gestures provides an interesting interaction paradigm in a variety of computer applications. The issues related with this recognition techniques are what technology to use for collecting raw data from the hand. Generally, two types oftechnologies are available for collecting this raw data. Thefirst one is a glove input device, which measures a number of joint angles in the hand.Accuracy of a glove input device depends on the type of bend sensortechnology used; usually, the more accurate the glove is, the more expensive it is. The second way of collecting raw data is to use computer vision. In a vision-based solution, one or more cameras placed in the environment record hand movement. By using a hand posture or gesture-based interface, the user does not want to wear the device and be physically attached to the computer. If vision-based solutions can overcome some of their difficulties and disadvantages, they appear to be the best choice for raw data collection. A number of recognition techniques are available such as template matching,feature extraction,active shape models.There are few Segmentation algorithms such as Anticipated Static Gesture Set,Hand Segmentation Using HSV Color Space and Sampled Storage Approach,Hand Tracking and Segmentation (HTS) Algorithm provide the segmentation of given input to be sent for recognition without any noise.The Hand gesture recognition models such as Hidden Markov model,YUV color space model,3D model and Appearance model will detect the input and process them for recognition.


  1. Aarti Malik and Ruchika” Gesture Technology: A Review“Departmentof Electronics and CommunicatioEngineeringnternationalJournal of Electronics and ComputerScience Engineering,ISSN- 2277-1956
  2. Akira Utsumi, TsutoniuMiyasato, FumioKishinoandRyoheiNakatsu, “Real-time Hand GestureRecognitionSystem,” Proc. of ACCV ’95, vol. 11, pp. 249-253,Singapore,1995
  3. Attila Licsár, TamásSzirányi University of Veszprém, “DynamicTraining of Hand Gesture Recognition System” Department ofImage Processing andNeurocomputing, H-8200 Veszpré,23-26Aug. 2004
  4. L. Bretzner and T. Lindeberg, “Relative orientation fromextended sequences of sparse point and line correspondencesusing the affine trifocal tensor,” in Proc. 5th Eur. Conf. ComputerVision, Berlin, Germany, June 1998, vol. 1406, Lecture Notes inComputer Science, pp.141–157, Springer Verlag.
  5. S. Y. Chen, Y. F. Li, and J. W. Zhang, “Vision processing forreal-time 3D data acquisition based on coded structured light,”IEEE Trans. Image Process., vol. 17, no. 2, pp. 167–176, Feb.2008
  6. Dardas, N.H. ;Petriu, E.M.” Hand gesture detection andrecognition using principal component analysis “ComputationalIntelligence for Measurement Systems and Applications(CIMSA), 2011 IEEE International Conference on 19-21 Sept.2011
  7. Doe-HyungLeeKwang-Seok Hong “A Hand Gesture RecognitionSystem Based on Difference Image Entropy”School ofInformation and Communication Engineering,SungkyunkwanUniversity.
  8. P. Garg, N. Aggarwal, and S. Sofat, “Vision based hand gesturerecognition,”inProc. World Acad. Sci., Eng. Technol., vol. 49, pp.972–977,2009
  9. M. Keck and J. W. Davis, “3D occlusion recovery using fewcameras,” in Proc. IEEE Conf. Comput. Vision Pattern Recog.,Anchorage, AK, Jun. 2008, pp. 1–8.
  10. C.Manresa, J.Varona, R.Mas and F. Perales,” Real-Time Hand Tracking and Gesture Recognition for Human-ComputerInteraction”, Electronic Letters on Computer Vision and ImageAnalysis, vol.5, no.3, 2000, pp.96-104.
  11. Y. Ma and X. Ding, “Robust real-time face detection based oncostsensitiveadaboost method”, In Proc. ICME, volume 2, p.465–472, 2003
  12. J. Neumann, C. Fermuller, and Y. Aloimonos, “A hierarchy of cameras for 3Dphotography,” in Proc. Int. Symp. 3D DataProcess. Vis. Transmiss., pp. 2–11, 2002.
  13. K. Oka, Y. Sato and H. Koike, “Real-time tracking ofmultiplefingertips and gesure recognition for augmented deskinterface systems”,In IEEE International Conference onAutomatic Face and Gesture Recognition, 2002.
  14. P. Premaratne and Q. Nguyen, “Consumer electronics controlsystem based on hand gesture moment invariants”, ComputerVision, IET, p.35–41, 2007.
  15. PujanZiaie, Thomas M ̈uller , Mary Ellen Foster , andAloisKnoll“A Na ̈ıve Bayes Classifier with DistanceWeightingfor Hand-Gesture Recognition”Technical University ofMunich,Dept. of Informatics VI, Robotics and EmbeddedSystems,Boltzmannstr. 3, DE-85748 Garching, Germany
  16. RuizeXu, Shengli Zhou, and Wen J. Li, “MEMS AccelerometerBased Nonspecific-UserHand Gesture Recognition”,IEEEsensorsjournal, vol. 12, no. 5, may 2012.
  17. Sheng-Yu PengKanoksakHwei-Jen Lin Kuan-ChingLiA” Real-Time Hand Gesture Recognition System for Daily InformationRetrieval from Internet”Fourth International Conference on Ubi-Media Computing,2011
  18. Sidenbladh, M. Black, and D. Fleet. Stochastic tracking of3Dhuman figures using 2D image motion. In Sixth EuropeanConferenceon Computer Vision, pages II:702–718, Dublin,Ireland, 2000.
  19. Stenger, “Template based Hand Pose recognition usingmultiplecues”, In Proc. 7th Asian Conference on ComputerVision: ACCV 2006.
  20. H.I. Suk, B.K. Sin and S.W. Lee, “Robust Modeling andRecognition of Hand Gestures with Dynamic BayesianNetwork”, IEEE International Conference on PatternRecognition, p.1-4, Dec. 2008
  21. J. Triesch and C. von der Malsburg. “Robust classification ofhandpostures against complex background”. In Int. Conf. on Faceand GestureRecognition, pages 170–175, Killington, Vermont,1996.
  22. P. K. Turaga, R. Chellappa, V. S. Subrahmanian, and O. Udrea,“Machinerecognition of human activities: A survey,” IEEE Trans.Circuits Syst.Video Technol., vol. 18, no. 11, pp. 1473–1488,Nov. 2008.
  23. Wne-Pinn Fang “An Intelligent Hand Gesture Extraction andRecognition System for Home Care Application”,Department ofComputer Science and Information Engineering YuanpeiUniversity Taiwan, R.O.C.2012 Sixth International Conferenceon Genetic and Evolutionary Computing.
  24. J. Wu, G. Pan, D. Zhang, G. Qi, and S. Li, “Gesture recognitionwith a 3-D accelerometer,” in Proc. Ubiquitous Intell. Comput.:Lecture NotesComput. Sci., 2009, vol. 5585/2009, pp. 25–38.
  25. T. Yang, Y. Xu, and “A. , Hidden Markov Model for GestureRecognition”, CMU-RI-TR-94 10, Robotics Institute, CarnegieMellon Univ.,Pittsburgh, PA, May 1994.
  26. YinghuiZhou,LeiJing,JunboWang,ZixueCheng“Analysis andSelection of Features for Gesture Recognition Based on a MicroWearableDevice” Graduate Schoolof Computer Science andEngineering,University of AizuWakamatsu, Japan,(IJACSA)International Journal of Advanced Computer Science andApplications, Vol. 3, No. 1, 2012
  27. YonaFalinie Abdul GausFarrahWongHidden Markov Model -based Gesture Recognition with Overlapping Hand-Head/Hand-Hand Estimated using KalmanFilter”School of Engineering andInformation Technology, University Malaysia Sabah, Jalan UMS,88400 Kota Kinabalu, Sabah, Malaysia,2012.
  28. Yoshio Iwai,YasushiYagi and Masahiko Yachida, “Estimation ofHand Motion and Position from Monocular Image Sequence,”Proc. of ACCV’95, vol.11, pp.230- 234, Singapore, 1995
  29. S.Zhou, Q. Shan, F. Fei, W. J. Li, C. P. Kwong, and C. K. Wu etal.,“Gesture recognition for interactive controllers using MEMSmotionsensors,” in Proc. IEEE Int. Conf. Nano/Micro Engineeredand Molecular Systems, Jan. 2009, pp. 935–940.
  30. S. Zhou, Z. Dong, W. J. Li, and C. P. Kwong, “Hand-writtencharacter recognition using MEMS motion sensing technology,”in Proc. IEEE/ASME Int. Conf. Advanced Intelligent Mechatronics, 2008, pp.1418–1423.