ISSN ONLINE(2320-9801) PRINT (2320-9798)

Robust Shape Representatio Using Minium Near Convex Decomposition

Ms.Sathya Prabha.P1, Mr.Santhosh.R2
  1. PG Scholar, Karpagam University, Coimbatore
  2. Assistant Professor/CSE, Karpagam University, Coimbatore
Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Innovative Research in Computer and Communication Engineering


It is natural to represent an object by its parts and there has been strong evidence for part-based representations in human vision, Part-based representations allow for recognition that is robust in the presence of occlusion, movement, deletion, or growth of portions of an object. In the task of forming high-level object-centered models from low level image-based features, parts serve as an intermediate representation.


It is natural to represent an object by its parts and there has been strong evidence for part-based representations in human vision, Part-based representations allow for recognition that is robust in the presence of occlusion, movement, deletion, or growth of portions of an object. In the task of forming high-level object-centered models from low level image-based features, parts serve as an intermediate representation. Given an arbitrary shape, it is thus of great interests to decompose it into a number of natural parts, where each part satisfies certain geometric constraint. The most popular constraint is convexity constraint, because a convex part is visually natural and geometrically simple,.thus can serve as a satisfactory primitive for recognition; many operators, which are too complicated to be applied on the original objects, can be easily applied to its convex parts.
Two methods present themselves by which a given gesture could be recognised from two dimensional “silhouette” information: Knowing that the hand is made up of bones of fixed width irect method based on geometry: Knowing that the hand is made up of bones of fixed width connected by joints which can only flex in certain directions and by limited angles it would be possible to calculate the silhouettes for a large number of hand gestures. Thus, it would be possible to take the silhouette information provided by the detection method and find the most likely gesture that corresponds to it by direct comparison. The advantages of this method are that it would require very little training and would be easy to extend to any number of gestures as required. However, the model for calculating the silhouette for any given gesture would be hard to construct and in order to attain a high degree of accuracy it would be necessary to model the effect of all light sources in the room on the shadows cast on the hand by itself.
With this method the gesture set to be recognised would be “taught” to the system beforehand. Any given gesture could then be compared with the stored gestures and a match score calculated. The highest scoring gesture could then be displayed if its score greater than some match quality threshold.
The advantage of this system is that no prior information is required about the lighting conditions or the geometry of the hand for the system to work, as this information would be encoded into the system during training. The system would be faster than the above method if the gesture set was kept small. The disadvantage with this system is that each gesture would need to be trained at least once and for any degree of accuracy, several times. The gesture set is also likely to be user specific. It was decided to proceed with the learning method for reasons of computation speed and ease of implementation.


A simplification used in this project, which was not found in any recognition methods researched, is the use of a wrist band to remove several degrees of freedom. This enabled three new recognition methods to be devised. The recognition frame rate achieved is comparable to most of the systems in existence (after allowance for processor speed) but the number of different gestures recognised and the recognition accuracy are amongst the best found. Figure 3 shows several of the existing gesture recognition systems along with recognition statistics and method.


A robust shape representation via our method. The first column shows the same objects with different degrees of local distortions; the second column shows the near-convex decomposition results using our method; the third column shows the shape representations by replacing each part with its convex hull; and the fourth column shows the graph representations by replacing each part with a node. Despite severe local distortions, as our method decomposes a
shape into minimum number of near-convex parts, it avoids introducing redundant parts and thus brings consistent decomposition results. The last two columns are the results of existing near-convex decomposition methods: 10 and 12, respectively.


To this end, strict convex decomposition has been a well studied problem in computational geometry.However, in practice; strict convex decomposition is not robust because it is sensitive to small


In order to evaluate our method on 2D shapes, we test the MPEG-7 shape dataset [9]. Excluding simple shapes such as the heart shape that can be easily decomposed, we select 20 complex shape categories from MPEG-7 dataset, in which each category has 20 shapes (20×20=400 shapes). Fig.4shows an image for each selected category.
Decomposing a shape into visually meaningful parts comes naturally to human vision, but recreating this fundamental operation in computers has been shown to be difficult. Similar challenges have puzzled researchers in shape reconstruction for decades. In this paper, we recognize the strong connections between shape reconstruction and shape decomposition at a fundamental level and propose a method called α-decomposition.


In our algorithm, there are 2 parameters,and, where is the user specified concavity tolerance for near-convex decomposition; is the parameter introducing the visual naturalness. The parameter tells how small degree of concave features the user want to ignore in near-convex decomposition.


One advantage of our method is that it does not introduce redundant part as it decomposes the shape into minimum number of parts. In terms of the number of parts, table 2 presents the average reduction rate comparing our method with ACD [10] and CSD [12] at 4 different, on MPEG-7 dataset. The average reduction rate scores are defined as:
As it shows, we produce the least number of parts. Comparing to ACD [10], up to 32.7% number of redundant parts are reduced, and up to 30.7% number of redundant parts are reduced comparing to CSD [12]. On average, comparing to ACD 19.18% number of parts are reduced and 10.62% comparing to CSD. Thus, the efficiency of further applications on the decomposed parts can be highly improved.
On the other hand, from the table, we notice that all the ACD ↓and CSD↓ scores are greater than 0 on every shape category and every , which means that MNCD produces minimum number of parts at all time, as proved in Theorem 1. Decomposition results to further evaluate the visual naturalness of our decomposition; Fig.7 compares our method with the method proposed by Mi and Decarlo [14]. Mi’s method is specifically designed to decompose 2D shapes into natural parts.


For hand gesture recognition based HCI [5], usually the color, texture, shading, and context information are not robust for successful recognition, while the shape feature alone is often sufficient. However, the vision-based hand gesture recognition is extremely hard, because of two primary problems: 1. It is hard to segment the hand out of the image with cluttered background; 2. Even with the shape of a hand, existing representations are not robust enough for gesture recognition. For example, the contour-based and the skeleton-based representations can be affected by large local noises.
With the advent of Kinect depth camera [1], we can accurately segment the hand shape using both image and depth information, as shown in Fig.8. After that, we can use MNCD to robustly represent the hand shape for gesture recognition. With the Kinect depth camera, we collect a new hand gesture dataset with both color images and depth maps. Our dataset contains 3 hand gesture categories, namely Rock, Paper and Scissors, each category has 50 samples. For each category, an example is shown in the first two columns of Fig.8.
However, even with the help from the Kinect depth camera, the image segmentation of the hand is not perfect. Due to low-resolution, it easily introduces large local distortions or other types of noises on the contour, as shown in the third column of Fig.8.
However, our MNCD is robust to handle most of the variations, and decomposes hand shapes into natural primitives such as fingers and palm. We can recognize the hand gesture among Rock, Paper, Scissors by only counting the number of parts. Suppose k is the number of parts, we classify a gesture to Rock if k ≤ 2 , Paper if k ≥ 5 , and Scissors otherwise. Fig. 9 shows some recognition results using MNCD under various scale, orientation and illumination conditions.
The last column is some imperfect results because of unsatisfactory hand image segmentation. Our hand gesture recognition method using MNCD is robust to local distortions, scale and orientation changes.


Man-machine interface: using hand gestures to control the computer mouse and/or keyboard functions. An example of this, which has been implemented in this project, controls various keyboard and mouse functions using gestures alone.
3D animation: Rapid and simple conversion of hand movements into 3D computer space for the purposes of computer animation.
Visualisation: Just as objects can be visually examined by rotating them with the hand, so it would be advantageous if virtual 3D objects (displayed on the computer screen) could be manipulated by rotating the hand in space [Bretzner & Lindeberg, 1998].
• Computer games: Using the hand to interact with computer games would be more natural for many applications.


In this paper, we proposed a novel near-convex shape decomposition approach for robust shape representation, which decomposes 2D and 3D shapes into minimum number of parts with high visual naturalness. With the convexity constraint, the non overlapping constraint and by imposing the perception rules, we formulate the shape decomposition problem as a combinatorial optimization problem, where the global optimal solution is found by a dynamic subgradient based branch-and-bound search.


[1] Microsoft Corp. Redmond WA. Kinect for Xbox 360.6

[2] E. Balas and M. C. Carrera. A dynamic subgradient-based branch and-bound procedure for set covering. Operations Research , 44:875–890, 1996.4

[3] I. Biederman. Recognition-by-components: A theory of human image understanding. Psychological Rev., 94:115–147, 1987.1

[4] T. A. Cass. Robust affine structure matching for 3d object recognition. IEEE Transaction on Pattern Analysis and Machine Intelligence , 20:1265 – 1274, 1998.1

[5] A. Erol, G. Bebis, M. Nicolescu, R. D. Boyle, and X. Twombly. Vision-based hand pose estimation: A review. Computer Vision and Image Understanding , 108:52–73, 2007.6

[6] D. D. Hoffman and M. Singh. Salience of visual parts.Cognition, 14:29–78, 1997.2,3

[7] J. M. Keil and J. Snoeyink. Minimum convex decomposition.