ISSN ONLINE(2320-9801) PRINT (2320-9798)
Snehal Dongre1, Sachine patil 2
|
Related article at Pubmed, Scholar Google |
Visit for more related articles at International Journal of Innovative Research in Computer and Communication Engineering
This project is used for an application that it’s capable of swapping mouse with human face for interaction with PC. Facial structures (tip, eyes and nose) are traced and tracked in to use their movements of mouse events. Coordinates of the nose tip in the video feed are interpreted to develop the coordinates of the mouse on the screen. The right /left blinks control right/left events for click. The external device will be cam for the video stream.
Keywords |
Cursor control system, Eye detection, Eye movement, Face detection, Human-Computer interaction, Support Vector Machine (SVM). |
INTRODUCTION |
In the past few years knowledge becomes progressed and less costly. With high performed processors and economical cams, people are using real-time applications using image processing. The artificial intelligence is one of them. Our work aims to use human face which is interacting with machine. Our system captures feature along with cam and note its act to interpret to events which is communicating with the machine. We are compensating people those have hands disabilities. Our system prevents them from using mouse. Our system uses facial expression which interacts with the machine. The nose tip was selected as the pointing device; the reason behind that decision is the location and shape of the nose; as it is located in the middle of the face it is more comfortable to use it as the feature that moves the mouse pointer and defines coordinates, it is placed on axis that the face take turns about. It is in convex shape which makes it easier to track as the face moves. Eyes were used to simulate clicks. User now fire events. He blinks their eyes while different devices were used in our system. (e.g. infrared cameras, sensors, microphones) we used an off-the-shelf cam that affords a moderate resolution and frame rate as the capturing device in order to make the ability of using the program affordable for all individuals. We will try to present an algorithm that distinguishes true eye blinks from involuntary ones, detects and tracks the desired facial features. |
II. RELATED WORK |
With the growth of attention about machine vision, the interest in our project has increased proportionally .As we mentioned before different human features and monitoring devices were used to achieve our project, but during our research we were interested only in works that involved the use of facial features and webcams. We noticed a large diversity of the facial features that were selected, the way they were detected and tracked, and the functionality that they presented for the project. Researchers chose different facial features: eye pupils, eyebrows, nose tip, lips, eye lids’ corners, mouth corners for each of which they provided an explanation to choose that particular one. Different detection techniques were applied (e.g. feature based, image based) where the goal was to achieve more accurate results with less processing time. To control the mouse pointer various points were tracked ranging from the middle distance between the eyes, the middle distance between the eyebrows, to the nose tip. To simulate mouse clicks; eye blinks, mouth opening/closing, and sometimes eyebrow movement were used. Each our project method that we read about had some drawbacks, some methods used expensive equipment, some were not fast enough to achieve real-time execution, and others were not robust and precise enough to replace the mouse. We tried to profit from the experience that other researchers gained in the project field and added our own ideas to produce an application that is fast, robust, and useable. |
III. ALGORITHM |
A. Design Considerations: |
Face Detection Algorithm Overview: |
Two main categories: |
1. Feature-based methods |
2. Image-based methods. |
Feature-based methods: first involves finding facial features (e.g. eye brows, lips, eye pupils….) |
Image-based methods: scanning the image of interest with a window that looks for faces at all scales and locations. |
B. Description of the Algorithm: |
Our system is believed that people from different races have same skin color but they differ in lighting intensity of that color (e.g. In HSB color model according to this idea people have close H and S values but different B values). |
In order to discover a suitable skin color model, according to the previously mentioned idea, we will use the pure r and g values which are the R and G values of the RGB color model in the absence of brightness, and they are calculated with the following equations: |
We using feature based face detection methods to reduce the area in which we are looking for the face, so we can decrease the execution time. |
To discover face candidates the (SSR) Six Segmented Rectangular filters, filter used in the following way: Initially we calculate the integral image. We are making a one pass over the video frame using these equations [3]: |
s(x, y) = s(x, y-1) + i(x, y)........ (7) |
ii(x, y) = ii(x-1, y) + s(x, y)...... (8) |
Here’s(x, y) is cumulative row sum, s(x,-1) = 0, and ii(-1, y) = 0.Figure 2 shows an ideal location of the SSR filter, where its center is considered as a face candidate. |
(x, y) is the location of the filter (upper left corner). |
The plus sign is the center of the filter which is the face candidate. |
We can notice that in this ideal position the eyes fall in sectors S1 and S3, while the nose falls in sector S5. |
Since the eyes and eye brows are darker than the BTE and the cheek bones , we deduce that: |
We are using a threshold that is relevant to the size of the currently used SSR filter, to eliminate clusters that are small. The center of each cluster that is big enough is set with the following equations: |
i is the pixel from the cluster, n is the cluster’s area. |
Find Nose Tip: |
Now that we located the eyes, the final step is to discover the nose tip. |
From figure 3, we can see that the blue line defines a square of the pupils. corners of the mouth; the nose tip should fall inside this square, so this square becomes our region of interest in finding the nose tip (see fig.4).So the first step is to extract the ROI; in case the face was rotated we need to rotate the ROI back to a horizontal alignment of the eyes. |
Using the previous idea we tried to locate the nose tip with intensity profiles. In horizontal intensity profiles we add vertically to each line the values of the lines that precedes it in the ROI (see fig.5), so since that the nose bridge is brighter than the surrounding features the values should accumulate faster at the bridge location; in other words the maximum value of the horizontal profile gives us the ‘x’ coordinate of the nose tip. |
IV. PSEUDO CODE |
V. SIMULATION RESULTS |
To digitize the (a, b) space we round values to multiplicands of 0.01, so we get 101 values for ‘a’, and 88 values for ‘b’. Finally, to discover the skin color model we extracted 735 skin pixel samples from each of 771 face images which were taken from [6, 7], so the total number of skin pixel samples was 566685 samples. For each of the samples we calculated the ‘a’ and ‘b’ values, and increased the counter of that value (‘a’ has 101 counters, ‘b’ has 88 counters).The values that were most frequent are considered as skin pixel values, so after plotting the ‘a’ and ‘b’ results (see fig. 5 & 6) we deduced the following thresholds for ‘a’, and ‘b’ to consider a pixel as a skin pixel. |
Using the previous idea we tried to locate the nose tip with intensity profiles. In horizontal intensity profiles we add vertically to each line the values of the lines that precedes it in the ROI (see fig.5), so since that the nose bridge is brighter than the surrounding features the values should accumulate faster at the bridge location; in other words the maximum value of the horizontal profile gives us the ‘x’ coordinate of the nose tip. |
ROI: Region of interest. |
We add horizontally to each column the values of the columns that precedes it in the ROI (see fig. 7); the same as in the horizontal profile, the values accumulate faster at the nose tip position so the maximum value gives us the ‘y’ coordinate of the nose tip. From both, the horizontal and vertical profiles we were able to locate the nose tip position (see fig. 8), but unfortunately this method did not give accurate results because here might be several max values in a profile that are close to each other, and choosing the correct max value that really points out the coordinates of the nose tip is a difficult task. |
Each NBP will add with its S2 sector a certain amount to the accumulated sum in the horizontal profile, but the NBP at the nose trills location will add a smaller amount (S2 sector of the NBP at the nose trills location is darker than S2 sector of other NBPs), so if we calculated the first derivate of the values of NBPs’ S2 sectors (the first derivate of the values of the maximum value of the horizontal profile at each ROI line) we will notice a local minima at the nose trills location (see fig. 9); by locating this local minima we take the NBP that corresponds to it as the nose trills location, and the next step is to look for the nose tip above the nose trills. Since the nose tip is brighter than other features it will donate with its S2sector to the accumulated sum more than other NBPs, which means a local maxima in the first derivate (see fig. 9); so the location of the nose tip is the location of the NBP that corresponds to the local maxima that is above the local minima in the first derivate. |
In tracking mode the results were very robust when: |
1. The frame rate is 20 fps and higher; the user can move very fast without having the program losing his facial features. |
2. The glasses reflect light and cause bright spots that sometimes force our program to lose track of the eyes. For the detection and tracking to be accurate and robust the lighting conditions must be set so the light is frontal in a way that it will spread evenly on the face, because side light will cause false face detection and will affect eventually the tracking process. |
VI. CONCLUSION AND FUTURE WORK |
We are very promising; the following conclusions were made based on the experiments that were held during the expo. In detection mode the eyes and nose tip were located accurately when the following conditions were fulfilled: |
1. The face is not rotated more than 5° around the axis that passes from the nose tip (as long as the eyes fall in sectors S1 and S3 of the SSR filter). |
2. The face is not rotated more than 30° around the axis that passes from the neck (profile view). |
3. Wearing glasses does not affect our detection process. |
4. As for different scales it is best to get about 35 cm close to the cam, because when the face is a bit far from the screen the program may detect a false positive especially when the background is jammed (crowded). |
Feature works may include improving the tracking robustness against lighting conditions; perhaps by using more sophisticated and expensive capturing devices such as infrared cameras that can operate in absence of light and give more accurate tracking results. Adding the double left click (detecting the double left eye blink) and the drag mode (enabling/disabling with the right double eye blink)functionalities. Adding voice commands to launch the program, start the detection process, and to enable/disable controlling the mouse with the face. |
References |
|