A Comparative Study of Object Tracking Techniques | Open Access Journals

ISSN ONLINE(2319-8753)PRINT(2347-6710)

A Comparative Study of Object Tracking Techniques

Meha J. Patel1, Bhumika Bhatt2
  1. P.G Student, Dept of Computer Engineering, Sarvjanik College of Engineering and Technology, Surat, Gujarat, India
  2. Professor, Dept of Computer Engineering, Sarvjanik College of Engineering and Technology, Surat, Gujarat, India
Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Innovative Research in Science, Engineering and Technology


Object tracking is very essential task in many application of computer vision such as surveillance, vehicle navigation, autonomous robot navigation, etc. It contains detection of interesting moving objects and tracking of such objects from frame to frame. Its main task is to find and follow a moving object or multiple objects in image sequences. Normally there are three stages of video analysis; object detection, object tracking, and object reorganization. This paper present a brief survey of various video object tracking techniques like point tracking, kernel tracking and Silhouette tracking algorithms. Also it presents Comparative study of all the techniques.


Object tracking, point tracking, kernel tracking, silhouette tracking.


Object tracking is an important and challenging task in the field of computer vision. The rise of high-powered computers, the availability of high quality and inexpensive video cameras, and the rising need for automated video analysis has origin a great deal of interest in object tracking algorithms.
Object tracking in video can be defined as the problem of estimating the trajectory of an object in the image plane as it moves around a scene [1]. It is a process of segmenting an object of interest from a video scene and keeping track of its motion, orientation, occlusion etc. Moreover, depending on the tracking domain, a tracker can also give object-centric information, such as orientation, region, or silhouette of an object. Video object detecting and tracking has a wide variety of applications in video processing such as video compression, video surveillance, vision-based control, humancomputer interfaces, medical imaging, augmented reality, and robotics.
Selection of tracking algorithm can be done on bases of object representation, feature, object detection technique and object tracking algorithm. Most Common approaches of representation are points, primitive geometric shapes, Object silhouette and contour, articulated shape models, skeletal models. Common visual features are color, textures, optical flow and edge. Several object detection methods are point detectors, background subtraction, segmentation, supervised learning.


The aim of Object tracking is to create the route for an object over time by finding its position in video sequences. Detecting the object and creating correspondence between the object occurrences through frames can either be accomplished separately or jointly. In the first stage, Region of interest in each frame is achieved by means of an object detection algorithm, and then tracking corresponds to objects across frames. In final stage, the object region is projected by iteratively updating object location obtained from previous frames. Object tracking can be classified as point tracking, kernel based tracking and silhouette based tracking [1].

A. Point tracking

Tracking can be formulated as the correspondence of detected objects represented by points across frames. Point correspondence is a complicated problem- mainly in the existence of occlusions, misdetections, entries, and exits of objects. Overall, point correspondence methods can be divided into two broad categories, namely, deterministic and statistical methods. The deterministic methods use qualitative motion heuristics to constrain the correspondence problem. On the other hand, probabilistic methods explicitly take the object measurement and take uncertainties into account to establish correspondence.

1) Deterministic Method

In Point tracking this method works on connecting each object in previous frame with single object in current frame. This is done with the help of set of motion constraints. Deterministic methods for point correspondence define a cost of associating each object in frame t − 1 to a single object in frame t using a set of motion constraints. Minimization of the correspondence cost is formulated as a combinatorial optimization problem.
These approaches are made for different application for different purpose. First tracking algorithm was proposed by Veenman in the rotating dish sequence colour segmentation was used to detect black dots on a white dish [4]. Another tracking algorithm was proposed by Shafique and Shah in that birds are detected using background subtraction [5].

2) Statistical Method

Statistical or Probabilistic Method is works on measuring position of object in the frame with detection mechanism. This method used for model the object properties such as velocity and position. There are two categories of tracking: Single object and multi object tracking. Measurements obtained from video sensors invariably contain noise. Moreover, the object motions can undergo random perturbations, for instance, maneuvering vehicles. Statistical correspondence methods solve these tracking problems by taking the measurement and the model uncertainties into account during object state estimation. The state space approach is used by statistical correspondence methods to model the object properties like position, velocity, and acceleration. Measurements usually consist of the object position in the image, which is obtained by a detection mechanism.
Thus statistical method is extending endeavor of deterministic method by using different kinds of filters. These filters are also used in other methods like deterministic method, kernel tracking methods and contour racking method. Some commonly used filters are kalman filter, particle filter, joint probability data association filter and multiple hypothesis filters.

B. Kernel tracking

Kernel tracking is based on object motion. It is typically performed by computing the motion of the object, which is represented by a primitive object region, from one frame to the next. These algorithms differ in terms of the appearance representation used, the number of objects tracked, and the method used to estimate the object motion.

1) Template based models

Template base is method of searching the image, for the object template defines in previous frame. Due to its relative simplicity and low computational cost Templates and density-based appearance models have been commonly used. We divide the trackers in this category into two subcategories based on whether the objects are tracked individually or jointly. There are two types of template based method: Tracking single object and Tracking Multiple object [1].

2) Multi-view appearance models

Multi view appearance is the new approach used for objects that may have different views in different frames. There are difficulties in other method to track object like this. The appearance models in the previous tracking methods are usually generated online (for example histograms, templates etc). Thus these models represent the information collected through the most recent observations from the object. The objects may appear different from different views, and if the object view changes dramatically during tracking, the appearance model may no longer be valid, and the object track might be lost. Different views of the object can be learned offline and used for tracking to overcome this difficulty.

C. Silhouette tracking

Silhouette Tracking is used when complete region of an object is required. Complex shape objects for example hands, head, human body, can be accurately described by silhouette based method. The objective of these type tracking methods is to find the object region in each frame by means of an object model generated using the previous frames. This model can be in the form of an object edges or the object contour and color histogram. There are mainly two categories of silhouette tracking namely shape matching and contour tracking [1].

1) Shape Matching

Shape matching can be performed tracking where an object silhouette and its associated model is search in the current frame which similar to tracking based on template matching. In this approach, the search is performed by means of computing the similarity of the object with the model generated from the presume object silhouette based on previous frame.

2) Contour Tracking

Compare to shape matching method, contour tracking method is iteratively evolving an initial contour in the preceding frame to its new location in the present frame. This contour evolution requires that some part of the object in the current frame overlap with the object area in the previous frame. Tracking by evolving a contour can be performed using two different approaches. The first approach uses state space models to model the contour shape and its motion. The second approach directly evolves the contour by minimizing the contour energy using direct minimization techniques such as gradient descent. Contour based tracking algorithm involves further classification, namely, state space model and direct minimization of contour energy function.


Comparison for object tracking techniques is presented in the table below. The Following table concludes that different tracking techniques have been applied for object tracking for different challenging situations.


In this paper, broad literature survey on various video object tracking techniques has been presented. Video analysis is mainly on the bases of object detection, object tracking and object recognition. Comparative analysis is being done on bases of basic object tracking algorithms: point tracking, kernel and Silhouette tracking algorithms. As point tracking involves detection in every frame, while kernel based or contour based tracking requires detection when object first appears in the scene. Point trackers are suitable for tracking very small objects which can be represented by a single point representation. In kernel tracking approach, different estimating methods are used to find resultant region to target object. Color histogram technique has good efficiency in terms of frame detection but spatial information is lost, but feature like texture information etc is included then kernel tracking will give better results. Silhouette tracking forms the bases on the type of representation which can be motion models or appearance models. This paper highlights comparative analysis of the tracking algorithms for researchers in area of object tracking.


1. Yilmaz, O. Javed and M. Shah,"Object tracking: A survey”, ACM Computing Surveys”, 2006.

2. Comaniciu, Dorin, Visvanathan Ramesh, and Peter Meer."Kernel-based object tracking." Pattern Analysis and Machine Intelligence, IEEE Transactions on 25.5, pp. 564-577, 2003.

3. Comaniciu, Dorin, and Peter Meer. "Mean shift: A robust approach toward feature space analysis." Pattern Analysis and Machine Intelligence, IEEE Transactions on 24.5, pp. 603-619, 2002.

4. Veenman, Cor J., Marcel JT Reinders, and Eric Backer. "Resolving motion correspondence for densely moving points." Pattern Analysis and Machine Intelligence, IEEE Transactions on 23.1, pp. 54-72, 2001.

5. Shafique, K. AND Shah, M. “A non-iterative greedy algorithm for multi-frame point correspondence” In IEEE International Conference on Computer Vision (ICCV) 2003.

6. Black, Michael J., and Allan D. Jepson. "Eigen tracking: Robust matching and tracking of articulated objects using a view-based representation." International Journal of Computer Vision 26.1, pp. 63-84, 1998.

7. Avidan, Shai. "Support vector tracking." Pattern Analysis and Machine Intelligence, IEEE Transactions on 26.8, pp. 1064-1072, 2004.

8. Yilmaz, A., Li, X., And Shah, M. “Contour based object tracking with occlusion handling in video acquired using mobile cameras” IEEE Trans. Patt. Analy. Mach. Intell., 2004.