Seam Carving for Content Aware Video Compression | Open Access Journals

ISSN ONLINE(2320-9801) PRINT (2320-9798)

Seam Carving for Content Aware Video Compression

G. Suganya1, S. Lavanya1, G. Sheeba Farin1, G. Karthick2
  1. Student, Department of Computer Science and Engineering, K.S.R College of Engineering, Tiruchengode, India
  2. Assistant Professor, Department of Computer Science and Engineering, K.S.R College of Engineering, Tiruchengode,
Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Innovative Research in Computer and Communication Engineering


Seam-carving has achieved the most widespread use with the blossom of content-aware resizing methods. This paper proposes the video retargeting method where the aspect ratio and the size of the video are reduced. It deals with the spatial and temporal coherence for the purpose of resizing videos. The proposed algorithm is mainly based on matching-area-based temporal energy adjustment. Each frame is seam carved and the corresponding optimal pixels are removed. In order to track the object it previously carved, the temporal energy adjustment is used. So carving the seam on different objects in two consecutive frames is greatly reduced. The spatially and temporally continuous resized images are obtained after this process. The seam carved frames are enhanced by both global and local process on luminance and chrominance components of the image by using the virtual histogram distribution method. The output video with reduction in both size and memory and also with the better visual quality can be obtained.


Seam carving, matching area, key point, image enhancement, virtual histogram


Image processing is one of the most researched areas these days owing to the growth of internet and display devices with enormous increase in the amount of images and videos. Video compression is also the area under concern. Compression means reduction of the amount of data required to represent a digital image by removing the redundant data. It involves reducing the size of the files, while retaining necessary information. Most of the existing application requires the videos to be compressed in an efficient manner due the limitations in storage space and resources. Larger size videos require large bandwidth to transmit, so the cost of communication is difficult and also requires high bandwidth. So the term data compression became more and more significant to save the storage space and bandwidth [1]. During the past years, video compression has been extensively studied. Among the existing compression techniques, the most commonly used ones are MPEG-4 and H.264 which use inter-frame prediction to reduce video data between a series of frames. This involves techniques such as difference coding, where one frame is compared with a reference frame and only pixels that have changed with respect to the reference frame are coded. In this way, the number of pixel values that is coded and sent is reduced. When such an encoded sequence is displayed, the images appear as in the original video sequence.
However with the increasing variety of video devices the support of video playback with varying resolutions, sizes and aspect ratios is needed. Image resizing forms the basis for video resizing. Conventional image shrinking regards all the pixels evenly using up and down sampling to achieve the desired size. This method can lead to distortion and also means that every object in the image is scaled down proportionally. The size of portable devices like laptops, PDAs, Internet video players and mobile phones continue diversifying, the existing coding schemes cannot be directly applied, and an additional video retargeting process (e.g., scaling, cropping or warping) is needed. Scaling the videos to fit into the display devices, distorts the salient objects. But this scaling is not good enough for the reason that it can only be applied uniformly through the whole image and disrespect of the content of the image [2]. Cropping is unacceptable if the edges contain some essential content. For effective video resizing we should not consider the geometry alone, we should consider the content of the video. Seam carving (SC) performs content aware video resizing [3]. However, for the resource-limited mobile devices, it is not always possible and economical to perform sophisticated content-aware video resizing. Therefore, content-based spatial-scalable video compression for arbitrary resolution is becoming one of the emergent challenges for universal access i.e., one can access any information over any network from anywhere through any type of display devices.
The video retargeting approaches can be classified into two main types, which are based on scaling and cropping [4]. In case of scaling based approaches the errors from important pixels are distributed to less important ones through non-uniform warping. Warping methods divide the image into quads by grid, and then compute the new geometry for the grid to fit the target aspect ratio of the image [5]. During resizing the quads with salient content remain untouched while other quads gain larger distortion. Since human sense is less aware of the distortion of homogeneous information, such as background grass or sky, image warping tends to distort the homogeneous regions. Image warping methods are content aware, which are generally achieved by incorporating some content-aware constraints with the adaptation of the scale-and-stretch optimization method.
Due to the motion-oblivious nature caused by the optimization constraints, the retargeted videos often show some incoherent or discontinuous artifacts, especially when sudden camera pan or object motions happen. Since the motions of camera and objects are crucial to the quality of retargeted videos, more and more work concentrates on the motion-aware constraints for video retargeting recently. Another cropping-based technique is image seam carving.


Seam carving protects the important features intact and only inserts or removes the seams containing less information. Seam carving is applied for image resizing, image object removal and partial enhancement. But the resizing application of seam carving is only considered here. Compared with content-aware cropping-and-scaling, seam carving removes the pixels more precisely. Seam carving is of two types, vertical and horizontal carving. Its fundamental idea is to remove one pixel from every row (vertical) or column (horizontal) in an image every time, and not to notably cause visual distortion at the same time. In order to reduce the width of the image vertical seam carving is applied and to reduce the height horizontal carving is applied.
For an image the "energy" of each pixel is computed and this represents the measure of how much that pixel stands out from its surroundings and gives us an idea of its importance. The formula used to compute the energy value is,
Energy(E) = sqrt (xenergy2+yenergy2) (1)
where xenergy and yenergy are the gradient of the image. Then, a dynamic programming (DP) algorithm is used to find a seam containing least energy. If we want to change the image ratio of an image from n×m to n×m' where m-m'=c. This can be achieved by successively removing c vertical seams from the original image.


In case of video retargeting, carving of non-salient regions may cause some degree of distortion or discontinuousness of video content, which would lead to artifacts both temporally and partially [7]. They should be reduced by considering the salient content in the target video and enforcing spatial and temporal coherence simultaneously. In some situations, spatial and temporal coherence may contradict to each other. For example, when a salient object moves to a region where a seam was carved in the previous frame, spatial coherence needs the seam to avoid the object, but temporal coherence requires the seam to stay in its location. As a result, the balance between them is crucial in seam-carving-based video retargeting. In this paper, we propose an algorithm for video retargeting based on the seam-carving approaches and we enhance those approaches with several novel ideas. Our algorithm is based on per-frame seam carving, because we observe that the strict geometric continuity between the seams in two consecutive frames may not be necessary to obtain temporal coherent. Instead, we propose the concept of matching area and modify the EM [8]. The basic idea is to divide the pixels in current frame into reward/punish regions according to their similarities to the pixels on previous seam. The EM would be modified based on the reward and punish regions, and then, regular image seam-carving algorithm is used to find the seam in current frame. Such a method provides greater flexibility than continuous seam carving.
The proposed method uses a universal energy adjustment scheme to preserve both spatial and temporal coherence. in this paper carving of one vertical frame is alone concentrated and the same procedure can be applied for multiple or horizontal carving of videos.

A. Energy Value Estimation

In order to compute the gradient of an image to determine its importance, Sobel operator is used. Technically, it is a discrete differentiation operator, computing an approximation of the gradient of the image intensity function.
where * denotes the 2-dimensional convolution operation and A is the source image. Gx and Gy represent the two images which at each point contain the horizontal and vertical derivative approximations. The gradient magnitude of the image is computed using
B. Matching Area Computation
The video is a sequence of frames and hence every video is divided into frames before processing. The image seam carving method is applied directly to the first frame and for the remaining frames matching area is computed. The positions of the seam in the ith frame is denoted by a set Si
where pk denotes the y-coordinate of the point on the kth row of the seam and H denotes the height of the image. As a result, (k, pk) denotes the coordinate on the kth point of the seam. For each seam, the points with the maximum energy value are selected as key points (KP)
The following equation defines the intervals on the x-dimension, so R1 to R NKP vertically and equally divide the seam into NKP parts.
where NKP represents the number of key points, which is set to 10 here and N is an integer domain. After selecting the maximum energy point on the seam in each interval, the coordinates of the key points are determined using
where EM(x , y) denotes the value of EM of the source image at the coordinate (x , y). From this the point on the seam with maximum energy value in each interval Ri are selected as key points. The pixels in the key point are used as the reference pixel in the following frame. For each frame the search areas are the square areas whose centers are KPs of the previous seam and the search range gives the radius of the search area. The search range is assigned to H/Nkp in this paper. This is because the area must cover every row in the frame.
The matching area of the ith frame is defined by MAi(P) whose center is P(xP,yP). The match window (MW) is assigned to 3 in this paper. The formulae to compute the matching area is
For every pixel P in SA of KP, the match index (MI) between P in the current (i+1) frame and KP in the previous (i) frame denoted by MI * (P, KP), is calculated as
where SAD represents the sum of absolute differences between the two square areas. The match index (MI) indicates the match degree between one pixel and the key point on previous seam. We compare the matching area of the key point in previous frame with the matching area of every pixel in the search area in current frame. The match index of every pixel in the search area is determined by the SAD value between the matching areas. If one pixel belongs to several search areas, the minimum MI is selected as the pixel’s MI as
C. Energy Adjustment
After the computation of matching area, the pixels are divided into rewarded pixels (R-region) and punished pixels (P-region), by a match threshold (MTH). MTH is empirically set to 0.2 with consideration of the balance between spatial and temporal coherence because a large MTH will enhance temporal coherence and weaken spatial coherence. The value of the RPMap is set to 1 for the pixels that do not belong to any SA.
where RPMap(Px,y) represents the RPMap value at the pixel Px,y. This part plays the vital role because the energy values are adjusted here. This is calculated by multiplying the original energy value with the calculated RPMap value.
where EM’(Px,y) represents the adjusted energy value and EM(Px,y) represents the original energy value. The adjusted energy map EM’ is used for further seam carving. This process is repeated until the last frame is arrived.
D. Image Enhancement
The edges will not be visually pleasing when the seam carving method is applied and hence to improve the sharpness and overall contrast of the image a new hybrid image enhancement approach is applied to the seam carved frames. This is driven by both global and local processes on luminance and chrominance components of the image [9], [11]. An approach, based on the parameter-controlled virtual histogram distribution method is applied. This approach also increases the visibility of specified portions or aspects of the image and at the same time it maintains the image color better.
Finally, the processed frames are grouped together to obtain the more pleasing video sequence with reduction in the size and aspect ratio.


The improved seam carving method based on the matching area and universal energy adjustment scheme is stated in this paper. The same technique can be applied for horizontal or multiple frames with little modification. It reduces the size and aspect ratio of the video file. Thus this approach produces a visually pleasant video that can fit into several display devices and hence one can access any information over any network from anywhere through any type of display devices.


[1] Roger J. Clarke, “Image and video compression: A survey,” International Journal of Imaging Systems and Technology,Vol.10, Issue 1, pages 20–32, 1999.

[2] E. Salma & J. P. Josh Kuma, “Efficient Image Compression based on Seam Carving for Arbitrary Resolution Display Devices,” IJCA Journal, Vol.68, No. 4, January 2013.

[3] S. Avidan and A. Shamir, “Seam carving for content-aware image resizing,” ACM Trans. Graphics, article no. 10, 2007.

[4] M. Rubinstein, A. Shamir, and S. Avidan, “Improved seam carving for video retargeting,” ACM Trans. Graph. (TOG), vol. 27, no. 3, pp. 1– 9, 2008.

[5] Y. Wang, H. Fu, O. Sorkine, T. Lee, and H. Seidel, “Motion-aware temporal coherence for video resizing,” ACM Trans. Graph. (TOG), vol.28, no. 5, 2009, p. 127.

[6] Tzu-Hua Chao, Chiayi Taiwan, Jin-Jang Leou, Han-Hui Hsiao, “An enhanced seam carving approach for video retargeting,” Signal & Information Processing Association Annual Summit and Conference (APSIP ASC), pages 1-5, 2012.

[7] L. Wolf, M. Guttmann, and D. Cohen-Or, “Non-homogeneous content driven video-retargeting,” in Proc. IEEE 11th ICCV, pp. 1–6, Oct. 2007.

[8] Bo Yan, Kairan Sun & Liu Liu, “Matching-area-based Seam Carving For Video Retargeting,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 23, No. 2, February 2013.

[9] Zhengya Xu, Hong Ren Wu, Xinghuo Yu & Bin Qiu, “Colour Image Enhancement by Virtual Histogram Approach,” IEEE Transactions on Consumer Electronics, Vol. 56, No. 2, May 2010.

[10] Wilhelm Burger and Mark J Burger, Digital Image Processing, Springer, 2008.

[11] Cristian Munteanu and Agostinho Rosa, “Gray-Scale Image Enhancement as an Automatic Process Driven by Evolution,” IEEE Transactions on Systems, Man, and Cybernetics—part b: Cybernetics,vol. 34, no. 2, April 2004.

[12] William K. Pratt, Digital Image Processing, John Wiley & Sons, 2008.

[13] Z. Wang, Scalable foveated image and video communications, PhD thesis, Dept. of ECE, The University of Texasat Austin, Dec. 2001.

[14] Y. Wang, H. Lin, O. Sorkine, and T. Lee, “Motion-based video retargeting with optimized crop-and-warp,” ACM Trans. Graph. (TOG), vol.29, no. 4, p. 90, 2010.