Content Based Image Retrieval by Online and
Offline

Swati Killikatt; Vidya Kulkarni; Madhuri Bijjal

Content Based Image Retrieval by Online and Offline

Swati Killikatt ¹, Vidya Kulkarni 2, Madhuri Bijjal³

M.Tech Student, Dept of CSE, KLSGIT Belgaum, Karnataka, India¹
Associate Professor, Dept of MCA, KLSGIT Belgaum, India²
Mtech Student, Dept of CSE, KLSGIT Belgaum, India³

Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

Abstract

With many potential practical applications, content- based image retrieval (CBIR) has attracted substantial attention during the past few years. A variety of relevance feedback (RF) schemes have been developed as a powerful tool to bridge the semantic gap between low-level visual features and high-level semantic concepts, and thus to improve the performance of CBIR systems. Among various RF approaches, support-vector-machine (SVM)-based RF is one of the most popular techniques in CBIR. Despite the success, directly using SVM as an RF scheme has two main drawbacks. First, it treats the positive and negative feedbacks equally, which is not appropriate since the two groups of training feedbacks have distinct properties. Second, as the size of image database increases search and retrieval become slow and it affects the performance of the system. To explore solutions to overcome these two drawbacks, CBIR system is implemented as both offline and online and which make use of the properties of images.

Keywords

Content-based image retrieval (CBIR), graph embedding relevance feedback (RF), support vector machine (SVM), Hue Saturation Value (HSV).

INTRODUCTION

During the past few years, content-based image retrieval (CBIR) has gained much attention for its potential applications in multimedia management [1]-[2]. Content-based image retrieval, a technique which uses visual contents to search images from large scale image databases according to user interests, has been an active and fast advancing research area since the 1990s. Content-based image retrieval, also known as query by image content (QBIC).It is motivated by the explosive growth of image records and the online accessibility of remotely stored images. An effective search scheme is urgently required to manage the huge image database. Different from the traditional search engine, in CBIR, an image query is described by using one or more example images, and low-level visual features (e.g., color [3]-[5], texture [5]-[7], shape [8]-[10], etc.) are automatically extracted to represent the images in the database. However, the low-level features captured from the images may not accurately characterize the high-level semantic concepts [1], [2]. To reduce the inconsistency problem, the image retrieval is carried out according to the image contents; such strategy is called content-based image retrieval. In Content-Based Approach, Images can be search based on visual features, such as color, texture, and edge information shown in fig 1.

To narrow down the so-called semantic gap, relevance feedback (RF) was introduced as a powerful tool to enhance the performance of CBIR [14]. A self-organizing map was used to construct the RF algorithms. In one-class support vector machine (SVM) estimated the density of positive feedback samples [13]. Derived from one-class SVM, a biased SVM inherited the merits of one-class SVM but incorporated the negative feedback samples. Considering the geometry structure of image low-level visual features, and proposed manifold-learning-based approaches to find the intrinsic structure of images and improve the retrieval performance. With the observation that “all positive examples are alike; each negative example is negative in its own way,” RF was formulated as a biased subspace learning problem, in which there is an unknown number of classes, but the user is only concerned about the positive class.

SVM RF approaches ignore the basic difference between the two distinct groups of feedbacks, i.e., all positive feedbacks share a similar concept while each negative feedback usually varies with different. A typical set of feedback samples in RF iteration is shown in fig 1. Traditional SVMRF techniques treat positive and negative feedbacks equally. Directly using SVM as an RF scheme is potentially damaging to the performance of CBIR systems. One problem stems from the fact that different semantic concepts live in different subspaces and each image can live in many different subspaces, and it is the goal of RF schemes to figure out “which one”. However, it will be a burden for traditional SVM-based RF schemes to tune the internal parameters to adapt to the changes of the subspace. Such difficulties have severely degraded the effectiveness of traditional SVM RF approaches for CBIR. This problem overcomes by implementing CBIR as both offline and online.

RELATED WORK

CBIR has become an active and fast-advancing research area in image retrieval in the last decade. Global image properties based, local image properties based, region-level features based, relevance feedback, and semantic based. The difference between the user’s information need and the image representation is called the semantic gap in CBIR systems. The limited retrieval accuracy of image centric retrieval systems is essentially due to the inherent semantic gap between users. In order to reduce the gap, the interactive relevance feedback system is introduced into CBIR. The basic idea behind relevance feedback is to incorporate human perception subjectivity into the query process and provide users with the opportunity to evaluate the retrieval results. The similarity measures are automatically refined on the basis of these evaluations. There are various approaches are present for Content Based Image Retrieval. Some of the important literature which covers the more important CBIR System is discussed below.

A.W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain [1] describe “Content-based" means that the search will analyze the actual contents of the image rather than the metadata such as keywords, tags, and/or descriptions associated with the image. The term 'content' in this context might refer to colors, shapes, textures, or any other information that can be derived from the image itself. CBIR is desirable because most web based image search engines rely purely on metadata and this produces a lot of garbage in the results. Also having humans manually enter keywords for images in a large database can be inefficient, expensive and may not capture every keyword that describes the image Thus a system that can filter images based on their content would provide better indexing and return more accurate results.

Gulfishan Firdose Ahmed, Raju Barskar [2] in this paper, the basic components of content-based image retrieval system is introduced. Image retrieval methods based on color, texture, shape and semantic image are discussed, analyzed and compared. The semantic-based image retrieval is a better way to solve the “semantic gap” problem, so the semantic-based image retrieval method is stressed in this paper. Other related techniques such as relevance feedback and performance evaluation also discussed. In many areas of commerce, government, academia, and hospitals, large collections of digital images are being created. Many of these collections are the product of digitizing existing collections of analogue photographs, diagrams, drawings, paintings, and prints. Usually, the only way of searching these collections was by keyword indexing, or simply by browsing. Digital images databases however, open the way to content-based searching. In this paper the survey is done on some technical aspects of current content-based image retrieval systems.

X. Zhou and T. Huang [3] in this paper authors analyze the nature of the relevance feedback problem in a continuous representation space in the context of multimedia information retrieval. Emphasis is put on exploring the uniqueness of the problem and comparing the assumptions, implementations, and merits of various solutions in the literature. An attempt is made to compile a list of critical issues to consider when designing a relevance feedback algorithm. With a comprehensive review as the main portion, this paper also offers some novel solutions and perspectives. Relevance feedback is a feature of some information retrieval systems. The idea behind relevance feedback is to take the results that are initially returned from a given query and to use information about whether or not those results are relevant to perform a new query.

Chun et al. [12] proposed a CBIR method based on an efficient combination of multi resolution color and texture features based CBIR. Their colors, autocorrelograms features of the hue and saturation component images in HSV color space are used. As its texture features, block difference of inverse probabilities and block variation of local correlation coefficient moments of the value component image are adopted. The color and texture features are extracted in multi resolution wavelet domain and then combined.

S. Tong and E. Chang [13] this paper describes that Support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. The basic SVM takes a set of input data and predicts, for each given input, which of two possible classes forms the output, making it a non-probabilistic binary linear classifier. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on.

D. Tao, X. Tang, and X. Li [14] in this paper, authors make theoretical and practical comparisons between principal and complement components of image features in CBIR RF. Most of the previous RF approaches treat the positive and negative feedbacks equivalently although this assumption is not appropriate since the two groups of training feedbacks have very different properties. That is, all positive feedbacks share a homogeneous concept while negative feedbacks do not. Authors explore solutions to this important problem by proposing an orthogonal complement component analysis. Experimental results are reported on a real-world image collection to demonstrate that the proposed complement components method consistently outperforms the conventional principal components method in both linear and kernel spaces when users want to retrieve images with a homogeneous concept.

PROPERTIES OF IMAGE

A. Color Feature Extraction

A color image can be represented using three primaries of a color space. Since the RGB color space does not correspond to the human way of perceiving the colors, so the HSV color space in this approach is used. HSV is an intuitive color space in the sense that each component contributes directly to visual perception, and it is common for image retrieval systems. Hue is used to distinguish colors or represents a pure color, where as saturation gives a measure of the percentage or amount of white light added to a pure color. Value refers to the perceived light intensity or measures brightness. The very important advantages of HSV color space are as follows: good compatibility with human intuition and there is a separability of chromatic and achromatic components. The color distribution of pixels in an image contains sufficient information. There are Global color properties and Local color properties of an image. The following two features to represent the global properties of an image can be used. The mean of pixel colors states the principal color of the image, and the standard deviation of pixel colors represents the variation of pixel colors in an image. The variation degree of pixel colors in an image is called the color complexity of the image . Global Color Properties:

The mean () and the standard deviation () of a color image are defined as follows:

Where μ = [μ H, μ S,μ V] T and σ = [σ H, σ S, σ V] T, each component of μ and σ indicates the HSV information, and Pi indicates the ith pixel of an image.

Local Color Properties:

The local color properties in an image play also an important role to improve the retrieval performance. Hence, a feature called binary bitmap can be used to capture the local color information of an image

B. Texture Properties

Texture is an important attribute that refers to innate surface properties of an object and their relationship to the surrounding environment. A gray level co-occurrence matrix (GLCM) can be used, which is simple and effective method for representing texture. GLCM creates a matrix with the directions and distances between pixels, and then extracts meaningful statistics from the matrix as texture features. The GLCM represents the probability p (i, j; d, ÃÂ¯ÃÂ¿ÃÂ½ÃÂ¯ÃÂ¿ÃÂ½) that two pixels in an image, which are located with distance d and angle ÃÂ¯ÃÂ¿ÃÂ½ÃÂ¯ÃÂ¿ÃÂ½, have gray levels i and j.

C. Edge feature extraction

Edges in images constitute an important feature to represent their content. Human eyes are very sensitive to edge features for image perception. An edge histogram in the image space represents the frequency and the directionality of the brightness changes in the image and adopts the edge histogram descriptor (EHD) to describe edge distribution with a histogram based on local edge distribution in an image.

IMPLEMENTATION

Fig shows the flow of the work carried out. There are two modules present over here

ÃÂ¯ÃâÃÂ· Login module

ÃÂ¯ÃâÃÂ· CBIR

o Online

o Offline

A. Login module.

Login or logon (also called logging in or on and signing in or on) is the process by which individual access to a computer system is controlled by identification of the user using credentials provided by the user. A user can log in to a system and can then log out or log off (perform a logout / logoff) when the access is no longer needed. Logging out may be done explicitly by the user performing some action, such as entering the appropriate command, or clicking a website link labeled as such. It can also be done implicitly, such as by powering the machine off, closing a web browser window, leaving a website, or not refreshing a webpage within a defined period.

B.CBIR module

In Content-Based Approach the Images can be searched based on visual features, such as color, texture, and edge information and find the similarities of images. In this there are two sub modules.

1)Offline Module: once the user is done with login procedure he can load the images as he wants. And after loading of the images user query the image and based on the input image or query image system will Extract low-level visual features for user query image and database images. The system computes the similarity between the user query image and the database image according to low-level visual features. The system retrieves and presents a sequence of images ranked in decreasing order of similarity. As a result, the user is able to find relevant images by getting the top ranked images first and calculate retrieval performance.

2)Online Module: as the database size increases image retrieval and search operations will become slow and it affects the performance also. Most of the time it’s not possible to have all images in database as database will become huge, so here user is provided by one more option that is Broader search. User can search and retrieve the similar kind of images by online also.

METHODOLOGY

We develop a content based image retrieval system using color, texture and edge features, as shown in Fig. 4.

The system operates in four phases:

1) Querying: The user provides a sample image as the query for the system.

2) Feature Extraction: Extract low-level visual features for user query image and database images.

3) Similarity Computation: The system computes the similarity between the user query image and the database image according to low-level visual features.

Euclidian Distance Measure:

Similarity Measurement is done using Euclidian Distance between an image D, which is present in the data base and query image Q can be given as,

Where, Di and Qi are the feature vectors of image D and query image Q respectively with size n.

4) Retrieval: The system retrieves and presents a sequence of images ranked in decreasing order of similarity. As a result, the user is able to find relevant images by getting the top ranked images first and calculate retrieval performance.

EXPERIMENTAL SETUP AND RESULTS

This section is used to explain result analysis. For this, PC with Processor Pentium 4 or Higher, PC RAM 2 GB or Higher, Hard Disk Drive 40GB, Windows 7, OpenCv 2.4.3 with Microsoft Visual Studio 2010 are used.

CONCLUSION AND FUTURE WORK

Content based image retrieval (CBIR) is one of most popular system to retrieve the images. In Content-Based Approach, Images can be searched based on visual features, such as color, texture, and edge information. Contemporary computers can perform simple matching of hundreds of images in near real time. It is widely recognized that most current content-based image retrieval systems work with low level features (color, texture, shape), and that next generation systems should operate at a higher semantic level. Relevance Feedback (RF) is tool which is used to improve the performance of CBIR System. The basic idea behind relevance feedback is to incorporate human perception subjectivity into the query process and provide users with the opportunity to evaluate the retrieval results. By using color, texture and shape and some other properties of the images it differentiates between positive and negative feedback and also display only relevant images based on query image. The images can be retrieved by both online and offline based on the query image which improves performance of the CBIR system.

In the present work Content Based Image Retrieval has implemented as both online and offline. So it’s easy and fast to search and retrieve the images. In the same way as a future work it can be done for voice recording and voice searching.

Figures at a glance


Figure 1	Figure 2	Figure 3	Figure 4	Figure 5

Figure 6	Figure 7	Figure 8	Figure 9	Figure 10

References

A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, “Content-based image retrieval at the end of the early years,” IEEE Trans. Pattern Anal.Mach. Intell., vol. 22, no. 12, pp. 1349–1380, Dec. 2000.
Gulfishan Firdose Ahmed, Raju Barskar, “A Study on Different Image Retrieval Techniques in Image Processing”, in September 2011.
X. Zhou and T. Huang, “Relevance feedback for image retrieval: A comprehensive review,” Multimedia Syst., vol. 8, no. 6, pp. 536–544, Apr. 2003.
R. Datta, D. Joshi, J. Li, and J. Z. Wang, “Image retrieval: Ideas, influences, and trends of the new age,” ACM Comput. Surv., vol. 40, no. 2, pp. 1–60, Apr. 2008.
M. J. Swain and D. H. Ballard, “Color indexing,” Int. J. Comput. Vis., vol. 7, no. 1, pp. 11–32, Nov. 1991.
G. Pass, R. Zabih, and J. Miller, “Comparing images using color coherence vectors,” in Proc. ACM Multimedia, 1996, pp. 65–73..
H. Tamura, S. Mori, and T.Yamawaki, “Texture features corresponding to visual perception,” IEEE Trans. Syst., Man, Cybern., vol. SMC-8, no. 6, pp. 460–473, Jun. 1978.
J.Mao andA. Jain, “Texture classification and segmentation usingmultiresolution simultaneous autoregressive models,” Pattern Recognit., vol. 25, no. 2, pp. 173–188, Feb. 1992.
A. Jain and A.Vailaya, “Image retrieval using color and shape,” Pattern Recognit., vol. 29, no. 8, pp. 1233–1244, Aug. 1996.
W. Niblack, R. Barber, W. Equitz, M. Flickner, E. Glasman, D. Petkovic, P. Yanker, C. Faloutsos, and G. Taubino, “The QB ICproject: Querying images by content using color, texture, and shape,” in Proc. SPIE—Storage and Retrieval for Images and Video Databases, Feb. 1993, pp. 173–181.
A. Jain and A. Vailaya, “Shape-based retrieval: A case study with trademark image databases,” Pattern Recognit., vol. 31, no. 9, pp. 1369–1390, Sep. 1998.
Y. D. Chun, N. C. Kim, and I. H. Jang, “Content-based image retrieval using multiresolution color and texture features,” IEEE Trans. Multimedia, vol. 10, no. 6, pp. 1073–1084, Oct. 2008.
S. Tong and E. Chang, “Support vector machine active learning for image retrieval,” in Proc. ACM Multimedia, 2001, pp. 107–118.
D. Tao, X. Tang, and X. Li, “Which components are important for interactive image searching?,” IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 1, pp. 3–11, Jan. 2008.