Survey on Web-Scale Image Search and Re-Ranking With Semantic Signatures. | Open Access Journals

ISSN ONLINE(2320-9801) PRINT (2320-9798)

Survey on Web-Scale Image Search and Re-Ranking With Semantic Signatures.

Darshana C. Chaudhari1 and Prof. Priti Subramanium2
  1. M.E. Student, Department of CSE, Shri Sant Gadge Baba College of Engineering & Technology, Bhusawal, North
    Maharashtra University, India
  2. Assistant Professor, Department of CSE, Shri Sant Gadge Baba College of Engineering & Technology, Bhusawal,
    North Maharashtra University, India
Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Innovative Research in Computer and Communication Engineering


Web-scale image search results can be improved by using image re-ranking technique. Many commercial search engines such as Google, Yahoo and Bing have been adopted this strategy. Firstly user is given a query keyword; text-based image retrieval is done. Then user is asked to select a query image from pool with minimum effort just by one click and images from a pool extracted by text-based information are re-ranked based on visual resemblance with the query image. There are some challenges in this method. Visual feature vectors need to be short to achieve high matching efficiency. But some of famous visual features are large in size. Another challenge is that the resemblance of visual features may not well associate with images’ high-level semantic meanings. For overcoming this problem, in this paper, a new technique is proposed for web-scale image re-ranking. As an alternative to manually defining an entire concept glossary, different semantic spaces for different query keywords can be found offline independently and automatically. Semantic signatures of the images can be acquired by projecting their visual features into their related semantic spaces and these semantic signatures can be computed using Hashing techniques. At the online stage, these compacted semantic signatures of images are to be compared to re-rank images. It significantly improves the efficiency and accuracy of web-image search and re-ranking.


Image re-ranking, query keyword, query image, keyword expansion, image search, semantic space, semantic signature, Hashing.


Image database on the internet is widely increasing day-by-day as internet is getting available to more and more people in the last decade. How to find just the right bit of images that user need from the Internet is a big challenge in image retrieval. Many commercial internet scale image search engines use only keywords as queries. The keywords provided by the users tend to be short and they cannot describe the actual visual content of target images just by using keywords. The text-based search results are noisy and ambiguous. For example, if “apple” is entered by the user to a search engine as a query keyword, the search results may belong to different categories such as “green apple,” “red apple,” “apple logo,” “apple laptop” and “apple iphone” because of the ambiguity of the word “apple”. To overcome this problem of ambiguity of keywords, another approach, content-based image extraction with relevance feedback is commonly used. In this, users have to select multiple relevant and irrelevant image examples. The visual similarity metrics are learned from them through online training, based on which, images are re-ranked. However, this approach requires a lots of user interventions and it is not suitable for commercial web-scale search engines.
To effectively improve the search results, online image re-ranking limits users’ effort to just one-click feedback. This strategy has been adopted by most of the commercial web-scale image search engines. It is shown in Fig. 1.
When user gives a query keyword to web image search engine, a collection of images related to the query keyword is retrieved based on textual information. The user has to select a query image from the image set. This image reflects the user’s search intention and the left images in the set are re-ranked depending on their visual similarities to the query image. The visual features of images are pre-extracted offline and stored. Comparing visual features is the major online computational cost. There are two major challenges in this method. First is that the visual features vectors should be short and their matching should be fast in order to achieve high efficiency. But some popular visual features are high dimensional and they cannot be directly matched. Second challenge is that the resemblance of low-level visual features and images’ high-level semantic meanings does not correlate which are necessary to capture users’ search intention. However, there have been many studies to decrease this semantic gap.


Computing the visual similarities that reflect the semantic relevance of images is the key component of image reranking. In last decade lots of visual features have been developed. But, the efficient low-level visual features are different for different query images.
W. Ma et al. [2] gave a prototype image retrieval system, called NeTra, which utilizes color, shape, texture and spatial location information in fragmented image pieces for searching and extract similar pieces from the database. Its integration with a strong automated image segmentation algorithm is a unique characteristic of this system. It permits the search based on object or region and the quality of image retrieval is also improved when images include many complex objects. X. Zhu et al. [3] gave an approach of semi-supervised learning that is based on a Gaussian random field model and proposed a random-walk model on graph manifolds to create “smoothed” similarity scores that are useful in ranking the remaining the images if one of them is selected as the query image. The goal is not classification; instead, it models the centrality of a graph as a means of ranking images.
Keyword or text based internet image search undergoes from lot of ambiguity. Numerous text based internet image search methods are restricted due to truth that content of images are not determined properly by query keywords. VisualRank is proposed by Jing and Baluja [6] for examining the visual link structures of images and for finding the visual ideas to re-rank them. VisualRank approach studies the distribution of visual similarities among the images. Common visual feature is applied among a set of images and utmost similarity node from set of images is found. It measures the similarity by analyzing an image to image distance function; means the distance between images from same class should be less than that from different classes.
Most of Pseudo-Relevance feedback techniques [4] limit users’ effort by extending query image with maximum visually similar images. Semantic gap between query image and other visual inconsistent images results into poor performance. Top N images which mainly visually match with the query image are taken as extended positive examples for obtaining a resemblance metric. While the top N images are not essentially semantically related to the query image, the obtained resemblance metric may not consistently reflect the semantic relevance and may even deteriorate re-ranking performance.
Cui et al. [5] classified query images into eight pre-identified intention classes and different types of query images are given different feature weighs. But the large variety of all the web images was difficult to cover up by the eight weighting schemes. In this, a query image was to be categorized to a wrong class.
Cai et al. [7] proposed to match the images in semantic spaces and re-rank them with attributes or reference classes which were manually defined and learned from training examples which were manually labeled. They supposed that there was one main semantic class for a query keyword. Images were re-ranked by using this main category with visual and textual features. Still it is hard and inefficient to learn a universal visual semantic space to describe highly varied images from the web.


The diagram of the approach is shown in Fig. 2. It has offline and online parts. When a user submits a text query, at offline part, keyword expansion related to query keyword is done automatically (e.g., “red apple” and “apple macbook” for query keyword “apple”) to accurately capture users’ search intention. These keyword expansions are called reference classes of the query keyword. The redundant reference classes (like “apple macbook” and “apple laptop”) having similar semantic meanings are removed in order to improve the efficiency of image re-ranking. Then the user has to select a query image from the image set which contains images with Keyword Expansion
Then the semantic signatures are derived from the visual features associated with the images using a trained multiclass classifier. Then, all the images in the image set are re-ranked by comparing their pre-calculated semantic signatures at the online stage.


The existing system can be improved in many ways. Proposed method requires less time and acquires less memory as compared to existing method. In the proposed method, when user gives query keyword, keyword expansion related to it can be done. After that, visual query expansion is done automatically to get multiple positive example images specific to the query image to accurately users’ intention by getting more relevant results. The new image re-ranking framework focuses on the semantic signatures associated with the images derived using a trained multiclass classifier. In the proposed method, hashing algorithm is used. Even if semantic signatures are much shorter than visual features, they can be further reduced by using block-mean perceptual hashing techniques to further increase their matching efficiency. The study says that P-hash is reliable and fastest algorithm for web-based applications.
All the images in the image set have pre-computed hash values. So at the online stage, the images in this set are reranked by comparing their hash values, using Euclidean Distance formula to compute image similarities with the query image. And these finally re-ranked images are displayed to user.


In this paper, we have studied an Internet based image search approach. We have also discussed the conventional web-based image search techniques and pointed out their shortcomings. The proposed image re-ranking framework can overcome the shortcomings of the previous methods and also considerably gets better in both the accuracy and efficiency of the re-ranking method and can give optimum results in less time.


1. X. Wang, S. Qiu, K. Liu, X. Tang, “Web Image Re-Ranking Using Query-Specific Semantic Signatures,” IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 36, pp. 810-823, 2014.

2. W. Y. Ma and B. S. Manjunath, “A toolbox for navigating large image databases, multimedia system,” Vol. 3, pp. 184-198, 1999.

3. X. Zhu and J. D. Lafferty, “Semi-supervised Learning using Gaussian Fields and Harmonic Functions,” Proc. 20th Int’l Conf. Machine Learning, pp. 912-919, 2003.

4. R. Yan, E. Hauptmann, and R. Jin, “Multimedia Search with Pseudo-Relevance Feedback,” in Proc. Int. Conf. Image and Video Retrieval, 2003.

5. J. Cui, F. Wen, and X. Tang, “Real Time Google and Live Image Search Re-Ranking,” in Proc. 16th ACM Int. Conf. Multimedia, 2008.

6. Y. Jing and S. Baluja, “Visual Rank: Applying Page Rank to Large- Scale Image Search,” IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 30, pp. 1877-1890, 2008.

7. J. Cai, Z. Zha, W. Zhou, and Q. Tian, “Attribute-Assisted Reranking for Web Image Retrieval,” in Proc. 20th ACM Int. Conf. Multimedia, 2012.

8. Suresh Kumar Nagarajan, Shanmugam Saravanan, “Content-Based Medical Image Annotation and Retrieval using Perceptual Hashing Algorithm,” IOSR Journal of Engineering, Vol. 2, pp. 814-818, 2012.