nips nips2004 nips2004-85 nips2004-85-reference knowledge-graph by maker-knowledge-mining

85 nips-2004-Instance-Based Relevance Feedback for Image Retrieval


Source: pdf

Author: Giorgio Gia\-cin\-to, Fabio Roli

Abstract: High retrieval precision in content-based image retrieval can be attained by adopting relevance feedback mechanisms. These mechanisms require that the user judges the quality of the results of the query by marking all the retrieved images as being either relevant or not. Then, the search engine exploits this information to adapt the search to better meet user’s needs. At present, the vast majority of proposed relevance feedback mechanisms are formulated in terms of search model that has to be optimized. Such an optimization involves the modification of some search parameters so that the nearest neighbor of the query vector contains the largest number of relevant images. In this paper, a different approach to relevance feedback is proposed. After the user provides the first feedback, following retrievals are not based on knn search, but on the computation of a relevance score for each image of the database. This score is computed as a function of two distances, namely the distance from the nearest non-relevant image and the distance from the nearest relevant one. Images are then ranked according to this score and the top k images are displayed. Reported results on three image data sets show that the proposed mechanism outperforms other state-of-the-art relevance feedback mechanisms. 1 In t rod u ct i on A large number of content-based image retrieval (CBIR) systems rely on the vector representation of images in a multidimensional feature space representing low-level image characteristics, e.g., color, texture, shape, etc. [1]. Content-based queries are often expressed by visual examples in order to retrieve from the database the images that are “similar” to the examples. This kind of retrieval is often referred to as K nearest-neighbor retrieval. It is easy to see that the effectiveness of content-based image retrieval systems (CBIR) strongly depends on the choice of the set of visual features, on the choice of the “metric” used to model the user’s perception of image similarity, and on the choice of the image used to query the database [1]. Typically, if we allow different users to mark the images retrieved with a given query as relevant or non-relevant, different subsets of images will be marked as relevant. Accordingly, the need for mechanisms to adapt the CBIR system response based on some feedback from the user is widely recognized. It is interesting to note that while relevance feedback mechanisms have been first introduced in the information retrieval field [2], they are receiving more attention in the CBIR field (Huang). The vast majority of relevance feedback techniques proposed in the literature is based on modifying the values of the search parameters as to better represent the concept the user bears in mind. To this end, search parameters are computed as a function of the relevance values assigned by the user to all the images retrieved so far. As an example, relevance feedback is often formulated in terms of the modification of the query vector, and/or in terms of adaptive similarity metrics. [3]-[7]. Recently, pattern classification paradigms such as SVMs have been proposed [8]. Feedback is thus used to model the concept of relevant images and adjust the search consequently. Concept modeling may be difficult on account of the distribution of relevant images in the selected feature space. “Narrow domain” image databases allows extracting good features, so that images bearing similar concepts belong to compact clusters. On the other hand, “broad domain” databases, such as image collection used by graphic professionals, or those made up of images from the Internet, are more difficult to subdivide in cluster because of the high variability of concepts [1]. In these cases, it is worth extracting only low level, non-specialized features, and image retrieval is better formulated in terms of a search problem rather then concept modeling. The present paper aims at offering an original contribution in this direction. Rather then modeling the concept of “relevance” the user bears in mind, feedback is used to assign each image of the database a relevance score. Such a score depends only from two dissimilarities (distances) computed against the images already marked by the user: the dissimilarity from the set of relevant images, and the dissimilarity from the set of non-relevant images. Despite its computational simplicity, this mechanism allows outperforming state-of-the-art relevance feedback mechanisms both on “narrow domain” databases, and on “broad domain” databases. This paper is organized as follows. Section 2 illustrates the idea behind the proposed mechanism and provides the basic assumptions. Section 3 details the proposed relevance feedback mechanism. Results on three image data sets are presented in Section 4, where performances of other relevance feedback mechanisms are compared. Conclusions are drawn in Section 5. 2 In st an ce- b ased rel evan ce est i m at i on The proposed mechanism has been inspired by classification techniques based on the “nearest case” [9]-[10]. Nearest-case theory provided the mechanism to compute the dissimilarity of each image from the sets of relevant and non–relevant images. The ratio between the nearest relevant image and the nearest non-relevant image has been used to compute the degree of relevance of each image of the database [11]. The present section illustrates the rationale behind the use of the nearest-case paradigm. Let us assume that each image of the database has been represented by a number of low-level features, and that a (dis)similarity measure has been defined so that the proximity between pairs of images represents some kind of “conceptual” similarity. In other words, the chosen feature space and similarity metric is meaningful at least for a restricted number of users. A search in image databases is usually performed by retrieving the k most similar images with respect to a given query. The dimension of k is usually small, to avoid displaying a large number of images at a time. Typical values for k are between 10 and 20. However, as the “relevant” images that the user wishes to retrieve may not fit perfectly with the similarity metric designed for the search engine, the user may be interested in exploring other regions of the feature space. To this end, the user marks the subset of “relevant” images out of the k retrieved. Usually, such relevance feedback is used to perform a new k-nn search by modifying some search parameters, i.e., the position of the query point, the similarity metric, and other tuning parameters [1]-[7]. Recent works proposed the use of support vector machine to learn the distribution of relevant images [8]. These techniques require some assumption about the general form of the distribution of relevant images in the feature space. As it is difficult to make any assumption about such a distribution for broad domain databases, we propose to exploit the information about the relevance of the images retrieved so far in a nearest-neighbor fashion. Nearest-neighbor techniques, as used in statistical pattern recognition, case-based reasoning, or instance-based learning, are effective in all applications where it is difficult to produce a high-level generalization of a “class” of objects [9]-[10],[12][13]. Relevance learning in content base image retrieval may well fit into this definition, as it is difficult to provide a general model that can be adapted to represent different concepts of similarity. In addition, the number of available cases may be too small to estimate the optimal set of parameters for such a general model. On the other hand, it can be more effective to use each “relevant” image as well as each “non-relevant” image, as “cases” or “instances” against which the images of the database should be compared. Consequently, we assume that an image is as much as relevant as much as its dissimilarity from the nearest relevant image is small. Analogously, an image is as much as non-relevant as much as its dissimilarity from the nearest non-relevant image is small. 3 Rel evan ce S core Com p u t ati on According to previous section, each image of the database can be thus characterized by a “degree of relevance” and a “degree of non-relevance” according to the dissimilarities from the nearest relevant image, and from the nearest non-relevant image, respectively. However, it should be noted that these degrees should be treated differently because only “relevant” images represent a “concept” in the user’s mind, while “non-relevant” images may represent a number of other concepts different from user’s interest. In other words, while it is meaningful to treat the degree of relevance as a degree of membership to the class of relevant images, the same does not apply to the degree of non-relevance. For this reason, we propose to use the “degree of non-relevance” to weight the “degree of relevance”. Let us denote with R the subset of indexes j ∈ {1,...,k} related to the set of relevant images retrieved so far and the original query (that is relevant by default), and with NR the subset of indexes j ∈ (1,...,k} related to the set of non-relevant images retrieved so far. For each image I of the database, according to the nearest neighbor rule, let us compute the dissimilarity from the nearest image in R and the dissimilarity from the nearest image in NR. Let us denote these dissimilarities as dR(I) and dNR(I), respectively. The value of dR(I) can be clearly used to measure the degree of relevance of image I, assuming that small values of dR(I) are related to very relevant images. On the other hand, the hypothesis that image I is relevant to the user’s query can be supported by a high value of dNR(I). Accordingly, we defined the relevance score ! dR ( I ) $ relevance ( I ) = # 1 + dN ( I ) &


reference text

[1] Smeulders A.W.M., Worring M., Santini S., Gupta A., Jain R.: Content-based image retrieval at the end of the early years. IEEE Trans. on Pattern Analysis and Machine Intelligence 22(12) (2000) 1349-1380

[2] G. Salton and M.J. McGill, Introduction to modern information retrieval, New York, McGraw-Hill, 1988.

[3] Ishikawa Y., Subramanys R., Faloutsos C.: MindReader: Querying databases through multiple examples. In Proceedings. of the 24 th VLDB Conference (1998) 433-438

[4] Santini S., Jain R.: Integrated browsing and querying for image databases. IEEE Multimedia 7(3) (2000) 26-39

[5] Rui Y., Huang T.S.: Relevance Feedback Techniques in Image retrieval. In Lew M.S. (ed.): Principles of Visual Information Retrieval. Springer, London, (2001) 219-258

[6] Sclaroff S., La Cascia M., Sethi S., Taycher L.: Mix and Match Features in the ImageRover search engine. In Lew M.S. (ed.): Principles of Visual Information Retrieval. Springer-Verlag, London (2001) 219-258

[7] Giacinto G., Roli F.: Bayesian relevance feedback for content-based image retrieval. Pattern Recognition 37(7) (2004) 1499-1508

[8] Zhou X.S. and Huang T.S.: Relevance feedback in image retrieval: a comprehensive review, Multimedia Systems 8(6) (2003) 536-544

[9] Aha D.W., Kibler D., Albert M.K. Instance Based learning Algorithms. Machine Learning, 6, (1991) 37-66

[10] Althoff K-D. Case-Based Reasoning. In Chang S.K. (ed.) Handbook on Software Engineering and Knowledge Engineering, World Scientific (2001), 549-588.

[11] Bloch I. Information Combination Operators for Data Fusion: A Comparative Review with Classification. IEEE Trans. on System, Man and Cybernetics - Part A, 26(1) (1996) 52-67

[12] Duda R.O., Hart P.E., and Stork D.G.: Pattern Classification. John Wiley and Sons, Inc., New York, 2001

[13] Hastie T., Tibshrirani R., and Friedman J.: The Elements of Statistical Learning. Springer, New York, 2001

[14] Peng J., Bhanu B., Qing S., Probabilistic feature relevance learning for content-based image retrieval, Computer Vision and Image Understanding 75 (1999) 150-164.

[15] He J., Li M., Zhang H-J, Tong H., Zhang C, Mean Version Space: a New Active Learning Method for Content-Based Image Retrieval, Proc. of MIR 2004, New York, USA. (2004) 15-22.