nips nips2004 nips2004-196 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Suvrit Sra, Joel Tropp, Inderjit S. Dhillon
Abstract: Various problems in machine learning, databases, and statistics involve pairwise distances among a set of objects. It is often desirable for these distances to satisfy the properties of a metric, especially the triangle inequality. Applications where metric data is useful include clustering, classification, metric-based indexing, and approximation algorithms for various graph problems. This paper presents the Metric Nearness Problem: Given a dissimilarity matrix, find the “nearest” matrix of distances that satisfy the triangle inequalities. For p nearness measures, this paper develops efficient triangle fixing algorithms that compute globally optimal solutions by exploiting the inherent structure of the problem. Empirically, the algorithms have time and storage costs that are linear in the number of triangle constraints. The methods can also be easily parallelized for additional speed. 1
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract Various problems in machine learning, databases, and statistics involve pairwise distances among a set of objects. [sent-12, score-0.168]
2 It is often desirable for these distances to satisfy the properties of a metric, especially the triangle inequality. [sent-13, score-0.573]
3 Applications where metric data is useful include clustering, classification, metric-based indexing, and approximation algorithms for various graph problems. [sent-14, score-0.326]
4 This paper presents the Metric Nearness Problem: Given a dissimilarity matrix, find the “nearest” matrix of distances that satisfy the triangle inequalities. [sent-15, score-0.869]
5 For p nearness measures, this paper develops efficient triangle fixing algorithms that compute globally optimal solutions by exploiting the inherent structure of the problem. [sent-16, score-0.866]
6 Empirically, the algorithms have time and storage costs that are linear in the number of triangle constraints. [sent-17, score-0.455]
7 1 Introduction Imagine that a lazy graduate student has been asked to measure the pairwise distances among a group of objects in a metric space. [sent-19, score-0.525]
8 Obviously, all the distances need to be consistent, but the student does not know very much about the space in which the objects are embedded. [sent-21, score-0.229]
9 One way to solve his problem is to find the “nearest” complete set of distances that satisfy the triangle inequalities. [sent-22, score-0.603]
10 This procedure respects the measurements that have already been taken while forcing the missing numbers to behave like distances. [sent-23, score-0.106]
11 More charitably, suppose that the student has finished the experiment, but—measurements being what they are—the numbers do not satisfy the triangle inequality. [sent-24, score-0.466]
12 Once again, the solution seems to require the “nearest” set of distances that satisfy the triangle inequalities. [sent-26, score-0.599]
13 Matrix nearness problems [6] offer a natural framework for developing this idea. [sent-27, score-0.41]
14 If there are n points, we may collect the measurements into an n × n symmetric matrix whose (j, k) entry represents the dissimilarity between the j-th and k-th points. [sent-28, score-0.365]
15 Then, we seek to approximate this matrix by another whose entries satisfy the triangle inequalities. [sent-29, score-0.562]
16 That is, mik ≤ mij + mjk for every triple (i, j, k). [sent-30, score-0.179]
17 Any such matrix will represent the distances among n points in some metric space. [sent-31, score-0.559]
18 For example, one might prefer to change a few entries significantly or to change all the entries a little. [sent-33, score-0.112]
19 We call the problem of approximating general dissimilarity data by metric data the Metric Nearness (MN) Problem. [sent-34, score-0.525]
20 To solve the problem we present triangle-fixing algorithms that take advantage of its structure to produce globally optimal solutions. [sent-38, score-0.094]
21 They observe that machine learning applications often require metric data, and they propose a technique for metrizing dissimilarity data. [sent-44, score-0.566]
22 Their method, constant-shift embedding, increases all the dissimilarities by an equal amount to produce a set of Euclidean distances (i. [sent-45, score-0.222]
23 , a set of numbers that can be realized as the pairwise distances among an ensemble of points in a Euclidean space). [sent-47, score-0.196]
24 The size of the translation depends on the data, so the relative and absolute changes to the dissimilarity values can be large. [sent-48, score-0.297]
25 We seek a consistent set of distances that deviates as little as possible from the original measurements. [sent-50, score-0.202]
26 In our approach, the resulting set of distances can arise from an arbitrary metric space; we do not restrict our attention to obtaining Euclidean distances. [sent-51, score-0.495]
27 In consequence, we expect metric nearness to provide superior denoising. [sent-52, score-0.706]
28 Moreover, our techniques can also learn distances that are missing entirely. [sent-53, score-0.168]
29 That is, a metric dist(x, y) = (x − y)T G(x − y), where G is an s × s positive semi-definite matrix. [sent-57, score-0.296]
30 Then the matrix G is computed by minimizing the total squared distances between similar points while forcing the total distances between dissimilar points to exceed one. [sent-59, score-0.521]
31 In comparison, the metric nearness problem is not restricted to Mahalanobis distances; it can learn a general discrete metric. [sent-61, score-0.706]
32 It also allows us to use specific distance measurements and to indicate our confidence in those measurements (by means of a weight matrix), rather than forcing a binary choice of “similar” or “dissimilar. [sent-62, score-0.208]
33 ” The Metric Nearness Problem may appear similar to metric Multi-Dimensional Scaling (MDS) [8], but we emphasize that the two problems are distinct. [sent-63, score-0.296]
34 The MDS problem endeavors to find an ensemble of points in a prescribed metric space (usually a Euclidean space) such that the distances between these points are close to the set of input distances. [sent-64, score-0.52]
35 In fact MN does not impose any hypotheses on the underlying space other than requiring it to be a metric space. [sent-66, score-0.296]
36 In Section 3, we present algorithms that allow us to solve MN problems with p nearness measures. [sent-69, score-0.47]
37 We define a dissimilarity matrix to be a nonnegative, symmetric matrix with zero diagonal. [sent-73, score-0.363]
38 Meanwhile, a distance matrix is defined to be a dissimilarity matrix whose entries satisfy the triangle inequalities. [sent-74, score-0.882]
39 That is, M is a distance matrix if and only if it is a dissimilarity matrix and mik ≤ mij + mjk for every triple of distinct indices (i, j, k). [sent-75, score-0.6]
40 Distance matrices arise from measuring the distances among n points in a pseudo-metric space (i. [sent-76, score-0.261]
41 A distance matrix contains N = n (n − 1)/2 free parameters, so we denote the collection of all distance matrices by MN . [sent-79, score-0.217]
42 The metric nearness problem requests a distance matrix M that is closest to a given dissimilarity matrix D with respect to some measure of “closeness. [sent-81, score-1.127]
43 Specifically, we seek a distance matrix M so that, M∈ argmin W X −D , (2. [sent-83, score-0.197]
44 The weight matrix reflects our confi2 dence in the entries of D. [sent-85, score-0.123]
45 If, in addition, the norm is strictly convex and the weight matrix has no zeros or infinities off its diagonal, then there is a unique global minimum. [sent-92, score-0.199]
46 It is possible to use any norm in the metric nearness problem. [sent-96, score-0.765]
47 The associated Metric Nearness Problems are 1/p min X∈MN min X∈MN wjk (xjk − djk ) p for 1 ≤ p < ∞, and (2. [sent-98, score-0.095]
48 3) j=k max wjk (xjk − djk ) j=k Note that the p norms are strictly convex for 1 < p < ∞, and therefore the solution to (2. [sent-101, score-0.28]
49 The 1 norm gives the absolute sum of the (weighted) changes to the input matrix, while the ∞ only reflects the maximum absolute change. [sent-104, score-0.157]
50 At first, it may appear that one should use quadratic programming (QP) software when p = 2, linear programming (LP) software when p = 1 or p = ∞, and convex programming software for the remaining p. [sent-111, score-0.203]
51 An efficient algorithm must exploit the structure of the triangle inequalities. [sent-113, score-0.361]
52 This method examines each triple of points in turn and optimally enforces any triangle inequality that fails. [sent-115, score-0.463]
53 (The definition of “optimal” depends on the p nearness measure. [sent-116, score-0.41]
54 To each matrix X of dissimilarities or distances, we associate the vector x formed by stacking the columns of the lower triangle, left to right. [sent-120, score-0.121]
55 Define a constraint matrix A so that M is a distance matrix if and only if Am ≤ 0. [sent-122, score-0.222]
56 1 MN for the 2 norm We first develop a triangle-fixing algorithm for solving (2. [sent-125, score-0.095]
57 Given a dissimilarity vector d, we wish to find its orthogonal projection m onto the cone MN . [sent-129, score-0.301]
58 The negative entries of b indicate how much each triangle inequality is violated. [sent-132, score-0.456]
59 The problem becomes mine e 2 , subject to Ae ≤ b. [sent-133, score-0.101]
60 Suppose that the (i, j, k) triangle inequality is violated, i. [sent-138, score-0.4]
61 We wish to remedy this violation by making an 2 -minimal adjustment of eij , ejk , and eki . [sent-141, score-0.708]
62 In other words, the vector e is projected orthogonally onto the constraint set {e : eij − ejk − eki ≤ bijk }. [sent-142, score-0.87]
63 This is tantamount to solving mine 1 2 (eij − eij )2 + (ejk − ejk )2 + (eki − eki )2 ) , subject to eij − ejk − eki = bijk . [sent-143, score-1.616]
64 2) It is easy to check that the solution is given by eij ← eij − µijk , ejk ← ejk + µijk , and eki ← eki + µijk , (3. [sent-145, score-1.382]
65 3) show that the largest edge weight in the triangle is decreased, while the other two edge weights are increased. [sent-149, score-0.361]
66 In turn, we fix each violated triangle inequality using (3. [sent-150, score-0.439]
67 Each dual variable corresponds to the violation in a single triangle inequality, and each individual correction results in a decrease in the violation. [sent-155, score-0.454]
68 We continue until no triangle receives a significant update. [sent-156, score-0.361]
69 1 displays the complete iterative scheme that performs triangle fixing along with appropriate corrections. [sent-158, score-0.361]
70 T RIANGLE F IXING(D, ) Input: Input dissimilarity matrix D, tolerance Output: M = argminX∈MN X − D 2 . [sent-161, score-0.296]
71 By itself, Bregman’s method would suffer the same storage and computation costs as a general convex optimization algorithm. [sent-164, score-0.102]
72 Our triangle fixing operations allow us to compactly represent and compute the intermediate variables required to solve the problem. [sent-165, score-0.391]
73 2 MN for the 1 and ∞ norms The basic triangle fixing algorithm succeeds only when the norm used in (2. [sent-170, score-0.506]
74 First, observe that the problem of minimizing the an LP: 1 norm of the changes can be written as min 0T e + 1T f e,f subject to Ae ≤ b, (3. [sent-174, score-0.127]
75 Similarly, minimizing the ∞ norm of the changes can be accomplished with the LP min 0T e + ζ e,ζ subject to Ae ≤b, We interpret ζ = e (3. [sent-177, score-0.127]
76 Moreover, the solutions are not unique because the 1 and ∞ norms are not strictly convex. [sent-181, score-0.152]
77 Instead, we replace the LP by a quadratic program (QP) that is strictly convex and returns the solution of the LP that has minimum 2 -norm. [sent-182, score-0.147]
78 As in the 2 case, the triangle constraints are enforced using (3. [sent-201, score-0.391]
79 Each remaining constraint is enforced by computing an orthogonal projection onto the corresponding halfspace. [sent-203, score-0.132]
80 3 MN for p norms (1 < p < ∞) Next, we explain how to use triangle fixing to solve the MN problem for the remaining p norms, 1 < p < ∞. [sent-206, score-0.477]
81 The problem may be phrased as mine 1 e p p p subject to Ae ≤ b. [sent-208, score-0.101]
82 7) To enforce a triangle constraint optimally in the p norm, we need to compute a projection 1 of the vector e onto the constraint set. [sent-210, score-0.493]
83 The projection of e onto the (i, j, k) violating constraint is the solution of mine ϕ(e ) − ϕ(e) − ϕ(e), e − e subject to aT e = bijk , ijk where aijk is the row of the constraint matrix corresponding to the triangle inequality (i, j, k). [sent-212, score-1.041]
84 The projection may be determined by solving ϕ(e ) = ϕ(e) + µijk aijk so that aT e = bijk . [sent-213, score-0.254]
85 4 Applications and Experiments Replacing a general graph (dissimilarity matrix) by a metric graph (distance matrix) can enable us to use efficient approximation algorithms for NP-Hard graph problems (M AX C UT clustering) that have guaranteed error for metric data, for example, see [7]. [sent-219, score-0.622]
86 As an example, constant factor approximation algorithms for M AX -C UT exist for metric graphs [3], and can be used for clustering applications. [sent-221, score-0.354]
87 Applications that use dissimilarity values, such as clustering, classification, searching, and indexing, could potentially be sped up if the data is metric. [sent-223, score-0.262]
88 MN is a natural candidate for enforcing metric properties on the data to permit these speedups. [sent-224, score-0.323]
89 This problem involves approximating mPAM matrices, which are a derivative of mutation probability matrices [2] that arise in protein sequencing. [sent-226, score-0.099]
90 They represent a certain measure of dissimilarity for an application in protein sequencing. [sent-227, score-0.263]
91 Query operations in biological databases have the potential to be dramatically sped up if the data were metric (using a metric based indexing scheme). [sent-229, score-0.706]
92 Thus, one approach is to find the nearest distance matrix to each mPAM matrix and use that approximation in the metric based indexing scheme. [sent-230, score-0.574]
93 We approximated various mPAM matrices by their nearest distance matrices. [sent-231, score-0.128]
94 Figure 1 shows log–log plots of the running time of our algorithms for solving the 1 Log−Log plot showing runtime behavior of l1 MN Log−Log plot of running time for l2 MN 8 6. [sent-251, score-0.156]
95 The results plotted in the figure were obtained by executing the algorithms on random dissimilarity matrices. [sent-275, score-0.259]
96 5 Discussion In this paper, we have introduced the Metric Nearness problem, and we have developed algorithms for solving it for p nearness measures. [sent-280, score-0.476]
97 The algorithms proceed by fixing violated triangles in turn, while introducing correction terms to guide the algorithm to the global optimum. [sent-281, score-0.132]
98 Thus one may check whether the N distances satisfy metric properties in O(APSP) time. [sent-286, score-0.508]
99 Some other possibilities include putting box constraints on the distances (l ≤ m ≤ u), allowing λ triangle inequalities (mij ≤ λ1 mik +λ2 mkj ), or enforcing order constraints (dij < dkl implies mij < mkl ). [sent-289, score-0.667]
100 Distance metric learning, with application to clustering with side constraints. [sent-396, score-0.324]
wordName wordTfidf (topN-words)
[('nearness', 0.41), ('triangle', 0.361), ('mn', 0.326), ('metric', 0.296), ('ejk', 0.246), ('eki', 0.246), ('dissimilarity', 0.229), ('eij', 0.186), ('distances', 0.168), ('ijk', 0.13), ('xing', 0.124), ('bijk', 0.123), ('norms', 0.086), ('mpam', 0.082), ('zijk', 0.082), ('austin', 0.071), ('mine', 0.071), ('matrix', 0.067), ('lp', 0.063), ('correction', 0.063), ('aijk', 0.062), ('dhillon', 0.062), ('forcing', 0.062), ('sra', 0.062), ('student', 0.061), ('norm', 0.059), ('distance', 0.058), ('ae', 0.057), ('mij', 0.057), ('dij', 0.057), ('entries', 0.056), ('djk', 0.054), ('dissimilarities', 0.054), ('mik', 0.054), ('indexing', 0.05), ('texas', 0.046), ('fixing', 0.046), ('running', 0.045), ('satisfy', 0.044), ('measurements', 0.044), ('bregman', 0.043), ('atlab', 0.041), ('metrizing', 0.041), ('wjk', 0.041), ('onto', 0.039), ('inequality', 0.039), ('violated', 0.039), ('qp', 0.039), ('storage', 0.038), ('convex', 0.038), ('changes', 0.038), ('argmin', 0.038), ('solving', 0.036), ('nearest', 0.036), ('foreach', 0.036), ('strictly', 0.035), ('triple', 0.035), ('seek', 0.034), ('protein', 0.034), ('globally', 0.034), ('matrices', 0.034), ('projection', 0.033), ('xjk', 0.033), ('sped', 0.033), ('mjk', 0.033), ('corrections', 0.033), ('solutions', 0.031), ('databases', 0.031), ('arise', 0.031), ('algorithms', 0.03), ('constraint', 0.03), ('arbor', 0.03), ('enforced', 0.03), ('award', 0.03), ('violation', 0.03), ('absolute', 0.03), ('subject', 0.03), ('solve', 0.03), ('software', 0.029), ('ann', 0.029), ('clustering', 0.028), ('points', 0.028), ('enforcing', 0.027), ('mds', 0.027), ('euclidean', 0.027), ('costs', 0.026), ('mahalanobis', 0.026), ('solution', 0.026), ('programming', 0.026), ('replace', 0.025), ('entry', 0.025), ('auxiliary', 0.025), ('log', 0.025), ('ax', 0.024), ('roth', 0.024), ('minimizer', 0.024), ('returns', 0.023), ('nonzero', 0.023), ('obermayer', 0.023), ('ut', 0.023)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 196 nips-2004-Triangle Fixing Algorithms for the Metric Nearness Problem
Author: Suvrit Sra, Joel Tropp, Inderjit S. Dhillon
Abstract: Various problems in machine learning, databases, and statistics involve pairwise distances among a set of objects. It is often desirable for these distances to satisfy the properties of a metric, especially the triangle inequality. Applications where metric data is useful include clustering, classification, metric-based indexing, and approximation algorithms for various graph problems. This paper presents the Metric Nearness Problem: Given a dissimilarity matrix, find the “nearest” matrix of distances that satisfy the triangle inequalities. For p nearness measures, this paper develops efficient triangle fixing algorithms that compute globally optimal solutions by exploiting the inherent structure of the problem. Empirically, the algorithms have time and storage costs that are linear in the number of triangle constraints. The methods can also be easily parallelized for additional speed. 1
2 0.082057089 22 nips-2004-An Investigation of Practical Approximate Nearest Neighbor Algorithms
Author: Ting Liu, Andrew W. Moore, Ke Yang, Alexander G. Gray
Abstract: This paper concerns approximate nearest neighbor searching algorithms, which have become increasingly important, especially in high dimensional perception areas such as computer vision, with dozens of publications in recent years. Much of this enthusiasm is due to a successful new approximate nearest neighbor approach called Locality Sensitive Hashing (LSH). In this paper we ask the question: can earlier spatial data structure approaches to exact nearest neighbor, such as metric trees, be altered to provide approximate answers to proximity queries and if so, how? We introduce a new kind of metric tree that allows overlap: certain datapoints may appear in both the children of a parent. We also introduce new approximate k-NN search algorithms on this structure. We show why these structures should be able to exploit the same randomprojection-based approximations that LSH enjoys, but with a simpler algorithm and perhaps with greater efficiency. We then provide a detailed empirical evaluation on five large, high dimensional datasets which show up to 31-fold accelerations over LSH. This result holds true throughout the spectrum of approximation levels. 1
3 0.078751341 62 nips-2004-Euclidean Embedding of Co-Occurrence Data
Author: Amir Globerson, Gal Chechik, Fernando Pereira, Naftali Tishby
Abstract: Embedding algorithms search for low dimensional structure in complex data, but most algorithms only handle objects of a single type for which pairwise distances are specified. This paper describes a method for embedding objects of different types, such as images and text, into a single common Euclidean space based on their co-occurrence statistics. The joint distributions are modeled as exponentials of Euclidean distances in the low-dimensional embedding space, which links the problem to convex optimization over positive semidefinite matrices. The local structure of our embedding corresponds to the statistical correlations via random walks in the Euclidean space. We quantify the performance of our method on two text datasets, and show that it consistently and significantly outperforms standard methods of statistical correspondence modeling, such as multidimensional scaling and correspondence analysis. 1
4 0.076891698 127 nips-2004-Neighbourhood Components Analysis
Author: Jacob Goldberger, Geoffrey E. Hinton, Sam T. Roweis, Ruslan Salakhutdinov
Abstract: In this paper we propose a novel method for learning a Mahalanobis distance measure to be used in the KNN classification algorithm. The algorithm directly maximizes a stochastic variant of the leave-one-out KNN score on the training set. It can also learn a low-dimensional linear embedding of labeled data that can be used for data visualization and fast classification. Unlike other methods, our classification model is non-parametric, making no assumptions about the shape of the class distributions or the boundaries between them. The performance of the method is demonstrated on several data sets, both for metric learning and linear dimensionality reduction. 1
5 0.069523126 85 nips-2004-Instance-Based Relevance Feedback for Image Retrieval
Author: Giorgio Gia\-cin\-to, Fabio Roli
Abstract: High retrieval precision in content-based image retrieval can be attained by adopting relevance feedback mechanisms. These mechanisms require that the user judges the quality of the results of the query by marking all the retrieved images as being either relevant or not. Then, the search engine exploits this information to adapt the search to better meet user’s needs. At present, the vast majority of proposed relevance feedback mechanisms are formulated in terms of search model that has to be optimized. Such an optimization involves the modification of some search parameters so that the nearest neighbor of the query vector contains the largest number of relevant images. In this paper, a different approach to relevance feedback is proposed. After the user provides the first feedback, following retrievals are not based on knn search, but on the computation of a relevance score for each image of the database. This score is computed as a function of two distances, namely the distance from the nearest non-relevant image and the distance from the nearest relevant one. Images are then ranked according to this score and the top k images are displayed. Reported results on three image data sets show that the proposed mechanism outperforms other state-of-the-art relevance feedback mechanisms. 1 In t rod u ct i on A large number of content-based image retrieval (CBIR) systems rely on the vector representation of images in a multidimensional feature space representing low-level image characteristics, e.g., color, texture, shape, etc. [1]. Content-based queries are often expressed by visual examples in order to retrieve from the database the images that are “similar” to the examples. This kind of retrieval is often referred to as K nearest-neighbor retrieval. It is easy to see that the effectiveness of content-based image retrieval systems (CBIR) strongly depends on the choice of the set of visual features, on the choice of the “metric” used to model the user’s perception of image similarity, and on the choice of the image used to query the database [1]. Typically, if we allow different users to mark the images retrieved with a given query as relevant or non-relevant, different subsets of images will be marked as relevant. Accordingly, the need for mechanisms to adapt the CBIR system response based on some feedback from the user is widely recognized. It is interesting to note that while relevance feedback mechanisms have been first introduced in the information retrieval field [2], they are receiving more attention in the CBIR field (Huang). The vast majority of relevance feedback techniques proposed in the literature is based on modifying the values of the search parameters as to better represent the concept the user bears in mind. To this end, search parameters are computed as a function of the relevance values assigned by the user to all the images retrieved so far. As an example, relevance feedback is often formulated in terms of the modification of the query vector, and/or in terms of adaptive similarity metrics. [3]-[7]. Recently, pattern classification paradigms such as SVMs have been proposed [8]. Feedback is thus used to model the concept of relevant images and adjust the search consequently. Concept modeling may be difficult on account of the distribution of relevant images in the selected feature space. “Narrow domain” image databases allows extracting good features, so that images bearing similar concepts belong to compact clusters. On the other hand, “broad domain” databases, such as image collection used by graphic professionals, or those made up of images from the Internet, are more difficult to subdivide in cluster because of the high variability of concepts [1]. In these cases, it is worth extracting only low level, non-specialized features, and image retrieval is better formulated in terms of a search problem rather then concept modeling. The present paper aims at offering an original contribution in this direction. Rather then modeling the concept of “relevance” the user bears in mind, feedback is used to assign each image of the database a relevance score. Such a score depends only from two dissimilarities (distances) computed against the images already marked by the user: the dissimilarity from the set of relevant images, and the dissimilarity from the set of non-relevant images. Despite its computational simplicity, this mechanism allows outperforming state-of-the-art relevance feedback mechanisms both on “narrow domain” databases, and on “broad domain” databases. This paper is organized as follows. Section 2 illustrates the idea behind the proposed mechanism and provides the basic assumptions. Section 3 details the proposed relevance feedback mechanism. Results on three image data sets are presented in Section 4, where performances of other relevance feedback mechanisms are compared. Conclusions are drawn in Section 5. 2 In st an ce- b ased rel evan ce est i m at i on The proposed mechanism has been inspired by classification techniques based on the “nearest case” [9]-[10]. Nearest-case theory provided the mechanism to compute the dissimilarity of each image from the sets of relevant and non–relevant images. The ratio between the nearest relevant image and the nearest non-relevant image has been used to compute the degree of relevance of each image of the database [11]. The present section illustrates the rationale behind the use of the nearest-case paradigm. Let us assume that each image of the database has been represented by a number of low-level features, and that a (dis)similarity measure has been defined so that the proximity between pairs of images represents some kind of “conceptual” similarity. In other words, the chosen feature space and similarity metric is meaningful at least for a restricted number of users. A search in image databases is usually performed by retrieving the k most similar images with respect to a given query. The dimension of k is usually small, to avoid displaying a large number of images at a time. Typical values for k are between 10 and 20. However, as the “relevant” images that the user wishes to retrieve may not fit perfectly with the similarity metric designed for the search engine, the user may be interested in exploring other regions of the feature space. To this end, the user marks the subset of “relevant” images out of the k retrieved. Usually, such relevance feedback is used to perform a new k-nn search by modifying some search parameters, i.e., the position of the query point, the similarity metric, and other tuning parameters [1]-[7]. Recent works proposed the use of support vector machine to learn the distribution of relevant images [8]. These techniques require some assumption about the general form of the distribution of relevant images in the feature space. As it is difficult to make any assumption about such a distribution for broad domain databases, we propose to exploit the information about the relevance of the images retrieved so far in a nearest-neighbor fashion. Nearest-neighbor techniques, as used in statistical pattern recognition, case-based reasoning, or instance-based learning, are effective in all applications where it is difficult to produce a high-level generalization of a “class” of objects [9]-[10],[12][13]. Relevance learning in content base image retrieval may well fit into this definition, as it is difficult to provide a general model that can be adapted to represent different concepts of similarity. In addition, the number of available cases may be too small to estimate the optimal set of parameters for such a general model. On the other hand, it can be more effective to use each “relevant” image as well as each “non-relevant” image, as “cases” or “instances” against which the images of the database should be compared. Consequently, we assume that an image is as much as relevant as much as its dissimilarity from the nearest relevant image is small. Analogously, an image is as much as non-relevant as much as its dissimilarity from the nearest non-relevant image is small. 3 Rel evan ce S core Com p u t ati on According to previous section, each image of the database can be thus characterized by a “degree of relevance” and a “degree of non-relevance” according to the dissimilarities from the nearest relevant image, and from the nearest non-relevant image, respectively. However, it should be noted that these degrees should be treated differently because only “relevant” images represent a “concept” in the user’s mind, while “non-relevant” images may represent a number of other concepts different from user’s interest. In other words, while it is meaningful to treat the degree of relevance as a degree of membership to the class of relevant images, the same does not apply to the degree of non-relevance. For this reason, we propose to use the “degree of non-relevance” to weight the “degree of relevance”. Let us denote with R the subset of indexes j ∈ {1,...,k} related to the set of relevant images retrieved so far and the original query (that is relevant by default), and with NR the subset of indexes j ∈ (1,...,k} related to the set of non-relevant images retrieved so far. For each image I of the database, according to the nearest neighbor rule, let us compute the dissimilarity from the nearest image in R and the dissimilarity from the nearest image in NR. Let us denote these dissimilarities as dR(I) and dNR(I), respectively. The value of dR(I) can be clearly used to measure the degree of relevance of image I, assuming that small values of dR(I) are related to very relevant images. On the other hand, the hypothesis that image I is relevant to the user’s query can be supported by a high value of dNR(I). Accordingly, we defined the relevance score ! dR ( I ) $ relevance ( I ) = # 1 + dN ( I ) &
6 0.068907097 110 nips-2004-Matrix Exponential Gradient Updates for On-line Learning and Bregman Projection
7 0.068906553 1 nips-2004-A Cost-Shaping LP for Bellman Error Minimization with Performance Guarantees
8 0.0688693 150 nips-2004-Proximity Graphs for Clustering and Manifold Learning
9 0.068209663 115 nips-2004-Maximum Margin Clustering
10 0.067475885 160 nips-2004-Seeing through water
11 0.064849176 207 nips-2004-ℓ₀-norm Minimization for Basis Selection
12 0.062290654 113 nips-2004-Maximum-Margin Matrix Factorization
13 0.05957197 134 nips-2004-Object Classification from a Single Example Utilizing Class Relevance Metrics
14 0.05664622 92 nips-2004-Kernel Methods for Implicit Surface Modeling
15 0.056140672 142 nips-2004-Outlier Detection with One-class Kernel Fisher Discriminants
16 0.050844461 125 nips-2004-Multiple Relational Embedding
17 0.045373335 3 nips-2004-A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound
18 0.04526763 11 nips-2004-A Second Order Cone programming Formulation for Classifying Missing Data
19 0.04411618 161 nips-2004-Self-Tuning Spectral Clustering
20 0.043640405 133 nips-2004-Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning
topicId topicWeight
[(0, -0.145), (1, 0.056), (2, -0.009), (3, -0.008), (4, -0.009), (5, -0.003), (6, -0.067), (7, 0.013), (8, 0.043), (9, -0.037), (10, -0.037), (11, -0.111), (12, -0.053), (13, -0.011), (14, -0.051), (15, 0.067), (16, 0.017), (17, -0.03), (18, -0.149), (19, 0.024), (20, 0.07), (21, 0.069), (22, 0.006), (23, 0.045), (24, 0.091), (25, -0.057), (26, -0.039), (27, 0.053), (28, 0.107), (29, 0.054), (30, -0.113), (31, 0.015), (32, -0.086), (33, -0.034), (34, -0.047), (35, -0.083), (36, -0.09), (37, -0.058), (38, 0.111), (39, 0.078), (40, 0.049), (41, 0.019), (42, -0.056), (43, 0.15), (44, 0.164), (45, 0.025), (46, 0.033), (47, -0.004), (48, 0.076), (49, -0.058)]
simIndex simValue paperId paperTitle
same-paper 1 0.94813311 196 nips-2004-Triangle Fixing Algorithms for the Metric Nearness Problem
Author: Suvrit Sra, Joel Tropp, Inderjit S. Dhillon
Abstract: Various problems in machine learning, databases, and statistics involve pairwise distances among a set of objects. It is often desirable for these distances to satisfy the properties of a metric, especially the triangle inequality. Applications where metric data is useful include clustering, classification, metric-based indexing, and approximation algorithms for various graph problems. This paper presents the Metric Nearness Problem: Given a dissimilarity matrix, find the “nearest” matrix of distances that satisfy the triangle inequalities. For p nearness measures, this paper develops efficient triangle fixing algorithms that compute globally optimal solutions by exploiting the inherent structure of the problem. Empirically, the algorithms have time and storage costs that are linear in the number of triangle constraints. The methods can also be easily parallelized for additional speed. 1
2 0.51635748 22 nips-2004-An Investigation of Practical Approximate Nearest Neighbor Algorithms
Author: Ting Liu, Andrew W. Moore, Ke Yang, Alexander G. Gray
Abstract: This paper concerns approximate nearest neighbor searching algorithms, which have become increasingly important, especially in high dimensional perception areas such as computer vision, with dozens of publications in recent years. Much of this enthusiasm is due to a successful new approximate nearest neighbor approach called Locality Sensitive Hashing (LSH). In this paper we ask the question: can earlier spatial data structure approaches to exact nearest neighbor, such as metric trees, be altered to provide approximate answers to proximity queries and if so, how? We introduce a new kind of metric tree that allows overlap: certain datapoints may appear in both the children of a parent. We also introduce new approximate k-NN search algorithms on this structure. We show why these structures should be able to exploit the same randomprojection-based approximations that LSH enjoys, but with a simpler algorithm and perhaps with greater efficiency. We then provide a detailed empirical evaluation on five large, high dimensional datasets which show up to 31-fold accelerations over LSH. This result holds true throughout the spectrum of approximation levels. 1
3 0.49862641 127 nips-2004-Neighbourhood Components Analysis
Author: Jacob Goldberger, Geoffrey E. Hinton, Sam T. Roweis, Ruslan Salakhutdinov
Abstract: In this paper we propose a novel method for learning a Mahalanobis distance measure to be used in the KNN classification algorithm. The algorithm directly maximizes a stochastic variant of the leave-one-out KNN score on the training set. It can also learn a low-dimensional linear embedding of labeled data that can be used for data visualization and fast classification. Unlike other methods, our classification model is non-parametric, making no assumptions about the shape of the class distributions or the boundaries between them. The performance of the method is demonstrated on several data sets, both for metric learning and linear dimensionality reduction. 1
4 0.47334543 62 nips-2004-Euclidean Embedding of Co-Occurrence Data
Author: Amir Globerson, Gal Chechik, Fernando Pereira, Naftali Tishby
Abstract: Embedding algorithms search for low dimensional structure in complex data, but most algorithms only handle objects of a single type for which pairwise distances are specified. This paper describes a method for embedding objects of different types, such as images and text, into a single common Euclidean space based on their co-occurrence statistics. The joint distributions are modeled as exponentials of Euclidean distances in the low-dimensional embedding space, which links the problem to convex optimization over positive semidefinite matrices. The local structure of our embedding corresponds to the statistical correlations via random walks in the Euclidean space. We quantify the performance of our method on two text datasets, and show that it consistently and significantly outperforms standard methods of statistical correspondence modeling, such as multidimensional scaling and correspondence analysis. 1
5 0.46864524 134 nips-2004-Object Classification from a Single Example Utilizing Class Relevance Metrics
Author: Michael Fink
Abstract: We describe a framework for learning an object classifier from a single example. This goal is achieved by emphasizing the relevant dimensions for classification using available examples of related classes. Learning to accurately classify objects from a single training example is often unfeasible due to overfitting effects. However, if the instance representation provides that the distance between each two instances of the same class is smaller than the distance between any two instances from different classes, then a nearest neighbor classifier could achieve perfect performance with a single training example. We therefore suggest a two stage strategy. First, learn a metric over the instances that achieves the distance criterion mentioned above, from available examples of other related classes. Then, using the single examples, define a nearest neighbor classifier where distance is evaluated by the learned class relevance metric. Finding a metric that emphasizes the relevant dimensions for classification might not be possible when restricted to linear projections. We therefore make use of a kernel based metric learning algorithm. Our setting encodes object instances as sets of locality based descriptors and adopts an appropriate image kernel for the class relevance metric learning. The proposed framework for learning from a single example is demonstrated in a synthetic setting and on a character classification task. 1
6 0.46589005 85 nips-2004-Instance-Based Relevance Feedback for Image Retrieval
7 0.43943137 207 nips-2004-ℓ₀-norm Minimization for Basis Selection
8 0.40830338 2 nips-2004-A Direct Formulation for Sparse PCA Using Semidefinite Programming
9 0.40221894 1 nips-2004-A Cost-Shaping LP for Bellman Error Minimization with Performance Guarantees
10 0.38718414 61 nips-2004-Efficient Out-of-Sample Extension of Dominant-Set Clusters
11 0.37856448 107 nips-2004-Making Latin Manuscripts Searchable using gHMM's
12 0.37574735 3 nips-2004-A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound
13 0.35128102 150 nips-2004-Proximity Graphs for Clustering and Manifold Learning
14 0.32784611 113 nips-2004-Maximum-Margin Matrix Factorization
15 0.32470948 145 nips-2004-Parametric Embedding for Class Visualization
16 0.32460612 160 nips-2004-Seeing through water
17 0.31768748 110 nips-2004-Matrix Exponential Gradient Updates for On-line Learning and Bregman Projection
18 0.3135398 126 nips-2004-Nearly Tight Bounds for the Continuum-Armed Bandit Problem
19 0.30583805 27 nips-2004-Bayesian Regularization and Nonnegative Deconvolution for Time Delay Estimation
20 0.29627272 52 nips-2004-Discrete profile alignment via constrained information bottleneck
topicId topicWeight
[(13, 0.11), (15, 0.103), (26, 0.047), (31, 0.022), (33, 0.16), (35, 0.02), (39, 0.036), (50, 0.02), (54, 0.355), (76, 0.015)]
simIndex simValue paperId paperTitle
1 0.83451772 155 nips-2004-Responding to Modalities with Different Latencies
Author: Fredrik Bissmarck, Hiroyuki Nakahara, Kenji Doya, Okihide Hikosaka
Abstract: Motor control depends on sensory feedback in multiple modalities with different latencies. In this paper we consider within the framework of reinforcement learning how different sensory modalities can be combined and selected for real-time, optimal movement control. We propose an actor-critic architecture with multiple modules, whose output are combined using a softmax function. We tested our architecture in a simulation of a sequential reaching task. Reaching was initially guided by visual feedback with a long latency. Our learning scheme allowed the agent to utilize the somatosensory feedback with shorter latency when the hand is near the experienced trajectory. In simulations with different latencies for visual and somatosensory feedback, we found that the agent depended more on feedback with shorter latency. 1
same-paper 2 0.77443749 196 nips-2004-Triangle Fixing Algorithms for the Metric Nearness Problem
Author: Suvrit Sra, Joel Tropp, Inderjit S. Dhillon
Abstract: Various problems in machine learning, databases, and statistics involve pairwise distances among a set of objects. It is often desirable for these distances to satisfy the properties of a metric, especially the triangle inequality. Applications where metric data is useful include clustering, classification, metric-based indexing, and approximation algorithms for various graph problems. This paper presents the Metric Nearness Problem: Given a dissimilarity matrix, find the “nearest” matrix of distances that satisfy the triangle inequalities. For p nearness measures, this paper develops efficient triangle fixing algorithms that compute globally optimal solutions by exploiting the inherent structure of the problem. Empirically, the algorithms have time and storage costs that are linear in the number of triangle constraints. The methods can also be easily parallelized for additional speed. 1
3 0.66725194 125 nips-2004-Multiple Relational Embedding
Author: Roland Memisevic, Geoffrey E. Hinton
Abstract: We describe a way of using multiple different types of similarity relationship to learn a low-dimensional embedding of a dataset. Our method chooses different, possibly overlapping representations of similarity by individually reweighting the dimensions of a common underlying latent space. When applied to a single similarity relation that is based on Euclidean distances between the input data points, the method reduces to simple dimensionality reduction. If additional information is available about the dataset or about subsets of it, we can use this information to clean up or otherwise improve the embedding. We demonstrate the potential usefulness of this form of semi-supervised dimensionality reduction on some simple examples. 1
4 0.53805637 163 nips-2004-Semi-parametric Exponential Family PCA
Author: Sajama Sajama, Alon Orlitsky
Abstract: We present a semi-parametric latent variable model based technique for density modelling, dimensionality reduction and visualization. Unlike previous methods, we estimate the latent distribution non-parametrically which enables us to model data generated by an underlying low dimensional, multimodal distribution. In addition, we allow the components of latent variable models to be drawn from the exponential family which makes the method suitable for special data types, for example binary or count data. Simulations on real valued, binary and count data show favorable comparison to other related schemes both in terms of separating different populations and generalization to unseen samples. 1
5 0.53670734 131 nips-2004-Non-Local Manifold Tangent Learning
Author: Yoshua Bengio, Martin Monperrus
Abstract: We claim and present arguments to the effect that a large class of manifold learning algorithms that are essentially local and can be framed as kernel learning algorithms will suffer from the curse of dimensionality, at the dimension of the true underlying manifold. This observation suggests to explore non-local manifold learning algorithms which attempt to discover shared structure in the tangent planes at different positions. A criterion for such an algorithm is proposed and experiments estimating a tangent plane prediction function are presented, showing its advantages with respect to local manifold learning algorithms: it is able to generalize very far from training data (on learning handwritten character image rotations), where a local non-parametric method fails. 1
6 0.53550941 102 nips-2004-Learning first-order Markov models for control
7 0.534329 70 nips-2004-Following Curved Regularized Optimization Solution Paths
8 0.53430033 181 nips-2004-Synergies between Intrinsic and Synaptic Plasticity in Individual Model Neurons
9 0.53293073 124 nips-2004-Multiple Alignment of Continuous Time Series
10 0.53048396 204 nips-2004-Variational Minimax Estimation of Discrete Distributions under KL Loss
11 0.53009397 189 nips-2004-The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees
12 0.52928483 116 nips-2004-Message Errors in Belief Propagation
13 0.52890289 60 nips-2004-Efficient Kernel Machines Using the Improved Fast Gauss Transform
14 0.52809715 178 nips-2004-Support Vector Classification with Input Data Uncertainty
15 0.52784783 174 nips-2004-Spike Sorting: Bayesian Clustering of Non-Stationary Data
16 0.52738351 22 nips-2004-An Investigation of Practical Approximate Nearest Neighbor Algorithms
17 0.52720958 93 nips-2004-Kernel Projection Machine: a New Tool for Pattern Recognition
18 0.5271793 187 nips-2004-The Entire Regularization Path for the Support Vector Machine
19 0.52679592 142 nips-2004-Outlier Detection with One-class Kernel Fisher Discriminants
20 0.52631837 25 nips-2004-Assignment of Multiplicative Mixtures in Natural Images