nips nips2009 nips2009-202 knowledge-graph by maker-knowledge-mining

202 nips-2009-Regularized Distance Metric Learning:Theory and Algorithm


Source: pdf

Author: Rong Jin, Shijun Wang, Yang Zhou

Abstract: In this paper, we examine the generalization error of regularized distance metric learning. We show that with appropriate constraints, the generalization error of regularized distance metric learning could be independent from the dimensionality, making it suitable for handling high dimensional data. In addition, we present an efficient online learning algorithm for regularized distance metric learning. Our empirical studies with data classification and face recognition show that the proposed algorithm is (i) effective for distance metric learning when compared to the state-of-the-art methods, and (ii) efficient and robust for high dimensional data.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract In this paper, we examine the generalization error of regularized distance metric learning. [sent-7, score-1.12]

2 We show that with appropriate constraints, the generalization error of regularized distance metric learning could be independent from the dimensionality, making it suitable for handling high dimensional data. [sent-8, score-1.254]

3 In addition, we present an efficient online learning algorithm for regularized distance metric learning. [sent-9, score-1.156]

4 Our empirical studies with data classification and face recognition show that the proposed algorithm is (i) effective for distance metric learning when compared to the state-of-the-art methods, and (ii) efficient and robust for high dimensional data. [sent-10, score-1.143]

5 1 Introduction Distance metric learning is a fundamental problem in machine learning and pattern recognition. [sent-11, score-0.602]

6 Numerous algorithms have been proposed and examined for distance metric learning. [sent-13, score-0.805]

7 They are usually classified into two categories: unsupervised metric learning and supervised metric learning. [sent-14, score-1.134]

8 Unsupervised distance metric learning, or sometimes referred to as manifold learning, aims to learn a underlying low-dimensional manifold where the distance between most pairs of data points are preserved. [sent-15, score-1.076]

9 Supervised metric learning attempts to learn distance metrics from side information such as labeled instances and pairwise constraints. [sent-17, score-0.938]

10 It searches for the optimal distance metric that (a) keeps data points of the same classes close, and (b) keeps data points from different classes far apart. [sent-18, score-0.981]

11 In this work, we focus on supervised distance metric learning. [sent-20, score-0.837]

12 Although a large number of studies were devoted to supervised distance metric learning (see the survey in [18] and references therein), few studies address the generalization error of distance metric learning. [sent-21, score-1.845]

13 In this paper, we examine the generalization error for regularized distance metric learning. [sent-22, score-1.12]

14 Following the idea of stability analysis [1], we show that with appropriate constraints, the generalization error of regularized distance metric learning is independent from the dimensionality of data, making it suitable for handling high dimensional data. [sent-23, score-1.371]

15 In addition, we present an online learning algorithm for regularized distance metric learning, and show its regret bound. [sent-24, score-1.215]

16 To verify the efficacy and efficiency of the proposed algorithm for regularized distance metric learning, we conduct experiments with data classification and face recognition. [sent-26, score-1.165]

17 Our empirical results show that the proposed online algorithm is (1) effective for metric learning compared to the state-of-the-art methods, and (2) robust and efficient for high dimensional data. [sent-27, score-0.835]

18 In our study, we assume that the norm of any example is upper bounded by R, i. [sent-38, score-0.032]

19 Let A ∈ Sd×d be the + distance metric to be learned, where the distance between two data points x and x′ is calculated as |x − x′ |2 = (x − x′ )⊤ A(x − x′ ). [sent-41, score-1.076]

20 A Following the idea of maximum margin classifiers, we have the following framework for regularized distance metric learning:   1  2C min |A|2 + g yi,j 1 − |xi − xj |2 : A 0, tr(A) ≤ η(d) (1) F A A 2  n(n − 1) i < 1. [sent-42, score-1.007]

21 We will also show how this constraint could affect the generalization error of distance metric learning. [sent-43, score-0.908]

22 3 Generalization Error Let AD be the distance metric learned by the algorithm in (1) from the training examples D. [sent-44, score-0.947]

23 Given a distance metric learning algorithm A has uniform stability κ/n, we have the following inequality for E(DD ) κ E(DD ) ≤ 2 (6) n where n is the number of training examples in D. [sent-50, score-1.049]

24 The result in the following lemma shows that the condition in McDiarmid inequality holds. [sent-51, score-0.089]

25 Let D be a collection of n randomly selected training examples, and Di,z be the collection of examples that replaces zi in D with example z. [sent-53, score-0.204]

26 We have |DD − DDi,z | bounded as follows |DD − DDi,z | ≤ 2κ + 8Lη(d) + 2g0 n (7) where g0 = supz,z′ |V (0, z, z ′ )| measures the largest loss when distance metric A is 0. [sent-54, score-0.805]

27 Combining the results in Lemma 1 and 2, we can now derive the the bound for the generalization error by using the McDiarmid inequality. [sent-55, score-0.139]

28 Let D denote a collection of n randomly selected training examples, and AD be the distance metric learned by the algorithm in (1) whose uniform stability is κ/n. [sent-57, score-1.064]

29 With probability 1 − δ, we have the following bound for I(AD ) I(AD ) − ID (AD ) ≤ 2κ + (2κ + 4Lη(d) + 2g0 ) n ln(2/δ) 2n (8) 3. [sent-58, score-0.036]

30 F z,z ′ To bound the uniform stability, we need the following proposition Proposition 2. [sent-65, score-0.088]

31 Let D denote a collection of n randomly selected training examples, and by z = (x, y) a randomly selected example. [sent-69, score-0.128]

32 Let AD be the distance metric learned by the algorithm in (1). [sent-70, score-0.88]

33 We have |AD − ADi,z |F ≤ 8CLR2 n (11) The proof of the above lemma can be found in Appendix A. [sent-71, score-0.062]

34 By putting the results in Lemma 3 and Proposition 2, we have the following theorem for the stability of the Frobenius norm based regularizer. [sent-72, score-0.165]

35 Let D be a collection of n randomly selected examples, and AD be the distance metric learned by the algorithm in (1) with h(A) = |A|2 . [sent-75, score-0.947]

36 With probability 1 − δ, we have the following F bound for the true loss function I(AD ) where AD is learned from (1) using the Frobenius norm regularizer I(AD ) − ID (AD ) ≤ where s(d) = min 32CL2 R4 + 32CL2 R4 + 4Ls(d) + 2g0 n ln(2/δ) 2n (13) √ 2dg0 C, η(d) . [sent-76, score-0.15]

37 Remark√ The most important feature in the estimation error is that it converges in the order of O(s(d)/ n). [sent-77, score-0.038]

38 , η(d) ∼ dp with p ≪ 1), the proposed framework for regularized distance metric learning will be robust to the high dimensional data. [sent-80, score-1.135]

39 In the extreme case, by setting η(d) to be a constant, the estimation error will be independent from the dimensionality of data. [sent-81, score-0.073]

40 4 Algorithm In this section, we discuss an efficient algorithm for solving (1). [sent-82, score-0.034]

41 To design an online learning algorithm for regularized distance metric learning, we follow the theory of gradient based online learning [2] by defining potential function Φ(A) = |A|2 /2. [sent-86, score-1.3]

42 The theorem below shows the regret bound for the online learning algorithm in Figure 1. [sent-88, score-0.324]

43 Let the online learning algorithm 1 run with learning rate λ > 0 on a sequence (xt , x′ ), yt , t = 1, . [sent-90, score-0.459]

44 , T do 1 2 4: Receive a pair of training examples {(x1 , yt ), (x2 , yt )} t t 1 2 5: Compute the class label yt : yt = +1 if yt = yt , and yt = −1 otherwise. [sent-98, score-1.796]

45 6: if the training pair (x1 , x2 ), yt is classified correctly, i. [sent-99, score-0.282]

46 , yt 1 − |x1 − x2 |2 t−1 > 0 then t t t t A 7: At = At−1 . [sent-101, score-0.247]

47 10: end if 11: end for The proof of this theorem can be found in Appendix B. [sent-103, score-0.051]

48 Note that the above online learning algorithm require computing πS+ (M ), i. [sent-104, score-0.178]

49 , projecting matrix M onto the SDP cone, which is expensive for high dimensional data. [sent-106, score-0.122]

50 The optimal solution λt to the problem in (14) is expressed as λt = λ min λ, [(xt − x′ )⊤ A−1 (xt − x′ )]−1 t t t−1 yt = −1 yt = +1 Proof of this theorem can be found in the supplementary materials. [sent-110, score-0.545]

51 5 Experiments We conducted an extensive study to verify both the efficiency and the efficacy of the proposed algorithms for metric learning. [sent-113, score-0.565]

52 For the convenience of discussion, we refer to the propoesd online distance metric learning algorithm as online-reg. [sent-114, score-0.983]

53 To examine the efficacy of the learned distance metric, we employed the k Nearest Neighbor (k-NN) classifier. [sent-115, score-0.351]

54 Our hypothesis is that the better the distance metric is, the higher the classification accuracy of k-NN will be. [sent-116, score-0.836]

55 We compare our algorithm to the following six state-of-the-art algorithms for distance metric learning as baselines: (1) Euclidean distance metric; (2) Mahalanobis distance metric, which is comn puted as the inverse of covariance matrix of training samples, i. [sent-118, score-1.476]

56 , ( i=1 xi xi )−1 ; (3) Xing’s algorithm proposed in [17]; (4) LMNN, a distance metric learning algorithm based on the large margin nearest neighbor classifier [15]; (5) ITML, an Information-theoretic metric learning based on [4]; and (6) Relevance Component Analysis (RCA) [10]. [sent-120, score-1.57]

57 The number of target neighbors in LMNN and parameter γ in ITML 5 Table 1: Classification error (%) of a k-NN (k = 3) classifier on the ten UCI data sets using seven different metrics. [sent-122, score-0.038]

58 Table 1 shows the classification errors of all the metric learning methods over 9 datasets averaged over 10 runs, together with the standard deviation. [sent-307, score-0.597]

59 We observe that the proposed metric learning algorithm deliver performance that comparable to the state-of-the-art methods. [sent-308, score-0.602]

60 In particular, for almost all datasets, the classification accuracy of the proposed algorithm is close to that of LMNN, which has yielded overall the best performance among six baseline algorithms. [sent-309, score-0.065]

61 This is consistent with the results of the other studies, which show LMNN is among the most effective algorithms for distance metric learning. [sent-310, score-0.805]

62 To further verify if the proposed method performs statistically better than the baseline methods, we conduct statistical test by using Wilcoxon signed-rank test [3]. [sent-311, score-0.068]

63 It is known to be safer than the Student’s t-test because it does not assume normal distributions. [sent-313, score-0.026]

64 From table 2, we find that the regularized distance metric learning improves the classification accuracy significantly compared to Mahalanobis distance, Xing’s method and RCA at significant level 0. [sent-314, score-1.043]

65 6 att−face att−face 1 7000 LMNN 6000 Running time (seconds) Classification accuracy 0. [sent-317, score-0.031]

66 2 Image resize ratio (a) (b) Figure 1: (a) Face recognition accuracy of kNN and (b) running time of LMNN, ITML, RCA and online reg algorithms on the “att-face” dataset with varying image sizes. [sent-338, score-0.313]

67 2 Experiment (II): Results for High Dimensional Data To evaluate the dependence of the regularized metric learning algorithms on data dimensions, we tested it by the task of face recognition. [sent-340, score-0.826]

68 It consists of grey images of faces from 40 distinct subjects, with ten pictures for each subject. [sent-342, score-0.076]

69 For every subject, the images were taken at different times, with varied the lighting condition and different facial expressions (open/closed-eyes, smiling/not-smiling) and facial details (glasses/no-glasses). [sent-343, score-0.118]

70 The original size of each image is 112 × 92 pixels, with 256 grey levels per pixel. [sent-344, score-0.089]

71 To examine the sensitivity to data dimensionality, we vary the data dimension (i. [sent-345, score-0.039]

72 , the size of images) by compressing the original images into size different sizes with the image aspect ratio preserved. [sent-347, score-0.149]

73 The image compression is achieved by bicubic interpolation (the output pixel value is a weighted average of pixels in the nearest 4-by-4 neighborhood). [sent-348, score-0.09]

74 For each subject, we randomly spit its face images into training set and test set with ratio 4 : 6. [sent-349, score-0.19]

75 A distance metric is learned from the collection of training face images, and is used by the kNN classifier (k = 3) to predict the subject ID of the test images. [sent-350, score-1.007]

76 We conduct each experiment 10 times, and report the classification accuracy by averaging over 40 subjects and 10 runs. [sent-351, score-0.068]

77 Figure 1 (a) shows the average classification accuracy of the kNN classifier using different distance metric learning algorithms. [sent-352, score-0.87]

78 The running times of different metric learning algorithms for the same dataset is shown in Figure 1 (b). [sent-353, score-0.568]

79 We observed that with increasing image size (dimensions), the regularized distance metric learning algorithm yields stable performance, indicating that the it is resilient to high dimensional data. [sent-355, score-1.22]

80 In contrast, for almost all the baseline methods except ITML, their performance varied significantly as the size of the input image changed. [sent-356, score-0.053]

81 Although ITML yields stable performance with respect to different size of images, its high computational cost (Figure 1), arising from solving a Bregman optimization problem in each iteration, makes it unsuitable for high-dimensional data. [sent-357, score-0.026]

82 6 Conclusion In this paper, we analyze the generalization error of regularized distance metric learning. [sent-358, score-1.081]

83 We show that with appropriate constraint, the regularized distance metric learning could be robust to high dimensional data. [sent-359, score-1.135]

84 We also present efficient learning algorithms for solving the related optimization problems. [sent-360, score-0.034]

85 Empirical studies with face recognition and data classification show the proposed approach is (i) robust and efficient for high dimensional data, and (ii) comparable to the state-of-theart approaches for distance learning. [sent-361, score-0.541]

86 In the future, we plan to investigate different regularizers and their effect for distance metric learning. [sent-362, score-0.805]

87 Since t 2 n t=1 n I yt (1 − |xt − x′ |2 t−1 ) t A < 0 |xt − x′ |4 t ≤ t=1 we thus have the result in the theorem 8 max(0, b − yt (1 − |xt − x′ |2 t−1 )) t A 16R4 16R4 = Ln b b References [1] Bousquet, Olivier, and Andr´ Elisseeff. [sent-377, score-0.545]

88 Learning distance metrics with contextual constraints for image retrieval. [sent-414, score-0.359]

89 Think globally, fit locally: Unsupervised learning of low dimensional manifolds. [sent-421, score-0.103]

90 Distance metric learning for large margin nearest neighbor classification. [sent-464, score-0.663]

91 Distance metric learning from uncertain side information with application to automated photo tagging. [sent-469, score-0.568]

92 Distance metric learning, with application to clustering with side-information. [sent-479, score-0.534]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('metric', 0.534), ('distance', 0.271), ('itml', 0.263), ('yt', 0.247), ('ad', 0.244), ('lmnn', 0.194), ('xt', 0.186), ('regularized', 0.173), ('rca', 0.17), ('dd', 0.12), ('online', 0.11), ('xing', 0.099), ('mahala', 0.09), ('zu', 0.09), ('zv', 0.09), ('face', 0.085), ('stability', 0.082), ('hoi', 0.078), ('steven', 0.078), ('sdp', 0.078), ('id', 0.075), ('dimensional', 0.069), ('mahalanobis', 0.068), ('classi', 0.068), ('ln', 0.068), ('generalization', 0.065), ('instances', 0.064), ('lemma', 0.062), ('rong', 0.061), ('att', 0.06), ('bregmen', 0.06), ('eclidean', 0.06), ('resize', 0.06), ('regret', 0.059), ('knn', 0.058), ('wilcoxon', 0.058), ('mcdiarmid', 0.058), ('frobenius', 0.057), ('classes', 0.055), ('image', 0.053), ('proposition', 0.052), ('theorem', 0.051), ('cone', 0.048), ('army', 0.048), ('cacy', 0.046), ('jin', 0.045), ('handling', 0.044), ('regularizer', 0.041), ('collection', 0.041), ('learned', 0.041), ('vd', 0.04), ('images', 0.04), ('facial', 0.039), ('examine', 0.039), ('error', 0.038), ('michigan', 0.037), ('mistake', 0.037), ('multimedia', 0.037), ('nearest', 0.037), ('conduct', 0.037), ('yang', 0.036), ('tr', 0.036), ('grey', 0.036), ('bound', 0.036), ('training', 0.035), ('dimensionality', 0.035), ('metrics', 0.035), ('algorithm', 0.034), ('learning', 0.034), ('studies', 0.033), ('keeps', 0.033), ('appendix', 0.033), ('wei', 0.032), ('norm', 0.032), ('examples', 0.032), ('supervised', 0.032), ('accuracy', 0.031), ('advantageous', 0.031), ('features', 0.031), ('verify', 0.031), ('ratio', 0.03), ('ef', 0.029), ('zi', 0.029), ('neighbor', 0.029), ('margin', 0.029), ('recognition', 0.029), ('datasets', 0.029), ('collaborative', 0.028), ('robust', 0.028), ('projecting', 0.027), ('inequality', 0.027), ('high', 0.026), ('selected', 0.026), ('cheung', 0.026), ('compressing', 0.026), ('optdigits', 0.026), ('safer', 0.026), ('nenghai', 0.026), ('resilient', 0.026), ('puted', 0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999928 202 nips-2009-Regularized Distance Metric Learning:Theory and Algorithm

Author: Rong Jin, Shijun Wang, Yang Zhou

Abstract: In this paper, we examine the generalization error of regularized distance metric learning. We show that with appropriate constraints, the generalization error of regularized distance metric learning could be independent from the dimensionality, making it suitable for handling high dimensional data. In addition, we present an efficient online learning algorithm for regularized distance metric learning. Our empirical studies with data classification and face recognition show that the proposed algorithm is (i) effective for distance metric learning when compared to the state-of-the-art methods, and (ii) efficient and robust for high dimensional data.

2 0.30749479 223 nips-2009-Sparse Metric Learning via Smooth Optimization

Author: Yiming Ying, Kaizhu Huang, Colin Campbell

Abstract: In this paper we study the problem of learning a low-rank (sparse) distance matrix. We propose a novel metric learning model which can simultaneously conduct dimension reduction and learn a distance matrix. The sparse representation involves a mixed-norm regularization which is non-convex. We then show that it can be equivalently formulated as a convex saddle (min-max) problem. From this saddle representation, we develop an efficient smooth optimization approach [17] for sparse metric learning, although the learning model is based on a nondifferentiable loss function. Finally, we run experiments to validate the effectiveness and efficiency of our sparse metric learning model on various datasets.

3 0.27054253 126 nips-2009-Learning Bregman Distance Functions and Its Application for Semi-Supervised Clustering

Author: Lei Wu, Rong Jin, Steven C. Hoi, Jianke Zhu, Nenghai Yu

Abstract: Learning distance functions with side information plays a key role in many machine learning and data mining applications. Conventional approaches often assume a Mahalanobis distance function. These approaches are limited in two aspects: (i) they are computationally expensive (even infeasible) for high dimensional data because the size of the metric is in the square of dimensionality; (ii) they assume a fixed metric for the entire input space and therefore are unable to handle heterogeneous data. In this paper, we propose a novel scheme that learns nonlinear Bregman distance functions from side information using a nonparametric approach that is similar to support vector machines. The proposed scheme avoids the assumption of fixed metric by implicitly deriving a local distance from the Hessian matrix of a convex function that is used to generate the Bregman distance function. We also present an efficient learning algorithm for the proposed scheme for distance function learning. The extensive experiments with semi-supervised clustering show the proposed technique (i) outperforms the state-of-the-art approaches for distance function learning, and (ii) is computationally efficient for high dimensional data. 1

4 0.18028453 22 nips-2009-Accelerated Gradient Methods for Stochastic Optimization and Online Learning

Author: Chonghai Hu, Weike Pan, James T. Kwok

Abstract: Regularized risk minimization often involves non-smooth optimization, either because of the loss function (e.g., hinge loss) or the regularizer (e.g., ℓ1 -regularizer). Gradient methods, though highly scalable and easy to implement, are known to converge slowly. In this paper, we develop a novel accelerated gradient method for stochastic optimization while still preserving their computational simplicity and scalability. The proposed algorithm, called SAGE (Stochastic Accelerated GradiEnt), exhibits fast convergence rates on stochastic composite optimization with convex or strongly convex objectives. Experimental results show that SAGE is faster than recent (sub)gradient methods including FOLOS, SMIDAS and SCD. Moreover, SAGE can also be extended for online learning, resulting in a simple algorithm but with the best regret bounds currently known for these problems. 1

5 0.16912982 191 nips-2009-Positive Semidefinite Metric Learning with Boosting

Author: Chunhua Shen, Junae Kim, Lei Wang, Anton Hengel

Abstract: The learning of appropriate distance metrics is a critical problem in image classification and retrieval. In this work, we propose a boosting-based technique, termed B OOST M ETRIC, for learning a Mahalanobis distance metric. One of the primary difficulties in learning such a metric is to ensure that the Mahalanobis matrix remains positive semidefinite. Semidefinite programming is sometimes used to enforce this constraint, but does not scale well. B OOST M ETRIC is instead based on a key observation that any positive semidefinite matrix can be decomposed into a linear positive combination of trace-one rank-one matrices. B OOST M ETRIC thus uses rank-one positive semidefinite matrices as weak learners within an efficient and scalable boosting-based learning process. The resulting method is easy to implement, does not require tuning, and can accommodate various types of constraints. Experiments on various datasets show that the proposed algorithm compares favorably to those state-of-the-art methods in terms of classification accuracy and running time. 1

6 0.13280611 27 nips-2009-Adaptive Regularization of Weight Vectors

7 0.12869801 178 nips-2009-On Stochastic and Worst-case Models for Investing

8 0.12340339 144 nips-2009-Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness

9 0.11573362 32 nips-2009-An Online Algorithm for Large Scale Image Similarity Learning

10 0.11176599 207 nips-2009-Robust Nonparametric Regression with Metric-Space Valued Output

11 0.11127857 215 nips-2009-Sensitivity analysis in HMMs with application to likelihood maximization

12 0.097283207 154 nips-2009-Modeling the spacing effect in sequential category learning

13 0.096171111 177 nips-2009-On Learning Rotations

14 0.091927655 125 nips-2009-Learning Brain Connectivity of Alzheimer's Disease from Neuroimaging Data

15 0.091703579 116 nips-2009-Information-theoretic lower bounds on the oracle complexity of convex optimization

16 0.086896658 220 nips-2009-Slow Learners are Fast

17 0.085825764 20 nips-2009-A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers

18 0.081340395 57 nips-2009-Conditional Random Fields with High-Order Features for Sequence Labeling

19 0.080373801 90 nips-2009-Factor Modeling for Advertisement Targeting

20 0.079750635 63 nips-2009-DUOL: A Double Updating Approach for Online Learning


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.249), (1, 0.197), (2, 0.009), (3, 0.012), (4, 0.153), (5, 0.159), (6, -0.056), (7, 0.085), (8, -0.032), (9, 0.098), (10, 0.129), (11, 0.023), (12, 0.133), (13, 0.067), (14, 0.031), (15, -0.296), (16, 0.152), (17, -0.056), (18, 0.041), (19, 0.025), (20, -0.207), (21, -0.087), (22, -0.12), (23, -0.008), (24, 0.036), (25, 0.15), (26, -0.086), (27, -0.138), (28, -0.059), (29, -0.037), (30, -0.056), (31, 0.097), (32, 0.042), (33, -0.031), (34, -0.149), (35, -0.029), (36, 0.02), (37, 0.058), (38, -0.088), (39, -0.018), (40, -0.139), (41, -0.056), (42, -0.051), (43, 0.108), (44, -0.042), (45, -0.03), (46, 0.074), (47, 0.041), (48, 0.006), (49, 0.063)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97626555 202 nips-2009-Regularized Distance Metric Learning:Theory and Algorithm

Author: Rong Jin, Shijun Wang, Yang Zhou

Abstract: In this paper, we examine the generalization error of regularized distance metric learning. We show that with appropriate constraints, the generalization error of regularized distance metric learning could be independent from the dimensionality, making it suitable for handling high dimensional data. In addition, we present an efficient online learning algorithm for regularized distance metric learning. Our empirical studies with data classification and face recognition show that the proposed algorithm is (i) effective for distance metric learning when compared to the state-of-the-art methods, and (ii) efficient and robust for high dimensional data.

2 0.82572675 223 nips-2009-Sparse Metric Learning via Smooth Optimization

Author: Yiming Ying, Kaizhu Huang, Colin Campbell

Abstract: In this paper we study the problem of learning a low-rank (sparse) distance matrix. We propose a novel metric learning model which can simultaneously conduct dimension reduction and learn a distance matrix. The sparse representation involves a mixed-norm regularization which is non-convex. We then show that it can be equivalently formulated as a convex saddle (min-max) problem. From this saddle representation, we develop an efficient smooth optimization approach [17] for sparse metric learning, although the learning model is based on a nondifferentiable loss function. Finally, we run experiments to validate the effectiveness and efficiency of our sparse metric learning model on various datasets.

3 0.77718496 191 nips-2009-Positive Semidefinite Metric Learning with Boosting

Author: Chunhua Shen, Junae Kim, Lei Wang, Anton Hengel

Abstract: The learning of appropriate distance metrics is a critical problem in image classification and retrieval. In this work, we propose a boosting-based technique, termed B OOST M ETRIC, for learning a Mahalanobis distance metric. One of the primary difficulties in learning such a metric is to ensure that the Mahalanobis matrix remains positive semidefinite. Semidefinite programming is sometimes used to enforce this constraint, but does not scale well. B OOST M ETRIC is instead based on a key observation that any positive semidefinite matrix can be decomposed into a linear positive combination of trace-one rank-one matrices. B OOST M ETRIC thus uses rank-one positive semidefinite matrices as weak learners within an efficient and scalable boosting-based learning process. The resulting method is easy to implement, does not require tuning, and can accommodate various types of constraints. Experiments on various datasets show that the proposed algorithm compares favorably to those state-of-the-art methods in terms of classification accuracy and running time. 1

4 0.72551227 126 nips-2009-Learning Bregman Distance Functions and Its Application for Semi-Supervised Clustering

Author: Lei Wu, Rong Jin, Steven C. Hoi, Jianke Zhu, Nenghai Yu

Abstract: Learning distance functions with side information plays a key role in many machine learning and data mining applications. Conventional approaches often assume a Mahalanobis distance function. These approaches are limited in two aspects: (i) they are computationally expensive (even infeasible) for high dimensional data because the size of the metric is in the square of dimensionality; (ii) they assume a fixed metric for the entire input space and therefore are unable to handle heterogeneous data. In this paper, we propose a novel scheme that learns nonlinear Bregman distance functions from side information using a nonparametric approach that is similar to support vector machines. The proposed scheme avoids the assumption of fixed metric by implicitly deriving a local distance from the Hessian matrix of a convex function that is used to generate the Bregman distance function. We also present an efficient learning algorithm for the proposed scheme for distance function learning. The extensive experiments with semi-supervised clustering show the proposed technique (i) outperforms the state-of-the-art approaches for distance function learning, and (ii) is computationally efficient for high dimensional data. 1

5 0.56696784 32 nips-2009-An Online Algorithm for Large Scale Image Similarity Learning

Author: Gal Chechik, Uri Shalit, Varun Sharma, Samy Bengio

Abstract: Learning a measure of similarity between pairs of objects is a fundamental problem in machine learning. It stands in the core of classification methods like kernel machines, and is particularly useful for applications like searching for images that are similar to a given image or finding videos that are relevant to a given video. In these tasks, users look for objects that are not only visually similar but also semantically related to a given object. Unfortunately, current approaches for learning similarity do not scale to large datasets, especially when imposing metric constraints on the learned similarity. We describe OASIS, a method for learning pairwise similarity that is fast and scales linearly with the number of objects and the number of non-zero features. Scalability is achieved through online learning of a bilinear model over sparse representations using a large margin criterion and an efficient hinge loss cost. OASIS is accurate at a wide range of scales: on a standard benchmark with thousands of images, it is more precise than state-of-the-art methods, and faster by orders of magnitude. On 2.7 million images collected from the web, OASIS can be trained within 3 days on a single CPU. The nonmetric similarities learned by OASIS can be transformed into metric similarities, achieving higher precisions than similarities that are learned as metrics in the first place. This suggests an approach for learning a metric from data that is larger by orders of magnitude than was handled before. 1

6 0.53302705 27 nips-2009-Adaptive Regularization of Weight Vectors

7 0.45135257 22 nips-2009-Accelerated Gradient Methods for Stochastic Optimization and Online Learning

8 0.44336316 90 nips-2009-Factor Modeling for Advertisement Targeting

9 0.43170676 63 nips-2009-DUOL: A Double Updating Approach for Online Learning

10 0.39165449 207 nips-2009-Robust Nonparametric Regression with Metric-Space Valued Output

11 0.39143988 74 nips-2009-Efficient Bregman Range Search

12 0.39012063 177 nips-2009-On Learning Rotations

13 0.38937014 73 nips-2009-Dual Averaging Method for Regularized Stochastic Learning and Online Optimization

14 0.38179716 33 nips-2009-Analysis of SVM with Indefinite Kernels

15 0.37123322 144 nips-2009-Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness

16 0.36733475 178 nips-2009-On Stochastic and Worst-case Models for Investing

17 0.35294995 116 nips-2009-Information-theoretic lower bounds on the oracle complexity of convex optimization

18 0.35036048 127 nips-2009-Learning Label Embeddings for Nearest-Neighbor Multi-class Classification with an Application to Speech Recognition

19 0.34384018 146 nips-2009-Manifold Regularization for SIR with Rate Root-n Convergence

20 0.33843431 91 nips-2009-Fast, smooth and adaptive regression in metric spaces


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(7, 0.012), (24, 0.064), (25, 0.042), (35, 0.049), (36, 0.133), (39, 0.047), (44, 0.204), (58, 0.161), (71, 0.06), (81, 0.02), (86, 0.108), (91, 0.013)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.91391987 121 nips-2009-Know Thy Neighbour: A Normative Theory of Synaptic Depression

Author: Jean-pascal Pfister, Peter Dayan, Máté Lengyel

Abstract: Synapses exhibit an extraordinary degree of short-term malleability, with release probabilities and effective synaptic strengths changing markedly over multiple timescales. From the perspective of a fixed computational operation in a network, this seems like a most unacceptable degree of added variability. We suggest an alternative theory according to which short-term synaptic plasticity plays a normatively-justifiable role. This theory starts from the commonplace observation that the spiking of a neuron is an incomplete, digital, report of the analog quantity that contains all the critical information, namely its membrane potential. We suggest that a synapse solves the inverse problem of estimating the pre-synaptic membrane potential from the spikes it receives, acting as a recursive filter. We show that the dynamics of short-term synaptic depression closely resemble those required for optimal filtering, and that they indeed support high quality estimation. Under this account, the local postsynaptic potential and the level of synaptic resources track the (scaled) mean and variance of the estimated presynaptic membrane potential. We make experimentally testable predictions for how the statistics of subthreshold membrane potential fluctuations and the form of spiking non-linearity should be related to the properties of short-term plasticity in any particular cell type. 1

2 0.89133191 86 nips-2009-Exploring Functional Connectivities of the Human Brain using Multivariate Information Analysis

Author: Barry Chai, Dirk Walther, Diane Beck, Li Fei-fei

Abstract: In this study, we present a new method for establishing fMRI pattern-based functional connectivity between brain regions by estimating their multivariate mutual information. Recent advances in the numerical approximation of highdimensional probability distributions allow us to successfully estimate mutual information from scarce fMRI data. We also show that selecting voxels based on the multivariate mutual information of local activity patterns with respect to ground truth labels leads to higher decoding accuracy than established voxel selection methods. We validate our approach with a 6-way scene categorization fMRI experiment. Multivariate information analysis is able to find strong information sharing between PPA and RSC, consistent with existing neuroscience studies on scenes. Furthermore, an exploratory whole-brain analysis uncovered other brain regions that share information with the PPA-RSC scene network.

same-paper 3 0.8640101 202 nips-2009-Regularized Distance Metric Learning:Theory and Algorithm

Author: Rong Jin, Shijun Wang, Yang Zhou

Abstract: In this paper, we examine the generalization error of regularized distance metric learning. We show that with appropriate constraints, the generalization error of regularized distance metric learning could be independent from the dimensionality, making it suitable for handling high dimensional data. In addition, we present an efficient online learning algorithm for regularized distance metric learning. Our empirical studies with data classification and face recognition show that the proposed algorithm is (i) effective for distance metric learning when compared to the state-of-the-art methods, and (ii) efficient and robust for high dimensional data.

4 0.83200473 44 nips-2009-Beyond Categories: The Visual Memex Model for Reasoning About Object Relationships

Author: Tomasz Malisiewicz, Alyosha Efros

Abstract: The use of context is critical for scene understanding in computer vision, where the recognition of an object is driven by both local appearance and the object’s relationship to other elements of the scene (context). Most current approaches rely on modeling the relationships between object categories as a source of context. In this paper we seek to move beyond categories to provide a richer appearancebased model of context. We present an exemplar-based model of objects and their relationships, the Visual Memex, that encodes both local appearance and 2D spatial context between object instances. We evaluate our model on Torralba’s proposed Context Challenge against a baseline category-based system. Our experiments suggest that moving beyond categories for context modeling appears to be quite beneficial, and may be the critical missing ingredient in scene understanding systems. 1

5 0.79482281 126 nips-2009-Learning Bregman Distance Functions and Its Application for Semi-Supervised Clustering

Author: Lei Wu, Rong Jin, Steven C. Hoi, Jianke Zhu, Nenghai Yu

Abstract: Learning distance functions with side information plays a key role in many machine learning and data mining applications. Conventional approaches often assume a Mahalanobis distance function. These approaches are limited in two aspects: (i) they are computationally expensive (even infeasible) for high dimensional data because the size of the metric is in the square of dimensionality; (ii) they assume a fixed metric for the entire input space and therefore are unable to handle heterogeneous data. In this paper, we propose a novel scheme that learns nonlinear Bregman distance functions from side information using a nonparametric approach that is similar to support vector machines. The proposed scheme avoids the assumption of fixed metric by implicitly deriving a local distance from the Hessian matrix of a convex function that is used to generate the Bregman distance function. We also present an efficient learning algorithm for the proposed scheme for distance function learning. The extensive experiments with semi-supervised clustering show the proposed technique (i) outperforms the state-of-the-art approaches for distance function learning, and (ii) is computationally efficient for high dimensional data. 1

6 0.77422625 191 nips-2009-Positive Semidefinite Metric Learning with Boosting

7 0.77273864 70 nips-2009-Discriminative Network Models of Schizophrenia

8 0.76512772 104 nips-2009-Group Sparse Coding

9 0.75882113 254 nips-2009-Variational Gaussian-process factor analysis for modeling spatio-temporal data

10 0.75560004 61 nips-2009-Convex Relaxation of Mixture Regression with Efficient Algorithms

11 0.7548207 158 nips-2009-Multi-Label Prediction via Sparse Infinite CCA

12 0.75249308 224 nips-2009-Sparse and Locally Constant Gaussian Graphical Models

13 0.74733955 177 nips-2009-On Learning Rotations

14 0.74609131 169 nips-2009-Nonlinear Learning using Local Coordinate Coding

15 0.74470496 108 nips-2009-Heterogeneous multitask learning with joint sparsity constraints

16 0.74359888 35 nips-2009-Approximating MAP by Compensating for Structural Relaxations

17 0.7428717 252 nips-2009-Unsupervised Feature Selection for the $k$-means Clustering Problem

18 0.74283797 79 nips-2009-Efficient Recovery of Jointly Sparse Vectors

19 0.74241501 30 nips-2009-An Integer Projected Fixed Point Method for Graph Matching and MAP Inference

20 0.74241102 100 nips-2009-Gaussian process regression with Student-t likelihood