iccv iccv2013 iccv2013-227 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Zheyun Feng, Rong Jin, Anil Jain
Abstract: One of the key challenges in search-based image annotation models is to define an appropriate similarity measure between images. Many kernel distance metric learning (KML) algorithms have been developed in order to capture the nonlinear relationships between visual features and semantics ofthe images. Onefundamental limitation in applying KML to image annotation is that it requires converting image annotations into binary constraints, leading to a significant information loss. In addition, most KML algorithms suffer from high computational cost due to the requirement that the learned matrix has to be positive semi-definitive (PSD). In this paper, we propose a robust kernel metric learning (RKML) algorithm based on the regression technique that is able to directly utilize image annotations. The proposed method is also computationally more efficient because PSD property is automatically ensured by regression. We provide the theoretical guarantee for the proposed algorithm, and verify its efficiency and effectiveness for image annotation by comparing it to state-of-the-art approaches for both distance metric learning and image annotation. ,
Reference: text
sentIndex sentText sentNum sentScore
1 Many kernel distance metric learning (KML) algorithms have been developed in order to capture the nonlinear relationships between visual features and semantics ofthe images. [sent-2, score-0.48]
2 Onefundamental limitation in applying KML to image annotation is that it requires converting image annotations into binary constraints, leading to a significant information loss. [sent-3, score-0.274]
3 In this paper, we propose a robust kernel metric learning (RKML) algorithm based on the regression technique that is able to directly utilize image annotations. [sent-5, score-0.354]
4 We provide the theoretical guarantee for the proposed algorithm, and verify its efficiency and effectiveness for image annotation by comparing it to state-of-the-art approaches for both distance metric learning and image annotation. [sent-7, score-0.495]
5 Introduction The objective of image annotation is to automatically annotate an image with appropriate keywords, often referred to as tags, which reflect its visual content. [sent-9, score-0.209]
6 Their key idea is to annotate a test image I the common tags shared by the subset of trainwith ing images tthh atht are visually gsism sihlaarr etod b Iy. [sent-11, score-0.146]
7 Distance metric learning (DML) tackles this problem by learning a metric that pulls semantically similar images close and pushes semantically dissimilar images far apart. [sent-14, score-0.358]
8 Many studies on DML are restricted to learning a linear Mahalanobis distance metric, failing to capture the nonlinear relationj ain} @ c s e . [sent-15, score-0.227]
9 Several nonlinear DML algorithms have been proposed to overcome this limitation. [sent-18, score-0.087]
10 In the case of image annotation, it could be difficult to construct these binary constraints as two images with different annotations may still share several common keywords. [sent-24, score-0.099]
11 In particular, to ensure the learned metric to be Positive SemiDefinite (PSD), the existing methods need to project the learned matrix into a PSD cone whose computational cost is O(d3). [sent-29, score-0.21]
12 Finally, the high dimensionality of KML may lead to the overfitting of training data [18]. [sent-30, score-0.138]
13 In this paper, we propose a regression based approach for KML, termed Regression based Kernel Metric Learning (RKML), that explicitly addresses the challenges arising from high dimensionality and limitations of binary constraints. [sent-32, score-0.117]
14 RKML directly utilizes image tags to compute a real-valued semantic similarity, and therefore do not need to construct the binary constraints. [sent-33, score-0.189]
15 The projection step is avoided by exploiting the special property of regression, and the overfitting risk is alleviated by appropriately reg11660099 ularizing the rank of the learned kernel metric. [sent-34, score-0.32]
16 We demonstrate the robustness of the proposed RKML algorithm to high dimensionality by proving the theoretical guarantee of the learned kernel metric. [sent-35, score-0.242]
17 We also verify the efficiency and effectiveness of RKML for search-based image annotation by comparing it to the state-of-the-art approaches for both DML and image annotation on several benchmark datasets. [sent-36, score-0.414]
18 Related Work In this section we review the related work on image annotation and distance metric learning. [sent-38, score-0.368]
19 Recent studies on image annotation show that search based approaches are more effective than both generative and discriminative models. [sent-41, score-0.212]
20 Distance Metric Learning Many algorithms have been developed to learn a linear DML from pairwise constraints [35], and some of them are designed exclusively for image annotation [17, 32, 34]. [sent-47, score-0.261]
21 Recently, a number of nonlinear DML approaches have been developed to handle nonlinear and multimodal patterns. [sent-48, score-0.222]
22 They are usually classified into two categories, boosting based approaches [14, 15, 26] and kernel based approaches, depending on how the nonlinear mapping is constructed. [sent-49, score-0.263]
23 Many KML algorithms, such as Kernel DCA [16], KLMCA [28] and Kernel ITML [7], directly extend their linear counterparts to KML using the kernel trick. [sent-50, score-0.159]
24 To handle the high dimensionality challenge in KML, a common approach is to apply dimensionality reduction before learning the metric [5, 28]. [sent-51, score-0.263]
25 Although these studies show dimensionality reduction helps alleviate the overfitting risk in KML, no theoretical support is provided. [sent-52, score-0.183]
26 ) : Rd Rd → R be a kernel function, and Hκ be the corresponding Reproducing K keerrnneell fHuinlbcetirotn Space. [sent-66, score-0.138]
27 Without a metric, the similarity between two instances xa and xb could be assessed by the kernel function as ? [sent-67, score-0.381]
28 HThe objective of KML is to learn a PSD linear operator T that is consistent with the class assignments of training examples. [sent-74, score-0.084]
29 Note that this is different from similarity learning [4] because we require T to be PSD. [sent-75, score-0.082]
30 Regression based Kernel Metric Learning The proposed RKML is a kernel metric learning algorithm based on the regression technique. [sent-79, score-0.354]
31 Let si,j ∈ R be the similarity measure between two images xi and xj b Ras ebde on their annotations yi and yj . [sent-80, score-0.161]
32 We adopt a regression model to learn a kernel distance metric consistent with the similarity measure si,j by solving the optimization problem: T? [sent-85, score-0.402]
33 Following the representer theorem of kernel learning [24], it is sufficient to assume that T? [sent-91, score-0.179]
34 |2F, (2) where K = [κ(xi , xj)]n×n is the kernel matrix and S = [si,j]n×n includes all the pairwise se kmerannetlic m saitmriixla arintides S Sb e=- tween any two training images. [sent-107, score-0.159]
35 Note that when the semantic similarity matrix S is PSD, A will also be PSD, thus no additional projectSion is i Ps SnDe,ed Aed w tiol len alfsoorc bee t PheS Dlin,e tahru operator Tt? [sent-109, score-0.115]
36 e best rank r approximation of K, and express A as A = Kr−1SK−r1. [sent-112, score-0.092]
37 (3) Evidently, the rank r makes the tradeoff between bias and variance in estimating A: the larger the rank r, the lower the bias and higher the variance. [sent-113, score-0.12]
38 , the similarity between any two data instances xa and x? [sent-116, score-0.161]
39 Theoretical Guarantee of RKML We will show that the linear operator learned by the proposed algorithm is stochastically consistent, i. [sent-133, score-0.089]
40 , the lin- ear operator learned from finite samples provides a good approximation to the optimal one learned from an infinite number of samples. [sent-135, score-0.146]
41 To simplify our analysis, we assume that the semantic similarity measure si,j = yi? [sent-136, score-0.073]
42 aLtievte gk ( s·m) a blle, wtheh prediction function for the k-th ? [sent-161, score-0.098]
43 e W pree dmicatkioe nth feu following assumption fsosr, gk (·) in our analysis: A1 : gk(·) ∈ Hκ, k = 1, . [sent-165, score-0.073]
44 Assumption A1 essentially assumes that it is possible to accurately learn the prediction function gk (·) given sufficiently large number of training examples. [sent-169, score-0.119]
45 W) gei vaelsno unofftiethat assumption A1 holds if gk (·) is a smooth function and κ(·, ·) is a universal kernel [23(]·). [sent-170, score-0.23]
46 , (6) where Krs is the best rank r approximation of Ks = [κ( x? [sent-193, score-0.092]
47 2 ≤ O(1/√ns), that K˜r is an accurate approximation implying of Kr provided the number of samples ns is sufficiently large. [sent-200, score-0.087]
48 , kernel matrix K can be well approximated by the Nytr o¨m method when ns is a few thousands. [sent-203, score-0.193]
49 According to our implementation, we observe that further approximating Kb in (6) to rank r usually yields more accurate prediction for tags. [sent-204, score-0.112]
50 Three benchmark datasets for image annotation are used in our study and their statistics are summarized in Table 1. [sent-217, score-0.182]
51 Given a test image, we first identify the k most visually similar images from the training set using the learned distance metric, and then rank the tags by a majority vote over the k nearest neighbors, where k is chosen by cross-validation. [sent-227, score-0.274]
52 An RBF kernel is used in our study for all KML algorithms. [sent-228, score-0.138]
53 38m based on our experience, and determine the kernel width and rank r by cross-validation. [sent-231, score-0.198]
54 Besides, annotation based on the Euclidean distance, denoted by Euclid, is used as a reference in our comparison. [sent-233, score-0.182]
55 Since most DMLs are developed against mustlinks and cannot-links, we apply the procedure described in [32] to generate the binary constraints by performing a probabilistic clustering over the images based on their tags. [sent-234, score-0.096]
56 We evaluate the annotation accuracy by the average precision for the top ranked image tags. [sent-236, score-0.211]
57 Following [33, 34], we first compute the precision for each test image by comparing the top 10 annotated tags with the ground truth, and then take the average over the test set. [sent-237, score-0.177]
58 Comparison with State-of-the-art Distance Metric Learning Algorithms Comparison to nonlinear DML algorithms. [sent-243, score-0.087]
59 , Distance Boost (DBoost) [14], Kernel Boost (KBoost) [15], and metric learning with boosting (BoostM) [26], for comparison. [sent-247, score-0.217]
60 11661122 Figure 1 shows the average precision for the top t annotated tags obtained by nonlinear DML baselines and the proposed RKML. [sent-252, score-0.264]
61 Surprisingly, we observe that most of the nonlinear DML algorithms are only able to yield performance similar to that based on the Euclidean distance, and more disturbingly, some of the nonlinear DML algorithms even perform significantly worse than the Euclidean distance. [sent-253, score-0.201]
62 As described before, all DML algorithms require converting image annotations into binary constraints, which does not make full use of the annotation information. [sent-257, score-0.274]
63 To verify this point, we run RKML with similarity measure si,j computed from the binary constraints that are generated for the baseline DML algorithms, and denote this method by RKMLH. [sent-258, score-0.134]
64 Comparison of various extensions of RKML for the top t annotated tags on the IAPR TC12. [sent-268, score-0.148]
65 Figure 3 shows the average annotation precision for the linear DML baselines. [sent-272, score-0.232]
66 Similar to KML, we observe that even the best linear DML algorithm is only slightly better than the Euclidean distance, while RKML significantly outperforms all linear DML baselines. [sent-273, score-0.069]
67 Again, we believe that the failure of linear DML is likely due to the binary constraints generated from image annotations. [sent-274, score-0.089]
68 Since none of the baseline algorithms, neither linear nor nonlinear DML, is able to significantly outperform the Euclidean distance, it remains unclear if kernel DML is advantageous to a linear DML. [sent-275, score-0.267]
69 It is clear that RKML significantly outperforms its linear counterpart RLML, verifying the advantage of using kernel in DML. [sent-278, score-0.159]
70 Average precision for the first tag predicted by RMKL using different values of rank r on IAPR TC12 data. [sent-281, score-0.134]
71 To make the overfitting effect clearer, we turn off the Nystr ¨om approximation in this experiment. [sent-282, score-0.107]
72 We finally examine the role of rank r in the proposed algorithm by evaluating the prediction accuracy with varied r on the IAPRTC 12 dataset for both training and testing images (Figure 2). [sent-284, score-0.13]
73 We observe that while the average accuracy of test images initially improves significantly with increasing rank r, it becomes saturated after certain rank. [sent-286, score-0.087]
74 On the other hand, the prediction accuracy of training data increases almost linearly with respect to the rank, and becomes almost 1for very large r, a clear indication of overfitting training data. [sent-287, score-0.142]
75 and ns can be found in the supplementary document. [sent-293, score-0.082]
76 We include Pop as a comparison reference which simply ranks tags based on their occurring frequency in the training set. [sent-298, score-0.14]
77 Average precision for the top t annotated tags using nonlinear distance metrics. [sent-301, score-0.312]
78 Average precision for the top t annotated tags using linear distance metrics. [sent-303, score-0.246]
79 TIMEDCALMNNITMLLDMLDBoostBoostMKPCAGDAKDCAKLFDAKITMLMLKRRKML Figure 4 shows the comparison of average precision obtained by different image annotation models. [sent-307, score-0.211]
80 It is not surprising to observe that most annotation methods significantly outperform Pop, while the proposed RMKL method outperforms all the state-of-the-art image annotation methods on IAPR TC12 and ESP Game datasets, and only performs slightly worse than TP-D on the Flickr 1M dataset. [sent-308, score-0.391]
81 ning time includes the time for both learning a distance metric and predicting image tags. [sent-322, score-0.227]
82 We observe that compared to the other annotation methods, the proposed RKML algorithm is particularly efficient for large datasets (i. [sent-323, score-0.209]
83 Conclusions and Future Work In this paper, we propose a robust and efficient method for kernel metric learning (KML). [sent-327, score-0.317]
84 The proposed method addresses (i) high computational cost by avoiding the projection into PSD cone, (ii) limitation of binary constraints in tags by adopting a real-valued similarity measure, as well as (iii) the overfitting problem by appropriately regularizing the learned kernel metric. [sent-328, score-0.488]
85 Experiments with large-scale image annotation demonstrate the effectiveness and efficiency of the proposed algorithm by comparing it to the state-ofthe-art approaches for DML and image annotation. [sent-329, score-0.207]
86 In the future, we plan to improve the annotation performance by developing a more robust semantic similarity measure. [sent-330, score-0.255]
87 Supervised learning of semantic classes for image annotation and retrieval. [sent-352, score-0.255]
88 Large scale online learning of image similarity through ranking. [sent-359, score-0.082]
89 On the nystr ¨om method for approximating a gram matrix for improved kernel-based learning. [sent-387, score-0.086]
90 Multi-level annotation of natural scenes using dominant image components and semantic concepts. [sent-393, score-0.214]
91 TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. [sent-417, score-0.204]
92 Boosting margin [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] based distance functions for clustering. [sent-431, score-0.071]
93 Learning a kernel function for classification with small training samples. [sent-438, score-0.159]
94 Learning distance metrics with contextual constraints for image retrieval. [sent-445, score-0.078]
95 Labeling images by integrating sparse multiple distance learning and semantic context modeling. [sent-452, score-0.121]
96 Positive [27] [28] [29] [30] [3 1] [32] [33] [34] [35] semidefinite metric learning with boosting. [sent-510, score-0.201]
97 Dimensionality reduction of multimodal labeled data by local fisher discriminant analysis. [sent-515, score-0.08]
98 Distance metric learning for large margin nearest neighbor classification. [sent-536, score-0.227]
99 Distance metric learning from uncertain side information with application to automated photo tagging. [sent-551, score-0.179]
100 Mining social images with distance metric learning for automated image tagging. [sent-567, score-0.227]
wordName wordTfidf (topN-words)
[('dml', 0.545), ('rkml', 0.461), ('kml', 0.39), ('annotation', 0.182), ('kernel', 0.138), ('metric', 0.138), ('iapr', 0.13), ('psd', 0.121), ('tags', 0.119), ('esp', 0.103), ('xa', 0.087), ('nonlinear', 0.087), ('nystr', 0.086), ('rlml', 0.084), ('rmkl', 0.084), ('xb', 0.082), ('overfitting', 0.075), ('gk', 0.073), ('flickr', 0.073), ('hertz', 0.063), ('rank', 0.06), ('kr', 0.057), ('tagprop', 0.056), ('game', 0.055), ('ns', 0.055), ('jin', 0.054), ('distance', 0.048), ('om', 0.047), ('tag', 0.045), ('hoi', 0.044), ('dimensionality', 0.042), ('operator', 0.042), ('dmls', 0.042), ('hillel', 0.042), ('krs', 0.042), ('rkmlh', 0.042), ('rong', 0.042), ('learning', 0.041), ('similarity', 0.041), ('theoretic', 0.039), ('boosting', 0.038), ('binary', 0.038), ('regression', 0.037), ('rca', 0.037), ('keywords', 0.037), ('jmlr', 0.036), ('theoretical', 0.036), ('discriminant', 0.036), ('dca', 0.034), ('instances', 0.033), ('verbeek', 0.032), ('approximation', 0.032), ('semantic', 0.032), ('annotations', 0.031), ('euclidean', 0.031), ('michigan', 0.031), ('constraints', 0.03), ('studies', 0.03), ('itml', 0.03), ('precision', 0.029), ('annotated', 0.029), ('developed', 0.028), ('mensink', 0.028), ('observe', 0.027), ('pop', 0.027), ('annotate', 0.027), ('supplementary', 0.027), ('kb', 0.026), ('learned', 0.026), ('prediction', 0.025), ('multimedia', 0.025), ('yj', 0.025), ('efficiency', 0.025), ('verify', 0.025), ('neighbor', 0.025), ('fisher', 0.024), ('examine', 0.024), ('weinberger', 0.024), ('xj', 0.024), ('margin', 0.023), ('lkopf', 0.023), ('rd', 0.023), ('clearer', 0.023), ('mahalanobis', 0.023), ('guillaumin', 0.023), ('converting', 0.023), ('rbf', 0.022), ('semidefinite', 0.022), ('appropriately', 0.021), ('linear', 0.021), ('xi', 0.021), ('training', 0.021), ('infinite', 0.02), ('website', 0.02), ('multimodal', 0.02), ('cone', 0.02), ('feng', 0.02), ('universal', 0.019), ('yi', 0.019), ('stands', 0.019)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999982 227 iccv-2013-Large-Scale Image Annotation by Efficient and Robust Kernel Metric Learning
Author: Zheyun Feng, Rong Jin, Anil Jain
Abstract: One of the key challenges in search-based image annotation models is to define an appropriate similarity measure between images. Many kernel distance metric learning (KML) algorithms have been developed in order to capture the nonlinear relationships between visual features and semantics ofthe images. Onefundamental limitation in applying KML to image annotation is that it requires converting image annotations into binary constraints, leading to a significant information loss. In addition, most KML algorithms suffer from high computational cost due to the requirement that the learned matrix has to be positive semi-definitive (PSD). In this paper, we propose a robust kernel metric learning (RKML) algorithm based on the regression technique that is able to directly utilize image annotations. The proposed method is also computationally more efficient because PSD property is automatically ensured by regression. We provide the theoretical guarantee for the proposed algorithm, and verify its efficiency and effectiveness for image annotation by comparing it to state-of-the-art approaches for both distance metric learning and image annotation. ,
2 0.12537494 177 iccv-2013-From Point to Set: Extend the Learning of Distance Metrics
Author: Pengfei Zhu, Lei Zhang, Wangmeng Zuo, David Zhang
Abstract: Most of the current metric learning methods are proposed for point-to-point distance (PPD) based classification. In many computer vision tasks, however, we need to measure the point-to-set distance (PSD) and even set-to-set distance (SSD) for classification. In this paper, we extend the PPD based Mahalanobis distance metric learning to PSD and SSD based ones, namely point-to-set distance metric learning (PSDML) and set-to-set distance metric learning (SSDML), and solve them under a unified optimization framework. First, we generate positive and negative sample pairs by computing the PSD and SSD between training samples. Then, we characterize each sample pair by its covariance matrix, and propose a covariance kernel based discriminative function. Finally, we tackle the PSDML and SSDMLproblems by using standard support vector machine solvers, making the metric learning very efficient for multiclass visual classification tasks. Experiments on gender classification, digit recognition, object categorization and face recognition show that the proposed metric learning methods can effectively enhance the performance of PSD and SSD based classification.
3 0.12380076 191 iccv-2013-Handling Uncertain Tags in Visual Recognition
Author: Arash Vahdat, Greg Mori
Abstract: Gathering accurate training data for recognizing a set of attributes or tags on images or videos is a challenge. Obtaining labels via manual effort or from weakly-supervised data typically results in noisy training labels. We develop the FlipSVM, a novel algorithm for handling these noisy, structured labels. The FlipSVM models label noise by “flipping ” labels on training examples. We show empirically that the FlipSVM is effective on images-and-attributes and video tagging datasets.
4 0.10661036 295 iccv-2013-On One-Shot Similarity Kernels: Explicit Feature Maps and Properties
Author: Stefanos Zafeiriou, Irene Kotsia
Abstract: Kernels have been a common tool of machine learning and computer vision applications for modeling nonlinearities and/or the design of robust1 similarity measures between objects. Arguably, the class of positive semidefinite (psd) kernels, widely known as Mercer’s Kernels, constitutes one of the most well-studied cases. For every psd kernel there exists an associated feature map to an arbitrary dimensional Hilbert space H, the so-called feature space. Tdihme mnsaiionn reason ebreth sipnadc ep s Hd ,ke threne slos’-c c aplolpedul aferiattyu rise the fact that classification/regression techniques (such as Support Vector Machines (SVMs)) and component analysis algorithms (such as Kernel Principal Component Analysis (KPCA)) can be devised in H, without an explicit defisnisiti (oKnP of t)h)e c feature map, only by using athne xkperlniceitl (dtehfeso-called kernel trick). Recently, due to the development of very efficient solutions for large scale linear SVMs and for incremental linear component analysis, the research to- wards finding feature map approximations for classes of kernels has attracted significant interest. In this paper, we attempt the derivation of explicit feature maps of a recently proposed class of kernels, the so-called one-shot similarity kernels. We show that for this class of kernels either there exists an explicit representation in feature space or the kernel can be expressed in such a form that allows for exact incremental learning. We theoretically explore the properties of these kernels and show how these kernels can be used for the development of robust visual tracking, recognition and deformable fitting algorithms. 1Robustness may refer to either the presence of outliers and noise the robustness to a class of transformations (e.g., translation). or to ∗ Irene Kotsia ,†,? ∗Electronics Laboratory, Department of Physics, University of Patras, Greece ?School of Science and Technology, Middlesex University, London i .kot s i @mdx . ac .uk a
5 0.10400878 392 iccv-2013-Similarity Metric Learning for Face Recognition
Author: Qiong Cao, Yiming Ying, Peng Li
Abstract: Recently, there is a considerable amount of efforts devoted to the problem of unconstrained face verification, where the task is to predict whether pairs of images are from the same person or not. This problem is challenging and difficult due to the large variations in face images. In this paper, we develop a novel regularization framework to learn similarity metrics for unconstrained face verification. We formulate its objective function by incorporating the robustness to the large intra-personal variations and the discriminative power of novel similarity metrics. In addition, our formulation is a convex optimization problem which guarantees the existence of its global solution. Experiments show that our proposed method achieves the state-of-the-art results on the challenging Labeled Faces in the Wild (LFW) database [10].
6 0.090259686 10 iccv-2013-A Framework for Shape Analysis via Hilbert Space Embedding
7 0.074132308 290 iccv-2013-New Graph Structured Sparsity Model for Multi-label Image Annotations
9 0.06052715 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation
10 0.058847357 431 iccv-2013-Unbiased Metric Learning: On the Utilization of Multiple Datasets and Web Images for Softening Bias
11 0.058613013 332 iccv-2013-Quadruplet-Wise Image Similarity Learning
12 0.05802599 126 iccv-2013-Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification
13 0.05728123 222 iccv-2013-Joint Learning of Discriminative Prototypes and Large Margin Nearest Neighbor Classifiers
14 0.053814773 293 iccv-2013-Nonparametric Blind Super-resolution
15 0.053374849 26 iccv-2013-A Practical Transfer Learning Algorithm for Face Verification
16 0.053303819 327 iccv-2013-Predicting an Object Location Using a Global Image Representation
17 0.052711599 85 iccv-2013-Compositional Models for Video Event Detection: A Multiple Kernel Learning Latent Variable Approach
18 0.051904399 81 iccv-2013-Combining the Right Features for Complex Event Recognition
19 0.04782718 23 iccv-2013-A New Image Quality Metric for Image Auto-denoising
20 0.045728561 35 iccv-2013-Accurate Blur Models vs. Image Priors in Single Image Super-resolution
topicId topicWeight
[(0, 0.121), (1, 0.045), (2, -0.03), (3, -0.049), (4, -0.011), (5, 0.049), (6, 0.01), (7, 0.008), (8, 0.016), (9, -0.042), (10, -0.02), (11, -0.069), (12, -0.032), (13, -0.061), (14, 0.01), (15, -0.011), (16, -0.007), (17, -0.031), (18, -0.029), (19, -0.04), (20, -0.015), (21, 0.01), (22, 0.003), (23, 0.039), (24, 0.027), (25, 0.053), (26, 0.07), (27, 0.059), (28, -0.004), (29, 0.083), (30, -0.001), (31, -0.013), (32, -0.041), (33, -0.023), (34, 0.025), (35, 0.063), (36, 0.044), (37, -0.021), (38, -0.014), (39, -0.022), (40, 0.044), (41, -0.021), (42, -0.026), (43, 0.033), (44, 0.148), (45, -0.02), (46, 0.057), (47, -0.052), (48, -0.047), (49, 0.019)]
simIndex simValue paperId paperTitle
same-paper 1 0.93497759 227 iccv-2013-Large-Scale Image Annotation by Efficient and Robust Kernel Metric Learning
Author: Zheyun Feng, Rong Jin, Anil Jain
Abstract: One of the key challenges in search-based image annotation models is to define an appropriate similarity measure between images. Many kernel distance metric learning (KML) algorithms have been developed in order to capture the nonlinear relationships between visual features and semantics ofthe images. Onefundamental limitation in applying KML to image annotation is that it requires converting image annotations into binary constraints, leading to a significant information loss. In addition, most KML algorithms suffer from high computational cost due to the requirement that the learned matrix has to be positive semi-definitive (PSD). In this paper, we propose a robust kernel metric learning (RKML) algorithm based on the regression technique that is able to directly utilize image annotations. The proposed method is also computationally more efficient because PSD property is automatically ensured by regression. We provide the theoretical guarantee for the proposed algorithm, and verify its efficiency and effectiveness for image annotation by comparing it to state-of-the-art approaches for both distance metric learning and image annotation. ,
2 0.83211935 177 iccv-2013-From Point to Set: Extend the Learning of Distance Metrics
Author: Pengfei Zhu, Lei Zhang, Wangmeng Zuo, David Zhang
Abstract: Most of the current metric learning methods are proposed for point-to-point distance (PPD) based classification. In many computer vision tasks, however, we need to measure the point-to-set distance (PSD) and even set-to-set distance (SSD) for classification. In this paper, we extend the PPD based Mahalanobis distance metric learning to PSD and SSD based ones, namely point-to-set distance metric learning (PSDML) and set-to-set distance metric learning (SSDML), and solve them under a unified optimization framework. First, we generate positive and negative sample pairs by computing the PSD and SSD between training samples. Then, we characterize each sample pair by its covariance matrix, and propose a covariance kernel based discriminative function. Finally, we tackle the PSDML and SSDMLproblems by using standard support vector machine solvers, making the metric learning very efficient for multiclass visual classification tasks. Experiments on gender classification, digit recognition, object categorization and face recognition show that the proposed metric learning methods can effectively enhance the performance of PSD and SSD based classification.
Author: Jiwen Lu, Gang Wang, Pierre Moulin
Abstract: This paper presents a new approach for image set classification, where each training and testing example contains a set of image instances of an object captured from varying viewpoints or under varying illuminations. While a number of image set classification methods have been proposed in recent years, most of them model each image set as a single linear subspace or mixture of linear subspaces, which may lose some discriminative information for classification. To address this, we propose exploring multiple order statistics as features of image sets, and develop a localized multikernel metric learning (LMKML) algorithm to effectively combine different order statistics information for classification. Our method achieves the state-of-the-art performance on four widely used databases including the Honda/UCSD, CMU Mobo, and Youtube face datasets, and the ETH-80 object dataset.
4 0.73906124 222 iccv-2013-Joint Learning of Discriminative Prototypes and Large Margin Nearest Neighbor Classifiers
Author: Martin Köstinger, Paul Wohlhart, Peter M. Roth, Horst Bischof
Abstract: In this paper, we raise important issues concerning the evaluation complexity of existing Mahalanobis metric learning methods. The complexity scales linearly with the size of the dataset. This is especially cumbersome on large scale or for real-time applications with limited time budget. To alleviate this problem we propose to represent the dataset by a fixed number of discriminative prototypes. In particular, we introduce a new method that jointly chooses the positioning of prototypes and also optimizes the Mahalanobis distance metric with respect to these. We show that choosing the positioning of the prototypes and learning the metric in parallel leads to a drastically reduced evaluation effort while maintaining the discriminative essence of the original dataset. Moreover, for most problems our method performing k-nearest prototype (k-NP) classification on the condensed dataset leads to even better generalization compared to k-NN classification using all data. Results on a variety of challenging benchmarks demonstrate the power of our method. These include standard machine learning datasets as well as the challenging Public Fig- ures Face Database. On the competitive machine learning benchmarks we are comparable to the state-of-the-art while being more efficient. On the face benchmark we clearly outperform the state-of-the-art in Mahalanobis metric learning with drastically reduced evaluation effort.
5 0.7300877 332 iccv-2013-Quadruplet-Wise Image Similarity Learning
Author: Marc T. Law, Nicolas Thome, Matthieu Cord
Abstract: This paper introduces a novel similarity learning framework. Working with inequality constraints involving quadruplets of images, our approach aims at efficiently modeling similarity from rich or complex semantic label relationships. From these quadruplet-wise constraints, we propose a similarity learning framework relying on a convex optimization scheme. We then study how our metric learning scheme can exploit specific class relationships, such as class ranking (relative attributes), and class taxonomy. We show that classification using the learned metrics gets improved performance over state-of-the-art methods on several datasets. We also evaluate our approach in a new application to learn similarities between webpage screenshots in a fully unsupervised way.
6 0.71269011 295 iccv-2013-On One-Shot Similarity Kernels: Explicit Feature Maps and Properties
7 0.70303589 126 iccv-2013-Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification
8 0.69685829 10 iccv-2013-A Framework for Shape Analysis via Hilbert Space Embedding
9 0.68239385 194 iccv-2013-Heterogeneous Image Features Integration via Multi-modal Semi-supervised Learning Model
10 0.67457235 431 iccv-2013-Unbiased Metric Learning: On the Utilization of Multiple Datasets and Web Images for Softening Bias
11 0.66777813 25 iccv-2013-A Novel Earth Mover's Distance Methodology for Image Matching with Gaussian Mixture Models
12 0.62691939 392 iccv-2013-Similarity Metric Learning for Face Recognition
13 0.62168384 142 iccv-2013-Ensemble Projection for Semi-supervised Image Classification
14 0.60107946 259 iccv-2013-Manifold Based Face Synthesis from Sparse Samples
15 0.58217126 248 iccv-2013-Learning to Rank Using Privileged Information
16 0.57840908 290 iccv-2013-New Graph Structured Sparsity Model for Multi-label Image Annotations
18 0.56727928 312 iccv-2013-Perceptual Fidelity Aware Mean Squared Error
19 0.55546427 48 iccv-2013-An Adaptive Descriptor Design for Object Recognition in the Wild
20 0.55427814 347 iccv-2013-Recursive Estimation of the Stein Center of SPD Matrices and Its Applications
topicId topicWeight
[(2, 0.096), (7, 0.025), (10, 0.012), (26, 0.071), (31, 0.045), (42, 0.101), (64, 0.056), (73, 0.014), (78, 0.013), (89, 0.136), (97, 0.307), (98, 0.021)]
simIndex simValue paperId paperTitle
1 0.85744274 373 iccv-2013-Saliency and Human Fixations: State-of-the-Art and Study of Comparison Metrics
Author: Nicolas Riche, Matthieu Duvinage, Matei Mancas, Bernard Gosselin, Thierry Dutoit
Abstract: Visual saliency has been an increasingly active research area in the last ten years with dozens of saliency models recently published. Nowadays, one of the big challenges in the field is to find a way to fairly evaluate all of these models. In this paper, on human eye fixations ,we compare the ranking of 12 state-of-the art saliency models using 12 similarity metrics. The comparison is done on Jian Li ’s database containing several hundreds of natural images. Based on Kendall concordance coefficient, it is shown that some of the metrics are strongly correlated leading to a redundancy in the performance metrics reported in the available benchmarks. On the other hand, other metrics provide a more diverse picture of models ’ overall performance. As a recommendation, three similarity metrics should be used to obtain a complete point of view of saliency model performance.
2 0.7837612 347 iccv-2013-Recursive Estimation of the Stein Center of SPD Matrices and Its Applications
Author: Hesamoddin Salehian, Guang Cheng, Baba C. Vemuri, Jeffrey Ho
Abstract: Symmetric positive-definite (SPD) matrices are ubiquitous in Computer Vision, Machine Learning and Medical Image Analysis. Finding the center/average of a population of such matrices is a common theme in many algorithms such as clustering, segmentation, principal geodesic analysis, etc. The center of a population of such matrices can be defined using a variety of distance/divergence measures as the minimizer of the sum of squared distances/divergences from the unknown center to the members of the population. It is well known that the computation of the Karcher mean for the space of SPD matrices which is a negativelycurved Riemannian manifold is computationally expensive. Recently, the LogDet divergence-based center was shown to be a computationally attractive alternative. However, the LogDet-based mean of more than two matrices can not be computed in closed form, which makes it computationally less attractive for large populations. In this paper we present a novel recursive estimator for center based on the Stein distance which is the square root of the LogDet di– vergence that is significantly faster than the batch mode computation of this center. The key theoretical contribution is a closed-form solution for the weighted Stein center of two SPD matrices, which is used in the recursive computation of the Stein center for a population of SPD matrices. Additionally, we show experimental evidence of the convergence of our recursive Stein center estimator to the batch mode Stein center. We present applications of our recursive estimator to K-means clustering and image indexing depicting significant time gains over corresponding algorithms that use the batch mode computations. For the latter application, we develop novel hashing functions using the Stein distance and apply it to publicly available data sets, and experimental results have shown favorable com– ∗This research was funded in part by the NIH grant NS066340 to BCV. †Corresponding author parisons to other competing methods.
3 0.73997015 412 iccv-2013-Synergistic Clustering of Image and Segment Descriptors for Unsupervised Scene Understanding
Author: Daniel M. Steinberg, Oscar Pizarro, Stefan B. Williams
Abstract: With the advent of cheap, high fidelity, digital imaging systems, the quantity and rate of generation of visual data can dramatically outpace a humans ability to label or annotate it. In these situations there is scope for the use of unsupervised approaches that can model these datasets and automatically summarise their content. To this end, we present a totally unsupervised, and annotation-less, model for scene understanding. This model can simultaneously cluster whole-image and segment descriptors, therebyforming an unsupervised model of scenes and objects. We show that this model outperforms other unsupervised models that can only cluster one source of information (image or segment) at once. We are able to compare unsupervised and supervised techniques using standard measures derived from confusion matrices and contingency tables. This shows that our unsupervised model is competitive with current supervised and weakly-supervised models for scene understanding on standard datasets. We also demonstrate our model operating on a dataset with more than 100,000 images col- lected by an autonomous underwater vehicle.
same-paper 4 0.72997475 227 iccv-2013-Large-Scale Image Annotation by Efficient and Robust Kernel Metric Learning
Author: Zheyun Feng, Rong Jin, Anil Jain
Abstract: One of the key challenges in search-based image annotation models is to define an appropriate similarity measure between images. Many kernel distance metric learning (KML) algorithms have been developed in order to capture the nonlinear relationships between visual features and semantics ofthe images. Onefundamental limitation in applying KML to image annotation is that it requires converting image annotations into binary constraints, leading to a significant information loss. In addition, most KML algorithms suffer from high computational cost due to the requirement that the learned matrix has to be positive semi-definitive (PSD). In this paper, we propose a robust kernel metric learning (RKML) algorithm based on the regression technique that is able to directly utilize image annotations. The proposed method is also computationally more efficient because PSD property is automatically ensured by regression. We provide the theoretical guarantee for the proposed algorithm, and verify its efficiency and effectiveness for image annotation by comparing it to state-of-the-art approaches for both distance metric learning and image annotation. ,
5 0.70520741 20 iccv-2013-A Max-Margin Perspective on Sparse Representation-Based Classification
Author: Zhaowen Wang, Jianchao Yang, Nasser Nasrabadi, Thomas Huang
Abstract: Sparse Representation-based Classification (SRC) is a powerful tool in distinguishing signal categories which lie on different subspaces. Despite its wide application to visual recognition tasks, current understanding of SRC is solely based on a reconstructive perspective, which neither offers any guarantee on its classification performance nor provides any insight on how to design a discriminative dictionary for SRC. In this paper, we present a novel perspective towards SRC and interpret it as a margin classifier. The decision boundary and margin of SRC are analyzed in local regions where the support of sparse code is stable. Based on the derived margin, we propose a hinge loss function as the gauge for the classification performance of SRC. A stochastic gradient descent algorithm is implemented to maximize the margin of SRC and obtain more discriminative dictionaries. Experiments validate the effectiveness of the proposed approach in predicting classification performance and improving dictionary quality over reconstructive ones. Classification results competitive with other state-ofthe-art sparse coding methods are reported on several data sets.
6 0.69254446 372 iccv-2013-Saliency Detection via Dense and Sparse Reconstruction
7 0.6914767 50 iccv-2013-Analysis of Scores, Datasets, and Models in Visual Saliency Prediction
8 0.68369353 425 iccv-2013-Tracking via Robust Multi-task Multi-view Joint Sparse Representation
9 0.6621815 369 iccv-2013-Saliency Detection: A Boolean Map Approach
10 0.64495653 91 iccv-2013-Contextual Hypergraph Modeling for Salient Object Detection
11 0.63503301 371 iccv-2013-Saliency Detection via Absorbing Markov Chain
12 0.63396466 71 iccv-2013-Category-Independent Object-Level Saliency Detection
13 0.60628623 396 iccv-2013-Space-Time Robust Representation for Action Recognition
14 0.60258603 180 iccv-2013-From Where and How to What We See
15 0.59961474 137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction
16 0.59258956 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning
17 0.588422 338 iccv-2013-Randomized Ensemble Tracking
18 0.58084589 381 iccv-2013-Semantically-Based Human Scanpath Estimation with HMMs
19 0.5766536 257 iccv-2013-Log-Euclidean Kernels for Sparse Representation and Dictionary Learning
20 0.57581615 45 iccv-2013-Affine-Constrained Group Sparse Coding and Its Application to Image-Based Classifications