iccv iccv2013 iccv2013-332 knowledge-graph by maker-knowledge-mining

332 iccv-2013-Quadruplet-Wise Image Similarity Learning


Source: pdf

Author: Marc T. Law, Nicolas Thome, Matthieu Cord

Abstract: This paper introduces a novel similarity learning framework. Working with inequality constraints involving quadruplets of images, our approach aims at efficiently modeling similarity from rich or complex semantic label relationships. From these quadruplet-wise constraints, we propose a similarity learning framework relying on a convex optimization scheme. We then study how our metric learning scheme can exploit specific class relationships, such as class ranking (relative attributes), and class taxonomy. We show that classification using the learned metrics gets improved performance over state-of-the-art methods on several datasets. We also evaluate our approach in a new application to learn similarities between webpage screenshots in a fully unsupervised way.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Working with inequality constraints involving quadruplets of images, our approach aims at efficiently modeling similarity from rich or complex semantic label relationships. [sent-5, score-0.475]

2 We then study how our metric learning scheme can exploit specific class relationships, such as class ranking (relative attributes), and class taxonomy. [sent-7, score-0.344]

3 We also evaluate our approach in a new application to learn similarities between webpage screenshots in a fully unsupervised way. [sent-9, score-0.16]

4 The key ingredients of similarity learning framework are (i) the data representation including both the feature space and the similarity function, (ii) the learning frameMatthieu . [sent-12, score-0.17]

5 Instead of working on pairwise relations that present some flaws (see text), Qwise strategy defines quadruplet-wise constraints to express that dissimilarities between examples from (f) and (g) should be smaller than dissimilarities between examples from (e) and (h). [sent-20, score-0.485]

6 Metrics are learned with training data to minimize dissimilarities between similar pairs while separating dissimilar ones. [sent-26, score-0.171]

7 Recently, some attempts have been made to go beyond learning metrics with pairwise constraints generated from binary class membership labels. [sent-28, score-0.309]

8 On the one hand, triplet-wise constraints have been considered to learn metrics [6, 15, 28]. [sent-29, score-0.145]

9 Triplet constraints may be generated from pends on a class hierarchy: an image should be closer to another image from a sibling class than to any image from a distant class in the hierarchy. [sent-30, score-0.374]

10 For instance, relative attributes have been introduced in [20]: different classes (e. [sent-32, score-0.151]

11 face images from class (x) smile more than (or as much as) face images from class (y). [sent-40, score-0.234]

12 Instead of pairwise or triplet-wise techniques, we propose to investigate relations between quadruplets of images. [sent-43, score-0.426]

13 1in which enforcing strong pairwise equivalence constraints as in [20] may be problematic: (second row) Owen (f) is smiling more than Rodriguez (g) although their classes are annotated as smiling as much as each other. [sent-46, score-0.295]

14 To overcome this limitation, one can consider relations on quadruplets: noting that the difference between the surrounding classes (e) and (h) is always greater than between (f) and (g), we express inequality constraints on dissimilarities (Fig. [sent-47, score-0.373]

15 We then demonstrate the advantage of our approach for image classification with respect to pairwise and triplet-wise strategies (Sections 4 and 5), and also for a new emerging context about webpage visual screenshot comparison (Section 6). [sent-53, score-0.243]

16 The Mahalanobis-like distance DW is definitively the most investigated metric for metric learning: DW2(xi,xj) = (xi − xj)? [sent-59, score-0.182]

17 0) and (xi , xj) ∈ Rd Rd are repr(ePsSeDn)ta mtioantsri oxf ( images pi anndd pj . [sent-62, score-0.48]

18 A hinge loss or a generalized logistic loss function may be used to express all the constraints (over S and D) in a single functional [a1ll8] th. [sent-83, score-0.197]

19 This type of constraints is easy to generate in classification: (pi ,pi+) are sampled from the same class and (pi ,pi− ) from different classes [10, 15, 25, 28]. [sent-90, score-0.199]

20 Other approaches investigate different dataset labels or semantic relationships to build pairwise or triplet-wise metric learning schemes. [sent-99, score-0.221]

21 They consider totally ordered sets of classes that describe relations among classes. [sent-106, score-0.132]

22 Based on these rich relations, they learn image representations by exploiting only pairwise class relations. [sent-107, score-0.175]

23 We propose to explore this type of data knowledge in metric learning for image comparison. [sent-108, score-0.137]

24 Noting that pairwise or triplet-wise approaches may, sometimes, be limited (see Section 1), our learning framework is based on constraints on quadruplets. [sent-109, score-0.166]

25 1, pair or triplet constraints may be noisy or irrelevant, leading to less than optimal learning scheme when provided at a class level. [sent-112, score-0.208]

26 On the other hand, working on dissimilarities between quadruplets of images limits the risk of incorporating misleading annotations. [sent-113, score-0.443]

27 One can find other Computer Vision applications where pairwise dissimilarities Dij might be hard or meaningless for humans to annotate and quadruplet-wise easy or meaningful. [sent-114, score-0.166]

28 We are interested in comparing pairs of dissimilarities (Dij , Dkl) that involve up to four different images (pi, pj , pk ,pl). [sent-116, score-0.485]

29 wTeweon types of relations R are considered between Dij and Dkl : (ty1)p sstr oicft inequality b aertew ceoenns ddeisrseidm bilaertwitieeesn: Dij < Dkl, (2) non-strict inequality: Dij ≤ Dkl . [sent-118, score-0.139]

30 We then define two training se≤ts: AD, composed ≥of D quadruplets (npi d , pj , pk ,pol) t satisfying Dij <, Domkl(Dkl strict upper bound of Dij) and B, composed ofquadruplets (p? [sent-120, score-0.68]

31 (1), the dissimilarity considered for learning in this paper is: Φ(pi,pj)? [sent-132, score-0.142]

32 Quadrupl)et ∈-wi Sse, c(ponstra)int ∈s can b⇒e g Denerated even if (pi , pj ) and (pk , pl ) are both similar or dissimilar. [sent-146, score-0.312]

33 Only the order of similarity between (pi , pj ) and (pk , pl ) is required. [sent-147, score-0.351]

34 ons provide Mtim idziaffteiornen ot dissimilarity ofufnc Lt:io nifs, t hweh aenrne oetaacthio nofs them represents a relative ordering focused on a given criterion (e. [sent-156, score-0.164]

35 pi is more smiling than pj, pi is younger than pj . [sent-158, score-0.763]

36 Without loss of generality, we then consider optimizing the following dissimilarity function (with Ψ equal to Φ or Φ2): • Dw(pi,pj) = w? [sent-166, score-0.137]

37 Our goal is to learn the parameters of Dw such that the maximum number of the following constraints is satisfied: ∀(pi, pj , pk, pl) ∈ A : Dw(pk ,pl) ≥ Dw(pi, pj) + 1 (5) , , , ∈ B : Dw(pk , pl) ≥ Dw(pi, pj ) ∀(pi pj pk pl) (6) This means that Dw fulfills in Eq. [sent-170, score-0.99]

38 (5) is similar to the constraints used in triplet-wise approaches [10, 25, 28] with the exception that we use quadruplets of images. [sent-174, score-0.367]

39 Let zq be the vector of differences of quadruplet q = (pi, pj , pk, pl) : zq = zijkl = Ψ(pk ,pl) −Ψ(pi, pj). [sent-181, score-0.636]

40 zq ≥ 0 (8) As explained in Section 2, we can use a loss function over the training set A ∪ B to define our objective functoiovenr. [sent-187, score-0.202]

41 de Sfiinncee et thhee elo csosn sfutnracitnitosn o oLve1h over quadruplets ifenr eAn tw,. [sent-189, score-0.304]

42 zq (∀q ∈ A) and use the following differentiable loss function3 : Lh1(t) =⎨⎧10(1 −+h4 th−t)2 i f t|1 > − 1 t +| ≤ h h (9) We define L0h as a⎩n adaptation of L1h that considers the absence of safety margin in Eq. [sent-195, score-0.287]

43 zq : Lh0(t) =⎨⎧0−4th2h − t i f t| − > h 0 − t| ≤ h (10) To avoid overfi⎩tting, we introduce a regularization over w (term ? [sent-198, score-0.161]

44 1 the attribute am = ”Presence of smile” allows to rank 4 celebrity classes from the least to the most smiling. [sent-227, score-0.15]

45 Instead of considering attributes as boolean values (the concept is present in the image or not), Parikh and Grauman [20] learn for each attribute am a vector wm ∈ Rd so that the score wm? [sent-228, score-0.25]

46 xi represents the degree of presence Rof am in pi (xi ∈ Rd is the feature vector of pi). [sent-229, score-0.295]

47 To learn wm, they use original training sets about relative ordering between classes such as the one presented in 3As described in [4], L1h is a differentiable approximation of the hinge loss when h → 0. [sent-230, score-0.25]

48 As explained in Section 1, the learning informatio−nx is provided at a class level: pairwise constraints may be noisy or irrelevant, leading to less than optimal learning scheme. [sent-249, score-0.281]

49 To further exploit the available ordered set of classes and overcome these limitations, we consider relations on quadruplets. [sent-252, score-0.132]

50 By sampling such quadruplets from the whole set of relative orderings on classes (e. [sent-257, score-0.413]

51 f W)e ∼ ∼th (eng) use a slightly different assumption: Dkl > |Dij | to take into account the fact that pi and pj are not> >ra |nDked|. [sent-262, score-0.48]

52 DDkkl ≥≥ DDiji ++ 1 (13) We thus generate two quadruplets in A from Eq. [sent-264, score-0.304]

53 optimal weight mveaicntosrs e wm are l ethairsne adp pfolirc aatlilo am, Oeancche i tmheage pi is described by a high level feature representation: hi = [w1? [sent-269, score-0.333]

54 4It is not necessary to discuss the sign of Dkl since pk was annotated to have stronger presence of am than pl . [sent-282, score-0.186]

55 Another information is also available for both datasets: relative orderings of classes according to some semantic attributes (see Table 1for OSR). [sent-290, score-0.178]

56 Baselines: We use three baselines: (1) the linear transformation learned with LMNN [28] that uses only class membership information5, (2) the relative attribute learning problem of Parikh and Grauman [20] that uses relative attribute annotations on classes (e. [sent-291, score-0.501]

57 Table 1), unlike LMNN, to generate and exploit only pairwise constraints, (3) a combination of the first two baselines that first uses relative attribute annotations to learn a representation of images in at- tribute space, and second, learns a metric in attribute space with LMNN. [sent-293, score-0.402]

58 The Qwise scheme only uses relative attribute information to learn a linear transformation. [sent-298, score-0.146]

59 This linear transformation can be exploited by other linear transformation learning methods that use class membership information. [sent-299, score-0.21]

60 To learn the projection direction wm of attribute am, we select pairs of classes. [sent-304, score-0.208]

61 × From each selected pair of classes, we extract N N image pairs or quadruplets atoi rc orefa ctlea training ecoxntrsactrtai Nnt×s . [sent-313, score-0.304]

62 Once all the M projection directions wm are learned, a Gaussian distribution is learned for each class cs of images: the mean μs ∈ RM and covariance matrix Σs ∈ RM×M are estimated using the hi of all training images pi ∈ cs. [sent-317, score-0.432]

63 A test image pt is then assigned to the class corresponding to the highest likelihood. [sent-318, score-0.217]

64 Relative attribute annotations (used for Qwise learning) and class membership information (used for LMNN) then seem complementary. [sent-329, score-0.192]

65 We also investigated how our quadruplets are sampled from the set of ordering relations. [sent-330, score-0.33]

66 For instance, if we have (k) ≺ (i) ≺ (e) ≺ (f) ∼ (g) ≺ (h) ≺ (j) ≺ (l) and focus on )t h≺e (cli)as ≺s pair (≺f) ( f∼) ∼(g) (,g i)n ≺all ( our experiments, we only sampled quadruplets ∼fro (mg) t,h ien 4a cll oasusre esx (pee)r ≺m e(nft)s ∼ w e(g o)n l≺y (sahm) (step q1u). [sent-331, score-0.304]

67 Hierarchical Metric Learning Another classification context with rich annotations is metric learning using a semantic taxonomy structure. [sent-337, score-0.268]

68 We study in this section how our model can exploit complex relations from a class hierarchy as proposed in [26] . [sent-338, score-0.173]

69 Our objective is to learn a metric such that images from close (sibling) classes with respect to the class semantic hierarchy are more similar than images from more distant classes. [sent-339, score-0.367]

70 Ψ, A: Qwise formulation Given a semantic taxonomy expressed by a tree of ×× classes, let us consider two sibling classes ca and cb and one of their cousin classes cd. [sent-342, score-0.316]

71 We generate two types of quadruplet-wise constraints in order to: Enforce the dissimilarity between two images from the• same ocrlcaess t htoe b dei ssmimalillearri tthya nbe btewtweeenen tw wtwoo i motahgeerss ffrroomm sibling classes. [sent-343, score-0.238]

72 • Enforce the dissimilarity between two images from sibling cElnafsosercse ttoh ebdei sssmimalil earr tthyabne tbw eetweneetwn otwimoa images mfrsoimbcousin classes. [sent-345, score-0.175]

73 Ψ(pi,pj) where Ψ(pi, pj ) = (xi −xj) ◦ (xi −xj) and w = Diag(W). [sent-351, score-0.251]

74 Experiments To validate the Qwise ability to learn a powerful metric using a class hierarchy, we focus on the local subtree classification task described in [26]. [sent-359, score-0.293]

75 [26] which also use class taxonomy information to learn hierarchical similarity metrics. [sent-365, score-0.207]

76 It is worth mentioning that they learn a local metric for each class (leaf of the subtree), parameterized by a full PSD matrix. [sent-366, score-0.24]

77 Our Qwise-learning model is simpler since we learn a global metric for each subtree and use a 6http://www. [sent-367, score-0.197]

78 Even if our metric learning strategy is not significantly better than a SVM scheme alone, the results are encouraging. [sent-382, score-0.162]

79 The sampling strategy to get useful quadruplet constraints from these hierarchies has to be further investigated. [sent-383, score-0.151]

80 Significant changes between successive versions of a same webpage mean that a robot has to revisit the page and index it. [sent-387, score-0.251]

81 In this study, we focus on news websites, where advertisements or menus are not significant whereas the news content is significant. [sent-388, score-0.201]

82 In this context, having a metric able to properly identify significant changes between webpage versions is crucial. [sent-389, score-0.264]

83 We intend to learn (1) a semantical dissimilarity between versions, and (2) region weights that help interpreting the results. [sent-394, score-0.185]

84 Ψ, A, B: Qwise formulation Let pi be here a screen capture of a specific webpage at time i. [sent-411, score-0.368]

85 Following Section 3, we use a diagonal matrix metric model and express our metric as: Dw(pi, pj) = w? [sent-412, score-0.209]

86 2 distance between GIST descriptors in the dth regions of pi and pj (see section 6. [sent-415, score-0.507]

87 To generate our constraints, we assume that the dissimi- larity between two successive screen captures pt and pt+1 is smaller than the dissimilarity between a previous (pr) and a later (distant enough) (pr+γ) versions: D(pr, pr+γ) > D(pt, pt+1) if r ≤ t ≤ r + γ − 1. [sent-418, score-0.308]

88 p eFsir sotf, quadruplets si tnh aBt t ohuatr vQiowliastee tchhee mcoens istr aaibnlte (tio. [sent-429, score-0.304]

89 Second, quadruplets in A that violate their corresponding constraint penalize cetosnt inen At th thaatt td voioesl anteot t change mresupchon idni some region, although a change in the whole page is expected. [sent-434, score-0.346]

90 Experimental Results Dataset: To evaluate the Qwise learning scheme, we provide a new webpage dataset. [sent-438, score-0.157]

91 For an evaluation purpose, we manually annotate successive versions (pt , pt+1 ) as dissimilar if a (semantical) significant change 7http://www. [sent-441, score-0.13]

92 occurs between pt and pt+1, and as similar otherwise. [sent-454, score-0.148]

93 For each split, a distance is learned on versions × crawled during 5 successive days, and the successive versions of the 45 remaining days are used for testing. [sent-458, score-0.254]

94 Visual descriptors: We consider screen captures of page versions as images. [sent-459, score-0.132]

95 2-distance between between bins that fall into the same cell of the grids of pi and pj . [sent-464, score-0.48]

96 Parameters to generate Qwise constraints: To generate constraints, we sample version quadruplets (pt, pt+1 ,pr, ps) in a temporal window (of 5 days) where t varies and so that r ≥ t − 6, s t + 7, γ = 4. [sent-466, score-0.304]

97 Conclusion and Perspectives In this paper, we introduce our Qwise framework to learn similarities from quadruplets of images. [sent-488, score-0.353]

98 The proposed metric parameterization makes the approach robust to overfitting, and the convexity of the objective function makes the learning effective. [sent-490, score-0.137]

99 Our Qwise approach has been successfully evaluated in three different scenarios: relative attribute learning, metric learning on class hierarchy, and study of webpage changes. [sent-491, score-0.414]

100 Distance metric learning for large margin nearest neighbor classification. [sent-675, score-0.182]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('qwise', 0.536), ('quadruplets', 0.304), ('pj', 0.251), ('dkl', 0.249), ('pi', 0.229), ('dij', 0.187), ('zq', 0.161), ('lmnn', 0.153), ('pt', 0.148), ('dw', 0.139), ('pk', 0.125), ('webpage', 0.111), ('dissimilarities', 0.109), ('wm', 0.104), ('dissimilarity', 0.096), ('metric', 0.091), ('osr', 0.088), ('sibling', 0.079), ('thome', 0.071), ('class', 0.069), ('classes', 0.067), ('xi', 0.066), ('relations', 0.065), ('quadruplet', 0.063), ('menus', 0.063), ('dwm', 0.063), ('verma', 0.063), ('constraints', 0.063), ('versions', 0.062), ('pl', 0.061), ('pubfig', 0.061), ('cord', 0.059), ('subtree', 0.057), ('pairwise', 0.057), ('pr', 0.055), ('attribute', 0.055), ('smiling', 0.054), ('taxonomy', 0.05), ('learn', 0.049), ('screenshot', 0.048), ('clas', 0.047), ('news', 0.047), ('psd', 0.046), ('learning', 0.046), ('xj', 0.045), ('margin', 0.045), ('advertisements', 0.044), ('rd', 0.044), ('smile', 0.042), ('inequality', 0.042), ('relative', 0.042), ('page', 0.042), ('attributes', 0.042), ('bbc', 0.041), ('loss', 0.041), ('membership', 0.041), ('semantical', 0.04), ('safety', 0.04), ('hierarchy', 0.039), ('similarity', 0.039), ('successive', 0.036), ('crawling', 0.036), ('oasis', 0.036), ('webpages', 0.036), ('metrics', 0.033), ('dissimilar', 0.032), ('oicft', 0.032), ('owen', 0.032), ('parameterized', 0.031), ('learned', 0.03), ('working', 0.03), ('gist', 0.03), ('triplet', 0.03), ('cnn', 0.029), ('mahalanobis', 0.029), ('screen', 0.028), ('days', 0.028), ('rm', 0.028), ('celebrity', 0.028), ('overfitting', 0.027), ('express', 0.027), ('dth', 0.027), ('face', 0.027), ('semantic', 0.027), ('irrelevant', 0.027), ('annotations', 0.027), ('transformation', 0.027), ('classification', 0.027), ('thee', 0.027), ('tco', 0.026), ('baselines', 0.026), ('grauman', 0.026), ('ordering', 0.026), ('web', 0.026), ('cb', 0.026), ('hinge', 0.025), ('hour', 0.025), ('marc', 0.025), ('distant', 0.025), ('strategy', 0.025)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 332 iccv-2013-Quadruplet-Wise Image Similarity Learning

Author: Marc T. Law, Nicolas Thome, Matthieu Cord

Abstract: This paper introduces a novel similarity learning framework. Working with inequality constraints involving quadruplets of images, our approach aims at efficiently modeling similarity from rich or complex semantic label relationships. From these quadruplet-wise constraints, we propose a similarity learning framework relying on a convex optimization scheme. We then study how our metric learning scheme can exploit specific class relationships, such as class ranking (relative attributes), and class taxonomy. We show that classification using the learned metrics gets improved performance over state-of-the-art methods on several datasets. We also evaluate our approach in a new application to learn similarities between webpage screenshots in a fully unsupervised way.

2 0.15438306 319 iccv-2013-Point-Based 3D Reconstruction of Thin Objects

Author: Benjamin Ummenhofer, Thomas Brox

Abstract: 3D reconstruction deals with the problem of finding the shape of an object from a set of images. Thin objects that have virtually no volumepose a special challengefor reconstruction with respect to shape representation and fusion of depth information. In this paper we present a dense pointbased reconstruction method that can deal with this special class of objects. We seek to jointly optimize a set of depth maps by treating each pixel as a point in space. Points are pulled towards a common surface by pairwise forces in an iterative scheme. The method also handles the problem of opposed surfaces by means of penalty forces. Efficient optimization is achieved by grouping points to superpixels and a spatial hashing approach for fast neighborhood queries. We show that the approach is on a par with state-of-the-art methods for standard multi view stereo settings and gives superior results for thin objects.

3 0.11998735 380 iccv-2013-Semantic Transform: Weakly Supervised Semantic Inference for Relating Visual Attributes

Author: Sukrit Shankar, Joan Lasenby, Roberto Cipolla

Abstract: Relative (comparative) attributes are promising for thematic ranking of visual entities, which also aids in recognition tasks [19, 23]. However, attribute rank learning often requires a substantial amount of relational supervision, which is highly tedious, and apparently impracticalfor realworld applications. In this paper, we introduce the Semantic Transform, which under minimal supervision, adaptively finds a semantic feature space along with a class ordering that is related in the best possible way. Such a semantic space is found for every attribute category. To relate the classes under weak supervision, the class ordering needs to be refined according to a cost function in an iterative procedure. This problem is ideally NP-hard, and we thus propose a constrained search tree formulation for the same. Driven by the adaptive semantic feature space representation, our model achieves the best results to date for all of the tasks of relative, absolute and zero-shot classification on two popular datasets.

4 0.11030713 370 iccv-2013-Saliency Detection in Large Point Sets

Author: Elizabeth Shtrom, George Leifman, Ayellet Tal

Abstract: While saliency in images has been extensively studied in recent years, there is very little work on saliency of point sets. This is despite the fact that point sets and range data are becoming ever more widespread and have myriad applications. In this paper we present an algorithm for detecting the salient points in unorganized 3D point sets. Our algorithm is designed to cope with extremely large sets, which may contain tens of millions of points. Such data is typical of urban scenes, which have recently become commonly available on the web. No previous work has handled such data. For general data sets, we show that our results are competitive with those of saliency detection of surfaces, although we do not have any connectivity information. We demonstrate the utility of our algorithm in two applications: producing a set of the most informative viewpoints and suggesting an informative city tour given a city scan.

5 0.11019668 392 iccv-2013-Similarity Metric Learning for Face Recognition

Author: Qiong Cao, Yiming Ying, Peng Li

Abstract: Recently, there is a considerable amount of efforts devoted to the problem of unconstrained face verification, where the task is to predict whether pairs of images are from the same person or not. This problem is challenging and difficult due to the large variations in face images. In this paper, we develop a novel regularization framework to learn similarity metrics for unconstrained face verification. We formulate its objective function by incorporating the robustness to the large intra-personal variations and the discriminative power of novel similarity metrics. In addition, our formulation is a convex optimization problem which guarantees the existence of its global solution. Experiments show that our proposed method achieves the state-of-the-art results on the challenging Labeled Faces in the Wild (LFW) database [10].

6 0.08449544 177 iccv-2013-From Point to Set: Extend the Learning of Distance Metrics

7 0.082602322 222 iccv-2013-Joint Learning of Discriminative Prototypes and Large Margin Nearest Neighbor Classifiers

8 0.081388921 52 iccv-2013-Attribute Adaptation for Personalized Image Search

9 0.080788434 31 iccv-2013-A Unified Probabilistic Approach Modeling Relationships between Attributes and Objects

10 0.078468636 6 iccv-2013-A Convex Optimization Framework for Active Learning

11 0.069661744 256 iccv-2013-Locally Affine Sparse-to-Dense Matching for Motion and Occlusion Estimation

12 0.06895832 431 iccv-2013-Unbiased Metric Learning: On the Utilization of Multiple Datasets and Web Images for Softening Bias

13 0.068164818 107 iccv-2013-Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction

14 0.067526706 399 iccv-2013-Spoken Attributes: Mixing Binary and Relative Attributes to Say the Right Thing

15 0.066500604 445 iccv-2013-Visual Reranking through Weakly Supervised Multi-graph Learning

16 0.065564871 290 iccv-2013-New Graph Structured Sparsity Model for Multi-label Image Annotations

17 0.06433998 397 iccv-2013-Space-Time Tradeoffs in Photo Sequencing

18 0.064128906 204 iccv-2013-Human Attribute Recognition by Rich Appearance Dictionary

19 0.062989101 238 iccv-2013-Learning Graphs to Match

20 0.062407136 165 iccv-2013-Find the Best Path: An Efficient and Accurate Classifier for Image Hierarchies


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.156), (1, 0.057), (2, -0.031), (3, -0.085), (4, 0.015), (5, 0.032), (6, -0.017), (7, -0.047), (8, 0.041), (9, -0.005), (10, -0.033), (11, 0.013), (12, -0.01), (13, -0.028), (14, 0.034), (15, -0.017), (16, -0.016), (17, 0.002), (18, -0.006), (19, 0.014), (20, 0.015), (21, -0.002), (22, -0.005), (23, -0.019), (24, -0.007), (25, 0.011), (26, 0.084), (27, 0.044), (28, 0.01), (29, 0.078), (30, 0.047), (31, -0.031), (32, -0.029), (33, -0.012), (34, 0.011), (35, 0.024), (36, 0.042), (37, -0.032), (38, 0.019), (39, -0.082), (40, 0.061), (41, -0.011), (42, -0.047), (43, 0.044), (44, 0.063), (45, -0.06), (46, 0.013), (47, -0.026), (48, -0.023), (49, 0.089)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93921238 332 iccv-2013-Quadruplet-Wise Image Similarity Learning

Author: Marc T. Law, Nicolas Thome, Matthieu Cord

Abstract: This paper introduces a novel similarity learning framework. Working with inequality constraints involving quadruplets of images, our approach aims at efficiently modeling similarity from rich or complex semantic label relationships. From these quadruplet-wise constraints, we propose a similarity learning framework relying on a convex optimization scheme. We then study how our metric learning scheme can exploit specific class relationships, such as class ranking (relative attributes), and class taxonomy. We show that classification using the learned metrics gets improved performance over state-of-the-art methods on several datasets. We also evaluate our approach in a new application to learn similarities between webpage screenshots in a fully unsupervised way.

2 0.82757294 222 iccv-2013-Joint Learning of Discriminative Prototypes and Large Margin Nearest Neighbor Classifiers

Author: Martin Köstinger, Paul Wohlhart, Peter M. Roth, Horst Bischof

Abstract: In this paper, we raise important issues concerning the evaluation complexity of existing Mahalanobis metric learning methods. The complexity scales linearly with the size of the dataset. This is especially cumbersome on large scale or for real-time applications with limited time budget. To alleviate this problem we propose to represent the dataset by a fixed number of discriminative prototypes. In particular, we introduce a new method that jointly chooses the positioning of prototypes and also optimizes the Mahalanobis distance metric with respect to these. We show that choosing the positioning of the prototypes and learning the metric in parallel leads to a drastically reduced evaluation effort while maintaining the discriminative essence of the original dataset. Moreover, for most problems our method performing k-nearest prototype (k-NP) classification on the condensed dataset leads to even better generalization compared to k-NN classification using all data. Results on a variety of challenging benchmarks demonstrate the power of our method. These include standard machine learning datasets as well as the challenging Public Fig- ures Face Database. On the competitive machine learning benchmarks we are comparable to the state-of-the-art while being more efficient. On the face benchmark we clearly outperform the state-of-the-art in Mahalanobis metric learning with drastically reduced evaluation effort.

3 0.78708732 227 iccv-2013-Large-Scale Image Annotation by Efficient and Robust Kernel Metric Learning

Author: Zheyun Feng, Rong Jin, Anil Jain

Abstract: One of the key challenges in search-based image annotation models is to define an appropriate similarity measure between images. Many kernel distance metric learning (KML) algorithms have been developed in order to capture the nonlinear relationships between visual features and semantics ofthe images. Onefundamental limitation in applying KML to image annotation is that it requires converting image annotations into binary constraints, leading to a significant information loss. In addition, most KML algorithms suffer from high computational cost due to the requirement that the learned matrix has to be positive semi-definitive (PSD). In this paper, we propose a robust kernel metric learning (RKML) algorithm based on the regression technique that is able to directly utilize image annotations. The proposed method is also computationally more efficient because PSD property is automatically ensured by regression. We provide the theoretical guarantee for the proposed algorithm, and verify its efficiency and effectiveness for image annotation by comparing it to state-of-the-art approaches for both distance metric learning and image annotation. ,

4 0.76649749 177 iccv-2013-From Point to Set: Extend the Learning of Distance Metrics

Author: Pengfei Zhu, Lei Zhang, Wangmeng Zuo, David Zhang

Abstract: Most of the current metric learning methods are proposed for point-to-point distance (PPD) based classification. In many computer vision tasks, however, we need to measure the point-to-set distance (PSD) and even set-to-set distance (SSD) for classification. In this paper, we extend the PPD based Mahalanobis distance metric learning to PSD and SSD based ones, namely point-to-set distance metric learning (PSDML) and set-to-set distance metric learning (SSDML), and solve them under a unified optimization framework. First, we generate positive and negative sample pairs by computing the PSD and SSD between training samples. Then, we characterize each sample pair by its covariance matrix, and propose a covariance kernel based discriminative function. Finally, we tackle the PSDML and SSDMLproblems by using standard support vector machine solvers, making the metric learning very efficient for multiclass visual classification tasks. Experiments on gender classification, digit recognition, object categorization and face recognition show that the proposed metric learning methods can effectively enhance the performance of PSD and SSD based classification.

5 0.75895804 142 iccv-2013-Ensemble Projection for Semi-supervised Image Classification

Author: Dengxin Dai, Luc Van_Gool

Abstract: This paper investigates the problem of semi-supervised classification. Unlike previous methods to regularize classifying boundaries with unlabeled data, our method learns a new image representation from all available data (labeled and unlabeled) andperformsplain supervised learning with the new feature. In particular, an ensemble of image prototype sets are sampled automatically from the available data, to represent a rich set of visual categories/attributes. Discriminative functions are then learned on these prototype sets, and image are represented by the concatenation of their projected values onto the prototypes (similarities to them) for further classification. Experiments on four standard datasets show three interesting phenomena: (1) our method consistently outperforms previous methods for semi-supervised image classification; (2) our method lets itself combine well with these methods; and (3) our method works well for self-taught image classification where unlabeled data are not coming from the same distribution as la- beled ones, but rather from a random collection of images.

6 0.75702769 431 iccv-2013-Unbiased Metric Learning: On the Utilization of Multiple Datasets and Web Images for Softening Bias

7 0.72322696 194 iccv-2013-Heterogeneous Image Features Integration via Multi-modal Semi-supervised Learning Model

8 0.71965575 212 iccv-2013-Image Set Classification Using Holistic Multiple Order Statistics Features and Localized Multi-kernel Metric Learning

9 0.69246113 126 iccv-2013-Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification

10 0.69017828 248 iccv-2013-Learning to Rank Using Privileged Information

11 0.68343484 380 iccv-2013-Semantic Transform: Weakly Supervised Semantic Inference for Relating Visual Attributes

12 0.67796373 25 iccv-2013-A Novel Earth Mover's Distance Methodology for Image Matching with Gaussian Mixture Models

13 0.6533404 392 iccv-2013-Similarity Metric Learning for Face Recognition

14 0.64203161 29 iccv-2013-A Scalable Unsupervised Feature Merging Approach to Efficient Dimensionality Reduction of High-Dimensional Visual Data

15 0.6169818 416 iccv-2013-The Interestingness of Images

16 0.61604363 312 iccv-2013-Perceptual Fidelity Aware Mean Squared Error

17 0.61539352 290 iccv-2013-New Graph Structured Sparsity Model for Multi-label Image Annotations

18 0.6151607 158 iccv-2013-Fast High Dimensional Vector Multiplication Face Recognition

19 0.61240226 125 iccv-2013-Drosophila Embryo Stage Annotation Using Label Propagation

20 0.60699862 449 iccv-2013-What Do You Do? Occupation Recognition in a Photo via Social Context


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.089), (7, 0.026), (26, 0.068), (31, 0.038), (34, 0.013), (42, 0.108), (48, 0.017), (64, 0.071), (73, 0.032), (78, 0.011), (89, 0.152), (97, 0.019), (98, 0.025), (99, 0.219)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.81340843 332 iccv-2013-Quadruplet-Wise Image Similarity Learning

Author: Marc T. Law, Nicolas Thome, Matthieu Cord

Abstract: This paper introduces a novel similarity learning framework. Working with inequality constraints involving quadruplets of images, our approach aims at efficiently modeling similarity from rich or complex semantic label relationships. From these quadruplet-wise constraints, we propose a similarity learning framework relying on a convex optimization scheme. We then study how our metric learning scheme can exploit specific class relationships, such as class ranking (relative attributes), and class taxonomy. We show that classification using the learned metrics gets improved performance over state-of-the-art methods on several datasets. We also evaluate our approach in a new application to learn similarities between webpage screenshots in a fully unsupervised way.

2 0.72569084 386 iccv-2013-Sequential Bayesian Model Update under Structured Scene Prior for Semantic Road Scenes Labeling

Author: Evgeny Levinkov, Mario Fritz

Abstract: Semantic road labeling is a key component of systems that aim at assisted or even autonomous driving. Considering that such systems continuously operate in the realworld, unforeseen conditions not represented in any conceivable training procedure are likely to occur on a regular basis. In order to equip systems with the ability to cope with such situations, we would like to enable adaptation to such new situations and conditions at runtime. Existing adaptive methods for image labeling either require labeled data from the new condition or even operate globally on a complete test set. None of this is a desirable mode of operation for a system as described above where new images arrive sequentially and conditions may vary. We study the effect of changing test conditions on scene labeling methods based on a new diverse street scene dataset. We propose a novel approach that can operate in such conditions and is based on a sequential Bayesian model update in order to robustly integrate the arriving images into the adapting procedure.

3 0.71456265 180 iccv-2013-From Where and How to What We See

Author: S. Karthikeyan, Vignesh Jagadeesh, Renuka Shenoy, Miguel Ecksteinz, B.S. Manjunath

Abstract: Eye movement studies have confirmed that overt attention is highly biased towards faces and text regions in images. In this paper we explore a novel problem of predicting face and text regions in images using eye tracking data from multiple subjects. The problem is challenging as we aim to predict the semantics (face/text/background) only from eye tracking data without utilizing any image information. The proposed algorithm spatially clusters eye tracking data obtained in an image into different coherent groups and subsequently models the likelihood of the clusters containing faces and text using afully connectedMarkov Random Field (MRF). Given the eye tracking datafrom a test image, itpredicts potential face/head (humans, dogs and cats) and text locations reliably. Furthermore, the approach can be used to select regions of interest for further analysis by object detectors for faces and text. The hybrid eye position/object detector approach achieves better detection performance and reduced computation time compared to using only the object detection algorithm. We also present a new eye tracking dataset on 300 images selected from ICDAR, Street-view, Flickr and Oxford-IIIT Pet Dataset from 15 subjects.

4 0.71300352 188 iccv-2013-Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps

Author: Jiajia Luo, Wei Wang, Hairong Qi

Abstract: Human action recognition based on the depth information provided by commodity depth sensors is an important yet challenging task. The noisy depth maps, different lengths of action sequences, and free styles in performing actions, may cause large intra-class variations. In this paper, a new framework based on sparse coding and temporal pyramid matching (TPM) is proposed for depthbased human action recognition. Especially, a discriminative class-specific dictionary learning algorithm isproposed for sparse coding. By adding the group sparsity and geometry constraints, features can be well reconstructed by the sub-dictionary belonging to the same class, and the geometry relationships among features are also kept in the calculated coefficients. The proposed approach is evaluated on two benchmark datasets captured by depth cameras. Experimental results show that the proposed algorithm repeatedly hqi } @ ut k . edu GB ImagesR epth ImagesD setkonlSy0 896.5170d4ept.3h021 .x02y 19.876504.dep3th02.1 x02. achieves superior performance to the state of the art algorithms. Moreover, the proposed dictionary learning method also outperforms classic dictionary learning approaches.

5 0.71237087 425 iccv-2013-Tracking via Robust Multi-task Multi-view Joint Sparse Representation

Author: Zhibin Hong, Xue Mei, Danil Prokhorov, Dacheng Tao

Abstract: Combining multiple observation views has proven beneficial for tracking. In this paper, we cast tracking as a novel multi-task multi-view sparse learning problem and exploit the cues from multiple views including various types of visual features, such as intensity, color, and edge, where each feature observation can be sparsely represented by a linear combination of atoms from an adaptive feature dictionary. The proposed method is integrated in a particle filter framework where every view in each particle is regarded as an individual task. We jointly consider the underlying relationship between tasks across different views and different particles, and tackle it in a unified robust multi-task formulation. In addition, to capture the frequently emerging outlier tasks, we decompose the representation matrix to two collaborative components which enable a more robust and accurate approximation. We show that theproposedformulation can be efficiently solved using the Accelerated Proximal Gradient method with a small number of closed-form updates. The presented tracker is implemented using four types of features and is tested on numerous benchmark video sequences. Both the qualitative and quantitative results demonstrate the superior performance of the proposed approach compared to several stateof-the-art trackers.

6 0.71211344 194 iccv-2013-Heterogeneous Image Features Integration via Multi-modal Semi-supervised Learning Model

7 0.71195936 227 iccv-2013-Large-Scale Image Annotation by Efficient and Robust Kernel Metric Learning

8 0.71035272 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning

9 0.70957524 406 iccv-2013-Style-Aware Mid-level Representation for Discovering Visual Connections in Space and Time

10 0.70930183 253 iccv-2013-Linear Sequence Discriminant Analysis: A Model-Based Dimensionality Reduction Method for Vector Sequences

11 0.70928079 338 iccv-2013-Randomized Ensemble Tracking

12 0.70906568 427 iccv-2013-Transfer Feature Learning with Joint Distribution Adaptation

13 0.70887554 179 iccv-2013-From Subcategories to Visual Composites: A Multi-level Framework for Object Detection

14 0.70880562 445 iccv-2013-Visual Reranking through Weakly Supervised Multi-graph Learning

15 0.70858312 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization

16 0.70832402 328 iccv-2013-Probabilistic Elastic Part Model for Unsupervised Face Detector Adaptation

17 0.70800871 44 iccv-2013-Adapting Classification Cascades to New Domains

18 0.7077657 20 iccv-2013-A Max-Margin Perspective on Sparse Representation-Based Classification

19 0.70732784 126 iccv-2013-Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification

20 0.70732325 340 iccv-2013-Real-Time Articulated Hand Pose Estimation Using Semi-supervised Transductive Regression Forests