cvpr cvpr2013 cvpr2013-241 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Zeynep Akata, Florent Perronnin, Zaid Harchaoui, Cordelia Schmid
Abstract: Attributes are an intermediate representation, which enables parameter sharing between classes, a must when training data is scarce. We propose to view attribute-based image classification as a label-embedding problem: each class is embedded in the space of attribute vectors. We introduce a function which measures the compatibility between an image and a label embedding. The parameters of this function are learned on a training set of labeled samples to ensure that, given an image, the correct classes rank higher than the incorrect ones. Results on the Animals With Attributes and Caltech-UCSD-Birds datasets show that the proposed framework outperforms the standard Direct Attribute Prediction baseline in a zero-shot learning scenario. The label embedding framework offers other advantages such as the ability to leverage alternative sources of information in addition to attributes (e.g. class hierarchies) or to transition smoothly from zero-shot learning to learning with large quantities of data.
Reference: text
sentIndex sentText sentNum sentScore
1 We propose to view attribute-based image classification as a label-embedding problem: each class is embedded in the space of attribute vectors. [sent-2, score-0.375]
2 The parameters of this function are learned on a training set of labeled samples to ensure that, given an image, the correct classes rank higher than the incorrect ones. [sent-4, score-0.267]
3 The label embedding framework offers other advantages such as the ability to leverage alternative sources of information in addition to attributes (e. [sent-6, score-0.761]
4 A solution to zero-shot learning which has recently gained in popularity in the computer vision community consists in introducing an intermediate space A referred to as attribute layer [16, 8]. [sent-14, score-0.361]
5 age embedding (left): how to extract suitable features from an image? [sent-17, score-0.307]
6 We focus on label embedding (right): how to embed class labels in a Euclidean space? [sent-18, score-0.501]
7 We use attributes as side information for the label embedding and measure the “compatibility”’ between the embedded inputs and outputs with a function F. [sent-19, score-0.753]
8 As an example, ifthe classes correspond to animals, possible attributes include “has paws”, “has stripes” or “is black”. [sent-21, score-0.357]
9 To classify a new image, its attributes are predicted using the learned classifiers and the attribute scores are combined into class-level scores. [sent-23, score-0.596]
10 In other words, since attribute classifiers are learned independently of the end-task they might be optimal at predicting attributes but not necessarily at predicting classes. [sent-27, score-0.596]
11 which can perform zeroshot prediction if no labeled samples are available for some classes, but which can also leverage new labeled samples for these classes as they become available. [sent-30, score-0.419]
12 Third, while attributes can be a useful source of prior information, other sources of information could be leveraged for zeroshot learning. [sent-32, score-0.41]
13 Indeed, images of classes which are close in a semantic hierarchy are usually more similar than images of classes which are far [6]. [sent-34, score-0.284]
14 This paper proposes such a solution by making use of the label embedding framework. [sent-38, score-0.395]
15 We underline that, while there is an abundant literature in the computer vision community on image embedding (how to describe an image? [sent-39, score-0.307]
16 ) much less work has been devoted in comparison to label embedding in the Y space (how to describe a class? [sent-40, score-0.425]
17 We embed each class y ∈ Y in the space of attribute vectors and thus refer to our approach as Attribute Label Embedding (ALE). [sent-42, score-0.395]
18 Third, the label embedding framework is generic and not restricted to attributes. [sent-50, score-0.395]
19 Related Work We now review related work on attributes, zero-shot learning and label embedding (three research areas which strongly overlap) with an emphasis on the latter. [sent-58, score-0.436]
20 It has been proposed to improve the standard DAP model to take into account the correlation between attributes or between attributes and classes [38, 39, 43, 19]. [sent-62, score-0.652]
21 Zero-shot learning Zero-shot learning requires the ability to transfer knowledge from classes for which we have training data to classes for which we do not. [sent-74, score-0.299]
22 Possible sources of prior information × include attributes [16, 8, 25, 28, 27], semantic class taxonomies [27, 22] or text features [25, 28, 27]. [sent-75, score-0.452]
23 It is unclear, however, how such an embedding could be extrapolated to the case of generic visual categories. [sent-79, score-0.307]
24 Label embedding In computer vision, a vast amount of work has been devoted to input embedding, i. [sent-83, score-0.337]
25 This includes works on patch encoding (see [4] for a recent comparison), on kernel-based methods [32] with a recent focus on explicit embeddings [20, 35], on dimensionality reduction [32] and on compression [13, 30, 36]. [sent-86, score-0.281]
26 Provided that the embedding function ϕ is chosen correctly, label embedding can be an effective way to share parameters between classes. [sent-88, score-0.702]
27 While this taxonomy is valid fot both input θ and output embeddings ϕ, we focus here on output embeddings. [sent-91, score-0.353]
28 A possible strategy consists in learning directly an embedding from the input to the output (or from the output to the input) as is the case of re888882111200888 gression [25]. [sent-98, score-0.426]
29 This setting is particularly relevant when little training data is available, as side information and the derived embeddings can compensate for the lack of data. [sent-106, score-0.357]
30 In our work, we focus on embeddings derived from side information but we also consider the case where they are learned from labeled data, using side information as a prior. [sent-110, score-0.46]
31 Learning with attributes as label embedding Given a training set S = {(xn, yn) , n = 1. [sent-112, score-0.693]
32 In machine learning, a common strategy is to use embedding functions θ :X → and ϕ : Y → for the inputs and outputs and then to learn on the transformed input/output pairs. [sent-118, score-0.386]
33 We then explain how to leverage attributes to compute label embeddings. [sent-122, score-0.389]
34 Finally, we show that the label embedding framework is generic enough to accommodate for other sources of side information. [sent-124, score-0.575]
35 It is generally assumed that F is linear in some combined feature embedding of inputs/outputs ψ(x, y) : F(x, y; w) = w? [sent-129, score-0.307]
36 ψ(x, y) (2) and that the joint embedding ψ can be written as the tensor product between the image embedding θ : X → = RD and the label embedding ϕ : Y → = RE: ψ(x, y) = θ(x) ⊗ ϕ(y) (3) and ψ(x, y) : RD RE → RDE. [sent-130, score-1.009]
37 Attribute label embedding We now consider the problem of computing label embeddings ϕA from attributes which we refer to as Attribute Label Embedding (ALE). [sent-144, score-1.005]
38 , C} and that we have a set of E attributes A = {ai , i= 1. [sent-150, score-0.265]
39 We also assume that we are provided with an association measure ρy,i between each attribute ai and each class y. [sent-154, score-0.36]
40 In this work, we focus on binary relevance although one advantage of the label embedding framework is that it can easily accommodate real-valued relevances. [sent-156, score-0.443]
41 We embed class y in the E-dim attribute space as follows: ϕA(y) = [ρy,1, . [sent-157, score-0.395]
42 , ρy,E] (7) and denote ΦA the E C matrix of attribute embeddings which stacks the individual ϕA(y) ’s. [sent-160, score-0.567]
43 We note that in equation (4) the image and label embeddings play symmetric roles. [sent-161, score-0.345]
44 Also, in the case where attributes are redundant, it might be advantageous to decorrelate them. [sent-165, score-0.284]
45 We will study the effect of attribute decorrelation in our experiments. [sent-171, score-0.312]
46 The simplest learning strategy is to maximize directly the compatibility between the input and output embeddings N1 ? [sent-175, score-0.405]
47 Therefore, we draw inspiration from the WSABIE algorithm [41] which learns jointly image and label embeddings from data to optimize classification accuracy. [sent-179, score-0.408]
48 The crucial difference between WSABIE and ALE is the fact that the latter uses attributes as side information. [sent-180, score-0.332]
49 In what follows, Φ is the matrix which stacks the embeddings ϕ(y). [sent-183, score-0.278]
50 In WSABIE, the label embedding space dimensionality is a parameter to tune. [sent-198, score-0.419]
51 In such a case, we want to learn the class embeddings using as prior information ΦA. [sent-210, score-0.368]
52 Beyond attributes While attributes make sense in the label embedding framework, we note that label embedding is more general and can accommodate for other sources of side information. [sent-218, score-1.5]
53 The hierarchy embedding ϕH (y) can be defined as the C dimensional vector: ξy,z ϕH(y) = [ξy,1, . [sent-221, score-0.387]
54 (11) We later refer to this embedding as Hierarchy Label Embed- ding (HLE) and we compare ϕA and ϕH as sources of prior information in our experiments. [sent-225, score-0.398]
55 In the case where classes are not organized in a tree structure but form a graph, then other types of embeddings could be used, for instance by performing a kernel PCA on the commute time kernel [29]. [sent-226, score-0.391]
56 Different embeddings can be easily combined in the label embedding framework, e. [sent-227, score-0.652]
57 through simple concatenation of the different embeddings or through more complex operations such as a CCA of the embeddings. [sent-229, score-0.289]
58 Each class was annotated with 85 attributes by 10 students [24] and the result was binarized. [sent-243, score-0.317]
59 Hence, there is a significant difference in the number and quality of attributes between the two datasets. [sent-255, score-0.265]
60 What is the best way to encode/normalize the attribute embeddings? [sent-273, score-0.289]
61 How do attributes compare to a class hierarchy as prior information? [sent-276, score-0.423]
62 The first baseline is Ridge Regression (RR) which was used in [25] to map input features to output attribute labels. [sent-280, score-0.338]
63 Comparison of different attribute embeddings: {0, 1} embedding, {− 1, +1} embedding and mean-centered embedding, with and without ? [sent-295, score-0.596]
64 For these experiments, the attribute vectors are encoded in a binary fashion (using {0, 1}) and ? [sent-299, score-0.308]
65 We experiment with a {0, 1} embedding, a {−1, +1} embedding and a meancentered embedding (i. [sent-307, score-0.614]
66 Underlying the {0, 1} embedding is the assumption that the presence of the same attribute in two classes should contribute to their similarity, but not its ab- sence3. [sent-310, score-0.707]
67 Underlying the {−1, 1} embedding is the assumption that the presence or the absence of the same attribute in two classes should contribute equally to their similarity. [sent-311, score-0.707]
68 For instance, if an attribute appears in almost all classes, then in the mean-centered embedding, its absence will contribute more to the similarity than its presence4. [sent-313, score-0.308]
69 In what follows, we make use of the simple {0, 1} embedding with ? [sent-320, score-0.307]
70 In DAP, given a new image x, we assign it to the class y with 3Here we assume a dot-product similarity between attribute embeddings which is consistent with our linear compatibility function (4). [sent-325, score-0.656]
71 Our DAP results on AWA are lower than those reported in [16] because we use only half of the data to train the attribute classifiers. [sent-342, score-0.308]
72 Right 2 columns: attribute prediction accuracy (AUC in %) on the 85 AWA and 312 CUB attributes. [sent-343, score-0.352]
73 p(ae = ρy,e|x) (12) e= 1 where ρy,e is the association measure between attribute ae and class y, and p(ae = 1|x) is the probability that image x contains attribute e. [sent-345, score-0.668]
74 We train for each attribute one linear classifier on the FVs. [sent-346, score-0.289]
75 We use a (regularized) logistic loss which provides an attribute classification accuracy similar to the SVM but with the added benefit that its output is already a probability. [sent-347, score-0.352]
76 Hence, our approach seems to be more beneficial when the attribute quality is higher. [sent-349, score-0.289]
77 In ALE, each column of W can be interpreted as an attribute classifier and θ(x)? [sent-352, score-0.289]
78 However, one major difference with DAP is that we do not optimize for attribute classification accuracy. [sent-354, score-0.352]
79 We therefore measured the attribute prediction accuracy of DAP and ALE. [sent-356, score-0.352]
80 As expected, the attribute prediction accuracy of DAP is higher than that of our approach. [sent-359, score-0.352]
81 Thus, our learned attribute classifiers should still be interpretable. [sent-363, score-0.331]
82 Classification accuracy on AWA (left) and CUB (right) as a function of the label embedding dimensionality. [sent-366, score-0.395]
83 We compare the baseline which uses all attributes, with an SVD dimensionality reduction and a sampling of attributes (we report the mean and standard deviation over 10 samplings). [sent-367, score-0.309]
84 Comparison of attributes (ALE) and hierarchies (HLE) for label embedding. [sent-377, score-0.381]
85 We explore two different techniques: Singular Value Decomposition (SVD) and attribute sampling. [sent-382, score-0.289]
86 From these experiments, we can conclude that there is a significant amount of correlation between attributes and that the output space dimensionality can be significantly reduced with little accuracy loss. [sent-389, score-0.348]
87 As expected, SVD outperforms a random sampling of the attribute dimensions. [sent-396, score-0.289]
88 As mentioned earlier, while attributes can be a useful source of prior information to embed classes, other sources exist. [sent-398, score-0.41]
89 For each attribute we show the images ranked highest. [sent-413, score-0.289]
90 We explore different alternatives such as the concatenation of the embeddings or performing CCA on the embeddings. [sent-418, score-0.289]
91 On AWA, the combina- tion performs better than attributes or the hierarchy alone while on CUB, there is no improvement through the combination, certainly because the hierarchy adds little additional information. [sent-424, score-0.425]
92 We compare ALE with WSABIE [41] which performs label embedding and therefore “shares” samples be- tween classes but does not use prior information. [sent-453, score-0.553]
93 One advantage of WSABIE with respect to ALE is that the embedding space dimensionality can be tuned, thus giving more flexibility when larger amounts of training data become available. [sent-457, score-0.364]
94 As an example, ALE with 2 training samples performs on par with WSABIE with 20 training samples, showing that attributes can compensate for limited training data. [sent-460, score-0.423]
95 We compare three embedding techniques: ALE (attributes only), HLE (hierarchy only), AHLE (attributes and hierarchy). [sent-465, score-0.307]
96 Second, our model can leverage labeled training data (if available) to update the label embedding, using the attribute embedding as a prior. [sent-482, score-0.8]
97 Third, the label embedding famework is not restricted to attributes and can accommodate other sources of prior information such as class taxonomies. [sent-483, score-0.851]
98 In the few-shots setting, we showed improvements with respect to WSABIE, which learns the label embedding from labeled data but does not leverage prior information. [sent-485, score-0.504]
99 Learning to detect unseen object classes by between-class attribute transfer. [sent-596, score-0.381]
100 A joint learning framework for attribute models and object descriptions. [sent-613, score-0.33]
wordName wordTfidf (topN-words)
[('wsabie', 0.376), ('cub', 0.338), ('awa', 0.319), ('embedding', 0.307), ('attribute', 0.289), ('attributes', 0.265), ('ale', 0.257), ('embeddings', 0.257), ('dap', 0.249), ('classes', 0.092), ('ahle', 0.089), ('label', 0.088), ('hierarchy', 0.08), ('cca', 0.073), ('yn', 0.072), ('hle', 0.069), ('side', 0.067), ('svd', 0.066), ('sources', 0.065), ('ovr', 0.063), ('prediction', 0.063), ('compatibility', 0.058), ('embed', 0.054), ('zeroshot', 0.054), ('class', 0.052), ('ranking', 0.051), ('weston', 0.05), ('accommodate', 0.048), ('xn', 0.048), ('ssvm', 0.047), ('labeled', 0.047), ('objective', 0.042), ('learning', 0.041), ('samples', 0.04), ('fv', 0.039), ('multiclass', 0.038), ('taxonomy', 0.038), ('auc', 0.037), ('leverage', 0.036), ('dataindependent', 0.036), ('fvs', 0.036), ('wintoh', 0.036), ('xrce', 0.036), ('classification', 0.034), ('training', 0.033), ('rank', 0.033), ('learn', 0.033), ('ecml', 0.032), ('concatenation', 0.032), ('intermediate', 0.031), ('correlation', 0.03), ('ridge', 0.03), ('devoted', 0.03), ('lear', 0.029), ('fisher', 0.029), ('output', 0.029), ('optimize', 0.029), ('hierarchies', 0.028), ('mahajan', 0.026), ('perronnin', 0.026), ('animals', 0.026), ('prior', 0.026), ('inputs', 0.026), ('larochelle', 0.025), ('dimensionality', 0.024), ('kulkarni', 0.024), ('wordnet', 0.024), ('taxonomies', 0.024), ('hsu', 0.024), ('mensink', 0.024), ('rohrbach', 0.024), ('decorrelation', 0.023), ('amit', 0.022), ('rr', 0.022), ('learned', 0.022), ('wah', 0.021), ('kernel', 0.021), ('stark', 0.021), ('comparatively', 0.021), ('stacks', 0.021), ('principled', 0.02), ('anchez', 0.02), ('strategy', 0.02), ('baseline', 0.02), ('branson', 0.02), ('semantic', 0.02), ('classifiers', 0.02), ('france', 0.02), ('ae', 0.019), ('douze', 0.019), ('half', 0.019), ('verbeek', 0.019), ('advantageous', 0.019), ('nips', 0.019), ('association', 0.019), ('contribute', 0.019), ('par', 0.019), ('fashion', 0.019), ('regularized', 0.018), ('vedaldi', 0.018)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999988 241 cvpr-2013-Label-Embedding for Attribute-Based Classification
Author: Zeynep Akata, Florent Perronnin, Zaid Harchaoui, Cordelia Schmid
Abstract: Attributes are an intermediate representation, which enables parameter sharing between classes, a must when training data is scarce. We propose to view attribute-based image classification as a label-embedding problem: each class is embedded in the space of attribute vectors. We introduce a function which measures the compatibility between an image and a label embedding. The parameters of this function are learned on a training set of labeled samples to ensure that, given an image, the correct classes rank higher than the incorrect ones. Results on the Animals With Attributes and Caltech-UCSD-Birds datasets show that the proposed framework outperforms the standard Direct Attribute Prediction baseline in a zero-shot learning scenario. The label embedding framework offers other advantages such as the ability to leverage alternative sources of information in addition to attributes (e.g. class hierarchies) or to transition smoothly from zero-shot learning to learning with large quantities of data.
2 0.36487374 116 cvpr-2013-Designing Category-Level Attributes for Discriminative Visual Recognition
Author: Felix X. Yu, Liangliang Cao, Rogerio S. Feris, John R. Smith, Shih-Fu Chang
Abstract: Attribute-based representation has shown great promises for visual recognition due to its intuitive interpretation and cross-category generalization property. However, human efforts are usually involved in the attribute designing process, making the representation costly to obtain. In this paper, we propose a novel formulation to automatically design discriminative “category-level attributes ”, which can be efficiently encoded by a compact category-attribute matrix. The formulation allows us to achieve intuitive and critical design criteria (category-separability, learnability) in a principled way. The designed attributes can be used for tasks of cross-category knowledge transfer, achieving superior performance over well-known attribute dataset Animals with Attributes (AwA) and a large-scale ILSVRC2010 dataset (1.2M images). This approach also leads to state-ofthe-art performance on the zero-shot learning task on AwA.
3 0.22966821 36 cvpr-2013-Adding Unlabeled Samples to Categories by Learned Attributes
Author: Jonghyun Choi, Mohammad Rastegari, Ali Farhadi, Larry S. Davis
Abstract: We propose a method to expand the visual coverage of training sets that consist of a small number of labeled examples using learned attributes. Our optimization formulation discovers category specific attributes as well as the images that have high confidence in terms of the attributes. In addition, we propose a method to stably capture example-specific attributes for a small sized training set. Our method adds images to a category from a large unlabeled image pool, and leads to significant improvement in category recognition accuracy evaluated on a large-scale dataset, ImageNet.
4 0.22280677 229 cvpr-2013-It's Not Polite to Point: Describing People with Uncertain Attributes
Author: Amir Sadovnik, Andrew Gallagher, Tsuhan Chen
Abstract: Visual attributes are powerful features for many different applications in computer vision such as object detection and scene recognition. Visual attributes present another application that has not been examined as rigorously: verbal communication from a computer to a human. Since many attributes are nameable, the computer is able to communicate these concepts through language. However, this is not a trivial task. Given a set of attributes, selecting a subset to be communicated is task dependent. Moreover, because attribute classifiers are noisy, it is important to find ways to deal with this uncertainty. We address the issue of communication by examining the task of composing an automatic description of a person in a group photo that distinguishes him from the others. We introduce an efficient, principled methodfor choosing which attributes are included in a short description to maximize the likelihood that a third party will correctly guess to which person the description refers. We compare our algorithm to computer baselines and human describers, and show the strength of our method in creating effective descriptions.
5 0.2042508 461 cvpr-2013-Weakly Supervised Learning for Attribute Localization in Outdoor Scenes
Author: Shuo Wang, Jungseock Joo, Yizhou Wang, Song-Chun Zhu
Abstract: In this paper, we propose a weakly supervised method for simultaneously learning scene parts and attributes from a collection ofimages associated with attributes in text, where the precise localization of the each attribute left unknown. Our method includes three aspects. (i) Compositional scene configuration. We learn the spatial layouts of the scene by Hierarchical Space Tiling (HST) representation, which can generate an excessive number of scene configurations through the hierarchical composition of a relatively small number of parts. (ii) Attribute association. The scene attributes contain nouns and adjectives corresponding to the objects and their appearance descriptions respectively. We assign the nouns to the nodes (parts) in HST using nonmaximum suppression of their correlation, then train an appearance model for each noun+adjective attribute pair. (iii) Joint inference and learning. For an image, we compute the most probable parse tree with the attributes as an instantiation of the HST by dynamic programming. Then update the HST and attribute association based on the in- ferred parse trees. We evaluate the proposed method by (i) showing the improvement of attribute recognition accuracy; and (ii) comparing the average precision of localizing attributes to the scene parts.
6 0.20250143 85 cvpr-2013-Complex Event Detection via Multi-source Video Attributes
7 0.20110162 101 cvpr-2013-Cumulative Attribute Space for Age and Crowd Density Estimation
8 0.19115734 293 cvpr-2013-Multi-attribute Queries: To Merge or Not to Merge?
9 0.18065017 48 cvpr-2013-Attribute-Based Detection of Unfamiliar Classes with Humans in the Loop
10 0.15909271 396 cvpr-2013-Simultaneous Active Learning of Classifiers & Attributes via Relative Feedback
11 0.15463108 179 cvpr-2013-From N to N+1: Multiclass Transfer Incremental Learning
12 0.1541459 348 cvpr-2013-Recognizing Activities via Bag of Words for Attribute Dynamics
13 0.14656073 146 cvpr-2013-Enriching Texture Analysis with Semantic Data
14 0.14281867 310 cvpr-2013-Object-Centric Anomaly Detection by Attribute-Based Reasoning
15 0.11091744 462 cvpr-2013-Weakly Supervised Learning of Mid-Level Features with Beta-Bernoulli Process Restricted Boltzmann Machines
16 0.10978945 99 cvpr-2013-Cross-View Image Geolocalization
17 0.10683892 153 cvpr-2013-Expanded Parts Model for Human Attribute and Action Recognition in Still Images
18 0.10406834 112 cvpr-2013-Dense Segmentation-Aware Descriptors
19 0.10111265 223 cvpr-2013-Inductive Hashing on Manifolds
20 0.097601034 323 cvpr-2013-POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation
topicId topicWeight
[(0, 0.159), (1, -0.127), (2, -0.059), (3, -0.016), (4, 0.133), (5, 0.124), (6, -0.29), (7, 0.057), (8, 0.052), (9, 0.21), (10, -0.058), (11, 0.103), (12, -0.078), (13, -0.005), (14, 0.053), (15, 0.044), (16, -0.077), (17, -0.049), (18, -0.036), (19, 0.09), (20, -0.004), (21, 0.003), (22, -0.016), (23, 0.022), (24, 0.026), (25, -0.002), (26, -0.028), (27, 0.044), (28, -0.021), (29, 0.005), (30, -0.003), (31, -0.011), (32, 0.029), (33, -0.031), (34, -0.019), (35, -0.01), (36, -0.005), (37, -0.016), (38, 0.018), (39, 0.042), (40, -0.023), (41, 0.017), (42, 0.021), (43, -0.022), (44, -0.024), (45, -0.014), (46, -0.014), (47, 0.052), (48, 0.015), (49, -0.038)]
simIndex simValue paperId paperTitle
1 0.95807832 116 cvpr-2013-Designing Category-Level Attributes for Discriminative Visual Recognition
Author: Felix X. Yu, Liangliang Cao, Rogerio S. Feris, John R. Smith, Shih-Fu Chang
Abstract: Attribute-based representation has shown great promises for visual recognition due to its intuitive interpretation and cross-category generalization property. However, human efforts are usually involved in the attribute designing process, making the representation costly to obtain. In this paper, we propose a novel formulation to automatically design discriminative “category-level attributes ”, which can be efficiently encoded by a compact category-attribute matrix. The formulation allows us to achieve intuitive and critical design criteria (category-separability, learnability) in a principled way. The designed attributes can be used for tasks of cross-category knowledge transfer, achieving superior performance over well-known attribute dataset Animals with Attributes (AwA) and a large-scale ILSVRC2010 dataset (1.2M images). This approach also leads to state-ofthe-art performance on the zero-shot learning task on AwA.
same-paper 2 0.9400838 241 cvpr-2013-Label-Embedding for Attribute-Based Classification
Author: Zeynep Akata, Florent Perronnin, Zaid Harchaoui, Cordelia Schmid
Abstract: Attributes are an intermediate representation, which enables parameter sharing between classes, a must when training data is scarce. We propose to view attribute-based image classification as a label-embedding problem: each class is embedded in the space of attribute vectors. We introduce a function which measures the compatibility between an image and a label embedding. The parameters of this function are learned on a training set of labeled samples to ensure that, given an image, the correct classes rank higher than the incorrect ones. Results on the Animals With Attributes and Caltech-UCSD-Birds datasets show that the proposed framework outperforms the standard Direct Attribute Prediction baseline in a zero-shot learning scenario. The label embedding framework offers other advantages such as the ability to leverage alternative sources of information in addition to attributes (e.g. class hierarchies) or to transition smoothly from zero-shot learning to learning with large quantities of data.
3 0.92089325 48 cvpr-2013-Attribute-Based Detection of Unfamiliar Classes with Humans in the Loop
Author: Catherine Wah, Serge Belongie
Abstract: Recent work in computer vision has addressed zero-shot learning or unseen class detection, which involves categorizing objects without observing any training examples. However, these problems assume that attributes or defining characteristics of these unobserved classes are known, leveraging this information at test time to detect an unseen class. We address the more realistic problem of detecting categories that do not appear in the dataset in any form. We denote such a category as an unfamiliar class; it is neither observed at train time, nor do we possess any knowledge regarding its relationships to attributes. This problem is one that has received limited attention within the computer vision community. In this work, we propose a novel ap. ucs d .edu Unfamiliar? or?not? UERY?IMAGQ IMmFaAtgMechs?inIlLatsrA?inYRESg MFNaAotc?ihntIlraLsin?A YRgES UMNaotFc?hAinMltarsIinL?NIgAOR AKNTAWDNO ?Train g?imagesn U(se)alc?n)eSs(Long?bilCas n?a’t lrfyibuteIn?mfoartesixNearwter proach to the unfamiliar class detection task that builds on attribute-based classification methods, and we empirically demonstrate how classification accuracy is impacted by attribute noise and dataset “difficulty,” as quantified by the separation of classes in the attribute space. We also present a method for incorporating human users to overcome deficiencies in attribute detection. We demonstrate results superior to existing methods on the challenging CUB-200-2011 dataset.
4 0.86920995 310 cvpr-2013-Object-Centric Anomaly Detection by Attribute-Based Reasoning
Author: Babak Saleh, Ali Farhadi, Ahmed Elgammal
Abstract: When describing images, humans tend not to talk about the obvious, but rather mention what they find interesting. We argue that abnormalities and deviations from typicalities are among the most important components that form what is worth mentioning. In this paper we introduce the abnormality detection as a recognition problem and show how to model typicalities and, consequently, meaningful deviations from prototypical properties of categories. Our model can recognize abnormalities and report the main reasons of any recognized abnormality. We also show that abnormality predictions can help image categorization. We introduce the abnormality detection dataset and show interesting results on how to reason about abnormalities.
5 0.86760652 229 cvpr-2013-It's Not Polite to Point: Describing People with Uncertain Attributes
Author: Amir Sadovnik, Andrew Gallagher, Tsuhan Chen
Abstract: Visual attributes are powerful features for many different applications in computer vision such as object detection and scene recognition. Visual attributes present another application that has not been examined as rigorously: verbal communication from a computer to a human. Since many attributes are nameable, the computer is able to communicate these concepts through language. However, this is not a trivial task. Given a set of attributes, selecting a subset to be communicated is task dependent. Moreover, because attribute classifiers are noisy, it is important to find ways to deal with this uncertainty. We address the issue of communication by examining the task of composing an automatic description of a person in a group photo that distinguishes him from the others. We introduce an efficient, principled methodfor choosing which attributes are included in a short description to maximize the likelihood that a third party will correctly guess to which person the description refers. We compare our algorithm to computer baselines and human describers, and show the strength of our method in creating effective descriptions.
6 0.86574024 293 cvpr-2013-Multi-attribute Queries: To Merge or Not to Merge?
7 0.85250318 396 cvpr-2013-Simultaneous Active Learning of Classifiers & Attributes via Relative Feedback
8 0.81303418 461 cvpr-2013-Weakly Supervised Learning for Attribute Localization in Outdoor Scenes
9 0.76107413 36 cvpr-2013-Adding Unlabeled Samples to Categories by Learned Attributes
10 0.74261642 101 cvpr-2013-Cumulative Attribute Space for Age and Crowd Density Estimation
11 0.70843422 85 cvpr-2013-Complex Event Detection via Multi-source Video Attributes
12 0.68041295 348 cvpr-2013-Recognizing Activities via Bag of Words for Attribute Dynamics
13 0.64426816 146 cvpr-2013-Enriching Texture Analysis with Semantic Data
14 0.5828703 99 cvpr-2013-Cross-View Image Geolocalization
15 0.51000178 463 cvpr-2013-What's in a Name? First Names as Facial Attributes
18 0.42305344 353 cvpr-2013-Relative Hidden Markov Models for Evaluating Motion Skill
19 0.40526125 153 cvpr-2013-Expanded Parts Model for Human Attribute and Action Recognition in Still Images
20 0.38328567 174 cvpr-2013-Fine-Grained Crowdsourcing for Fine-Grained Recognition
topicId topicWeight
[(10, 0.094), (15, 0.23), (16, 0.03), (26, 0.042), (28, 0.018), (33, 0.316), (67, 0.057), (69, 0.03), (72, 0.01), (77, 0.01), (80, 0.014), (87, 0.061)]
simIndex simValue paperId paperTitle
1 0.8933984 41 cvpr-2013-An Iterated L1 Algorithm for Non-smooth Non-convex Optimization in Computer Vision
Author: Peter Ochs, Alexey Dosovitskiy, Thomas Brox, Thomas Pock
Abstract: Natural image statistics indicate that we should use nonconvex norms for most regularization tasks in image processing and computer vision. Still, they are rarely used in practice due to the challenge to optimize them. Recently, iteratively reweighed ?1 minimization has been proposed as a way to tackle a class of non-convex functions by solving a sequence of convex ?2-?1 problems. Here we extend the problem class to linearly constrained optimization of a Lipschitz continuous function, which is the sum of a convex function and a function being concave and increasing on the non-negative orthant (possibly non-convex and nonconcave on the whole space). This allows to apply the algorithm to many computer vision tasks. We show the effect of non-convex regularizers on image denoising, deconvolution, optical flow, and depth map fusion. Non-convexity is particularly interesting in combination with total generalized variation and learned image priors. Efficient optimization is made possible by some important properties that are shown to hold.
2 0.88491631 21 cvpr-2013-A New Perspective on Uncalibrated Photometric Stereo
Author: Thoma Papadhimitri, Paolo Favaro
Abstract: We investigate the problem of reconstructing normals, albedo and lights of Lambertian surfaces in uncalibrated photometric stereo under the perspective projection model. Our analysis is based on establishing the integrability constraint. In the orthographicprojection case, it is well-known that when such constraint is imposed, a solution can be identified only up to 3 parameters, the so-called generalized bas-relief (GBR) ambiguity. We show that in the perspective projection case the solution is unique. We also propose a closed-form solution which is simple, efficient and robust. We test our algorithm on synthetic data and publicly available real data. Our quantitative tests show that our method outperforms all prior work of uncalibrated photometric stereo under orthographic projection.
same-paper 3 0.86829448 241 cvpr-2013-Label-Embedding for Attribute-Based Classification
Author: Zeynep Akata, Florent Perronnin, Zaid Harchaoui, Cordelia Schmid
Abstract: Attributes are an intermediate representation, which enables parameter sharing between classes, a must when training data is scarce. We propose to view attribute-based image classification as a label-embedding problem: each class is embedded in the space of attribute vectors. We introduce a function which measures the compatibility between an image and a label embedding. The parameters of this function are learned on a training set of labeled samples to ensure that, given an image, the correct classes rank higher than the incorrect ones. Results on the Animals With Attributes and Caltech-UCSD-Birds datasets show that the proposed framework outperforms the standard Direct Attribute Prediction baseline in a zero-shot learning scenario. The label embedding framework offers other advantages such as the ability to leverage alternative sources of information in addition to attributes (e.g. class hierarchies) or to transition smoothly from zero-shot learning to learning with large quantities of data.
4 0.85838288 38 cvpr-2013-All About VLAD
Author: unkown-author
Abstract: The objective of this paper is large scale object instance retrieval, given a query image. A starting point of such systems is feature detection and description, for example using SIFT. The focus of this paper, however, is towards very large scale retrieval where, due to storage requirements, very compact image descriptors are required and no information about the original SIFT descriptors can be accessed directly at run time. We start from VLAD, the state-of-the art compact descriptor introduced by J´ egou et al. [8] for this purpose, and make three novel contributions: first, we show that a simple change to the normalization method significantly improves retrieval performance; second, we show that vocabulary adaptation can substantially alleviate problems caused when images are added to the dataset after initial vocabulary learning. These two methods set a new stateof-the-art over all benchmarks investigated here for both mid-dimensional (20k-D to 30k-D) and small (128-D) descriptors. Our third contribution is a multiple spatial VLAD representation, MultiVLAD, that allows the retrieval and local- ization of objects that only extend over a small part of an image (again without requiring use of the original image SIFT descriptors).
5 0.85517418 438 cvpr-2013-Towards Pose Robust Face Recognition
Author: Dong Yi, Zhen Lei, Stan Z. Li
Abstract: Most existing pose robust methods are too computational complex to meet practical applications and their performance under unconstrained environments are rarely evaluated. In this paper, we propose a novel method for pose robust face recognition towards practical applications, which is fast, pose robust and can work well under unconstrained environments. Firstly, a 3D deformable model is built and a fast 3D model fitting algorithm is proposed to estimate the pose of face image. Secondly, a group of Gabor filters are transformed according to the pose and shape of face image for feature extraction. Finally, PCA is applied on the pose adaptive Gabor features to remove the redundances and Cosine metric is used to evaluate the similarity. The proposed method has three advantages: (1) The pose correction is applied in the filter space rather than image space, which makes our method less affected by the precision of the 3D model; (2) By combining the holistic pose transformation and local Gabor filtering, the final feature is robust to pose and other negative factors in face recognition; (3) The 3D structure and facial symmetry are successfully used to deal with self-occlusion. Extensive experiments on FERET and PIE show the proposed method outperforms state-ofthe-art methods significantly, meanwhile, the method works well on LFW.
6 0.82691896 82 cvpr-2013-Class Generative Models Based on Feature Regression for Pose Estimation of Object Categories
7 0.82684982 206 cvpr-2013-Human Pose Estimation Using Body Parts Dependent Joint Regressors
8 0.82679904 43 cvpr-2013-Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs
9 0.82670224 202 cvpr-2013-Hierarchical Saliency Detection
10 0.82638812 284 cvpr-2013-Mesh Based Semantic Modelling for Indoor and Outdoor Scenes
11 0.82582033 355 cvpr-2013-Representing Videos Using Mid-level Discriminative Patches
12 0.82577819 299 cvpr-2013-Multi-source Multi-scale Counting in Extremely Dense Crowd Images
13 0.82572746 380 cvpr-2013-Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images
14 0.82528633 196 cvpr-2013-HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences
15 0.82514405 450 cvpr-2013-Unsupervised Joint Object Discovery and Segmentation in Internet Images
16 0.82513899 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds
17 0.82502085 124 cvpr-2013-Determining Motion Directly from Normal Flows Upon the Use of a Spherical Eye Platform
18 0.82500225 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
19 0.82497966 168 cvpr-2013-Fast Object Detection with Entropy-Driven Evaluation
20 0.82490051 36 cvpr-2013-Adding Unlabeled Samples to Categories by Learned Attributes