iccv iccv2013 iccv2013-53 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Naman Turakhia, Devi Parikh
Abstract: When we look at an image, some properties or attributes of the image stand out more than others. When describing an image, people are likely to describe these dominant attributes first. Attribute dominance is a result of a complex interplay between the various properties present or absent in the image. Which attributes in an image are more dominant than others reveals rich information about the content of the image. In this paper we tap into this information by modeling attribute dominance. We show that this helps improve the performance of vision systems on a variety of human-centric applications such as zero-shot learning, image search and generating textual descriptions of images.
Reference: text
sentIndex sentText sentNum sentScore
1 When describing an image, people are likely to describe these dominant attributes first. [sent-4, score-0.65]
2 Attribute dominance is a result of a complex interplay between the various properties present or absent in the image. [sent-5, score-0.823]
3 Which attributes in an image are more dominant than others reveals rich information about the content of the image. [sent-6, score-0.535]
4 While all attributes are by definition semantic visual concepts we care about, different attributes dominate different images or categories. [sent-16, score-0.843]
5 An attribute may be dominant in a visual concept due to a variety of reasons such as strong presence, unusualness, absence of other more dominant attributes, etc. [sent-18, score-0.605]
6 It is relatively uncommon for people to have a beard and wear glasses, making these attributes dominant in Figure 1 (f). [sent-22, score-0.594]
7 In general, attribute dominance is different from the relative strength of an attribute in an image. [sent-134, score-1.376]
8 Relative attributes [24] compare the strength of an attribute across images. [sent-135, score-0.72]
9 Attribute dominance compares the relative importance of different attributes within an image or category. [sent-136, score-1.209]
10 Attribute dominance is an image- or category-specific phenomenon a manifes– tation of a complex interplay among all attributes present (or absent) in the image or category. [sent-137, score-1.196]
11 Because attribute dominance affects how humans perceive and describe images. [sent-139, score-1.11]
12 Attribute dominance affects which attributes humans tend to name in these scenarios, and in which order. [sent-142, score-1.226]
13 Tapping into this information by modeling attribute dominance is a step towards enhancing communication between humans and machines, and can lead to improved performance of computer vision systems in humancentric applications such as zero-shot learning and image search. [sent-144, score-1.095]
14 Machine generated textual descriptions of images that reason about attribute dominance are also more likely to be natural and easily understandable by humans. [sent-145, score-1.196]
15 In our work, by modeling the dominance of attributes in images, we en- hance this channel of communication. [sent-159, score-1.163]
16 Our work takes an orthogonal perspective: rather than using a more detailed (and hence likely more cumbersome) mode of supervision, we propose to model attribute dominance to extract more information from existing modes of supervision. [sent-164, score-1.068]
17 Modeling attribute dominance allows us to inject the user’s subjectivity into the search results without explicitly eliciting feedback or more detailed queries. [sent-167, score-1.084]
18 We focus on the constrained problem of predicting which attributes are dominant or important. [sent-185, score-0.563]
19 Unlike [1], in addition to predicting which attributes are likely to be named, we also model the order in which the attributes are likely to be named. [sent-186, score-0.908]
20 Note that attribute dominance is not meant to capture userspecific preferences (e. [sent-187, score-1.06]
21 Reading between the lines: We propose exploiting the order in which humans name attributes when describing an image. [sent-191, score-0.505]
22 This can be thought of as reading between the lines ofwhat the user is saying, and not simply taking the description which attributes are stated to be present or absent – – 1226 at face value. [sent-192, score-0.556]
23 Combining object and attribute dominance is an obvious direction for future work. [sent-195, score-1.04]
24 Approach We first describe how we annotate attribute dominance in images (Section 3. [sent-199, score-1.073]
25 We then present our model for predicting attribute dominance in a novel image (Section 3. [sent-201, score-1.068]
26 Finally, we describe how we use our attribute dominance predictors for three applications: zero-shot learning (Section 3. [sent-203, score-1.096]
27 Annotating Attribute Dominance We annotate attribute dominance in images to train our attribute dominance predictor, and use it as ground truth at test time to evaluate our approach. [sent-209, score-2.129]
28 We collect dominance annotations at the categorylevel, although our approach trivially generalizes to imagelevel dominance annotations as well. [sent-211, score-1.552]
29 If the user had to describe the category using one of these two attributes “am is present” or “am? [sent-239, score-0.518]
30 2 Each category is statement that is inconsistent with we remove his responses from 1Note: the presence or absence of an attribute in a category is distinct from whether that attribute is dominant in the category or not. [sent-246, score-0.92]
31 Since the presence of an attribute may be dominant in some images, but its absence may be dominant in others, we model them separately. [sent-261, score-0.64]
32 The dominance dnm of attribute am in category Cn is defined to be the number of subjects that selected am when it appeared as one of the options. [sent-274, score-1.14]
33 We now have the ground truth dominance value for all 2M attributes in all N categories. [sent-286, score-1.212]
34 We now describe our approach to predicting dominance of an attribute in a novel image. [sent-290, score-1.101]
35 Modeling Attribute Dominance Given a novel image xt, we predict the dominance dˆtm of attribute m in that image using dˆtm = wmTφ(xt) (2) We represent image xt via an image descriptor. [sent-293, score-1.102]
36 This exposes the complex interplay among attributes discussed in the introduction that leads to the dominance of certain attributes in an image and not others. [sent-295, score-1.608]
37 For training, we project the category-level attribute dominance annotations to each training image. [sent-298, score-1.065]
38 NTh}i,s gives us image and attribute dominance pairs {(xp, dpm)} for each attribute am. [sent-308, score-1.329]
39 The learnt parameters wm allow us to predict the dominance value of all attributes in a new image xt (Equation 2). [sent-312, score-1.225]
40 We sort all 2M attributes in descending order of their dominance values Let the rank of attribute m for image xt be (xt). [sent-313, score-1.558]
41 Then the probability pdkm (xt) that attribute m is the kth most dominant is image xt is computed as rm dˆtm. [sent-314, score-0.508]
42 =1skm(xt) skm(xt) = log(|rm(xt)1 − k| + 1) + 1 (3) (4) skm (xt) is a score that drops as the estimated rank (xt) of the attribute in terms of its dominance in the image is further away from k. [sent-316, score-1.103]
43 each attribute is one of the 2Mth most dominant in an image, since there are only 2M attributes in the vocabulary. [sent-319, score-0.824]
44 Note that although the dominance of each attribute is predicted independently, the model is trained on an attribute-based representation of the image. [sent-321, score-1.059]
45 As our experiments demonstrate, even our straight forward treatment of attribute dominance re- rm sults in significant improvements in performance in a variety of human centric applications. [sent-324, score-1.04]
46 4 It assumes that since being striped is a dominant attribute for zebras, a test image is more likely to be a zebra if it is striped and being striped is dominant in that image. [sent-359, score-0.674]
47 , our approach not only verifies how well its appearance matches the specified attributes presence / absence, but also verifies how well the predicted ordering of attributes according to their dominance matches the order of attributes used by the supervisor when describing the test category. [sent-376, score-2.175]
48 (x) is the appearance term computed using Equation 5 and the dominance term pdn? [sent-382, score-0.751]
49 (x) is 3Recall, our vocabulary of 2M attributes is over-complete and redundant since it includes both the presence and absence of attributes. [sent-383, score-0.563]
50 1 pdkmk (x) is the probability that attribute amk is the kth most dominant attribute in image x and is computed using Equations 3 and 4. [sent-391, score-0.724]
51 Image Search We consider the image search scenario where a user has a target category in mind, and provides as query a list of attributes that describe that category. [sent-397, score-0.546]
52 (S)he is likely to use the attributes dominant in the target concept, naming the most dominant attributes first. [sent-399, score-1.098]
53 In our approach, the probability that a target image satisfies the given query depends on whether its appearance matches the presence/absence of attributes specified, and whether the predicted dominance of attributes in the image satisfies the order used by the user in the query. [sent-400, score-1.635]
54 Again, if humans are asked to describe an image, they will describe some attributes before others, and may not describe some attributes at all. [sent-421, score-0.98]
55 If a machine is given similar abilities, we expect the resultant description to characterize the image better than an approach that lists attributes in an arbitrary order [8] and chooses a random subset of K out of M attributes to describe the image [24]. [sent-422, score-0.922]
56 We sort all attributes in descending order of their predicted dominance score for this image. [sent-424, score-1.215]
57 Note that since dominance is predicted for the expanded vocabulary, the resultant descriptions can specify the presence as well as absence of attributes. [sent-427, score-0.999]
58 We then provide an analysis of the dominance annotations we collected to gain better insights into the phenomenon and validate our assumptions. [sent-430, score-0.776]
59 We worked with a vocabulary of 13 attributes (26 in the expanded vocabulary including both presence and absence of attributes). [sent-436, score-0.642]
60 5 These attributes were selected to ensure (1) a variety in their presence / absence across the categories and (2) ease of use for lay people on MTurk to comment on. [sent-437, score-0.585]
61 We worked with a vocabulary of 27 attributes (54 in expanded These were picked to ensure that lay people on MTurk can understand them. [sent-446, score-0.515]
62 We used a held out set of 2429 validation images from those 40 categories to train our dominance predictor. [sent-450, score-0.795]
63 We collected attribute dominance annotation for each attribute across all categories as described in Section 3. [sent-453, score-1.373]
64 6The probability of a combined attribute was computed by training a classifier using the individual attributes as features. [sent-459, score-0.701]
65 1229 Figure 3: Ground truth dominance scores of all attributes (columns) in all categories (rows) in PubFig (left) and AWA (right). [sent-461, score-1.233]
66 The dominance values fall in [0,70] for PubFig and [0,143] for AWA. [sent-463, score-0.751]
67 [20] for PubFig and AWA respectively to train our attribute dominance predictor described in Section 3. [sent-466, score-1.057]
68 Dominance Analysis In Figure 3 we show the ground truth dominance scores of all attributes (expanded vocabulary) in all categories as computed using Equation 1. [sent-470, score-1.256]
69 We make three observations (1) Different categories do in fact have differ- ent attributes that are dominant in them (2) Even when the same attribute is present in different categories, it need not be dominant in all of them. [sent-472, score-0.991]
70 For instance, “Has tough skin” is present in 23 animal categories but has high dominance values in only 12 of them. [sent-473, score-0.851]
71 To analyze whether dominance simply captures the relative strength of an attribute in an image, we compare the ground truth dominance of an attribute across categories with relative attributes [24]. [sent-476, score-2.66]
72 For a given category, we sort the attributes using our ground truth dominance score as well as using the ground truth relative strength of the attributes in the categories. [sent-479, score-1.736]
73 To put this number in perspective, the rank correlation between a random ordering of attributes with the dominance score is 0. [sent-482, score-1.227]
74 The inter-human rank correlation computed by comparing the dominance score obtained using responses from half the subjects with the scores from the other halfis 0. [sent-484, score-0.861]
75 The rank correlation between our predicted dominance score and the ground truth is 0. [sent-486, score-0.862]
76 The rank correlation between a fixed ordering of attributes (based on their average dominance across all categories) and the ground truth is 0. [sent-488, score-1.26]
77 This shows that (1) dominance captures more than the relative strength of an attribute in the image (2) our attribute dominance predictor is quite reliable (3) inter-human agreement is high i. [sent-490, score-2.165]
78 One could argue that the rare attributes are the more dominant ones, and that TFIDF (Term Frequency - Inverse Document Frequency) would capture attribute dominance. [sent-503, score-0.824]
79 Rank correlation between attribute TFIDF and the ground truth attribute dominance is only 0. [sent-504, score-1.378]
80 69 for both PubFig and AWA, significantly lower than inter-human agreement on attribute dominance (0. [sent-505, score-1.061]
81 We compare our proposed approach of using appearance and dominance information both (Equation 6) to the baseline approach of Lampert et al. [sent-511, score-0.751]
82 We also compare to an approach that uses dominance information alone (i. [sent-513, score-0.751]
83 We see that the incorporation of dominance can provide a notable boost in performance compared to the appearance-only approach of Lampert et al. [sent-525, score-0.751]
84 We also see that the expanded vocabulary for modeling dominance performs better than the compressed version. [sent-527, score-0.83]
85 Note that for a fixed value of K (x-axis), different categories use their respective K most dominant attributes that a user is likely to list, which are typically different for different categories. [sent-534, score-0.648]
86 When experimenting with a scenario where the user provides queries containing K attributes, for each target, we use the K attributes selected most often by the users to describe the target category (Equation 1). [sent-543, score-0.545]
87 In the first case, we check what per- centage of the attributes present in our descriptions are also present in the ground truth descriptions of the images. [sent-557, score-0.591]
88 The ground truth descriptions are generated by selecting the K most dominant attributes using the ground truth dominance score of attributes (Equation 1). [sent-558, score-1.877]
89 To make the baselines even stronger, we first predict the presence / absence of attributes in the image using attribute classifiers, and then pick K attributes from those randomly or using the compressed dominance regressor. [sent-563, score-1.969]
90 Our improved performance over the global baseline demonstrates that our approach reliably captures image-specific dominance patterns. [sent-565, score-0.751]
91 We presented the three descriptions: dominance-based (our approach), global dominance based (same attributes for all images) and random, along with the image being described to human subjects on Amazon Mechanical Turk. [sent-568, score-1.212]
92 Clearly, modeling attribute dominance leads to significantly more natural image descriptions. [sent-572, score-1.04]
93 We repeated this study, but this time with ground truth dominance and ground truth presence / absence of attributes. [sent-573, score-0.954]
94 This validates our basic assumption that users use dominant attributes when describing images. [sent-575, score-0.592]
95 This is not surprising because we collected the dominance annotations by asking subjects which attributes they would use to describe the image (Figure 2). [sent-576, score-1.27]
96 When people naturally describe images, they tend to name a subset of all possible attributes and in a certain consistent order that reflects the dominance of attributes in the image. [sent-579, score-1.658]
97 , attribute dominance and demonstrate resultant improvements in performance for human-centric applications of computer vision such as zero-shot learning, image search and automatic generation of textual descriptions of images in two domains: faces and animals. [sent-582, score-1.222]
98 Future work involves incorporating the notion of dominance for relative attributes [24]. [sent-583, score-1.191]
99 When the user says “I want shoes that are shinier than these” or “This image is not a forest because it is too open to be a forest”, perhaps users name attributes that are dominant in the images. [sent-585, score-0.629]
100 ”, the responses from human subjects may be more consistent ifwe ensure that the two images being compared have equal dominance of attribute am. [sent-588, score-1.107]
wordName wordTfidf (topN-words)
[('dominance', 0.751), ('attributes', 0.412), ('attribute', 0.289), ('gmn', 0.136), ('dominant', 0.123), ('awa', 0.122), ('pubfig', 0.116), ('supervisor', 0.083), ('absence', 0.07), ('descriptions', 0.065), ('textual', 0.063), ('xt', 0.062), ('subjects', 0.049), ('vocabulary', 0.046), ('pdn', 0.045), ('categories', 0.044), ('user', 0.041), ('absent', 0.039), ('description', 0.039), ('striped', 0.037), ('humans', 0.037), ('lampert', 0.037), ('sentences', 0.036), ('presence', 0.035), ('beard', 0.035), ('furry', 0.035), ('pdkm', 0.034), ('zebras', 0.034), ('interplay', 0.033), ('expanded', 0.033), ('describe', 0.033), ('animal', 0.033), ('legs', 0.033), ('category', 0.032), ('describing', 0.03), ('likely', 0.028), ('relative', 0.028), ('predicting', 0.028), ('bearded', 0.028), ('tendencies', 0.028), ('search', 0.028), ('users', 0.027), ('rank', 0.027), ('cn', 0.026), ('parikh', 0.026), ('resultant', 0.026), ('name', 0.026), ('truth', 0.026), ('annotations', 0.025), ('reading', 0.025), ('equation', 0.024), ('people', 0.024), ('generating', 0.023), ('ground', 0.023), ('predictors', 0.023), ('animals', 0.023), ('lipstick', 0.023), ('pdkmk', 0.023), ('supervisors', 0.023), ('tough', 0.023), ('wolves', 0.023), ('kumar', 0.022), ('pan', 0.022), ('pop', 0.022), ('stripes', 0.022), ('talk', 0.022), ('wearing', 0.021), ('agreement', 0.021), ('smiling', 0.021), ('ordering', 0.021), ('asked', 0.02), ('userspecific', 0.02), ('skm', 0.02), ('xp', 0.02), ('berg', 0.019), ('predicted', 0.019), ('dominate', 0.019), ('language', 0.019), ('dnm', 0.019), ('glasses', 0.019), ('strength', 0.019), ('responses', 0.018), ('importance', 0.018), ('communication', 0.018), ('pam', 0.017), ('tfidf', 0.017), ('classifiers', 0.017), ('descending', 0.017), ('experimented', 0.017), ('predictor', 0.017), ('interface', 0.017), ('stand', 0.017), ('naphade', 0.017), ('teeth', 0.017), ('hair', 0.016), ('feedback', 0.016), ('score', 0.016), ('eyebrows', 0.016), ('mso', 0.016), ('teach', 0.016)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 53 iccv-2013-Attribute Dominance: What Pops Out?
Author: Naman Turakhia, Devi Parikh
Abstract: When we look at an image, some properties or attributes of the image stand out more than others. When describing an image, people are likely to describe these dominant attributes first. Attribute dominance is a result of a complex interplay between the various properties present or absent in the image. Which attributes in an image are more dominant than others reveals rich information about the content of the image. In this paper we tap into this information by modeling attribute dominance. We show that this helps improve the performance of vision systems on a variety of human-centric applications such as zero-shot learning, image search and generating textual descriptions of images.
2 0.39834744 31 iccv-2013-A Unified Probabilistic Approach Modeling Relationships between Attributes and Objects
Author: Xiaoyang Wang, Qiang Ji
Abstract: This paper proposes a unified probabilistic model to model the relationships between attributes and objects for attribute prediction and object recognition. As a list of semantically meaningful properties of objects, attributes generally relate to each other statistically. In this paper, we propose a unified probabilistic model to automatically discover and capture both the object-dependent and objectindependent attribute relationships. The model utilizes the captured relationships to benefit both attribute prediction and object recognition. Experiments on four benchmark attribute datasets demonstrate the effectiveness of the proposed unified model for improving attribute prediction as well as object recognition in both standard and zero-shot learning cases.
3 0.35001692 52 iccv-2013-Attribute Adaptation for Personalized Image Search
Author: Adriana Kovashka, Kristen Grauman
Abstract: Current methods learn monolithic attribute predictors, with the assumption that a single model is sufficient to reflect human understanding of a visual attribute. However, in reality, humans vary in how they perceive the association between a named property and image content. For example, two people may have slightly different internal models for what makes a shoe look “formal”, or they may disagree on which of two scenes looks “more cluttered”. Rather than discount these differences as noise, we propose to learn user-specific attribute models. We adapt a generic model trained with annotations from multiple users, tailoring it to satisfy user-specific labels. Furthermore, we propose novel techniques to infer user-specific labels based on transitivity and contradictions in the user’s search history. We demonstrate that adapted attributes improve accuracy over both existing monolithic models as well as models that learn from scratch with user-specific data alone. In addition, we show how adapted attributes are useful to personalize image search, whether with binary or relative attributes.
4 0.30802521 399 iccv-2013-Spoken Attributes: Mixing Binary and Relative Attributes to Say the Right Thing
Author: Amir Sadovnik, Andrew Gallagher, Devi Parikh, Tsuhan Chen
Abstract: In recent years, there has been a great deal of progress in describing objects with attributes. Attributes have proven useful for object recognition, image search, face verification, image description, and zero-shot learning. Typically, attributes are either binary or relative: they describe either the presence or absence of a descriptive characteristic, or the relative magnitude of the characteristic when comparing two exemplars. However, prior work fails to model the actual way in which humans use these attributes in descriptive statements of images. Specifically, it does not address the important interactions between the binary and relative aspects of an attribute. In this work we propose a spoken attribute classifier which models a more natural way of using an attribute in a description. For each attribute we train a classifier which captures the specific way this attribute should be used. We show that as a result of using this model, we produce descriptions about images of people that are more natural and specific than past systems.
5 0.22446992 54 iccv-2013-Attribute Pivots for Guiding Relevance Feedback in Image Search
Author: Adriana Kovashka, Kristen Grauman
Abstract: In interactive image search, a user iteratively refines his results by giving feedback on exemplar images. Active selection methods aim to elicit useful feedback, but traditional approaches suffer from expensive selection criteria and cannot predict informativeness reliably due to the imprecision of relevance feedback. To address these drawbacks, we propose to actively select “pivot” exemplars for which feedback in the form of a visual comparison will most reduce the system’s uncertainty. For example, the system might ask, “Is your target image more or less crowded than this image? ” Our approach relies on a series of binary search trees in relative attribute space, together with a selection function that predicts the information gain were the user to compare his envisioned target to the next node deeper in a given attribute ’s tree. It makes interactive search more efficient than existing strategies—both in terms of the system ’s selection time as well as the user’s feedback effort.
6 0.22086805 380 iccv-2013-Semantic Transform: Weakly Supervised Semantic Inference for Relating Visual Attributes
7 0.21750046 204 iccv-2013-Human Attribute Recognition by Rich Appearance Dictionary
8 0.17607684 7 iccv-2013-A Deep Sum-Product Architecture for Robust Facial Attributes Analysis
9 0.15683553 378 iccv-2013-Semantic-Aware Co-indexing for Image Retrieval
10 0.15543766 449 iccv-2013-What Do You Do? Occupation Recognition in a Photo via Social Context
11 0.15053941 276 iccv-2013-Multi-attributed Dictionary Learning for Sparse Coding
12 0.14027075 238 iccv-2013-Learning Graphs to Match
13 0.1377884 237 iccv-2013-Learning Graph Matching: Oriented to Category Modeling from Cluttered Scenes
14 0.13111952 246 iccv-2013-Learning the Visual Interpretation of Sentences
15 0.12827028 213 iccv-2013-Implied Feedback: Learning Nuances of User Behavior in Image Search
16 0.12537447 192 iccv-2013-Handwritten Word Spotting with Corrected Attributes
17 0.12463665 107 iccv-2013-Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction
18 0.12200733 445 iccv-2013-Visual Reranking through Weakly Supervised Multi-graph Learning
19 0.12126986 451 iccv-2013-Write a Classifier: Zero-Shot Learning Using Purely Textual Descriptions
20 0.11580338 123 iccv-2013-Domain Adaptive Classification
topicId topicWeight
[(0, 0.13), (1, 0.176), (2, -0.053), (3, -0.19), (4, 0.177), (5, -0.05), (6, -0.102), (7, -0.214), (8, 0.28), (9, 0.22), (10, -0.055), (11, 0.045), (12, 0.051), (13, 0.018), (14, -0.071), (15, -0.01), (16, 0.008), (17, 0.035), (18, -0.044), (19, -0.027), (20, -0.005), (21, 0.07), (22, 0.022), (23, -0.037), (24, -0.044), (25, -0.022), (26, 0.036), (27, -0.009), (28, -0.041), (29, 0.041), (30, -0.028), (31, -0.007), (32, -0.005), (33, 0.008), (34, 0.017), (35, -0.041), (36, -0.007), (37, -0.018), (38, -0.034), (39, -0.015), (40, -0.006), (41, 0.052), (42, -0.027), (43, 0.054), (44, -0.035), (45, 0.032), (46, 0.036), (47, 0.006), (48, 0.009), (49, -0.008)]
simIndex simValue paperId paperTitle
same-paper 1 0.98548937 53 iccv-2013-Attribute Dominance: What Pops Out?
Author: Naman Turakhia, Devi Parikh
Abstract: When we look at an image, some properties or attributes of the image stand out more than others. When describing an image, people are likely to describe these dominant attributes first. Attribute dominance is a result of a complex interplay between the various properties present or absent in the image. Which attributes in an image are more dominant than others reveals rich information about the content of the image. In this paper we tap into this information by modeling attribute dominance. We show that this helps improve the performance of vision systems on a variety of human-centric applications such as zero-shot learning, image search and generating textual descriptions of images.
2 0.96847498 399 iccv-2013-Spoken Attributes: Mixing Binary and Relative Attributes to Say the Right Thing
Author: Amir Sadovnik, Andrew Gallagher, Devi Parikh, Tsuhan Chen
Abstract: In recent years, there has been a great deal of progress in describing objects with attributes. Attributes have proven useful for object recognition, image search, face verification, image description, and zero-shot learning. Typically, attributes are either binary or relative: they describe either the presence or absence of a descriptive characteristic, or the relative magnitude of the characteristic when comparing two exemplars. However, prior work fails to model the actual way in which humans use these attributes in descriptive statements of images. Specifically, it does not address the important interactions between the binary and relative aspects of an attribute. In this work we propose a spoken attribute classifier which models a more natural way of using an attribute in a description. For each attribute we train a classifier which captures the specific way this attribute should be used. We show that as a result of using this model, we produce descriptions about images of people that are more natural and specific than past systems.
3 0.93672663 31 iccv-2013-A Unified Probabilistic Approach Modeling Relationships between Attributes and Objects
Author: Xiaoyang Wang, Qiang Ji
Abstract: This paper proposes a unified probabilistic model to model the relationships between attributes and objects for attribute prediction and object recognition. As a list of semantically meaningful properties of objects, attributes generally relate to each other statistically. In this paper, we propose a unified probabilistic model to automatically discover and capture both the object-dependent and objectindependent attribute relationships. The model utilizes the captured relationships to benefit both attribute prediction and object recognition. Experiments on four benchmark attribute datasets demonstrate the effectiveness of the proposed unified model for improving attribute prediction as well as object recognition in both standard and zero-shot learning cases.
4 0.89339209 52 iccv-2013-Attribute Adaptation for Personalized Image Search
Author: Adriana Kovashka, Kristen Grauman
Abstract: Current methods learn monolithic attribute predictors, with the assumption that a single model is sufficient to reflect human understanding of a visual attribute. However, in reality, humans vary in how they perceive the association between a named property and image content. For example, two people may have slightly different internal models for what makes a shoe look “formal”, or they may disagree on which of two scenes looks “more cluttered”. Rather than discount these differences as noise, we propose to learn user-specific attribute models. We adapt a generic model trained with annotations from multiple users, tailoring it to satisfy user-specific labels. Furthermore, we propose novel techniques to infer user-specific labels based on transitivity and contradictions in the user’s search history. We demonstrate that adapted attributes improve accuracy over both existing monolithic models as well as models that learn from scratch with user-specific data alone. In addition, we show how adapted attributes are useful to personalize image search, whether with binary or relative attributes.
5 0.80399972 380 iccv-2013-Semantic Transform: Weakly Supervised Semantic Inference for Relating Visual Attributes
Author: Sukrit Shankar, Joan Lasenby, Roberto Cipolla
Abstract: Relative (comparative) attributes are promising for thematic ranking of visual entities, which also aids in recognition tasks [19, 23]. However, attribute rank learning often requires a substantial amount of relational supervision, which is highly tedious, and apparently impracticalfor realworld applications. In this paper, we introduce the Semantic Transform, which under minimal supervision, adaptively finds a semantic feature space along with a class ordering that is related in the best possible way. Such a semantic space is found for every attribute category. To relate the classes under weak supervision, the class ordering needs to be refined according to a cost function in an iterative procedure. This problem is ideally NP-hard, and we thus propose a constrained search tree formulation for the same. Driven by the adaptive semantic feature space representation, our model achieves the best results to date for all of the tasks of relative, absolute and zero-shot classification on two popular datasets.
6 0.73075366 7 iccv-2013-A Deep Sum-Product Architecture for Robust Facial Attributes Analysis
7 0.721385 449 iccv-2013-What Do You Do? Occupation Recognition in a Photo via Social Context
8 0.72024888 54 iccv-2013-Attribute Pivots for Guiding Relevance Feedback in Image Search
9 0.60502601 350 iccv-2013-Relative Attributes for Large-Scale Abandoned Object Detection
10 0.59840465 192 iccv-2013-Handwritten Word Spotting with Corrected Attributes
11 0.58360618 204 iccv-2013-Human Attribute Recognition by Rich Appearance Dictionary
12 0.52801728 246 iccv-2013-Learning the Visual Interpretation of Sentences
13 0.51568264 445 iccv-2013-Visual Reranking through Weakly Supervised Multi-graph Learning
14 0.49000552 285 iccv-2013-NEIL: Extracting Visual Knowledge from Web Data
15 0.47852615 213 iccv-2013-Implied Feedback: Learning Nuances of User Behavior in Image Search
16 0.46808618 378 iccv-2013-Semantic-Aware Co-indexing for Image Retrieval
17 0.39761505 276 iccv-2013-Multi-attributed Dictionary Learning for Sparse Coding
18 0.39464742 191 iccv-2013-Handling Uncertain Tags in Visual Recognition
19 0.38827401 416 iccv-2013-The Interestingness of Images
20 0.38032192 451 iccv-2013-Write a Classifier: Zero-Shot Learning Using Purely Textual Descriptions
topicId topicWeight
[(2, 0.074), (7, 0.02), (8, 0.02), (12, 0.042), (26, 0.04), (31, 0.037), (34, 0.336), (42, 0.104), (64, 0.029), (73, 0.025), (89, 0.133), (98, 0.011)]
simIndex simValue paperId paperTitle
same-paper 1 0.79730153 53 iccv-2013-Attribute Dominance: What Pops Out?
Author: Naman Turakhia, Devi Parikh
Abstract: When we look at an image, some properties or attributes of the image stand out more than others. When describing an image, people are likely to describe these dominant attributes first. Attribute dominance is a result of a complex interplay between the various properties present or absent in the image. Which attributes in an image are more dominant than others reveals rich information about the content of the image. In this paper we tap into this information by modeling attribute dominance. We show that this helps improve the performance of vision systems on a variety of human-centric applications such as zero-shot learning, image search and generating textual descriptions of images.
2 0.78459066 202 iccv-2013-How Do You Tell a Blackbird from a Crow?
Author: Thomas Berg, Peter N. Belhumeur
Abstract: How do you tell a blackbirdfrom a crow? There has been great progress toward automatic methods for visual recognition, including fine-grained visual categorization in which the classes to be distinguished are very similar. In a task such as bird species recognition, automatic recognition systems can now exceed the performance of non-experts – most people are challenged to name a couple dozen bird species, let alone identify them. This leads us to the question, “Can a recognition system show humans what to look for when identifying classes (in this case birds)? ” In the context of fine-grained visual categorization, we show that we can automatically determine which classes are most visually similar, discover what visual features distinguish very similar classes, and illustrate the key features in a way meaningful to humans. Running these methods on a dataset of bird images, we can generate a visual field guide to birds which includes a tree of similarity that displays the similarity relations between all species, pages for each species showing the most similar other species, and pages for each pair of similar species illustrating their differences.
3 0.70913339 278 iccv-2013-Multi-scale Topological Features for Hand Posture Representation and Analysis
Author: Kaoning Hu, Lijun Yin
Abstract: In this paper, we propose a multi-scale topological feature representation for automatic analysis of hand posture. Such topological features have the advantage of being posture-dependent while being preserved under certain variations of illumination, rotation, personal dependency, etc. Our method studies the topology of the holes between the hand region and its convex hull. Inspired by the principle of Persistent Homology, which is the theory of computational topology for topological feature analysis over multiple scales, we construct the multi-scale Betti Numbers matrix (MSBNM) for the topological feature representation. In our experiments, we used 12 different hand postures and compared our features with three popular features (HOG, MCT, and Shape Context) on different data sets. In addition to hand postures, we also extend the feature representations to arm postures. The results demonstrate the feasibility and reliability of the proposed method.
4 0.66188085 230 iccv-2013-Latent Data Association: Bayesian Model Selection for Multi-target Tracking
Author: Aleksandr V. Segal, Ian Reid
Abstract: We propose a novel parametrization of the data association problem for multi-target tracking. In our formulation, the number of targets is implicitly inferred together with the data association, effectively solving data association and model selection as a single inference problem. The novel formulation allows us to interpret data association and tracking as a single Switching Linear Dynamical System (SLDS). We compute an approximate posterior solution to this problem using a dynamic programming/message passing technique. This inference-based approach allows us to incorporate richer probabilistic models into the tracking system. In particular, we incorporate inference over inliers/outliers and track termination times into the system. We evaluate our approach on publicly available datasets and demonstrate results competitive with, and in some cases exceeding the state of the art.
5 0.65359879 64 iccv-2013-Box in the Box: Joint 3D Layout and Object Reasoning from Single Images
Author: Alexander G. Schwing, Sanja Fidler, Marc Pollefeys, Raquel Urtasun
Abstract: In this paper we propose an approach to jointly infer the room layout as well as the objects present in the scene. Towards this goal, we propose a branch and bound algorithm which is guaranteed to retrieve the global optimum of the joint problem. The main difficulty resides in taking into account occlusion in order to not over-count the evidence. We introduce a new decomposition method, which generalizes integral geometry to triangular shapes, and allows us to bound the different terms in constant time. We exploit both geometric cues and object detectors as image features and show large improvements in 2D and 3D object detection over state-of-the-art deformable part-based models.
6 0.65285003 335 iccv-2013-Random Faces Guided Sparse Many-to-One Encoder for Pose-Invariant Face Recognition
7 0.64876378 399 iccv-2013-Spoken Attributes: Mixing Binary and Relative Attributes to Say the Right Thing
8 0.64697623 31 iccv-2013-A Unified Probabilistic Approach Modeling Relationships between Attributes and Objects
9 0.60875952 138 iccv-2013-Efficient and Robust Large-Scale Rotation Averaging
10 0.60104948 52 iccv-2013-Attribute Adaptation for Personalized Image Search
11 0.56064743 7 iccv-2013-A Deep Sum-Product Architecture for Robust Facial Attributes Analysis
12 0.55591011 449 iccv-2013-What Do You Do? Occupation Recognition in a Photo via Social Context
13 0.55183601 54 iccv-2013-Attribute Pivots for Guiding Relevance Feedback in Image Search
14 0.54613954 451 iccv-2013-Write a Classifier: Zero-Shot Learning Using Purely Textual Descriptions
15 0.54204977 107 iccv-2013-Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction
16 0.53613818 169 iccv-2013-Fine-Grained Categorization by Alignments
17 0.53242373 62 iccv-2013-Bird Part Localization Using Exemplar-Based Models with Enforced Pose and Subcategory Consistency
18 0.52859342 272 iccv-2013-Modifying the Memorability of Face Photographs
19 0.52707094 117 iccv-2013-Discovering Details and Scene Structure with Hierarchical Iconoid Shift
20 0.52355218 248 iccv-2013-Learning to Rank Using Privileged Information