iccv iccv2013 iccv2013-142 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Dengxin Dai, Luc Van_Gool
Abstract: This paper investigates the problem of semi-supervised classification. Unlike previous methods to regularize classifying boundaries with unlabeled data, our method learns a new image representation from all available data (labeled and unlabeled) andperformsplain supervised learning with the new feature. In particular, an ensemble of image prototype sets are sampled automatically from the available data, to represent a rich set of visual categories/attributes. Discriminative functions are then learned on these prototype sets, and image are represented by the concatenation of their projected values onto the prototypes (similarities to them) for further classification. Experiments on four standard datasets show three interesting phenomena: (1) our method consistently outperforms previous methods for semi-supervised image classification; (2) our method lets itself combine well with these methods; and (3) our method works well for self-taught image classification where unlabeled data are not coming from the same distribution as la- beled ones, but rather from a random collection of images.
Reference: text
sentIndex sentText sentNum sentScore
1 Unlike previous methods to regularize classifying boundaries with unlabeled data, our method learns a new image representation from all available data (labeled and unlabeled) andperformsplain supervised learning with the new feature. [sent-5, score-0.449]
2 In particular, an ensemble of image prototype sets are sampled automatically from the available data, to represent a rich set of visual categories/attributes. [sent-6, score-0.753]
3 Discriminative functions are then learned on these prototype sets, and image are represented by the concatenation of their projected values onto the prototypes (similarities to them) for further classification. [sent-7, score-0.928]
4 Most of the classification systems [3, 17] heavily rely on manually labeled training data, which is expensive and sometimes impossible to acquire. [sent-11, score-0.216]
5 As a result, numerous techniques such as semi-supervised learning [10], active learning [12], transfer learning [24], and self-taught learning [25] have been developed. [sent-13, score-0.265]
6 In this paper, we are interested in the problem of semisupervised learning (SSL) for image classification. [sent-14, score-0.123]
7 The task is to design a method that can make use of unlabeled images, while learning classifiers from labeled ones. [sent-15, score-0.48]
8 This assumption allows the geometrical structure of unlabeled data to regularize the classifying functions. [sent-21, score-0.367]
9 First of all, these methods only exploit the localconsistency assumption in image feature space, and ignore other prior information. [sent-23, score-0.14]
10 Furthermore, most previous methods design specialized learning algorithms to leverage the structure of unlabeled data [2, 15, 18], so users often need to change their learning methods in order to utilize the cheap unlabeled data. [sent-26, score-0.638]
11 Last but not the least, previous methods assume that the unlabeled data are coming from more or less the same distribution as the labeled data. [sent-28, score-0.38]
12 This is part of Eleanor Rosch’s prototype theory [27], that states that an object’s class is determined by its similarity to prototypes which represent object categories. [sent-34, score-0.763]
13 The theory is suitable for transfer learning [24], where labeled data of other categories are available. [sent-35, score-0.222]
14 An important question is whether the theory can also be used for SSL, with its huge amount of unlabeled data. [sent-36, score-0.248]
15 To use this paradigm, we first need to create the prototypes automatically from unlabeled data. [sent-38, score-0.634]
16 For feature learning, we sample an ensemble of T diverse prototype sets from all known images and learn discriminative classifiers on them for the projection functions. [sent-40, score-0.924]
17 For classification, we train plain classifiers on labeled images with the learned features to classify the unlabeled ones. [sent-42, score-0.566]
18 neighbors can be “good” prototypes (defining one visual category/attribute), and far apart such prototypes can play the role of different categories. [sent-43, score-0.698]
19 According to this observation, we design a method to sample the prototype set from all available data. [sent-44, score-0.44]
20 Discriminative learning is then used, logistic regression in our implementation, to learn projection functions tuned to the prototypes. [sent-45, score-0.176]
21 Images are linked to the prototypes via their projection values (classification scores). [sent-46, score-0.404]
22 Since information carried by one single prototype set is limited and can be noisy, we borrow ideas from ensemble learning [26] to create an ensemble of diverse prototype sets, which in turn leads to an ensemble of projection functions. [sent-47, score-1.7]
23 Images are then represented by the concatenation of their projected values (similarities) to all the image prototypes, in keeping with prototype theory [27]. [sent-48, score-0.488]
24 Related Work Our method is generally relevant to semi-supervised learning, ensemble learning, and image feature learning. [sent-68, score-0.258]
25 SSL aims at enhanced learning by exploiting available, unlabeled data. [sent-71, score-0.334]
26 Another group of methods utilize the unlabeled data to regularize the classifying functions enforcing the boundaries to pass through regions with a low density of data samples. [sent-76, score-0.355]
27 [29] presented two methods in the self-supervised manner – unlabeled images with high classification confidence are then included into the training set for the next round of learning. [sent-81, score-0.398]
28 Our method learns the representation from an ensemble of prototype sets, thus sharing aspects of ensemble learning (EL). [sent-84, score-0.928]
29 Popular ensemble methods that have been extended to semi-supervised scenarios are Boosting [15] and Random Forest [18]. [sent-86, score-0.214]
30 They focus on the problem of improving classifiers by using unlabeled data. [sent-88, score-0.304]
31 The reason we use EL is to capture rich visual attributes from a series of prototype sets. [sent-91, score-0.547]
32 They presented an ensemble partitioning framework for unsupervised image categorization, where weak training sets are sampled to train base learners. [sent-94, score-0.407]
33 The whole dataset is classified by all the base learners in order to obtain a bagged proximity matrix for further clustering. [sent-95, score-0.133]
34 A similar idea was also proposed in Random Ensemble Metrics [14], where images are projected to randomly subsampled training categories for supervised distance learning. [sent-96, score-0.137]
35 While getting pleasing results, these methods all require additional labeled training data, which is exactly what we want to avoid. [sent-101, score-0.13]
36 The method also shares similarity with Self-taught learning [25], where sparse coding is employed to construct higher-level features using unlabeled data. [sent-104, score-0.306]
37 Our Approach The training data consists of both labeled data Dl = {(xTi,h eyi) t}rail=in1i agnd d autnala cboenlesids dsa otaf D bout h= l {xj wDhere xi deno)te}s thea nfdea utunrlea bveelcetdor d aotaf image i {, yi ∈ {1, . [sent-107, score-0.13]
38 M∈os {t previous semi-supervised learning (SSL) methods learn a classifier φ : X → Y from Dl with a regulation term learned from Dφu :. [sent-111, score-0.167]
39 XOu7 →r m Yeth forodm mlea Drns a new image representation f from aDll known data D = Dl ∪ Du, and train plain classifier φ on kf. [sent-112, score-0.123]
40 Assume that EP learns knowledge from T prototype sets Pt,t∈{1,. [sent-114, score-0.484]
41 , r} is the pseudo-label indicating which prototype ∈sti belong t}o. [sent-123, score-0.414]
42 i r hise the number of prototypes (analogous to the number of object classes) in Pt, and n the number of images sampled for jeeaccth c prototype (e. [sent-124, score-0.841]
43 Below, we first present our sampling method of creating a single prototype set Pt in the t trial, followed by EP. [sent-128, score-0.446]
44 Max-Min Sampling As stated, we want the prototypes to be inter-distinct and intra-compact, so that each one represents a different visual concept. [sent-131, score-0.349]
45 In particular, we first sample a skeleton of the prototype set, by looking for image candidates that are strongly spread out, i. [sent-134, score-0.503]
46 We then enrich the skeleton to a prototype set by including the closest neighbors of the skeleton images. [sent-137, score-0.632]
47 For the skeleton, we randomly sampled m hypotheses each hypothesis consists of r random sampled images – and keep the one having the largest mutual distance. [sent-140, score-0.162]
48 Once the skeleton is created, the Min-step extends each seed image to an image prototype by introducing its n nearest neighbors (including itself), in order to enrich the – characteristics of each image prototype and reduce the risk of introducing noisy images. [sent-142, score-1.01]
49 For one thing, we do not need the optimal one – we only need the prototypes to be far apart, not farthest apart. [sent-146, score-0.349]
50 Ensemble Projection We now explore the use of the image prototype sets created in § 3. [sent-153, score-0.456]
51 Because the prototypes are compact i mn afgeeatu rerep space, ieoanch. [sent-155, score-0.349]
52 Since information carried by a single prototype set Pt is quite limited, we borrow idea from ensperomtboltye learning (EL) to create an ensemble of T such sets. [sent-158, score-0.759]
53 As we all know, EL benefits from the precision ofits base learners and their diversity. [sent-159, score-0.133]
54 For good precision, discriminative learning method is employed as the base learner φt (. [sent-160, score-0.148]
55 For large diversity, randomness is introduced in different trials of Max-Min Sampling to create an ensemble of diverse prototype sets, so that a rich set of image attributes are captured. [sent-162, score-0.878]
56 Furthermore, we collected a random image collection by sampling 20, 000 images randomly from ImageNet dataset [6] to evaluate our method on the task of self-taught image classification. [sent-178, score-0.129]
57 Competing methods: Four classifiers were adopted to evaluate the method, with two inductive classifiers logistic regression (LR) and linear SVMs, and two transductive classifiers Harmonic-Function (HF) [34] and LapSVM (LSVM) [1]. [sent-190, score-0.292]
58 Since our method builds up a new feature representation, we illustrate the performance of all methods working with normal features and our learned features. [sent-193, score-0.169]
59 The top panel evaluate the performance of our learned features when fed into LR and SVMs. [sent-205, score-0.18]
60 All methods were tested with two feature inputs: the concatenation of GIST, PHOG and LBP, and our learned feature from them (indicated by “+ EP”). [sent-207, score-0.281]
61 All methods were tested with two feature inputs: the concatenation of GIST, PHOG, and LBP, and our learned feature from it (indicated by “+ EP”) Algo. [sent-218, score-0.281]
62 2 needs a discriminative feature to learn precise projection functions. [sent-220, score-0.128]
63 2 shows all the results and Table 1 lists the results obtained with 5 labeled training images per class. [sent-235, score-0.196]
64 2, it is easy to observe that the two plain classifiers LR and SVMs working with our feature perform better than the two sophisticated SSL methods LapSVM and Harmonic-Function working with the original feature, while having comparable variance. [sent-237, score-0.237]
65 The advantages can be ascribed to two factors: (1) in addition to the local-consistency assumption, our method also exploits the exotic-inconsistency assumption; (2) the discriminative projections abstract high-level attributes from the sampled prototypes, e. [sent-239, score-0.231]
66 Note that our feature are learned exactly from the original feature, but going beyond one single image. [sent-243, score-0.109]
67 LR was employed with 5 labeled training images per class. [sent-247, score-0.196]
68 LR was used as the classifier with 5 labeled training images per class. [sent-251, score-0.24]
69 This suggests that our scheme of exploiting unlabeled data and the previous ones doing so capture complementary information. [sent-254, score-0.303]
70 They are the total number of prototype sets T, the number of prototypes in each set r, the number of images in each prototype n, and the number of skeleton hypotheses m used in Max-Min Sampling. [sent-260, score-1.334]
71 It implies that the method benefits from exploiting more “novel” visual attributes (image prototypes). [sent-266, score-0.13]
72 50 for the four datasets), the then exploited attributes have already been in, thus stopping boosting the performance much. [sent-269, score-0.155]
73 A – large r would lead to confusing attributes, because prototypes may start overlapping with each other. [sent-273, score-0.349]
74 For n, a similar trend was obtained as n increases, the characteristics of the prototypes are enriched, thus boosting the performance. [sent-274, score-0.428]
75 This can be explained from the perspective of ensemble learning (EL). [sent-280, score-0.272]
76 EL benefits from the strength of its base learners and their diversity. [sent-281, score-0.133]
77 Too large an m brings all prototype skeletons close the the optimal one, thus decreasing the diversity of sampled prototype sets. [sent-282, score-0.88]
78 The classifiers were tested with two feature inputs: the concatenation of GIST, PHOG, and LBP, and our learned feature from it (indicated by “+ EP”). [sent-297, score-0.337]
79 Comparison ofour learned feature with the normal image feature against different LR models. [sent-301, score-0.153]
80 again used as the classifier and we compared our learned feature with the corresponding original ones, namely the GIST, the PHOG, and the LBP. [sent-302, score-0.153]
81 3 Robustness Against Classifier Models In this section, we evaluate the robustness of our learned features against classifier models. [sent-307, score-0.154]
82 This property is important for SSL, as labeled data is limited and probably cannot accommodate a model selection technique such as Cross-Validation. [sent-319, score-0.123]
83 Self-taught Image Classification In order to evaluate the applicability of our method, we tested it in a more general scenario, where the unlabeled data is the set of 20, 000 random images from ImageNet. [sent-322, score-0.36]
84 Projection functions were learned from images in this set plus the labeled training images in corresponding evaluation dataset, and performance was measured on the unlabeled images. [sent-323, score-0.521]
85 5 shows the classification performance with different numbers of labeled training images per class, and Table 2 lists that when 5 training images per class is used. [sent-325, score-0.386]
86 From the figure and table, it can be found that our learned feature from the random image collection still outperforms the original feature. [sent-326, score-0.18]
87 The success could be ascribed to the fact that the “universal visual world” (the random image collection) contains abundant high-level, valuable visual attributes such as “blue and open” in some image clusters and “textured and man-made” in others. [sent-328, score-0.182]
88 Exploiting these “hidden” visual attributes is very beneficial for narrowing down the semantic gap between low-level features and high-level classification tasks. [sent-329, score-0.188]
89 From the figure, we can also find that as the number of labeled training images increases, the advantage of our learned feature may decrease. [sent-330, score-0.265]
90 It comes without much surprise as the method is designed to improve classification systems by exploiting ‘unknowledgeable’ (unlabeled) data. [sent-331, score-0.114]
91 Therefore, when a sufficient number of labeled images are available, introducing additional unlabeled ones may hurt the system. [sent-332, score-0.393]
92 This is a general, open problem for semisupervised learning (self-taught learning) [20]. [sent-333, score-0.123]
93 One possible solution is to study when the classification systems should switch from semi-supervised learning to fully supervised learning. [sent-334, score-0.178]
94 Conclusion This paper has tackled the problem of semi-supervised image classification from a novel perspective – rather than regularizing classifying functions like previous methods, we learn a new, high-level image representation. [sent-336, score-0.154]
95 We proposed as novel concept the exotic-inconsistency assumption and designed a simple, yet effective feature learning method to use it along with local-consistency to exploit the avail2078 MethodsS-15L-21T-25C-101 SVLSRMVL+sRM+EsPEP34 397 6. [sent-337, score-0.14]
96 All methods were tested with two feature inputs: the concatenation of GIST, PHOG, and LBP and our learned feature from the 20, 000 random image collection (indicated by “+ EP”). [sent-343, score-0.352]
97 By doing so, images are represented with their affinities to a rich set of discovered image attributes for classification. [sent-345, score-0.159]
98 Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. [sent-353, score-0.398]
99 Object bank: A highlevel image representation for scene classification & semantic feature sparsification. [sent-484, score-0.13]
100 Transfer learning for image classification with sparse prototype representations. [sent-515, score-0.558]
wordName wordTfidf (topN-words)
[('prototype', 0.414), ('prototypes', 0.349), ('ssl', 0.314), ('unlabeled', 0.248), ('ensemble', 0.214), ('phog', 0.18), ('lapsvm', 0.174), ('lr', 0.143), ('ep', 0.128), ('albeeld', 0.116), ('calss', 0.116), ('miages', 0.116), ('pam', 0.112), ('gist', 0.109), ('attributes', 0.102), ('lbp', 0.092), ('labeled', 0.092), ('skeleton', 0.089), ('classification', 0.086), ('hf', 0.079), ('plain', 0.079), ('panel', 0.077), ('svms', 0.075), ('concatenation', 0.074), ('learners', 0.072), ('pt', 0.066), ('learned', 0.065), ('semisupervised', 0.065), ('el', 0.061), ('base', 0.061), ('learning', 0.058), ('localconsistency', 0.058), ('competing', 0.057), ('dai', 0.057), ('classifiers', 0.056), ('projection', 0.055), ('dl', 0.054), ('tested', 0.054), ('boosting', 0.053), ('sampled', 0.052), ('indicated', 0.05), ('transductive', 0.049), ('ascribed', 0.048), ('rdaestual', 0.048), ('robustness', 0.045), ('classifier', 0.044), ('feature', 0.044), ('diverse', 0.044), ('inputs', 0.042), ('sets', 0.042), ('classifying', 0.042), ('investigates', 0.041), ('coming', 0.04), ('per', 0.04), ('enrich', 0.04), ('categories', 0.039), ('regularize', 0.039), ('collection', 0.039), ('inductive', 0.038), ('training', 0.038), ('fed', 0.038), ('assumption', 0.038), ('logistic', 0.037), ('imagenet', 0.037), ('create', 0.037), ('borrow', 0.036), ('randomness', 0.036), ('zurich', 0.035), ('shrivastava', 0.034), ('supervised', 0.034), ('transfer', 0.033), ('leistner', 0.032), ('random', 0.032), ('sampling', 0.032), ('builds', 0.031), ('rich', 0.031), ('thorough', 0.031), ('guillaumin', 0.031), ('property', 0.031), ('liblinear', 0.03), ('libsvm', 0.03), ('sit', 0.03), ('working', 0.029), ('discriminative', 0.029), ('neighborhoods', 0.029), ('ito', 0.029), ('bank', 0.028), ('learns', 0.028), ('exploiting', 0.028), ('seed', 0.027), ('ones', 0.027), ('similarities', 0.027), ('increases', 0.026), ('icml', 0.026), ('functions', 0.026), ('characteristics', 0.026), ('images', 0.026), ('design', 0.026), ('evta', 0.026), ('frs', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 142 iccv-2013-Ensemble Projection for Semi-supervised Image Classification
Author: Dengxin Dai, Luc Van_Gool
Abstract: This paper investigates the problem of semi-supervised classification. Unlike previous methods to regularize classifying boundaries with unlabeled data, our method learns a new image representation from all available data (labeled and unlabeled) andperformsplain supervised learning with the new feature. In particular, an ensemble of image prototype sets are sampled automatically from the available data, to represent a rich set of visual categories/attributes. Discriminative functions are then learned on these prototype sets, and image are represented by the concatenation of their projected values onto the prototypes (similarities to them) for further classification. Experiments on four standard datasets show three interesting phenomena: (1) our method consistently outperforms previous methods for semi-supervised image classification; (2) our method lets itself combine well with these methods; and (3) our method works well for self-taught image classification where unlabeled data are not coming from the same distribution as la- beled ones, but rather from a random collection of images.
2 0.42152026 222 iccv-2013-Joint Learning of Discriminative Prototypes and Large Margin Nearest Neighbor Classifiers
Author: Martin Köstinger, Paul Wohlhart, Peter M. Roth, Horst Bischof
Abstract: In this paper, we raise important issues concerning the evaluation complexity of existing Mahalanobis metric learning methods. The complexity scales linearly with the size of the dataset. This is especially cumbersome on large scale or for real-time applications with limited time budget. To alleviate this problem we propose to represent the dataset by a fixed number of discriminative prototypes. In particular, we introduce a new method that jointly chooses the positioning of prototypes and also optimizes the Mahalanobis distance metric with respect to these. We show that choosing the positioning of the prototypes and learning the metric in parallel leads to a drastically reduced evaluation effort while maintaining the discriminative essence of the original dataset. Moreover, for most problems our method performing k-nearest prototype (k-NP) classification on the condensed dataset leads to even better generalization compared to k-NN classification using all data. Results on a variety of challenging benchmarks demonstrate the power of our method. These include standard machine learning datasets as well as the challenging Public Fig- ures Face Database. On the competitive machine learning benchmarks we are comparable to the state-of-the-art while being more efficient. On the face benchmark we clearly outperform the state-of-the-art in Mahalanobis metric learning with drastically reduced evaluation effort.
3 0.1669745 6 iccv-2013-A Convex Optimization Framework for Active Learning
Author: Ehsan Elhamifar, Guillermo Sapiro, Allen Yang, S. Shankar Sasrty
Abstract: In many image/video/web classification problems, we have access to a large number of unlabeled samples. However, it is typically expensive and time consuming to obtain labels for the samples. Active learning is the problem of progressively selecting and annotating the most informative unlabeled samples, in order to obtain a high classification performance. Most existing active learning algorithms select only one sample at a time prior to retraining the classifier. Hence, they are computationally expensive and cannot take advantage of parallel labeling systems such as Mechanical Turk. On the other hand, algorithms that allow the selection of multiple samples prior to retraining the classifier, may select samples that have significant information overlap or they involve solving a non-convex optimization. More importantly, the majority of active learning algorithms are developed for a certain classifier type such as SVM. In this paper, we develop an efficient active learning framework based on convex programming, which can select multiple samples at a time for annotation. Unlike the state of the art, our algorithm can be used in conjunction with any type of classifiers, including those of the fam- ily of the recently proposed Sparse Representation-based Classification (SRC). We use the two principles of classifier uncertainty and sample diversity in order to guide the optimization program towards selecting the most informative unlabeled samples, which have the least information overlap. Our method can incorporate the data distribution in the selection process by using the appropriate dissimilarity between pairs of samples. We show the effectiveness of our framework in person detection, scene categorization and face recognition on real-world datasets.
4 0.12659603 338 iccv-2013-Randomized Ensemble Tracking
Author: Qinxun Bai, Zheng Wu, Stan Sclaroff, Margrit Betke, Camille Monnier
Abstract: We propose a randomized ensemble algorithm to model the time-varying appearance of an object for visual tracking. In contrast with previous online methods for updating classifier ensembles in tracking-by-detection, the weight vector that combines weak classifiers is treated as a random variable and the posterior distribution for the weight vector is estimated in a Bayesian manner. In essence, the weight vector is treated as a distribution that reflects the confidence among the weak classifiers used to construct and adapt the classifier ensemble. The resulting formulation models the time-varying discriminative ability among weak classifiers so that the ensembled strong classifier can adapt to the varying appearance, backgrounds, and occlusions. The formulation is tested in a tracking-by-detection implementation. Experiments on 28 challenging benchmark videos demonstrate that the proposed method can achieve results comparable to and often better than those of stateof-the-art approaches.
5 0.12347181 194 iccv-2013-Heterogeneous Image Features Integration via Multi-modal Semi-supervised Learning Model
Author: Xiao Cai, Feiping Nie, Weidong Cai, Heng Huang
Abstract: Automatic image categorization has become increasingly important with the development of Internet and the growth in the size of image databases. Although the image categorization can be formulated as a typical multiclass classification problem, two major challenges have been raised by the real-world images. On one hand, though using more labeled training data may improve the prediction performance, obtaining the image labels is a time consuming as well as biased process. On the other hand, more and more visual descriptors have been proposed to describe objects and scenes appearing in images and different features describe different aspects of the visual characteristics. Therefore, how to integrate heterogeneous visual features to do the semi-supervised learning is crucial for categorizing large-scale image data. In this paper, we propose a novel approach to integrate heterogeneous features by performing multi-modal semi-supervised classification on unlabeled as well as unsegmented images. Considering each type of feature as one modality, taking advantage of the large amoun- t of unlabeled data information, our new adaptive multimodal semi-supervised classification (AMMSS) algorithm learns a commonly shared class indicator matrix and the weights for different modalities (image features) simultaneously.
6 0.11449631 126 iccv-2013-Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification
7 0.10238067 285 iccv-2013-NEIL: Extracting Visual Knowledge from Web Data
8 0.099761888 52 iccv-2013-Attribute Adaptation for Personalized Image Search
9 0.098997377 41 iccv-2013-Active Learning of an Action Detector from Untrimmed Videos
10 0.090396143 233 iccv-2013-Latent Task Adaptation with Large-Scale Hierarchies
11 0.087352291 156 iccv-2013-Fast Direct Super-Resolution by Simple Functions
12 0.084404141 31 iccv-2013-A Unified Probabilistic Approach Modeling Relationships between Attributes and Objects
13 0.084117167 238 iccv-2013-Learning Graphs to Match
14 0.083043039 123 iccv-2013-Domain Adaptive Classification
15 0.083016895 336 iccv-2013-Random Forests of Local Experts for Pedestrian Detection
16 0.078242116 234 iccv-2013-Learning CRFs for Image Parsing with Adaptive Subgradient Descent
17 0.07792969 404 iccv-2013-Structured Forests for Fast Edge Detection
18 0.077438615 204 iccv-2013-Human Attribute Recognition by Rich Appearance Dictionary
19 0.076909512 378 iccv-2013-Semantic-Aware Co-indexing for Image Retrieval
20 0.076101273 380 iccv-2013-Semantic Transform: Weakly Supervised Semantic Inference for Relating Visual Attributes
topicId topicWeight
[(0, 0.187), (1, 0.1), (2, -0.051), (3, -0.091), (4, 0.029), (5, -0.009), (6, -0.034), (7, -0.016), (8, 0.03), (9, -0.031), (10, -0.028), (11, -0.044), (12, -0.024), (13, -0.055), (14, 0.087), (15, -0.057), (16, -0.054), (17, -0.023), (18, -0.013), (19, -0.005), (20, -0.054), (21, -0.011), (22, -0.069), (23, 0.074), (24, -0.014), (25, 0.002), (26, 0.123), (27, 0.074), (28, 0.067), (29, 0.039), (30, -0.064), (31, 0.035), (32, -0.004), (33, -0.036), (34, -0.026), (35, -0.012), (36, 0.045), (37, -0.081), (38, -0.025), (39, -0.101), (40, 0.229), (41, 0.007), (42, -0.069), (43, 0.095), (44, 0.149), (45, -0.164), (46, -0.024), (47, 0.102), (48, 0.12), (49, 0.087)]
simIndex simValue paperId paperTitle
same-paper 1 0.92003858 142 iccv-2013-Ensemble Projection for Semi-supervised Image Classification
Author: Dengxin Dai, Luc Van_Gool
Abstract: This paper investigates the problem of semi-supervised classification. Unlike previous methods to regularize classifying boundaries with unlabeled data, our method learns a new image representation from all available data (labeled and unlabeled) andperformsplain supervised learning with the new feature. In particular, an ensemble of image prototype sets are sampled automatically from the available data, to represent a rich set of visual categories/attributes. Discriminative functions are then learned on these prototype sets, and image are represented by the concatenation of their projected values onto the prototypes (similarities to them) for further classification. Experiments on four standard datasets show three interesting phenomena: (1) our method consistently outperforms previous methods for semi-supervised image classification; (2) our method lets itself combine well with these methods; and (3) our method works well for self-taught image classification where unlabeled data are not coming from the same distribution as la- beled ones, but rather from a random collection of images.
2 0.86796808 222 iccv-2013-Joint Learning of Discriminative Prototypes and Large Margin Nearest Neighbor Classifiers
Author: Martin Köstinger, Paul Wohlhart, Peter M. Roth, Horst Bischof
Abstract: In this paper, we raise important issues concerning the evaluation complexity of existing Mahalanobis metric learning methods. The complexity scales linearly with the size of the dataset. This is especially cumbersome on large scale or for real-time applications with limited time budget. To alleviate this problem we propose to represent the dataset by a fixed number of discriminative prototypes. In particular, we introduce a new method that jointly chooses the positioning of prototypes and also optimizes the Mahalanobis distance metric with respect to these. We show that choosing the positioning of the prototypes and learning the metric in parallel leads to a drastically reduced evaluation effort while maintaining the discriminative essence of the original dataset. Moreover, for most problems our method performing k-nearest prototype (k-NP) classification on the condensed dataset leads to even better generalization compared to k-NN classification using all data. Results on a variety of challenging benchmarks demonstrate the power of our method. These include standard machine learning datasets as well as the challenging Public Fig- ures Face Database. On the competitive machine learning benchmarks we are comparable to the state-of-the-art while being more efficient. On the face benchmark we clearly outperform the state-of-the-art in Mahalanobis metric learning with drastically reduced evaluation effort.
3 0.70725495 332 iccv-2013-Quadruplet-Wise Image Similarity Learning
Author: Marc T. Law, Nicolas Thome, Matthieu Cord
Abstract: This paper introduces a novel similarity learning framework. Working with inequality constraints involving quadruplets of images, our approach aims at efficiently modeling similarity from rich or complex semantic label relationships. From these quadruplet-wise constraints, we propose a similarity learning framework relying on a convex optimization scheme. We then study how our metric learning scheme can exploit specific class relationships, such as class ranking (relative attributes), and class taxonomy. We show that classification using the learned metrics gets improved performance over state-of-the-art methods on several datasets. We also evaluate our approach in a new application to learn similarities between webpage screenshots in a fully unsupervised way.
4 0.70515507 6 iccv-2013-A Convex Optimization Framework for Active Learning
Author: Ehsan Elhamifar, Guillermo Sapiro, Allen Yang, S. Shankar Sasrty
Abstract: In many image/video/web classification problems, we have access to a large number of unlabeled samples. However, it is typically expensive and time consuming to obtain labels for the samples. Active learning is the problem of progressively selecting and annotating the most informative unlabeled samples, in order to obtain a high classification performance. Most existing active learning algorithms select only one sample at a time prior to retraining the classifier. Hence, they are computationally expensive and cannot take advantage of parallel labeling systems such as Mechanical Turk. On the other hand, algorithms that allow the selection of multiple samples prior to retraining the classifier, may select samples that have significant information overlap or they involve solving a non-convex optimization. More importantly, the majority of active learning algorithms are developed for a certain classifier type such as SVM. In this paper, we develop an efficient active learning framework based on convex programming, which can select multiple samples at a time for annotation. Unlike the state of the art, our algorithm can be used in conjunction with any type of classifiers, including those of the fam- ily of the recently proposed Sparse Representation-based Classification (SRC). We use the two principles of classifier uncertainty and sample diversity in order to guide the optimization program towards selecting the most informative unlabeled samples, which have the least information overlap. Our method can incorporate the data distribution in the selection process by using the appropriate dissimilarity between pairs of samples. We show the effectiveness of our framework in person detection, scene categorization and face recognition on real-world datasets.
5 0.70329422 177 iccv-2013-From Point to Set: Extend the Learning of Distance Metrics
Author: Pengfei Zhu, Lei Zhang, Wangmeng Zuo, David Zhang
Abstract: Most of the current metric learning methods are proposed for point-to-point distance (PPD) based classification. In many computer vision tasks, however, we need to measure the point-to-set distance (PSD) and even set-to-set distance (SSD) for classification. In this paper, we extend the PPD based Mahalanobis distance metric learning to PSD and SSD based ones, namely point-to-set distance metric learning (PSDML) and set-to-set distance metric learning (SSDML), and solve them under a unified optimization framework. First, we generate positive and negative sample pairs by computing the PSD and SSD between training samples. Then, we characterize each sample pair by its covariance matrix, and propose a covariance kernel based discriminative function. Finally, we tackle the PSDML and SSDMLproblems by using standard support vector machine solvers, making the metric learning very efficient for multiclass visual classification tasks. Experiments on gender classification, digit recognition, object categorization and face recognition show that the proposed metric learning methods can effectively enhance the performance of PSD and SSD based classification.
6 0.69407994 194 iccv-2013-Heterogeneous Image Features Integration via Multi-modal Semi-supervised Learning Model
7 0.68628091 431 iccv-2013-Unbiased Metric Learning: On the Utilization of Multiple Datasets and Web Images for Softening Bias
8 0.67835754 126 iccv-2013-Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification
10 0.62318426 227 iccv-2013-Large-Scale Image Annotation by Efficient and Robust Kernel Metric Learning
11 0.57066613 125 iccv-2013-Drosophila Embryo Stage Annotation Using Label Propagation
12 0.52432597 392 iccv-2013-Similarity Metric Learning for Face Recognition
13 0.51463646 234 iccv-2013-Learning CRFs for Image Parsing with Adaptive Subgradient Descent
14 0.51171923 338 iccv-2013-Randomized Ensemble Tracking
15 0.51028043 451 iccv-2013-Write a Classifier: Zero-Shot Learning Using Purely Textual Descriptions
16 0.50914568 352 iccv-2013-Revisiting Example Dependent Cost-Sensitive Learning with Decision Trees
17 0.50885236 136 iccv-2013-Efficient Pedestrian Detection by Directly Optimizing the Partial Area under the ROC Curve
18 0.50831586 233 iccv-2013-Latent Task Adaptation with Large-Scale Hierarchies
19 0.50068617 285 iccv-2013-NEIL: Extracting Visual Knowledge from Web Data
20 0.50054073 290 iccv-2013-New Graph Structured Sparsity Model for Multi-label Image Annotations
topicId topicWeight
[(2, 0.064), (4, 0.016), (7, 0.021), (12, 0.029), (13, 0.011), (26, 0.105), (31, 0.046), (40, 0.017), (42, 0.095), (48, 0.017), (64, 0.034), (73, 0.036), (77, 0.261), (89, 0.165)]
simIndex simValue paperId paperTitle
1 0.79912198 350 iccv-2013-Relative Attributes for Large-Scale Abandoned Object Detection
Author: Quanfu Fan, Prasad Gabbur, Sharath Pankanti
Abstract: Effective reduction of false alarms in large-scale video surveillance is rather challenging, especially for applications where abnormal events of interest rarely occur, such as abandoned object detection. We develop an approach to prioritize alerts by ranking them, and demonstrate its great effectiveness in reducing false positives while keeping good detection accuracy. Our approach benefits from a novel representation of abandoned object alerts by relative attributes, namely staticness, foregroundness and abandonment. The relative strengths of these attributes are quantified using a ranking function[19] learnt on suitably designed low-level spatial and temporal features.These attributes of varying strengths are not only powerful in distinguishing abandoned objects from false alarms such as people and light artifacts, but also computationally efficient for large-scale deployment. With these features, we apply a linear ranking algorithm to sort alerts according to their relevance to the end-user. We test the effectiveness of our approach on both public data sets and large ones collected from the real world.
same-paper 2 0.76670146 142 iccv-2013-Ensemble Projection for Semi-supervised Image Classification
Author: Dengxin Dai, Luc Van_Gool
Abstract: This paper investigates the problem of semi-supervised classification. Unlike previous methods to regularize classifying boundaries with unlabeled data, our method learns a new image representation from all available data (labeled and unlabeled) andperformsplain supervised learning with the new feature. In particular, an ensemble of image prototype sets are sampled automatically from the available data, to represent a rich set of visual categories/attributes. Discriminative functions are then learned on these prototype sets, and image are represented by the concatenation of their projected values onto the prototypes (similarities to them) for further classification. Experiments on four standard datasets show three interesting phenomena: (1) our method consistently outperforms previous methods for semi-supervised image classification; (2) our method lets itself combine well with these methods; and (3) our method works well for self-taught image classification where unlabeled data are not coming from the same distribution as la- beled ones, but rather from a random collection of images.
3 0.75414658 83 iccv-2013-Complementary Projection Hashing
Author: Zhongming Jin, Yao Hu, Yue Lin, Debing Zhang, Shiding Lin, Deng Cai, Xuelong Li
Abstract: Recently, hashing techniques have been widely applied to solve the approximate nearest neighbors search problem in many vision applications. Generally, these hashing approaches generate 2c buckets, where c is the length of the hash code. A good hashing method should satisfy the following two requirements: 1) mapping the nearby data points into the same bucket or nearby (measured by xue long l i opt . ac . cn @ a(a)b(b) the Hamming distance) buckets. 2) all the data points are evenly distributed among all the buckets. In this paper, we propose a novel algorithm named Complementary Projection Hashing (CPH) to find the optimal hashing functions which explicitly considers the above two requirements. Specifically, CPHaims at sequentiallyfinding a series ofhyperplanes (hashing functions) which cross the sparse region of the data. At the same time, the data points are evenly distributed in the hypercubes generated by these hyperplanes. The experiments comparing with the state-of-the-art hashing methods demonstrate the effectiveness of the proposed method.
4 0.73578823 180 iccv-2013-From Where and How to What We See
Author: S. Karthikeyan, Vignesh Jagadeesh, Renuka Shenoy, Miguel Ecksteinz, B.S. Manjunath
Abstract: Eye movement studies have confirmed that overt attention is highly biased towards faces and text regions in images. In this paper we explore a novel problem of predicting face and text regions in images using eye tracking data from multiple subjects. The problem is challenging as we aim to predict the semantics (face/text/background) only from eye tracking data without utilizing any image information. The proposed algorithm spatially clusters eye tracking data obtained in an image into different coherent groups and subsequently models the likelihood of the clusters containing faces and text using afully connectedMarkov Random Field (MRF). Given the eye tracking datafrom a test image, itpredicts potential face/head (humans, dogs and cats) and text locations reliably. Furthermore, the approach can be used to select regions of interest for further analysis by object detectors for faces and text. The hybrid eye position/object detector approach achieves better detection performance and reduced computation time compared to using only the object detection algorithm. We also present a new eye tracking dataset on 300 images selected from ICDAR, Street-view, Flickr and Oxford-IIIT Pet Dataset from 15 subjects.
5 0.72202921 181 iccv-2013-Frustratingly Easy NBNN Domain Adaptation
Author: Tatiana Tommasi, Barbara Caputo
Abstract: Over the last years, several authors have signaled that state of the art categorization methods fail to perform well when trained and tested on data from different databases. The general consensus in the literature is that this issue, known as domain adaptation and/or dataset bias, is due to a distribution mismatch between data collections. Methods addressing it go from max-margin classifiers to learning how to modify the features and obtain a more robust representation. The large majority of these works use BOW feature descriptors, and learning methods based on imageto-image distance functions. Following the seminal work of [6], in this paper we challenge these two assumptions. We experimentally show that using the NBNN classifier over existing domain adaptation databases achieves always very strong performances. We build on this result, and present an NBNN-based domain adaptation algorithm that learns iteratively a class metric while inducing, for each sample, a large margin separation among classes. To the best of our knowledge, this is the first work casting the domain adaptation problem within the NBNN framework. Experiments show that our method achieves the state of the art, both in the unsupervised and semi-supervised settings.
6 0.70030308 141 iccv-2013-Enhanced Continuous Tabu Search for Parameter Estimation in Multiview Geometry
8 0.66206264 156 iccv-2013-Fast Direct Super-Resolution by Simple Functions
9 0.66065568 150 iccv-2013-Exemplar Cut
10 0.65945882 127 iccv-2013-Dynamic Pooling for Complex Event Recognition
11 0.65845108 95 iccv-2013-Cosegmentation and Cosketch by Unsupervised Learning
12 0.65767694 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation
13 0.65728211 414 iccv-2013-Temporally Consistent Superpixels
14 0.65606052 349 iccv-2013-Regionlets for Generic Object Detection
15 0.65447044 61 iccv-2013-Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition
16 0.65401512 196 iccv-2013-Hierarchical Data-Driven Descent for Efficient Optimal Deformation Estimation
17 0.65256172 21 iccv-2013-A Method of Perceptual-Based Shape Decomposition
18 0.65250814 4 iccv-2013-ACTIVE: Activity Concept Transitions in Video Event Classification
19 0.65218484 137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction
20 0.65157759 245 iccv-2013-Learning a Dictionary of Shape Epitomes with Applications to Image Labeling