nips nips2013 nips2013-335 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Marcus Rohrbach, Sandra Ebert, Bernt Schiele
Abstract: Category models for objects or activities typically rely on supervised learning requiring sufficiently large training sets. Transferring knowledge from known categories to novel classes with no or only a few labels is far less researched even though it is a common scenario. In this work, we extend transfer learning with semi-supervised learning to exploit unlabeled instances of (novel) categories with no or only a few labeled instances. Our proposed approach Propagated Semantic Transfer combines three techniques. First, we transfer information from known to novel categories by incorporating external knowledge, such as linguistic or expertspecified information, e.g., by a mid-level layer of semantic attributes. Second, we exploit the manifold structure of novel classes. More specifically we adapt a graph-based learning algorithm – so far only used for semi-supervised learning – to zero-shot and few-shot learning. Third, we improve the local neighborhood in such graph structures by replacing the raw feature-based representation with a mid-level object- or attribute-based representation. We evaluate our approach on three challenging datasets in two different applications, namely on Animals with Attributes and ImageNet for image classification and on MPII Composites for activity recognition. Our approach consistently outperforms state-of-the-art transfer and semi-supervised approaches on all datasets. 1
Reference: text
sentIndex sentText sentNum sentScore
1 de Abstract Category models for objects or activities typically rely on supervised learning requiring sufficiently large training sets. [sent-3, score-0.224]
2 Transferring knowledge from known categories to novel classes with no or only a few labels is far less researched even though it is a common scenario. [sent-4, score-0.563]
3 In this work, we extend transfer learning with semi-supervised learning to exploit unlabeled instances of (novel) categories with no or only a few labeled instances. [sent-5, score-0.749]
4 First, we transfer information from known to novel categories by incorporating external knowledge, such as linguistic or expertspecified information, e. [sent-7, score-0.762]
5 Third, we improve the local neighborhood in such graph structures by replacing the raw feature-based representation with a mid-level object- or attribute-based representation. [sent-12, score-0.335]
6 Our approach consistently outperforms state-of-the-art transfer and semi-supervised approaches on all datasets. [sent-14, score-0.357]
7 This development reflects the psychological point of view that humans are able to generalize to novel1 categories with only a few training samples [17, 1]. [sent-16, score-0.249]
8 This has recently gained increased interest in the computer vision and machine learning literature, which look at zero-shot recognition (with no training instances for a class) [11, 19, 9, 22, 16], and one- or few-shot recognition [29, 1, 21]. [sent-17, score-0.291]
9 Knowledge transfer is particularly beneficial when scaling to large numbers of classes [23, 16], distinguishing fine-grained categories [6], or analyzing compositional activities in videos [9, 22]. [sent-18, score-0.805]
10 Recognizing categories with no or only few labeled training instances is challenging. [sent-19, score-0.38]
11 To improve existing transfer learning approaches, we exploit several sources of information. [sent-20, score-0.413]
12 Our approach allows using (1) trained category models, (2) external knowledge, (3) instance similarity, and (4) labeled instances of the novel classes if available. [sent-21, score-0.534]
13 More specifically we learn category or attribute models based on labeled training data for known categories y (see also Figure 1) using supervised training. [sent-22, score-0.602]
14 These trained models are then associated with the novel categories z using, e. [sent-23, score-0.31]
15 expert or automatically mined semantic relatedness (cyan lines in Figure 1). [sent-25, score-0.402]
16 Similar to unsupervised learning [32, 28] our approach exploits similarities in the data space via a graph structure to discover dense regions that are associated with coherent categories or concepts (orange graph structure in Figure 1). [sent-26, score-0.421]
17 However, rather than using the raw input space, we map our data into a semantic output space with the 1 We use “novel” throughout the paper to denote categories with no or few labeled training instances. [sent-27, score-0.624]
18 Known categories y, novel categories z, instances x (colors denote predicted category affiliation). [sent-29, score-0.6]
19 Given the uncertain predictions and the graph structure we adapt semi-supervised label propagation [34, 33] to generate more reliable predictions. [sent-32, score-0.32]
20 Note, attribute or category models do not have to be retrained if novel classes are added which is an important aspect e. [sent-34, score-0.506]
21 First, we propose a novel approach that extends semantic knowledge transfer to the transductive setting, exploiting similarities in the unlabeled data distribution. [sent-38, score-0.902]
22 The approach allows to do zero-shot recognition but also smoothly integrate labels for novel classes (Section 3). [sent-39, score-0.388]
23 Second, we improve the local neighborhood structure in the raw feature space by mapping the data into a low dimensional semantic output space using the trained attribute and category models. [sent-40, score-0.766]
24 Third, we validate our approach on three challenging datasets for two different applications, namely on Animals with Attributes and ImageNet for image classification and on MPII Composites for activity recognition (Section 4). [sent-41, score-0.243]
25 2 Related work Knowledge transfer or transfer learning has the goal to transfer information of learned models to changing or unknown data distributions while reducing the need and effort to collect new training labels. [sent-44, score-1.14]
26 In this work we focus on transferring knowledge from known categories with sufficient training instances to novel categories with limited training data. [sent-46, score-0.773]
27 In computer vision or machine learning literature this setting is normally referred to as zero-shot learning [11, 19, 24, 9, 16] if there are no instances for the test classes available and one- or few-shot learning [16, 9, 8] if there are one or few instances available for the novel classes. [sent-47, score-0.387]
28 To recognize novel categories zero-shot recognition uses additional information, typically in the form of an intermediate attribute representation [11, 9], direct similarity [24] between categories, or hierarchical structures of categories [35]. [sent-48, score-0.95]
29 The information can either be manually specified [11, 9] or mined automatically from knowledge bases [24, 22]. [sent-49, score-0.239]
30 Our approach builds on these works by using a semantic knowledge transfer approach as the first step. [sent-50, score-0.644]
31 In contrast to related work, our approach uses the above mentioned semantic knowledge transfer also when few training examples are available to reduce the dependency on the quality of the samples. [sent-52, score-0.74]
32 Additionally, we exploit the neighborhood structure of the unlabeled instances to improve recognition for zero- and few-shot recognition. [sent-54, score-0.384]
33 This is in contrast to previous works with the exception of 2 the zero-shot approach of [9] that learns a discriminative, latent attribute representation and applies self-training on the unseen categories. [sent-55, score-0.285]
34 To transfer labels from labeled to unlabeled data label propagation is widely used [34, 33] and has shown to work successfully in several applications [13, 7]. [sent-60, score-0.706]
35 In this work, we extend transfer learning by considering the neighborhood structure of the novel classes. [sent-61, score-0.584]
36 We thus improve the graph structure by replacing the noisy raw input space with the more compact semantic output space which has shown to improve recognition [26, 22]. [sent-65, score-0.545]
37 To improve image classification with reduced training data, [4, 27] use attributes as an intermediate layer and incorporate unlabeled data, however, both works are in a classical semi-supervised learning setting similar to [5], while our setting is transfer learning. [sent-66, score-0.86]
38 In contrast, we use attributes for transfer and exploit the similarity between instances of the novel classes. [sent-69, score-0.826]
39 [4] automatically discover a discriminative attribute representation, while incorporating unlabeled data. [sent-70, score-0.307]
40 This notion of attributes is different to ours as we want to use semantic attributes to enable transfer from other classes. [sent-71, score-1.0]
41 3 Propagated Semantic Transfer (PST) Our main objective is to robustly recognize novel categories by transferring knowledge from known classes and exploiting the similarity of the test instances. [sent-73, score-0.629]
42 More specifically our novel approach called Propagated Semantic Transfer consists of the following four components: we employ semantic knowledge transfer from known classes to novel classes (Sec. [sent-74, score-1.13]
43 1); we combine the transferred predictions with labels for the novel classes (Sec. [sent-76, score-0.367]
44 2); a similarity metric is defined to achieve a robust graph structure (Sec. [sent-78, score-0.227]
45 3); we propagate this information within the novel classes (Sec. [sent-80, score-0.243]
46 1 Semantic knowledge transfer We first transfer knowledge using a semantic representation. [sent-84, score-1.071]
47 We use two strategies to achieve this transfer: i) an attribute representation that employs an intermediate representation of a1 , . [sent-102, score-0.308]
48 , aM attributes or ii) direct similarities calculated among the known object classes. [sent-105, score-0.339]
49 Both work without any training examples for zn , i. [sent-106, score-0.252]
50 An intermediate level of M attribute classifiers p(am |x) is trained on the known classes yk to estimate the presence of attribute am in the instance x. [sent-111, score-0.635]
51 The subsequent knowledge transfer requires an external knowledge source that provides class-attribute associations azn ∈ {0, 1} indicating if attribute am is associated with class zn . [sent-112, score-1.127]
52 Given this information the probability of the novel classes zn to be present in the instance x can then be estimated [24]: M zn (2p(am |x))am . [sent-115, score-0.609]
53 , yU as a predictor for novel class zn given an instance x [24]: U p(zn |x) ∝ (2p(yu |x)) u=1 3 z yun , (2) z where yun provides continuous normalized weights for the strength of the similarity between the novel class zn and the known class yu [24]. [sent-120, score-0.762]
54 for attributes p(zn |x) ∝ m=1M m aznm , and for direct similarity U m=1 m p(y |x) p(zn |x) ∝ u=1U u . [sent-123, score-0.344]
55 For class zn the semantic knowledge transfer provides p(zn |x) ∈ [0, 1] for all instances x. [sent-127, score-0.899]
56 3 Similarity metric based on discriminative models for graph construction We enhance transfer learning by exploiting also the neighborhood structure within novel classes, i. [sent-135, score-0.68]
57 The k-NN graph is usually built on the raw feature descriptors of the data. [sent-140, score-0.228]
58 We note that the visual representation used for label propagation can be independent of the visual representation used for transfer. [sent-142, score-0.297]
59 While the visual representation for transfer is required to provide good generalization abilities in conjunction with the employed supervised learning strategy, the visual representation for label propagation should induce a good neighborhood structure. [sent-143, score-0.784]
60 4 Label propagation with certain and uncertain labels In this work, we build upon the label propagation by [33]. [sent-155, score-0.294]
61 For each class, labels are propagated through this graph structure converging to the following closed form solution L∗ = (I − αS)−1 L(0) for 1 ≤ n ≤ N, n n (9) with the regularization parameter α ∈ (0, 1]. [sent-171, score-0.313]
62 The resulting framework makes use of the manifold structure underlying the novel classes to regulate the predictions from transfer learning. [sent-172, score-0.683]
63 AwA The Animals with Attributes dataset (AwA) [11] is one of the first and most widely used datasets for semantic knowledge transfer and zero-shot recognition. [sent-176, score-0.678]
64 It consists of 1000 image categories which are split into 800 training and 200 test categories according to [23]. [sent-180, score-0.524]
65 It consists of a total of 256 videos, 44 are used for training the attribute representation, 170 are used as test data. [sent-183, score-0.263]
66 2 External knowledge sources and similarity measures Our approach incorporates external knowledge to enable semantic knowledge transfer from known classes y to unseen classes z. [sent-186, score-1.347]
67 We use the class-attribute associations azn for attribute-based transfer m z (Equation 1) or inter-class similarity yun for direct-similarity-based transfer (Equation 2) provided with the datasets. [sent-187, score-1.057]
68 Manual (AwA) AwA is accompanied with a set of 85 attributes and associations to all 40 training and all 10 test classes. [sent-189, score-0.386]
69 Furthermore, the 370 inner nodes can group several classes into attributes [23]. [sent-192, score-0.34]
70 DAP [11] IAP [11] Zero-Shot Learning [9] PST (ours) on image descriptors on attributes 81. [sent-194, score-0.351]
71 Predictions with attributes and manual defined associations, in %. [sent-211, score-0.253]
72 Linguistic knowledge bases (AwA, ImageNet) An alternative to manual association are automatically mined associations. [sent-216, score-0.296]
73 We use the provided similarity matrices which are extracted using different linguistic similarity measures. [sent-217, score-0.27]
74 One can distinguish basic web search (Yahoo Web), web search refined to part associations (Yahoo Holonyms), image search (Yahoo Image and Flickr Image), or use the information of the summary snippets returned by web search (Yahoo Snippets). [sent-219, score-0.407]
75 Script data (MPII Composites) To associate composite cooking activities such as preparing carrots with attributes of fine-grained activities (e. [sent-221, score-0.644]
76 On all datasets we train attribute or object classifiers (for direct similarity) with one-vsall SVMs using Mean Stochastic Gradient Descent [23] and, for AwA and MPII Composites, with a χ2 kernel approximation as in [22]. [sent-232, score-0.326]
77 We validate our claim that the classifier output space induces a better neighborhood structure than the raw features by examining the k-Nearest-Neighbour (kNN) quality for both. [sent-243, score-0.254]
78 In the following, we compare the performance of the raw features with the attribute classifier representation. [sent-253, score-0.293]
79 PST (ours) − Hierachy (inner nodes) PST (ours) − Yahoo Img direct LP + object classifiers 35 0 5 10 15 # training samples per class 20 (b) Few-Shot. [sent-255, score-0.258]
80 Our Propagated Semantic Transfer, using the raw image descriptors to build a neighborhood structure, achieves 81. [sent-272, score-0.353]
81 To understand the difference in performance between the attribute and the image descriptor space we examine the neighborhood quality used for propagating labels shown in Figure 5b. [sent-278, score-0.485]
82 The k-NN accuracy, measured on the ground truth labels, is significantly higher for the attribute space (green dashed curve) compared to the raw features (solid green). [sent-279, score-0.293]
83 We also compute LP in combination with the similarity metric based on the attribute classifier scores (blue curves). [sent-290, score-0.323]
84 This transfer of knowledge residing in the classifier trained on the known classes already gives a significant improvement in performance. [sent-291, score-0.602]
85 The dashed lines in Figure 3b provide results for automatically mined associations azn between m attributes and classes. [sent-302, score-0.53]
86 It is interesting to note that these automatically mined associations achieve performance very close to the manual defined associations (dashed vs. [sent-303, score-0.44]
87 In this plot we use Yahoo Image as base for the semantic relatedness, but we also provide the improvements of PST for the other linguistic language sources in supplemental material. [sent-305, score-0.311]
88 2 ImageNet - large scale image classification In this section we evaluate our Propagated Semantic Transfer approach on a large image classification task with 200 unseen image categories using the setup as proposed by [23]. [sent-308, score-0.516]
89 0 20 40 60 80 k nearest neighours AwA − attribute classifiers AwA − raw features ImageNet − object classifiers ImageNet − raw features 100 (b) Accuracy of the majority vote from kNN (kNN-Classifier) on test sets’ ground truth. [sent-314, score-0.626]
90 3 MPII composite - activity recognition In the last two subsections, we showed the benefit of Propagated Semantic Transfer on two image classification challenges. [sent-326, score-0.291]
91 This is especially impressive as it reaches the level of supervised training: for the same set of attributes (and very few, ≤ 7 training categories per class) [22] achieve 32. [sent-339, score-0.476]
92 We find these results encouraging as it is much more difficult to collect and label training examples for this domain than for image classification and the complexity and compositional nature of activities frequently requires recognizing unseen categories [9]. [sent-343, score-0.612]
93 novel classes, have no or only few labeled training samples. [sent-346, score-0.227]
94 We propose a novel approach named Propagated Semantic Transfer, which integrates semantic knowledge transfer with the visual similarities of unlabeled instances within the novel classes. [sent-347, score-1.079]
95 We adapt a semi-supervised label-propagation approach by building the neighborhood graph on expressive, lowdimensional semantic output space and by initializing it with predictions from knowledge transfer. [sent-348, score-0.538]
96 7% multi-class accuracy on the Animals with Attributes dataset for zero-shot recognition, scale to 200 unseen classes on ImageNet, and achieve up to 34. [sent-352, score-0.223]
97 We show that our approach consistently improves performance independent of factors such as (1) the specific datasets and descriptors, (2) different transfer approaches: direct vs. [sent-355, score-0.437]
98 attributes, (3) types of transfer association: manually defined, linguistic knowledge bases, or script data, (4) domain: image and video activity recognition, or (5) model: probabilistic vs. [sent-356, score-0.787]
99 Single-example learning of novel classes using representation by similarity. [sent-363, score-0.283]
100 Remember and transfer what you have learned - recognizing composite activities based on activity spotting. [sent-376, score-0.641]
wordName wordTfidf (topN-words)
[('transfer', 0.357), ('awa', 0.272), ('pst', 0.272), ('mpii', 0.235), ('semantic', 0.217), ('attributes', 0.196), ('attribute', 0.194), ('zn', 0.183), ('categories', 0.18), ('yahoo', 0.163), ('script', 0.16), ('imagenet', 0.15), ('propagated', 0.145), ('classes', 0.144), ('activities', 0.124), ('associations', 0.121), ('composites', 0.112), ('mined', 0.109), ('similarity', 0.102), ('raw', 0.099), ('neighborhood', 0.099), ('novel', 0.099), ('image', 0.095), ('classifiers', 0.091), ('lp', 0.089), ('propagation', 0.085), ('composite', 0.082), ('unlabeled', 0.081), ('recognition', 0.075), ('azn', 0.072), ('instances', 0.072), ('knowledge', 0.07), ('labels', 0.07), ('training', 0.069), ('graph', 0.069), ('category', 0.069), ('linguistic', 0.066), ('rohrbach', 0.064), ('cooking', 0.064), ('external', 0.06), ('descriptors', 0.06), ('snippets', 0.059), ('labeled', 0.059), ('manual', 0.057), ('eccv', 0.057), ('carrots', 0.054), ('hierachy', 0.054), ('idf', 0.054), ('label', 0.054), ('predictions', 0.054), ('object', 0.052), ('unseen', 0.051), ('wordnet', 0.05), ('classi', 0.049), ('yun', 0.048), ('dap', 0.048), ('hierarchy', 0.047), ('auc', 0.047), ('direct', 0.046), ('similarities', 0.045), ('relatedness', 0.044), ('tf', 0.044), ('web', 0.044), ('animals', 0.043), ('xj', 0.04), ('representation', 0.04), ('recognizing', 0.039), ('activity', 0.039), ('visual', 0.039), ('ap', 0.038), ('yk', 0.038), ('cvpr', 0.037), ('ebert', 0.036), ('freqs', 0.036), ('holonyms', 0.036), ('prepare', 0.036), ('knn', 0.034), ('transferring', 0.034), ('datasets', 0.034), ('enable', 0.034), ('intermediate', 0.034), ('transductive', 0.033), ('img', 0.032), ('ers', 0.032), ('automatically', 0.032), ('trained', 0.031), ('supervised', 0.031), ('wikipedia', 0.031), ('literal', 0.029), ('adapt', 0.029), ('structure', 0.029), ('xi', 0.028), ('accuracy', 0.028), ('bases', 0.028), ('improve', 0.028), ('sources', 0.028), ('acc', 0.028), ('quality', 0.027), ('metric', 0.027), ('bars', 0.027)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999928 335 nips-2013-Transfer Learning in a Transductive Setting
Author: Marcus Rohrbach, Sandra Ebert, Bernt Schiele
Abstract: Category models for objects or activities typically rely on supervised learning requiring sufficiently large training sets. Transferring knowledge from known categories to novel classes with no or only a few labels is far less researched even though it is a common scenario. In this work, we extend transfer learning with semi-supervised learning to exploit unlabeled instances of (novel) categories with no or only a few labeled instances. Our proposed approach Propagated Semantic Transfer combines three techniques. First, we transfer information from known to novel categories by incorporating external knowledge, such as linguistic or expertspecified information, e.g., by a mid-level layer of semantic attributes. Second, we exploit the manifold structure of novel classes. More specifically we adapt a graph-based learning algorithm – so far only used for semi-supervised learning – to zero-shot and few-shot learning. Third, we improve the local neighborhood in such graph structures by replacing the raw feature-based representation with a mid-level object- or attribute-based representation. We evaluate our approach on three challenging datasets in two different applications, namely on Animals with Attributes and ImageNet for image classification and on MPII Composites for activity recognition. Our approach consistently outperforms state-of-the-art transfer and semi-supervised approaches on all datasets. 1
2 0.2428896 356 nips-2013-Zero-Shot Learning Through Cross-Modal Transfer
Author: Richard Socher, Milind Ganjoo, Christopher D. Manning, Andrew Ng
Abstract: This work introduces a model that can recognize objects in images even if no training data is available for the object class. The only necessary knowledge about unseen visual categories comes from unsupervised text corpora. Unlike previous zero-shot learning models, which can only differentiate between unseen classes, our model can operate on a mixture of seen and unseen classes, simultaneously obtaining state of the art performance on classes with thousands of training images and reasonable performance on unseen classes. This is achieved by seeing the distributions of words in texts as a semantic space for understanding what objects look like. Our deep learning model does not require any manually defined semantic or visual features for either words or images. Images are mapped to be close to semantic word vectors corresponding to their classes, and the resulting image embeddings can be used to distinguish whether an image is of a seen or unseen class. We then use novelty detection methods to differentiate unseen classes from seen classes. We demonstrate two novelty detection strategies; the first gives high accuracy on unseen classes, while the second is conservative in its prediction of novelty and keeps the seen classes’ accuracy high. 1
3 0.21906406 81 nips-2013-DeViSE: A Deep Visual-Semantic Embedding Model
Author: Andrea Frome, Greg S. Corrado, Jon Shlens, Samy Bengio, Jeff Dean, Marc'Aurelio Ranzato, Tomas Mikolov
Abstract: Modern visual recognition systems are often limited in their ability to scale to large numbers of object categories. This limitation is in part due to the increasing difficulty of acquiring sufficient training data in the form of labeled images as the number of object categories grows. One remedy is to leverage data from other sources – such as text data – both to train visual models and to constrain their predictions. In this paper we present a new deep visual-semantic embedding model trained to identify visual objects using both labeled image data as well as semantic information gleaned from unannotated text. We demonstrate that this model matches state-of-the-art performance on the 1000-class ImageNet object recognition challenge while making more semantically reasonable errors, and also show that the semantic information can be exploited to make predictions about tens of thousands of image labels not observed during training. Semantic knowledge improves such zero-shot predictions achieving hit rates of up to 18% across thousands of novel labels never seen by the visual model. 1
4 0.16651127 93 nips-2013-Discriminative Transfer Learning with Tree-based Priors
Author: Nitish Srivastava, Ruslan Salakhutdinov
Abstract: High capacity classifiers, such as deep neural networks, often struggle on classes that have very few training examples. We propose a method for improving classification performance for such classes by discovering similar classes and transferring knowledge among them. Our method learns to organize the classes into a tree hierarchy. This tree structure imposes a prior over the classifier’s parameters. We show that the performance of deep neural networks can be improved by applying these priors to the weights in the last layer. Our method combines the strength of discriminatively trained deep neural networks, which typically require large amounts of training data, with tree-based priors, making deep neural networks work well on infrequent classes as well. We also propose an algorithm for learning the underlying tree structure. Starting from an initial pre-specified tree, this algorithm modifies the tree to make it more pertinent to the task being solved, for example, removing semantic relationships in favour of visual ones for an image classification task. Our method achieves state-of-the-art classification results on the CIFAR-100 image data set and the MIR Flickr image-text data set. 1
5 0.14459194 349 nips-2013-Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies
Author: Yangqing Jia, Joshua T. Abbott, Joseph Austerweil, Thomas Griffiths, Trevor Darrell
Abstract: Learning a visual concept from a small number of positive examples is a significant challenge for machine learning algorithms. Current methods typically fail to find the appropriate level of generalization in a concept hierarchy for a given set of visual examples. Recent work in cognitive science on Bayesian models of generalization addresses this challenge, but prior results assumed that objects were perfectly recognized. We present an algorithm for learning visual concepts directly from images, using probabilistic predictions generated by visual classifiers as the input to a Bayesian generalization model. As no existing challenge data tests this paradigm, we collect and make available a new, large-scale dataset for visual concept learning using the ImageNet hierarchy as the source of possible concepts, with human annotators to provide ground truth labels as to whether a new image is an instance of each concept using a paradigm similar to that used in experiments studying word learning in children. We compare the performance of our system to several baseline algorithms, and show a significant advantage results from combining visual classifiers with the ability to identify an appropriate level of abstraction using Bayesian generalization. 1
6 0.13056384 138 nips-2013-Higher Order Priors for Joint Intrinsic Image, Objects, and Attributes Estimation
7 0.10691684 216 nips-2013-On Flat versus Hierarchical Classification in Large-Scale Taxonomies
8 0.092999667 285 nips-2013-Robust Transfer Principal Component Analysis with Rank Constraints
9 0.088205233 149 nips-2013-Latent Structured Active Learning
10 0.086266577 276 nips-2013-Reshaping Visual Datasets for Domain Adaptation
11 0.084112093 278 nips-2013-Reward Mapping for Transfer in Long-Lived Agents
12 0.081390575 274 nips-2013-Relevance Topic Model for Unstructured Social Group Activity Recognition
13 0.075105153 75 nips-2013-Convex Two-Layer Modeling
14 0.075040482 190 nips-2013-Mid-level Visual Element Discovery as Discriminative Mode Seeking
15 0.071898684 83 nips-2013-Deep Fisher Networks for Large-Scale Image Classification
16 0.070685536 33 nips-2013-An Approximate, Efficient LP Solver for LP Rounding
17 0.069761485 289 nips-2013-Scalable kernels for graphs with continuous attributes
18 0.069201089 211 nips-2013-Non-Linear Domain Adaptation with Boosting
19 0.068372801 84 nips-2013-Deep Neural Networks for Object Detection
20 0.068082899 119 nips-2013-Fast Template Evaluation with Vector Quantization
topicId topicWeight
[(0, 0.173), (1, 0.089), (2, -0.139), (3, -0.092), (4, 0.172), (5, -0.086), (6, -0.027), (7, -0.03), (8, -0.09), (9, 0.057), (10, -0.116), (11, -0.014), (12, 0.005), (13, -0.036), (14, -0.024), (15, -0.059), (16, 0.014), (17, 0.039), (18, 0.054), (19, 0.067), (20, 0.033), (21, 0.017), (22, 0.039), (23, -0.005), (24, -0.024), (25, 0.125), (26, 0.081), (27, -0.019), (28, 0.008), (29, 0.036), (30, -0.059), (31, -0.001), (32, -0.038), (33, 0.056), (34, -0.047), (35, -0.012), (36, 0.066), (37, -0.084), (38, -0.123), (39, 0.059), (40, 0.037), (41, 0.018), (42, 0.079), (43, 0.104), (44, 0.067), (45, 0.054), (46, 0.104), (47, 0.092), (48, -0.013), (49, -0.013)]
simIndex simValue paperId paperTitle
same-paper 1 0.96936977 335 nips-2013-Transfer Learning in a Transductive Setting
Author: Marcus Rohrbach, Sandra Ebert, Bernt Schiele
Abstract: Category models for objects or activities typically rely on supervised learning requiring sufficiently large training sets. Transferring knowledge from known categories to novel classes with no or only a few labels is far less researched even though it is a common scenario. In this work, we extend transfer learning with semi-supervised learning to exploit unlabeled instances of (novel) categories with no or only a few labeled instances. Our proposed approach Propagated Semantic Transfer combines three techniques. First, we transfer information from known to novel categories by incorporating external knowledge, such as linguistic or expertspecified information, e.g., by a mid-level layer of semantic attributes. Second, we exploit the manifold structure of novel classes. More specifically we adapt a graph-based learning algorithm – so far only used for semi-supervised learning – to zero-shot and few-shot learning. Third, we improve the local neighborhood in such graph structures by replacing the raw feature-based representation with a mid-level object- or attribute-based representation. We evaluate our approach on three challenging datasets in two different applications, namely on Animals with Attributes and ImageNet for image classification and on MPII Composites for activity recognition. Our approach consistently outperforms state-of-the-art transfer and semi-supervised approaches on all datasets. 1
2 0.86692768 356 nips-2013-Zero-Shot Learning Through Cross-Modal Transfer
Author: Richard Socher, Milind Ganjoo, Christopher D. Manning, Andrew Ng
Abstract: This work introduces a model that can recognize objects in images even if no training data is available for the object class. The only necessary knowledge about unseen visual categories comes from unsupervised text corpora. Unlike previous zero-shot learning models, which can only differentiate between unseen classes, our model can operate on a mixture of seen and unseen classes, simultaneously obtaining state of the art performance on classes with thousands of training images and reasonable performance on unseen classes. This is achieved by seeing the distributions of words in texts as a semantic space for understanding what objects look like. Our deep learning model does not require any manually defined semantic or visual features for either words or images. Images are mapped to be close to semantic word vectors corresponding to their classes, and the resulting image embeddings can be used to distinguish whether an image is of a seen or unseen class. We then use novelty detection methods to differentiate unseen classes from seen classes. We demonstrate two novelty detection strategies; the first gives high accuracy on unseen classes, while the second is conservative in its prediction of novelty and keeps the seen classes’ accuracy high. 1
3 0.79889286 81 nips-2013-DeViSE: A Deep Visual-Semantic Embedding Model
Author: Andrea Frome, Greg S. Corrado, Jon Shlens, Samy Bengio, Jeff Dean, Marc'Aurelio Ranzato, Tomas Mikolov
Abstract: Modern visual recognition systems are often limited in their ability to scale to large numbers of object categories. This limitation is in part due to the increasing difficulty of acquiring sufficient training data in the form of labeled images as the number of object categories grows. One remedy is to leverage data from other sources – such as text data – both to train visual models and to constrain their predictions. In this paper we present a new deep visual-semantic embedding model trained to identify visual objects using both labeled image data as well as semantic information gleaned from unannotated text. We demonstrate that this model matches state-of-the-art performance on the 1000-class ImageNet object recognition challenge while making more semantically reasonable errors, and also show that the semantic information can be exploited to make predictions about tens of thousands of image labels not observed during training. Semantic knowledge improves such zero-shot predictions achieving hit rates of up to 18% across thousands of novel labels never seen by the visual model. 1
4 0.73520708 349 nips-2013-Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies
Author: Yangqing Jia, Joshua T. Abbott, Joseph Austerweil, Thomas Griffiths, Trevor Darrell
Abstract: Learning a visual concept from a small number of positive examples is a significant challenge for machine learning algorithms. Current methods typically fail to find the appropriate level of generalization in a concept hierarchy for a given set of visual examples. Recent work in cognitive science on Bayesian models of generalization addresses this challenge, but prior results assumed that objects were perfectly recognized. We present an algorithm for learning visual concepts directly from images, using probabilistic predictions generated by visual classifiers as the input to a Bayesian generalization model. As no existing challenge data tests this paradigm, we collect and make available a new, large-scale dataset for visual concept learning using the ImageNet hierarchy as the source of possible concepts, with human annotators to provide ground truth labels as to whether a new image is an instance of each concept using a paradigm similar to that used in experiments studying word learning in children. We compare the performance of our system to several baseline algorithms, and show a significant advantage results from combining visual classifiers with the ability to identify an appropriate level of abstraction using Bayesian generalization. 1
5 0.71770483 138 nips-2013-Higher Order Priors for Joint Intrinsic Image, Objects, and Attributes Estimation
Author: Vibhav Vineet, Carsten Rother, Philip Torr
Abstract: Many methods have been proposed to solve the problems of recovering intrinsic scene properties such as shape, reflectance and illumination from a single image, and object class segmentation separately. While these two problems are mutually informative, in the past not many papers have addressed this topic. In this work we explore such joint estimation of intrinsic scene properties recovered from an image, together with the estimation of the objects and attributes present in the scene. In this way, our unified framework is able to capture the correlations between intrinsic properties (reflectance, shape, illumination), objects (table, tv-monitor), and materials (wooden, plastic) in a given scene. For example, our model is able to enforce the condition that if a set of pixels take same object label, e.g. table, most likely those pixels would receive similar reflectance values. We cast the problem in an energy minimization framework and demonstrate the qualitative and quantitative improvement in the overall accuracy on the NYU and Pascal datasets. 1
6 0.65097117 216 nips-2013-On Flat versus Hierarchical Classification in Large-Scale Taxonomies
7 0.62061179 93 nips-2013-Discriminative Transfer Learning with Tree-based Priors
8 0.58493775 226 nips-2013-One-shot learning by inverting a compositional causal process
9 0.57627952 84 nips-2013-Deep Neural Networks for Object Detection
10 0.57403606 279 nips-2013-Robust Bloom Filters for Large MultiLabel Classification Tasks
11 0.57360286 166 nips-2013-Learning invariant representations and applications to face verification
12 0.56969577 27 nips-2013-Adaptive Multi-Column Deep Neural Networks with Application to Robust Image Denoising
13 0.56342977 223 nips-2013-On the Relationship Between Binary Classification, Bipartite Ranking, and Binary Class Probability Estimation
14 0.55957627 182 nips-2013-Manifold-based Similarity Adaptation for Label Propagation
15 0.55035704 343 nips-2013-Unsupervised Structure Learning of Stochastic And-Or Grammars
16 0.54300147 195 nips-2013-Modeling Clutter Perception using Parametric Proto-object Partitioning
17 0.5422703 12 nips-2013-A Novel Two-Step Method for Cross Language Representation Learning
18 0.54002339 163 nips-2013-Learning a Deep Compact Image Representation for Visual Tracking
19 0.52624184 119 nips-2013-Fast Template Evaluation with Vector Quantization
20 0.52429855 183 nips-2013-Mapping paradigm ontologies to and from the brain
topicId topicWeight
[(16, 0.026), (33, 0.203), (34, 0.074), (41, 0.018), (49, 0.074), (56, 0.083), (69, 0.194), (70, 0.045), (85, 0.07), (89, 0.042), (93, 0.057), (95, 0.018)]
simIndex simValue paperId paperTitle
1 0.91001189 98 nips-2013-Documents as multiple overlapping windows into grids of counts
Author: Alessandro Perina, Nebojsa Jojic, Manuele Bicego, Andrzej Truski
Abstract: In text analysis documents are often represented as disorganized bags of words; models of such count features are typically based on mixing a small number of topics [1, 2]. Recently, it has been observed that for many text corpora documents evolve into one another in a smooth way, with some features dropping and new ones being introduced. The counting grid [3] models this spatial metaphor literally: it is a grid of word distributions learned in such a way that a document’s own distribution of features can be modeled as the sum of the histograms found in a window into the grid. The major drawback of this method is that it is essentially a mixture and all the content must be generated by a single contiguous area on the grid. This may be problematic especially for lower dimensional grids. In this paper, we overcome this issue by introducing the Componential Counting Grid which brings the componential nature of topic models to the basic counting grid. We evaluated our approach on document classification and multimodal retrieval obtaining state of the art results on standard benchmarks. 1
same-paper 2 0.86446315 335 nips-2013-Transfer Learning in a Transductive Setting
Author: Marcus Rohrbach, Sandra Ebert, Bernt Schiele
Abstract: Category models for objects or activities typically rely on supervised learning requiring sufficiently large training sets. Transferring knowledge from known categories to novel classes with no or only a few labels is far less researched even though it is a common scenario. In this work, we extend transfer learning with semi-supervised learning to exploit unlabeled instances of (novel) categories with no or only a few labeled instances. Our proposed approach Propagated Semantic Transfer combines three techniques. First, we transfer information from known to novel categories by incorporating external knowledge, such as linguistic or expertspecified information, e.g., by a mid-level layer of semantic attributes. Second, we exploit the manifold structure of novel classes. More specifically we adapt a graph-based learning algorithm – so far only used for semi-supervised learning – to zero-shot and few-shot learning. Third, we improve the local neighborhood in such graph structures by replacing the raw feature-based representation with a mid-level object- or attribute-based representation. We evaluate our approach on three challenging datasets in two different applications, namely on Animals with Attributes and ImageNet for image classification and on MPII Composites for activity recognition. Our approach consistently outperforms state-of-the-art transfer and semi-supervised approaches on all datasets. 1
3 0.78002983 331 nips-2013-Top-Down Regularization of Deep Belief Networks
Author: Hanlin Goh, Nicolas Thome, Matthieu Cord, Joo-Hwee Lim
Abstract: Designing a principled and effective algorithm for learning deep architectures is a challenging problem. The current approach involves two training phases: a fully unsupervised learning followed by a strongly discriminative optimization. We suggest a deep learning strategy that bridges the gap between the two phases, resulting in a three-phase learning procedure. We propose to implement the scheme using a method to regularize deep belief networks with top-down information. The network is constructed from building blocks of restricted Boltzmann machines learned by combining bottom-up and top-down sampled signals. A global optimization procedure that merges samples from a forward bottom-up pass and a top-down pass is used. Experiments on the MNIST dataset show improvements over the existing algorithms for deep belief networks. Object recognition results on the Caltech-101 dataset also yield competitive results. 1
4 0.77084428 304 nips-2013-Sparse nonnegative deconvolution for compressive calcium imaging: algorithms and phase transitions
Author: Eftychios A. Pnevmatikakis, Liam Paninski
Abstract: We propose a compressed sensing (CS) calcium imaging framework for monitoring large neuronal populations, where we image randomized projections of the spatial calcium concentration at each timestep, instead of measuring the concentration at individual locations. We develop scalable nonnegative deconvolution methods for extracting the neuronal spike time series from such observations. We also address the problem of demixing the spatial locations of the neurons using rank-penalized matrix factorization methods. By exploiting the sparsity of neural spiking we demonstrate that the number of measurements needed per timestep is significantly smaller than the total number of neurons, a result that can potentially enable imaging of larger populations at considerably faster rates compared to traditional raster-scanning techniques. Unlike traditional CS setups, our problem involves a block-diagonal sensing matrix and a non-orthogonal sparse basis that spans multiple timesteps. We provide tight approximations to the number of measurements needed for perfect deconvolution for certain classes of spiking processes, and show that this number undergoes a “phase transition,” which we characterize using modern tools relating conic geometry to compressed sensing. 1
5 0.76768905 190 nips-2013-Mid-level Visual Element Discovery as Discriminative Mode Seeking
Author: Carl Doersch, Abhinav Gupta, Alexei A. Efros
Abstract: Recent work on mid-level visual representations aims to capture information at the level of complexity higher than typical “visual words”, but lower than full-blown semantic objects. Several approaches [5, 6, 12, 23] have been proposed to discover mid-level visual elements, that are both 1) representative, i.e., frequently occurring within a visual dataset, and 2) visually discriminative. However, the current approaches are rather ad hoc and difficult to analyze and evaluate. In this work, we pose visual element discovery as discriminative mode seeking, drawing connections to the the well-known and well-studied mean-shift algorithm [2, 1, 4, 8]. Given a weakly-labeled image collection, our method discovers visually-coherent patch clusters that are maximally discriminative with respect to the labels. One advantage of our formulation is that it requires only a single pass through the data. We also propose the Purity-Coverage plot as a principled way of experimentally analyzing and evaluating different visual discovery approaches, and compare our method against prior work on the Paris Street View dataset of [5]. We also evaluate our method on the task of scene classification, demonstrating state-of-the-art performance on the MIT Scene-67 dataset. 1
6 0.76622087 22 nips-2013-Action is in the Eye of the Beholder: Eye-gaze Driven Model for Spatio-Temporal Action Localization
7 0.76580405 334 nips-2013-Training and Analysing Deep Recurrent Neural Networks
8 0.76538366 213 nips-2013-Nonparametric Multi-group Membership Model for Dynamic Networks
9 0.7650851 300 nips-2013-Solving the multi-way matching problem by permutation synchronization
10 0.76402515 349 nips-2013-Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies
11 0.76364785 275 nips-2013-Reservoir Boosting : Between Online and Offline Ensemble Learning
12 0.76357192 200 nips-2013-Multi-Prediction Deep Boltzmann Machines
13 0.76285279 64 nips-2013-Compete to Compute
14 0.76274306 301 nips-2013-Sparse Additive Text Models with Low Rank Background
15 0.76194471 276 nips-2013-Reshaping Visual Datasets for Domain Adaptation
16 0.76187617 285 nips-2013-Robust Transfer Principal Component Analysis with Rank Constraints
17 0.76158088 356 nips-2013-Zero-Shot Learning Through Cross-Modal Transfer
18 0.76072484 163 nips-2013-Learning a Deep Compact Image Representation for Visual Tracking
19 0.76056945 236 nips-2013-Optimal Neural Population Codes for High-dimensional Stimulus Variables
20 0.75924951 176 nips-2013-Linear decision rule as aspiration for simple decision heuristics