nips nips2012 nips2012-168 knowledge-graph by maker-knowledge-mining

168 nips-2012-Kernel Latent SVM for Visual Recognition

Source: pdf

Author: Weilong Yang, Yang Wang, Arash Vahdat, Greg Mori

Abstract: Latent SVMs (LSVMs) are a class of powerful tools that have been successfully applied to many applications in computer vision. However, a limitation of LSVMs is that they rely on linear models. For many computer vision tasks, linear models are suboptimal and nonlinear models learned with kernels typically perform much better. Therefore it is desirable to develop the kernel version of LSVM. In this paper, we propose kernel latent SVM (KLSVM) – a new learning framework that combines latent SVMs and kernel methods. We develop an iterative training algorithm to learn the model parameters. We demonstrate the effectiveness of KLSVM using three different applications in visual recognition. Our KLSVM formulation is very general and can be applied to solve a wide range of applications in computer vision and machine learning. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 In this paper, we propose kernel latent SVM (KLSVM) – a new learning framework that combines latent SVMs and kernel methods. [sent-11, score-0.752]

2 The person detection algorithm in [2] is an example of the success of linear classiﬁers in computer vision. [sent-21, score-0.236]

3 The ﬁrst one is to introduce latent variables into the linear model. [sent-29, score-0.292]

4 DPM captures shape and pose variations of an object class with a root template covering the whole object and several part templates. [sent-31, score-0.335]

5 Learning a DPM involves solving a latent 1 Without loss of generality, we assume linear models without the bias term. [sent-33, score-0.26]

6 1 SVM (LSVM) [5, 17] – an extension of regular linear SVM for handling latent variables. [sent-34, score-0.285]

7 For example, in object detection, the training data are weakly labeled because we are only given the bounding boxes of the objects without the detailed annotation for each part. [sent-36, score-0.38]

8 In addition to modeling part deformation, another popular application of LSVM is to use it as a mixture model where the mixture component is represented as a latent variable [5, 6, 16]. [sent-37, score-0.263]

9 A limitation of kernel methods is that the learning is more expensive than linear classiﬁers on large datasets, although efﬁcient algorithms exist for certain types of kernels (e. [sent-40, score-0.245]

10 The latent variables in LSVM can often have some intuitive and semantic meanings. [sent-47, score-0.283]

11 Examples of latent variables in the literature include part locations in object detection [5], subcategories in video annotation [16], object localization in image classiﬁcation [8], etc. [sent-49, score-0.726]

12 Since the number of support vectors can vary depending on the training data, kernel methods can adapt their model complexity to ﬁt the data. [sent-54, score-0.211]

13 In this paper, we propose kernel latent SVM (KLSVM) – a new learning framework that combines latent SVMs and kernel methods. [sent-55, score-0.752]

14 On one hand, the latent variables in KLSVM can be something intuitive and semantically meaningful. [sent-57, score-0.283]

15 We demonstrate KLSVM on three applications in visual recognition: 1) object classiﬁcation with latent localization; 2) object classiﬁcation with latent subcategories; 3) recognition of object interactions. [sent-59, score-0.906]

16 2 Preliminaries In this section, we introduce some background on latent SVM and on the dual form of SVMs used for deriving kernel SVMs. [sent-60, score-0.431]

17 Each instance is also associated with a latent variable h that captures some unobserved information about the data. [sent-64, score-0.312]

18 To simplify the notation, we also assume the latent variable h takes its value from a discrete set of labels h ∈ H. [sent-74, score-0.263]

19 In latent SVM, the scoring function of sample x is deﬁned as fw (x) = maxh w φ(x, h), where φ(x, h) is the feature vector deﬁned for the pair of (x, h). [sent-81, score-0.332]

20 For example, in the “car model” example, φ(x, h) can be a feature vector extracted from the image patch at location h of the image x. [sent-82, score-0.252]

21 However, the learning problem becomes convex once the latent variable h is ﬁxed for positive examples. [sent-85, score-0.29]

22 More formally, we are given 2 M positive samples {xi , hi }M , and N negative samples {xj }M +N . [sent-93, score-0.297]

23 αi + i j h 1 βj,h − || 2 βj,h φ(xj , h)||2 (2a) αi φ(xi , hi ) − i j h 0 ≤ αi ≤ C1 , ∀i; 0 ≤ βj,h ≤ C2 , ∀j, ∀h ∈ H ∗ (2b) ∗ ∗ The optimal primal parameters w for Eq. [sent-117, score-0.27]

24 2 are related as follows: w∗ = ∗ αi φ(xi , hi ) − ∗ βj,h φ(xj , h) i j (3) h Let us deﬁne λ to be the concatenations of {αi : ∀i} and {βj,h : ∀j, ∀h ∈ H}, so |λ| = M +N ×|H|. [sent-119, score-0.27]

25 Ψ is obtained by stacking together {φ(xi , hi ) : ∀i} and {−φ(xj , h) : ∀j, ∀h ∈ H}. [sent-121, score-0.27]

26 The scoring function for the testing images xnew can be kernelized as follows: f (xnew ) = maxhnew i ∗ αi k(φ(xi , hi ), φ(xnew , hnew )) − j h ∗ βj,h k(φ(xj , h), φ(xnew , hnew )) . [sent-128, score-0.54]

27 In the next section, we will exploit this fact to develop the kernel latent support vector machines. [sent-132, score-0.399]

28 In this section, we propose kernel latent SVM (KLSVM) – a new latent variable learning method that only requires a kernel function K(x, h, x , h ) between a pair of (x, h) and (x , h ). [sent-136, score-0.787]

29 0 ≤ αi ≤ C1 , ∀i; h 1 βj,h − || 2 βj,h φ(xj , h)||2 (5b) αi φ(xi , hi ) − i j h 0 ≤ βj,h ≤ C2 , ∀j, ∀h ∈ H (5c) The most straightforward way of solving Eq. [sent-146, score-0.27]

30 When hi takes its value from a discrete set of K possible choices (i. [sent-148, score-0.27]

31 8(b) as follows h∗ = arg max αt αt k(φ(xt , ht ), φ(xt , ht )) + 2 t ht αi αt k(φ(xi , hi ), φ(xt , ht )) i:i=t −2 βj,h αt k(φ(xj , h), φ(xt , ht )) j (9) h It is interesting to notice that if the t-th example is not a support vector (i. [sent-168, score-1.742]

32 For other positive examples (non-support vectors), we can simply set their latent variables the same 4 (7) as the previous iteration. [sent-174, score-0.313]

33 8 becomes: h∗ = arg max ||αt φ(xt , ht )||2 + 2 w − αt φ(xt , hold ) t t αt φ(xt , ht ) (10a) ht 1 2 2 ⇔ arg max αt w φ(xt , ht ) + αt ||φ(xt , ht )||2 − αt φ(xt , hold ) φ(xt , ht ) t 2 ht (10b) where hold is the value of latent variable of the t-th example in the previous iteration. [sent-181, score-2.501]

34 In this case, αt φ(xt , ht ) φ(xt , ht ) is a constant, and we have φ(xt , hold ) φ(xt , hold ) > φ(xt , hold ) φ(xt , ht ) if ht = hold . [sent-183, score-1.384]

35 10 is equivalent to: t t t t h∗ = arg max w φ(xt , ht ) − αt φ(xt , hold ) φ(xt , ht ) t t (11) ht Eq. [sent-185, score-0.946]

36 , h∗ = arg maxht w φ(xt , ht ), but t with an extra term αt φ(xt , hold ) φ(xt , ht ) which penalizes the choice of ht for being the same t value as previous iteration hold . [sent-188, score-1.035]

37 If the t-th positive t example is a support vector, the latent variable hold from previous iteration causes this example to lie very close to (or even on the wrong side) the decision boundary, i. [sent-190, score-0.4]

38 The amount of penalty depends on the magnitudes of αt and φ(xt , hold ) φ(xt , ht ). [sent-195, score-0.346]

39 We can interpret αt as how t “bad” hold is, and φ(xt , hold ) φ(xt , ht ) as how close ht is to hold . [sent-196, score-0.755]

40 2 Composite Kernels So far we have assumed that the latent variable h takes its value from a discrete set of labels. [sent-200, score-0.263]

41 First of all, it allows us to exploit structural information in the latent variables. [sent-206, score-0.228]

42 In the following, we will show how to compose a new kernel for the “person riding a bike” classiﬁer from those components. [sent-221, score-0.28]

43 We denote the latent variable using h to emphasize that now it is a vector instead of a single discrete value. [sent-222, score-0.263]

44 For the structured latent variable, it is assumed that there are certain dependencies between some pairs of (zu , zv ). [sent-227, score-0.28]

45 We can use an undirected graph G = (V, E) to capture the structure of the latent variable, where a vertex u ∈ V corresponds to the label zu , and an edge (u, v) ∈ E corresponds to the dependency between zu and zv . [sent-228, score-0.66]

46 The latent variable in this case has two components h = (zperson , zbike ) corresponding to the location of person and bike, respectively. [sent-230, score-0.509]

47 On the training data, we have access to the ground-truth bounding box of “person riding a bike” as a whole, but not the exact location of “person” or “bike” within the bounding box. [sent-231, score-0.463]

48 Suppose we already have kernel functions corresponding to the vertices and edges in the graph, we can then deﬁne the composite kernel as the summation of the kernels over all the vertices and edges. [sent-237, score-0.367]

49 5 Figure 1: Visualization of how the latent variable (i. [sent-238, score-0.263]

50 The red bounding box corresponds to the initial object location. [sent-241, score-0.264]

51 The blue bounding box corresponds to the object location after the learning. [sent-242, score-0.32]

52 Method BOF + linear SVM BOF + kernel SVM linear LSVM KLSVM Acc (%) 45. [sent-243, score-0.212]

53 K(Φ(x, h), Φ(x , h )) = ku (φ(x, zu ), φ(x , zu )) + u∈V kuv (ψ(x, zu , zv ), ψ(x , zu , zv )) (12) (u,v)∈E When the latent variable h forms a tree structure, there exist efﬁcient inference algorithms for solving Eq. [sent-253, score-1.179]

54 Each application has a different type of latent variables. [sent-258, score-0.228]

55 Our training data only have image-level labels indicating the presence/absence of each object category in an image. [sent-263, score-0.219]

56 The exact object location in the image is not provided and is considered as the latent variable h in our formulation. [sent-264, score-0.508]

57 We deﬁne the feature vector φ(x, h) as the HOG feature extracted from the image at location h. [sent-265, score-0.221]

58 We assume the object size is the same for the images of the same category, which is a reasonable assumption for this dataset. [sent-270, score-0.204]

59 To demonstrate the beneﬁt of using latent variables, we also compare with two simple baselines using linear and kernel SVMs based on bag-offeatures (BOF) extracted from the whole image (i. [sent-273, score-0.561]

60 We use the histogram intersection kernel (HIK) [10] since it has been proved to be successful for vision applications, and efﬁcient learning/inference algorithms exist for this kernel. [sent-278, score-0.215]

61 In each round, we randomly split the images from each category into training and testing sets. [sent-280, score-0.201]

62 For both linear LSVM and KLSVM, we initialize the latent variable at the center location of each image and we set C1 = C2 = 1. [sent-281, score-0.416]

63 So BOF feature without latent variables cannot capture the subtle differences between each category. [sent-287, score-0.294]

64 1 shows examples of how the latent variables change on some training images during the learning of the KLSVM. [sent-290, score-0.406]

65 For each training image, the location of the object (latent variable h) is initialized to the center of the image. [sent-291, score-0.255]

66 After the learning algorithm terminates, the latent variables accurately locate the objects. [sent-292, score-0.291]

67 Method non-latent linear SVM linear LSVM non-latent kernel SVM KLSVM Acc (%) 50. [sent-296, score-0.212]

68 But here we consider a different type of latent variable. [sent-309, score-0.228]

69 Here we use the latent variable h to indicate the subcategory an image belongs to. [sent-317, score-0.439]

70 If a training image belongs to the class c, its subcategory label h takes value from a set Hc of subcategory labels corresponding to the c-th class. [sent-318, score-0.327]

71 Note that subcategories are latent on the training data, so they may or may not have semantic meanings. [sent-319, score-0.327]

72 Results: Again we compare with three baselines: linear LSVM, non-latent linear SVM, non-latent kernel SVM. [sent-328, score-0.212]

73 For non-latent approaches, we simply feed feature vector φ(x) to SVMs without using any latent variable. [sent-330, score-0.262]

74 01 for all the experiments and initialize the subcategory labels of training images by k-means clustering. [sent-334, score-0.231]

75 It is interesting to note that both linear LSVM and KLSVM outperform their non-latent counterparts, which demonstrates the effectiveness of using latent subcategories in object classiﬁcation. [sent-337, score-0.443]

76 3 Recognition of Object Interaction Problem and Dataset: Finally, we consider an application where the latent variable is more complex and requires the composite kernel introduced in Sec. [sent-343, score-0.441]

77 Each image is only associated with one of the four object interaction label. [sent-351, score-0.225]

78 Our approach: We treat the locations of objects as latent variables. [sent-354, score-0.298]

79 For example, when learning the model for “person riding a bicycle”, we treat the locations of “person” and “bicycle” as latent variables. [sent-355, score-0.401]

80 In this example, each image is associated with latent variables h = (z1 , z2 ), where z1 denotes the location of the “person” and z2 denotes the location of the “bicycle”. [sent-356, score-0.437]

81 To reduce the search space of inference, we ﬁrst apply off-the-shelf “person” and “bicycle” detectors [5] on 7 Method BOF + linear SVM BOF + kernel SVM linear LSVM KLSVM Acc(%) 42. [sent-357, score-0.212]

82 For the approaches using latent variables, we show the mean/std of classiﬁcation accuracies over ﬁve folds of experiments. [sent-364, score-0.259]

83 Figure 3: Visualization of how latent variables (i. [sent-365, score-0.26]

84 The left image is from the “person riding a bicycle” category, and the right image is from the “person next to a car” category. [sent-368, score-0.262]

85 Yellow bounding boxes corresponds to the initial object locations. [sent-369, score-0.281]

86 The blue bounding boxes correspond to the object locations after the learning. [sent-370, score-0.322]

87 Then the kernel between two images can be deﬁned as follows: K(Φ(x, h), Φ(x , h )) = ku (φ(x, zu ), φ(x , zu )) + kp (ψ(z1 , z2 ), ψ(z1 , z2 )) (13) u={1,2} We deﬁne φ(x, zu ) as the bag-of-features (BOF) extracted from the bounding box zu in the image x. [sent-377, score-1.332]

88 The kernel kp (·) captures the spatial relationship between z1 and z2 such as above, below, overlapping, next-to, near, and far. [sent-383, score-0.227]

89 Note that this is a strong baseline since [3] has shown that a similar pyramid feature representation with kernel SVM achieves top performances on the task of person-object interaction recognition. [sent-391, score-0.218]

90 We run the experiments for ﬁve rounds for approaches using latent variables. [sent-396, score-0.256]

91 The kernel latent SVM that uses HIK for ku (·) achieves the best performance. [sent-399, score-0.428]

92 3 shows examples of how the latent variables change on some training images during the learning of the KLSVM. [sent-401, score-0.406]

93 For each training image, both latent variables z1 and z2 are randomly initialized to one of ﬁve candidate bounding boxes. [sent-402, score-0.395]

94 As we can see, the initial bounding boxes can accurately locate the target objects but their spatial relations are different to ground-truth labels. [sent-403, score-0.217]

95 After learning algorithm terminates, the latent variables not only locate the target objects, but more importantly they also capture the correct spatial relationship between objects. [sent-404, score-0.291]

96 5 Conclusion We have proposed kernel latent SVM – a new learning framework that combines the beneﬁts of LSVM and kernel methods. [sent-405, score-0.524]

97 The latent variables can not only be a single discrete value, but also be more complex values with interdependent structures. [sent-407, score-0.26]

98 Our experimental results on three different applications in visual recognition demonstrate that KLSVM outperforms using LSVM or using kernel methods alone. [sent-408, score-0.226]

99 Classiﬁcation using intersection kernel support vector machines is efﬁcient. [sent-476, score-0.194]

100 Discriminative tag learning on youtube videos with latent sub-tags. [sent-518, score-0.254]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('lsvm', 0.476), ('klsvm', 0.378), ('ht', 0.283), ('hi', 0.27), ('latent', 0.228), ('zu', 0.19), ('bof', 0.175), ('person', 0.154), ('kernel', 0.148), ('svm', 0.143), ('riding', 0.132), ('hik', 0.127), ('object', 0.124), ('xt', 0.121), ('subcategory', 0.111), ('bicycle', 0.11), ('bike', 0.102), ('bounding', 0.095), ('images', 0.08), ('car', 0.075), ('bird', 0.068), ('svms', 0.067), ('image', 0.065), ('xnew', 0.063), ('hold', 0.063), ('boxes', 0.062), ('classi', 0.062), ('subcategories', 0.059), ('hog', 0.058), ('location', 0.056), ('category', 0.055), ('kp', 0.055), ('dual', 0.055), ('fraser', 0.054), ('zv', 0.052), ('ku', 0.052), ('dpm', 0.048), ('xj', 0.046), ('ve', 0.045), ('box', 0.045), ('vision', 0.044), ('visual', 0.043), ('mammal', 0.041), ('kernels', 0.041), ('locations', 0.041), ('training', 0.04), ('fw', 0.038), ('hc', 0.038), ('template', 0.037), ('acc', 0.036), ('hnew', 0.036), ('lsvms', 0.036), ('zbike', 0.036), ('zperson', 0.036), ('interaction', 0.036), ('variable', 0.035), ('recognition', 0.035), ('arg', 0.034), ('feature', 0.034), ('extracted', 0.032), ('linear', 0.032), ('variables', 0.032), ('maxh', 0.032), ('mori', 0.032), ('locate', 0.031), ('ers', 0.031), ('accuracies', 0.031), ('composite', 0.03), ('weakly', 0.03), ('baselines', 0.03), ('objects', 0.029), ('visually', 0.029), ('kernelized', 0.029), ('simon', 0.028), ('rounds', 0.028), ('nonlinear', 0.028), ('labelings', 0.028), ('boat', 0.028), ('localization', 0.027), ('positive', 0.027), ('detector', 0.027), ('whole', 0.026), ('testing', 0.026), ('youtube', 0.026), ('examples', 0.026), ('detection', 0.026), ('penalizes', 0.026), ('unobserved', 0.025), ('margin', 0.025), ('handling', 0.025), ('decision', 0.024), ('computer', 0.024), ('enumerating', 0.024), ('xi', 0.024), ('limitation', 0.024), ('captures', 0.024), ('intersection', 0.023), ('support', 0.023), ('intuitive', 0.023), ('cation', 0.023)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 168 nips-2012-Kernel Latent SVM for Visual Recognition

Author: Weilong Yang, Yang Wang, Arash Vahdat, Greg Mori

2 0.17182815 195 nips-2012-Learning visual motion in recurrent neural networks

Author: Marius Pachitariu, Maneesh Sahani

Abstract: We present a dynamic nonlinear generative model for visual motion based on a latent representation of binary-gated Gaussian variables. Trained on sequences of images, the model learns to represent different movement directions in different variables. We use an online approximate inference scheme that can be mapped to the dynamics of networks of neurons. Probed with drifting grating stimuli and moving bars of light, neurons in the model show patterns of responses analogous to those of direction-selective simple cells in primary visual cortex. Most model neurons also show speed tuning and respond equally well to a range of motion directions and speeds aligned to the constraint line of their respective preferred speed. We show how these computations are enabled by a speciﬁc pattern of recurrent connections learned by the model. 1

3 0.15132993 197 nips-2012-Learning with Recursive Perceptual Representations

Author: Oriol Vinyals, Yangqing Jia, Li Deng, Trevor Darrell

Abstract: Linear Support Vector Machines (SVMs) have become very popular in vision as part of state-of-the-art object recognition and other classiﬁcation tasks but require high dimensional feature spaces for good performance. Deep learning methods can ﬁnd more compact representations but current methods employ multilayer perceptrons that require solving a difﬁcult, non-convex optimization problem. We propose a deep non-linear classiﬁer whose layers are SVMs and which incorporates random projection as its core stacking element. Our method learns layers of linear SVMs recursively transforming the original data manifold through a random projection of the weak prediction computed from each layer. Our method scales as linear SVMs, does not rely on any kernel computations or nonconvex optimization, and exhibits better generalization ability than kernel-based SVMs. This is especially true when the number of training samples is smaller than the dimensionality of data, a common scenario in many real-world applications. The use of random projections is key to our method, as we show in the experiments section, in which we observe a consistent improvement over previous –often more complicated– methods on several vision and speech benchmarks. 1

4 0.14664666 344 nips-2012-Timely Object Recognition

Author: Sergey Karayev, Tobias Baumgartner, Mario Fritz, Trevor Darrell

Abstract: In a large visual multi-class detection framework, the timeliness of results can be crucial. Our method for timely multi-class detection aims to give the best possible performance at any single point after a start time; it is terminated at a deadline time. Toward this goal, we formulate a dynamic, closed-loop policy that infers the contents of the image in order to decide which detector to deploy next. In contrast to previous work, our method signiﬁcantly diverges from the predominant greedy strategies, and is able to learn to take actions with deferred values. We evaluate our method with a novel timeliness measure, computed as the area under an Average Precision vs. Time curve. Experiments are conducted on the PASCAL VOC object detection dataset. If execution is stopped when only half the detectors have been run, our method obtains 66% better AP than a random ordering, and 14% better performance than an intelligent baseline. On the timeliness measure, our method obtains at least 11% better performance. Our method is easily extensible, as it treats detectors and classiﬁers as black boxes and learns from execution traces using reinforcement learning. 1

5 0.13024627 188 nips-2012-Learning from Distributions via Support Measure Machines

Author: Krikamol Muandet, Kenji Fukumizu, Francesco Dinuzzo, Bernhard Schölkopf

Abstract: This paper presents a kernel-based discriminative learning framework on probability measures. Rather than relying on large collections of vectorial training examples, our framework learns using a collection of probability distributions that have been constructed to meaningfully represent training data. By representing these probability distributions as mean embeddings in the reproducing kernel Hilbert space (RKHS), we are able to apply many standard kernel-based learning techniques in straightforward fashion. To accomplish this, we construct a generalization of the support vector machine (SVM) called a support measure machine (SMM). Our analyses of SMMs provides several insights into their relationship to traditional SVMs. Based on such insights, we propose a ﬂexible SVM (FlexSVM) that places different kernel functions on each training example. Experimental results on both synthetic and real-world data demonstrate the effectiveness of our proposed framework. 1

6 0.12357195 360 nips-2012-Visual Recognition using Embedded Feature Selection for Curvature Self-Similarity

7 0.12336375 13 nips-2012-A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function

8 0.11085517 311 nips-2012-Shifting Weights: Adapting Object Detectors from Image to Video

9 0.10621398 357 nips-2012-Unsupervised Template Learning for Fine-Grained Object Recognition

10 0.10492373 289 nips-2012-Recognizing Activities by Attribute Dynamics

11 0.09788432 121 nips-2012-Expectation Propagation in Gaussian Process Dynamical Systems

12 0.095745601 324 nips-2012-Stochastic Gradient Descent with Only One Projection

13 0.089822128 106 nips-2012-Dynamical And-Or Graph Learning for Object Shape Modeling and Detection

14 0.088769548 228 nips-2012-Multilabel Classification using Bayesian Compressed Sensing

15 0.087238744 1 nips-2012-3D Object Detection and Viewpoint Estimation with a Deformable 3D Cuboid Model

16 0.086176179 209 nips-2012-Max-Margin Structured Output Regression for Spatio-Temporal Action Localization

17 0.08585266 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs

18 0.085665718 242 nips-2012-Non-linear Metric Learning

19 0.085624762 264 nips-2012-Optimal kernel choice for large-scale two-sample tests

20 0.082854815 306 nips-2012-Semantic Kernel Forests from Multiple Taxonomies

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.212), (1, 0.033), (2, -0.122), (3, 0.05), (4, 0.13), (5, -0.119), (6, -0.009), (7, -0.025), (8, -0.035), (9, -0.101), (10, 0.002), (11, 0.064), (12, 0.177), (13, -0.032), (14, 0.08), (15, 0.045), (16, -0.028), (17, 0.068), (18, -0.009), (19, 0.071), (20, -0.008), (21, -0.071), (22, 0.036), (23, -0.164), (24, -0.064), (25, 0.025), (26, -0.031), (27, -0.036), (28, 0.001), (29, 0.026), (30, -0.138), (31, -0.048), (32, -0.01), (33, 0.01), (34, -0.003), (35, -0.016), (36, -0.07), (37, -0.04), (38, 0.112), (39, -0.06), (40, -0.048), (41, 0.056), (42, -0.002), (43, 0.071), (44, 0.025), (45, -0.01), (46, -0.061), (47, -0.057), (48, -0.023), (49, -0.021)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9515698 168 nips-2012-Kernel Latent SVM for Visual Recognition

Author: Weilong Yang, Yang Wang, Arash Vahdat, Greg Mori

2 0.72882962 360 nips-2012-Visual Recognition using Embedded Feature Selection for Curvature Self-Similarity

Author: Angela Eigenstetter, Bjorn Ommer

Abstract: Category-level object detection has a crucial need for informative object representations. This demand has led to feature descriptors of ever increasing dimensionality like co-occurrence statistics and self-similarity. In this paper we propose a new object representation based on curvature self-similarity that goes beyond the currently popular approximation of objects using straight lines. However, like all descriptors using second order statistics, ours also exhibits a high dimensionality. Although improving discriminability, the high dimensionality becomes a critical issue due to lack of generalization ability and curse of dimensionality. Given only a limited amount of training data, even sophisticated learning algorithms such as the popular kernel methods are not able to suppress noisy or superﬂuous dimensions of such high-dimensional data. Consequently, there is a natural need for feature selection when using present-day informative features and, particularly, curvature self-similarity. We therefore suggest an embedded feature selection method for SVMs that reduces complexity and improves generalization capability of object models. By successfully integrating the proposed curvature self-similarity representation together with the embedded feature selection in a widely used state-of-the-art object detection framework we show the general pertinence of the approach. 1

3 0.63687754 188 nips-2012-Learning from Distributions via Support Measure Machines

Author: Krikamol Muandet, Kenji Fukumizu, Francesco Dinuzzo, Bernhard Schölkopf

4 0.63125587 357 nips-2012-Unsupervised Template Learning for Fine-Grained Object Recognition

Author: Shulin Yang, Liefeng Bo, Jue Wang, Linda G. Shapiro

Abstract: Fine-grained recognition refers to a subordinate level of recognition, such as recognizing different species of animals and plants. It differs from recognition of basic categories, such as humans, tables, and computers, in that there are global similarities in shape and structure shared cross different categories, and the differences are in the details of object parts. We suggest that the key to identifying the ﬁne-grained differences lies in ﬁnding the right alignment of image regions that contain the same object parts. We propose a template model for the purpose, which captures common shape patterns of object parts, as well as the cooccurrence relation of the shape patterns. Once the image regions are aligned, extracted features are used for classiﬁcation. Learning of the template model is efﬁcient, and the recognition results we achieve signiﬁcantly outperform the stateof-the-art algorithms. 1

5 0.61240703 306 nips-2012-Semantic Kernel Forests from Multiple Taxonomies

Author: Sung J. Hwang, Kristen Grauman, Fei Sha

Abstract: When learning features for complex visual recognition problems, labeled image exemplars alone can be insufﬁcient. While an object taxonomy specifying the categories’ semantic relationships could bolster the learning process, not all relationships are relevant to a given visual classiﬁcation task, nor does a single taxonomy capture all ties that are relevant. In light of these issues, we propose a discriminative feature learning approach that leverages multiple hierarchical taxonomies representing different semantic views of the object categories (e.g., for animal classes, one taxonomy could reﬂect their phylogenic ties, while another could reﬂect their habitats). For each taxonomy, we ﬁrst learn a tree of semantic kernels, where each node has a Mahalanobis kernel optimized to distinguish between the classes in its children nodes. Then, using the resulting semantic kernel forest, we learn class-speciﬁc kernel combinations to select only those relationships relevant to recognize each object class. To learn the weights, we introduce a novel hierarchical regularization term that further exploits the taxonomies’ structure. We demonstrate our method on challenging object recognition datasets, and show that interleaving multiple taxonomic views yields signiﬁcant accuracy improvements.

6 0.61193436 106 nips-2012-Dynamical And-Or Graph Learning for Object Shape Modeling and Detection

7 0.61022758 48 nips-2012-Augmented-SVM: Automatic space partitioning for combining multiple non-linear dynamics

8 0.5649249 197 nips-2012-Learning with Recursive Perceptual Representations

9 0.55570185 284 nips-2012-Q-MKL: Matrix-induced Regularization in Multi-Kernel Learning with Applications to Neuroimaging

10 0.54762304 289 nips-2012-Recognizing Activities by Attribute Dynamics

11 0.54534334 176 nips-2012-Learning Image Descriptors with the Boosting-Trick

12 0.54247499 201 nips-2012-Localizing 3D cuboids in single-view images

13 0.53471041 40 nips-2012-Analyzing 3D Objects in Cluttered Images

14 0.5323177 101 nips-2012-Discriminatively Trained Sparse Code Gradients for Contour Detection

15 0.53157908 167 nips-2012-Kernel Hyperalignment

16 0.52088624 1 nips-2012-3D Object Detection and Viewpoint Estimation with a Deformable 3D Cuboid Model

17 0.50855869 344 nips-2012-Timely Object Recognition

18 0.50490898 146 nips-2012-Graphical Gaussian Vector for Image Categorization

19 0.47297207 305 nips-2012-Selective Labeling via Error Bound Minimization

20 0.47002652 303 nips-2012-Searching for objects driven by context

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.037), (8, 0.2), (17, 0.011), (21, 0.047), (38, 0.097), (39, 0.014), (42, 0.047), (54, 0.011), (55, 0.028), (74, 0.137), (76, 0.132), (80, 0.117), (92, 0.037)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.82687867 168 nips-2012-Kernel Latent SVM for Visual Recognition

Author: Weilong Yang, Yang Wang, Arash Vahdat, Greg Mori

2 0.76822168 274 nips-2012-Priors for Diversity in Generative Latent Variable Models

Author: James T. Kwok, Ryan P. Adams

Abstract: Probabilistic latent variable models are one of the cornerstones of machine learning. They offer a convenient and coherent way to specify prior distributions over unobserved structure in data, so that these unknown properties can be inferred via posterior inference. Such models are useful for exploratory analysis and visualization, for building density models of data, and for providing features that can be used for later discriminative tasks. A signiﬁcant limitation of these models, however, is that draws from the prior are often highly redundant due to i.i.d. assumptions on internal parameters. For example, there is no preference in the prior of a mixture model to make components non-overlapping, or in topic model to ensure that co-occurring words only appear in a small number of topics. In this work, we revisit these independence assumptions for probabilistic latent variable models, replacing the underlying i.i.d. prior with a determinantal point process (DPP). The DPP allows us to specify a preference for diversity in our latent variables using a positive deﬁnite kernel function. Using a kernel between probability distributions, we are able to deﬁne a DPP on probability measures. We show how to perform MAP inference with DPP priors in latent Dirichlet allocation and in mixture models, leading to better intuition for the latent variable representation and quantitatively improved unsupervised feature extraction, without compromising the generative aspects of the model. 1

3 0.75484031 3 nips-2012-A Bayesian Approach for Policy Learning from Trajectory Preference Queries

Author: Aaron Wilson, Alan Fern, Prasad Tadepalli

Abstract: We consider the problem of learning control policies via trajectory preference queries to an expert. In particular, the agent presents an expert with short runs of a pair of policies originating from the same state and the expert indicates which trajectory is preferred. The agent’s goal is to elicit a latent target policy from the expert with as few queries as possible. To tackle this problem we propose a novel Bayesian model of the querying process and introduce two methods that exploit this model to actively select expert queries. Experimental results on four benchmark problems indicate that our model can effectively learn policies from trajectory preference queries and that active query selection can be substantially more efﬁcient than random selection. 1

4 0.74858987 339 nips-2012-The Time-Marginalized Coalescent Prior for Hierarchical Clustering

Author: Levi Boyles, Max Welling

Abstract: We introduce a new prior for use in Nonparametric Bayesian Hierarchical Clustering. The prior is constructed by marginalizing out the time information of Kingman’s coalescent, providing a prior over tree structures which we call the Time-Marginalized Coalescent (TMC). This allows for models which factorize the tree structure and times, providing two beneﬁts: more ﬂexible priors may be constructed and more efﬁcient Gibbs type inference can be used. We demonstrate this on an example model for density estimation and show the TMC achieves competitive experimental results. 1

5 0.74574924 210 nips-2012-Memorability of Image Regions

Author: Aditya Khosla, Jianxiong Xiao, Antonio Torralba, Aude Oliva

Abstract: While long term human visual memory can store a remarkable amount of visual information, it tends to degrade over time. Recent works have shown that image memorability is an intrinsic property of an image that can be reliably estimated using state-of-the-art image features and machine learning algorithms. However, the class of features and image information that is forgotten has not been explored yet. In this work, we propose a probabilistic framework that models how and which local regions from an image may be forgotten using a data-driven approach that combines local and global images features. The model automatically discovers memorability maps of individual images without any human annotation. We incorporate multiple image region attributes in our algorithm, leading to improved memorability prediction of images as compared to previous works. 1

6 0.74283439 8 nips-2012-A Generative Model for Parts-based Object Segmentation

7 0.74282342 176 nips-2012-Learning Image Descriptors with the Boosting-Trick

8 0.7378574 201 nips-2012-Localizing 3D cuboids in single-view images

9 0.73319238 303 nips-2012-Searching for objects driven by context

10 0.73202211 357 nips-2012-Unsupervised Template Learning for Fine-Grained Object Recognition

11 0.72951257 193 nips-2012-Learning to Align from Scratch

12 0.72241449 337 nips-2012-The Lovász ϑ function, SVMs and finding large dense subgraphs

13 0.72061551 101 nips-2012-Discriminatively Trained Sparse Code Gradients for Contour Detection

14 0.72021145 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs

15 0.72017342 40 nips-2012-Analyzing 3D Objects in Cluttered Images

16 0.72003615 197 nips-2012-Learning with Recursive Perceptual Representations

17 0.71872872 260 nips-2012-Online Sum-Product Computation Over Trees

18 0.71871179 360 nips-2012-Visual Recognition using Embedded Feature Selection for Curvature Self-Similarity

19 0.7178331 92 nips-2012-Deep Representations and Codes for Image Auto-Annotation

20 0.71762305 183 nips-2012-Learning Partially Observable Models Using Temporally Abstract Decision Trees