cvpr cvpr2013 cvpr2013-134 knowledge-graph by maker-knowledge-mining

134 cvpr-2013-Discriminative Sub-categorization


Source: pdf

Author: Minh Hoai, Andrew Zisserman

Abstract: The objective of this work is to learn sub-categories. Rather than casting this as a problem of unsupervised clustering, we investigate a weakly supervised approach using both positive and negative samples of the category. We make the following contributions: (i) we introduce a new model for discriminative sub-categorization which determines cluster membership for positive samples whilst simultaneously learning a max-margin classifier to separate each cluster from the negative samples; (ii) we show that this model does not suffer from the degenerate cluster problem that afflicts several competing methods (e.g., Latent SVM and Max-Margin Clustering); (iii) we show that the method is able to discover interpretable sub-categories in various datasets. The model is evaluated experimentally over various datasets, and itsperformance advantages over k-means and Latent SVM are demonstrated. We also stress test the model and show its resilience in discovering sub-categories as the parameters are varied.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Rather than casting this as a problem of unsupervised clustering, we investigate a weakly supervised approach using both positive and negative samples of the category. [sent-5, score-0.215]

2 , frontal or profile faces [22, 23]), but can be determined automatically using unsupervised clustering [2, 6, 11, 14, 15, 17, 21, 29, 30]. [sent-15, score-0.138]

3 , Max-Margin Clustering (MMC) [29], DIFFRAC [2], universum clustering [32]). [sent-22, score-0.185]

4 One particular problem of such methods is cluster degeneration, where clusters have few or no elements [13, 29]. [sent-23, score-0.472]

5 (d): our method maximizes the separation between clusters and negative data; it partitions positive examples (red plus) into clusters so that each cluster can be well separated from the negative examples (blue minus). [sent-69, score-1.088]

6 In essence, a sub-category is required to contains similar items and also be well separated from the negative examples. [sent-73, score-0.201]

7 Given a set of positive and negative examples of a category, the model simultaneously determines the cluster label of each positive example, whilst learning an SVM for each cluster, discriminating it from the negative examples, as illustrated in Fig. [sent-74, score-0.782]

8 The requirement for negative examples is usually not a problem since they are readily available. [sent-76, score-0.166]

9 Experiments on datasets of varying complexity, from digits and letters to images, show that the model often discovers highly interpretable sub-categories. [sent-82, score-0.137]

10 We use negative data to subcategorize a single class (discovering subcategories), while Universum [27] uses “non-examples” for learning a classifier to separate between predefined classes. [sent-87, score-0.175]

11 Subcategory discovery We pose sub-categorization as a joint clustering and classification problem. [sent-89, score-0.189]

12 Joint clustering and classification For a particular category of interest, consider the task of discovering its sub-categories given a set ofpositive training examples (x1+ , · · · , ∈ ? [sent-93, score-0.281]

13 d) and a set of negative training examples (x1− , · · · , xm− ∈ ? [sent-94, score-0.207]

14 We propose to find the sub-categories by grouping positive training examples into several clusters such that each cluster is well separated from the negative training examples. [sent-96, score-0.836]

15 Let yi ∈ {1, · · · , k} be the (latent) cluster label associated with the positive training examplexi+, the separationbetween cluster j and the negative examples can be measured using the SVM objective: xn+ wmj,ibnj 12||wj||2 (1) s. [sent-97, score-0.987]

16 ≤ The above only involves positive examples that belong to cluster j. [sent-100, score-0.446]

17 To measure the total separability between subcategories and the negative examples, we use the weighted sum of the above SVM objectives: nnj (21| |wj | |2), where nj is the cardinality of cluster j. [sent-101, score-0.522]

18 We seek the cluster labels for positive examples and simultaneously train the SVMs that separate the resulting clusters from the negative examples: {mwinj,ibmj,iyzi}e21ni? [sent-105, score-0.765]

19 wyTixi+ + byi ≥ 1− ξi+ ∀i, wjTxi− + bj ≤ −1 + ξi− ∀i ∀j, ξi+ ≥ 0, ξi− ≥ 0. [sent-114, score-0.199]

20 Notably, in the above objective, a vector wj is weighted by the cardinality of cluster j. [sent-116, score-0.568]

21 This is different from the objective of Multi-class SVMs [4] or Latent SVMs [1, 9, 3 1] in which each vector wj is weighted equally. [sent-117, score-0.243]

22 Thus the cluster label yi of xi+ is: yi= argjmin? [sent-119, score-0.381]

23 (4) This is different from the class/cluster assignment in Multiclass SVMs and Latent SVMs, where yi is given by yi = argmaxj{wjTxi+ + bj}. [sent-122, score-0.136]

24 Recall k-means seeks a set of centroids {w1, · · · , wk} and cluster labels {y1, · · · , yn} to minimize 21 ? [sent-124, score-0.357]

25 The cluster label of a point xi+ is given by yi = argminj 21| |xi+ − wj | |2, or equivalently, yi = argminj{21||wj||2 − wjTxi+}. [sent-126, score-0.69]

26 Cluster degeneration As noted in the introduction, several alternative formulations that are used for sub-category discovery suffer from the problem ofcluster degeneration, i. [sent-131, score-0.334]

27 , the situation where a few clusters dominate and claim all the points, leading to many empty clusters. [sent-133, score-0.219]

28 Cluster degeneration has been pointed out to be an inherent problem of 111666666755 discriminative clustering [13, 29]. [sent-135, score-0.357]

29 wyTixi+ + byi ≥ 1− ξi+ ∀i, wjTxi− + bj ξi+ ≤ −1 + ξi− ∀i ∀j, ≥ 0, ξi− ≥ 0. [sent-142, score-0.199]

30 This formulation is a particular realization of Latent SVM [9], Multiple-Instance SVM [1], and Latent Structural SVM [3 1], in which the latent variables are the cluster labels. [sent-143, score-0.437]

31 Our formulation has a natural mechanism for eliminating empty clusters without increasing the cost. [sent-151, score-0.343]

32 First we show that if a cluster is empty, then it can be regenerated at no additional cost, then we show that the cost can be decreased. [sent-152, score-0.33]

33 Consider the following steps: (i) pick a non-empty cluster land split it into two arbitrary halves of size n2l ; (ii) reassign one half to cluster j and copy the weight vector and bias term of cluster l to cluster j, i. [sent-154, score-1.398]

34 Moreover, if the cluster l contains some points that are not support vectors (i. [sent-159, score-0.362]

35 , points that are beyond the right side of the margin—corresponding to non-tight constraints) and these are reassigned to cluster j, then the margin corresponding to cluster j can be increased in subsequent optimization iterations. [sent-161, score-0.772]

36 In short, there exists a mechanism for eliminating empty clusters; this mechanism never increases the cost and it decreases the cost with high probability (unless every point is a support vector). [sent-163, score-0.213]

37 First, the mechanism described above for regenerating an empty cluster at no cost does not apply. [sent-169, score-0.456]

38 Eliminating an empty cluster j by reassigning some points from a non-empty cluster l and duplicating the weight vector (wj := wl) will increase the objective function, because 21k | |wl | |2 + 21k | |wj | |2 < 21k | |wl | |2 + 21k | |wl| |2 (recall cluster j is empty and | |wj | | = 0). [sent-170, score-1.216]

39 Each of these SVMs has the same number of negative constraints, but the number of positive constraints depends on the cluster size. [sent-173, score-0.518]

40 In general, the more positive constraints an SVM has, the the smaller the margin will be (assuming C is fixed; C is the parameter controlling the tradeoff for larger margin and less constraint violation). [sent-174, score-0.227]

41 Thus, if cluster u is much larger than cluster v, the magnitude of weight vector wu will be much larger than that of wv, i. [sent-176, score-0.703]

42 Now since the clustering assignment of a data point is based on the dot product between itself and the weight vectors, cluster u will have an advantage over cluster v. [sent-180, score-0.848]

43 It is likely that some points from cluster v will be reassigned to cluster u. [sent-181, score-0.709]

44 Cluster u will grow larger while cluster v becomes smaller, increasing the sizegap between them. [sent-182, score-0.33]

45 Interestingly, cluster degeneration has been empirically observed for other types of classifiers. [sent-183, score-0.551]

46 Cluster degeneration is also an inherent problem of MMC [29]. [sent-185, score-0.221]

47 MMC requires every pair of clusters to be well separated by a margin. [sent-186, score-0.189]

48 Thus every pair of clusters leads to a constraint on the maximum size of the margin. [sent-187, score-0.142]

49 In the extreme, MMC can create a single cluster [13, 29]. [sent-189, score-0.33]

50 Block coordinate descent alternates between the following two procedures: (A) Fix the cluster labels {yi}, optimize the SVM parameters {wj, bj} and {ξ+i, ξi−}, (B) Fix the SVM parameters {wj , bj }, optimize the cluster labels {yi}. [sent-194, score-0.885]

51 Procedure (A) corresponds to a convex quadratic program, and can be optimized using stochastic gradient descent [3], where the weight vectors and the bias terms are 111666666866 updated based on a single training example at each iteration. [sent-195, score-0.153]

52 Procedure (B) requires updating the cluster assignment for each positive training example. [sent-197, score-0.474]

53 This is equivalent to finding the cluster label with minimum assignment cost, given in Eq. [sent-198, score-0.364]

54 We propose to initialize the algorithm as follows: (i) Train a linear SVM to separate positive and negative classes, obtain the weight vector w. [sent-202, score-0.262]

55 (ii) Project the positive examples on the weight vector w, compute the residual vectors x¯i+ := xi+ ||w1||(wTxi+)w. [sent-203, score-0.228]

56 (iii) Perform k-means on the residual vectors { x¯i+} to get the initial cluster labels. [sent-205, score-0.399]

57 Both quantitative and qualitative evaluations are provided, using purity measure [20, 26] and visual interpretability. [sent-208, score-0.179]

58 Clustering performance We validated the clustering performance of our method on several publicly available datasets from the UCI repository1 and the MNIST dataset [18]. [sent-211, score-0.156]

59 The training subsets (one positive and one negative) are used to learn the cluster models as in Eq. [sent-227, score-0.44]

60 We set the number of clusters to the true number of classes of the positive group. [sent-230, score-0.24]

61 To measure clustering performance, we followed the strategy used by [13, 29], where we first took a set oflabeled data, removed the labels and ran the clustering algorithms. [sent-239, score-0.281]

62 We then found the best one-to-one association between the resulting clusters and the ground truth classes (Hungarian algorithm). [sent-240, score-0.171]

63 This is referred to as purity in information theoretic measures [20, 26]. [sent-242, score-0.179]

64 Notably, a purity measure requires no separate test set. [sent-243, score-0.21]

65 It is very likely that any subset of the positive class can be linearly separated from the negative class. [sent-252, score-0.235]

66 Discovering Head Orientations This section describes experiments on discovering head orientations—the looking direction. [sent-258, score-0.174]

67 Clustering purity measures (%) of k-means, LSVM, and our method on UCI datasets and MNIST. [sent-262, score-0.224]

68 × framing the upper bodies ofthe people present and their discrete head orientations. [sent-271, score-0.134]

69 The label set for head orientations are Profile-Left, Frontal-Left, Frontal-Right, Profile-Right, and Backward. [sent-272, score-0.165]

70 Because the head areas of the same person in consecutive frames are often similar, it is unnecessary to consider all frames so we subsample them to obtain positive examples for this experiment. [sent-277, score-0.25]

71 Data for training and validation are sampled from separate video subsets, based on the train/test split specified by the authors of the TVHI dataset [22]. [sent-278, score-0.137]

72 This process yields 4040 and 4760 positive examples for training and validation, respectively. [sent-279, score-0.157]

73 The negative examples are obtained from the negative images of the INRIA Person dataset2 by applying the upper-body detector [8] on each image and retaining the top five detections. [sent-280, score-0.319]

74 The numbers of negative examples for training and validation are 4872 and 4530 respectively. [sent-282, score-0.303]

75 Finally, all feature vectors are normalized to have L2 norms of approximately 1(dividing them by the median of the L2 norms of positive training examples). [sent-283, score-0.2]

76 The training data is used to learn the cluster models as in Eq. [sent-284, score-0.371]

77 The performance measure is cluster pu- rity [20, 26], which requires no separate test set. [sent-286, score-0.361]

78 To test the ability to discover sub-categories, we set the number of clusters to five, the predefined number of head orientations. [sent-287, score-0.301]

79 1, we used purity measure to benchmark the clustering performance. [sent-295, score-0.29]

80 05%, which is the state-of-the-art accuracy of five-way head classification using linear SVMs with HOG descriptors [22] (this is a comparison between the purity measure of an unsupervised method with the classification accuracy of a supervised method). [sent-306, score-0.424]

81 We applied the same multi-stage optimization procedure to LSVM, but the performance degraded due to cluster degeneration at the early stage. [sent-307, score-0.551]

82 (a): classification accuracy on validation data; (b): imbalance index—the standard deviation of cluster sizes; (c): purity measure—the agreement between clusters and ground truth classes. [sent-312, score-0.807]

83 Due to the problem of cluster degeneration, the clusters produced by LSVM can be highly unbalanced, even for the values of C that yield relatively high classification accuracy. [sent-313, score-0.542]

84 Our method does not suffer from this problem, yielding low imbalance index and high purity measure on all values of C. [sent-314, score-0.28]

85 For high values of C, LSVM does not suffer from the cluster degeneration problem, but the clustering performance is similar to the performance of the initialization. [sent-315, score-0.714]

86 We also study the classification accuracy (on validation data) and the clustering purity as the amount of negative data varies. [sent-319, score-0.516]

87 Theoretically, C controls the tradeoff between a large margin and low training loss. [sent-324, score-0.136]

88 But for small values of C, the clusters of LSVM degenerate (as shown in Fig. [sent-326, score-0.19]

89 In contrast, our method achieves good result with as few as 300 negative training examples. [sent-329, score-0.16]

90 The learned weight vectors and the highestrank images somewhat correspond to the five discrete ground truth head orientations. [sent-334, score-0.243]

91 Classification accuracy and clustering purity as a function of m, the number of negative training examples. [sent-345, score-0.45]

92 Though LSVM performs relatively well on the classification task, its clustering performance is much worse than ours. [sent-346, score-0.153]

93 Our method obtains excellent clustering results, with as few as 300 negative examples. [sent-347, score-0.23]

94 The accuracy of five linear SVMs trained with ground truth head orientations is 94. [sent-349, score-0.199]

95 Low-rank images are due to: i) the regression procedure fails to localize the head region; ii) subject exhibits a rare head pose; iii) the head is occluded; or iv) the image patch has low resolution, low contrast, or motion blur. [sent-390, score-0.402]

96 The learned clusters when the desired numbers of subcategories are three. [sent-392, score-0.255]

97 All clusters produced by our method are meaningful while the last cluster of LSVM is uninterpretable. [sent-394, score-0.5]

98 All clusters produced by our method are meaningful while two clusters of LSVM are uninterpretable. [sent-398, score-0.312]

99 On this dataset (8912 training and 9290 testing examples of 1984 dimensions), our method (naive implementation without any speed optimization) took 260s for training and 0. [sent-402, score-0.161]

100 Furthermore, we show that assigning to clusters by a combination of Hinge loss and SVM margin avoids the degenerate configurations suffered by several popular methods that assign according to classifier score alone. [sent-410, score-0.253]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('lsvm', 0.566), ('cluster', 0.33), ('wjtxi', 0.245), ('degeneration', 0.221), ('wj', 0.214), ('purity', 0.179), ('svms', 0.165), ('clusters', 0.142), ('head', 0.134), ('bj', 0.134), ('mmc', 0.123), ('negative', 0.119), ('clustering', 0.111), ('wyi', 0.081), ('empty', 0.077), ('uci', 0.075), ('universum', 0.074), ('wytixi', 0.074), ('svm', 0.071), ('latent', 0.07), ('positive', 0.069), ('byi', 0.065), ('validation', 0.065), ('wl', 0.064), ('margin', 0.063), ('digits', 0.058), ('suffer', 0.052), ('yi', 0.051), ('diffrac', 0.049), ('earned', 0.049), ('landsat', 0.049), ('ofnegative', 0.049), ('reassigned', 0.049), ('imbalance', 0.049), ('subcategories', 0.049), ('mechanism', 0.049), ('degenerate', 0.048), ('examples', 0.047), ('bottou', 0.047), ('violation', 0.047), ('separated', 0.047), ('notably', 0.046), ('mnist', 0.046), ('datasets', 0.045), ('tvhi', 0.044), ('argminj', 0.044), ('semeion', 0.044), ('weight', 0.043), ('classification', 0.042), ('training', 0.041), ('discovering', 0.04), ('plates', 0.04), ('mci', 0.04), ('hoai', 0.04), ('nci', 0.04), ('gas', 0.038), ('steel', 0.038), ('eliminating', 0.038), ('residual', 0.037), ('descent', 0.037), ('formulation', 0.037), ('emphasis', 0.037), ('discovery', 0.036), ('amazon', 0.036), ('items', 0.035), ('halves', 0.035), ('reviews', 0.035), ('assignment', 0.034), ('five', 0.034), ('repository', 0.034), ('interpretable', 0.034), ('wv', 0.034), ('desired', 0.033), ('vectors', 0.032), ('took', 0.032), ('tradeoff', 0.032), ('separate', 0.031), ('numbers', 0.031), ('orientations', 0.031), ('letter', 0.031), ('unbalanced', 0.029), ('classes', 0.029), ('objective', 0.029), ('whilst', 0.029), ('norms', 0.029), ('produced', 0.028), ('handwritten', 0.028), ('labels', 0.027), ('icml', 0.027), ('proceedings', 0.027), ('tv', 0.027), ('unsupervised', 0.027), ('jordan', 0.026), ('inversely', 0.026), ('separation', 0.026), ('noted', 0.025), ('predefined', 0.025), ('discriminative', 0.025), ('xi', 0.025), ('cardinality', 0.024)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0 134 cvpr-2013-Discriminative Sub-categorization

Author: Minh Hoai, Andrew Zisserman

Abstract: The objective of this work is to learn sub-categories. Rather than casting this as a problem of unsupervised clustering, we investigate a weakly supervised approach using both positive and negative samples of the category. We make the following contributions: (i) we introduce a new model for discriminative sub-categorization which determines cluster membership for positive samples whilst simultaneously learning a max-margin classifier to separate each cluster from the negative samples; (ii) we show that this model does not suffer from the degenerate cluster problem that afflicts several competing methods (e.g., Latent SVM and Max-Margin Clustering); (iii) we show that the method is able to discover interpretable sub-categories in various datasets. The model is evaluated experimentally over various datasets, and itsperformance advantages over k-means and Latent SVM are demonstrated. We also stress test the model and show its resilience in discovering sub-categories as the parameters are varied.

2 0.11614924 260 cvpr-2013-Learning and Calibrating Per-Location Classifiers for Visual Place Recognition

Author: Petr Gronát, Guillaume Obozinski, Josef Sivic, Tomáš Pajdla

Abstract: The aim of this work is to localize a query photograph by finding other images depicting the same place in a large geotagged image database. This is a challenging task due to changes in viewpoint, imaging conditions and the large size of the image database. The contribution of this work is two-fold. First, we cast the place recognition problem as a classification task and use the available geotags to train a classifier for each location in the database in a similar manner to per-exemplar SVMs in object recognition. Second, as onlyfewpositive training examples are availablefor each location, we propose a new approach to calibrate all the per-location SVM classifiers using only the negative examples. The calibration we propose relies on a significance measure essentially equivalent to the p-values classically used in statistical hypothesis testing. Experiments are performed on a database of 25,000 geotagged street view images of Pittsburgh and demonstrate improved place recognition accuracy of the proposed approach over the previous work. 2Center for Machine Perception, Faculty of Electrical Engineering 3WILLOW project, Laboratoire d’Informatique de l’E´cole Normale Sup e´rieure, ENS/INRIA/CNRS UMR 8548. 4Universit Paris-Est, LIGM (UMR CNRS 8049), Center for Visual Computing, Ecole des Ponts - ParisTech, 77455 Marne-la-Valle, France

3 0.10586186 130 cvpr-2013-Discriminative Color Descriptors

Author: Rahat Khan, Joost van_de_Weijer, Fahad Shahbaz Khan, Damien Muselet, Christophe Ducottet, Cecile Barat

Abstract: Color description is a challenging task because of large variations in RGB values which occur due to scene accidental events, such as shadows, shading, specularities, illuminant color changes, and changes in viewing geometry. Traditionally, this challenge has been addressed by capturing the variations in physics-basedmodels, and deriving invariants for the undesired variations. The drawback of this approach is that sets of distinguishable colors in the original color space are mapped to the same value in the photometric invariant space. This results in a drop of discriminative power of the color description. In this paper we take an information theoretic approach to color description. We cluster color values together based on their discriminative power in a classification problem. The clustering has the explicit objective to minimize the drop of mutual information of the final representation. We show that such a color description automatically learns a certain degree of photometric invariance. We also show that a universal color representation, which is based on other data sets than the one at hand, can obtain competing performance. Experiments show that the proposed descriptor outperforms existing photometric invariants. Furthermore, we show that combined with shape description these color descriptors obtain excellent results on four challenging datasets, namely, PASCAL VOC 2007, Flowers-102, Stanford dogs-120 and Birds-200.

4 0.10108635 143 cvpr-2013-Efficient Large-Scale Structured Learning

Author: Steve Branson, Oscar Beijbom, Serge Belongie

Abstract: unkown-abstract

5 0.097484685 8 cvpr-2013-A Fast Approximate AIB Algorithm for Distributional Word Clustering

Author: Lei Wang, Jianjia Zhang, Luping Zhou, Wanqing Li

Abstract: Distributional word clustering merges the words having similar probability distributions to attain reliable parameter estimation, compact classification models and even better classification performance. Agglomerative Information Bottleneck (AIB) is one of the typical word clustering algorithms and has been applied to both traditional text classification and recent image recognition. Although enjoying theoretical elegance, AIB has one main issue on its computational efficiency, especially when clustering a large number of words. Different from existing solutions to this issue, we analyze the characteristics of its objective function the loss of mutual information, and show that by merely using the ratio of word-class joint probabilities of each word, good candidate word pairs for merging can be easily identified. Based on this finding, we propose a fast approximate AIB algorithm and show that it can significantly improve the computational efficiency of AIB while well maintaining or even slightly increasing its classification performance. Experimental study on both text and image classification benchmark data sets shows that our algorithm can achieve more than 100 times speedup on large real data sets over the state-of-the-art method.

6 0.08678405 93 cvpr-2013-Constraints as Features

7 0.086449794 379 cvpr-2013-Scalable Sparse Subspace Clustering

8 0.08434689 82 cvpr-2013-Class Generative Models Based on Feature Regression for Pose Estimation of Object Categories

9 0.082600035 355 cvpr-2013-Representing Videos Using Mid-level Discriminative Patches

10 0.08022029 92 cvpr-2013-Constrained Clustering and Its Application to Face Clustering in Videos

11 0.076529458 248 cvpr-2013-Learning Collections of Part Models for Object Recognition

12 0.076449558 386 cvpr-2013-Self-Paced Learning for Long-Term Tracking

13 0.075793073 173 cvpr-2013-Finding Things: Image Parsing with Regions and Per-Exemplar Detectors

14 0.075623073 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels

15 0.071423851 221 cvpr-2013-Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection

16 0.070884801 67 cvpr-2013-Blocks That Shout: Distinctive Parts for Scene Classification

17 0.06993901 36 cvpr-2013-Adding Unlabeled Samples to Categories by Learned Attributes

18 0.068889081 430 cvpr-2013-The SVM-Minus Similarity Score for Video Face Recognition

19 0.067077518 30 cvpr-2013-Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses

20 0.065236904 153 cvpr-2013-Expanded Parts Model for Human Attribute and Action Recognition in Still Images


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.171), (1, -0.059), (2, -0.024), (3, -0.009), (4, 0.052), (5, 0.024), (6, -0.006), (7, -0.011), (8, -0.027), (9, -0.034), (10, -0.023), (11, -0.018), (12, -0.018), (13, -0.047), (14, -0.043), (15, -0.014), (16, 0.008), (17, -0.056), (18, -0.001), (19, -0.069), (20, 0.035), (21, -0.007), (22, 0.007), (23, -0.02), (24, -0.005), (25, 0.04), (26, -0.029), (27, 0.008), (28, 0.005), (29, -0.025), (30, 0.072), (31, 0.091), (32, -0.002), (33, -0.006), (34, 0.031), (35, 0.007), (36, -0.038), (37, -0.048), (38, -0.014), (39, -0.062), (40, -0.02), (41, -0.109), (42, -0.085), (43, 0.056), (44, 0.054), (45, 0.008), (46, 0.023), (47, 0.005), (48, 0.081), (49, 0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95014876 134 cvpr-2013-Discriminative Sub-categorization

Author: Minh Hoai, Andrew Zisserman

Abstract: The objective of this work is to learn sub-categories. Rather than casting this as a problem of unsupervised clustering, we investigate a weakly supervised approach using both positive and negative samples of the category. We make the following contributions: (i) we introduce a new model for discriminative sub-categorization which determines cluster membership for positive samples whilst simultaneously learning a max-margin classifier to separate each cluster from the negative samples; (ii) we show that this model does not suffer from the degenerate cluster problem that afflicts several competing methods (e.g., Latent SVM and Max-Margin Clustering); (iii) we show that the method is able to discover interpretable sub-categories in various datasets. The model is evaluated experimentally over various datasets, and itsperformance advantages over k-means and Latent SVM are demonstrated. We also stress test the model and show its resilience in discovering sub-categories as the parameters are varied.

2 0.77335322 320 cvpr-2013-Optimizing 1-Nearest Prototype Classifiers

Author: Paul Wohlhart, Martin Köstinger, Michael Donoser, Peter M. Roth, Horst Bischof

Abstract: The development of complex, powerful classifiers and their constant improvement have contributed much to the progress in many fields of computer vision. However, the trend towards large scale datasets revived the interest in simpler classifiers to reduce runtime. Simple nearest neighbor classifiers have several beneficial properties, such as low complexity and inherent multi-class handling, however, they have a runtime linear in the size of the database. Recent related work represents data samples by assigning them to a set of prototypes that partition the input feature space and afterwards applies linear classifiers on top of this representation to approximate decision boundaries locally linear. In this paper, we go a step beyond these approaches and purely focus on 1-nearest prototype classification, where we propose a novel algorithm for deriving optimal prototypes in a discriminative manner from the training samples. Our method is implicitly multi-class capable, parameter free, avoids noise overfitting and, since during testing only comparisons to the derived prototypes are required, highly efficient. Experiments demonstrate that we are able to outperform related locally linear methods, while even getting close to the results of more complex classifiers.

3 0.75253111 8 cvpr-2013-A Fast Approximate AIB Algorithm for Distributional Word Clustering

Author: Lei Wang, Jianjia Zhang, Luping Zhou, Wanqing Li

Abstract: Distributional word clustering merges the words having similar probability distributions to attain reliable parameter estimation, compact classification models and even better classification performance. Agglomerative Information Bottleneck (AIB) is one of the typical word clustering algorithms and has been applied to both traditional text classification and recent image recognition. Although enjoying theoretical elegance, AIB has one main issue on its computational efficiency, especially when clustering a large number of words. Different from existing solutions to this issue, we analyze the characteristics of its objective function the loss of mutual information, and show that by merely using the ratio of word-class joint probabilities of each word, good candidate word pairs for merging can be easily identified. Based on this finding, we propose a fast approximate AIB algorithm and show that it can significantly improve the computational efficiency of AIB while well maintaining or even slightly increasing its classification performance. Experimental study on both text and image classification benchmark data sets shows that our algorithm can achieve more than 100 times speedup on large real data sets over the state-of-the-art method.

4 0.66191787 143 cvpr-2013-Efficient Large-Scale Structured Learning

Author: Steve Branson, Oscar Beijbom, Serge Belongie

Abstract: unkown-abstract

5 0.63500524 15 cvpr-2013-A Lazy Man's Approach to Benchmarking: Semisupervised Classifier Evaluation and Recalibration

Author: Peter Welinder, Max Welling, Pietro Perona

Abstract: How many labeled examples are needed to estimate a classifier’s performance on a new dataset? We study the case where data is plentiful, but labels are expensive. We show that by making a few reasonable assumptions on the structure of the data, it is possible to estimate performance curves, with confidence bounds, using a small number of ground truth labels. Our approach, which we call Semisupervised Performance Evaluation (SPE), is based on a generative model for the classifier’s confidence scores. In addition to estimating the performance of classifiers on new datasets, SPE can be used to recalibrate a classifier by reestimating the class-conditional confidence distributions.

6 0.62384772 239 cvpr-2013-Kernel Null Space Methods for Novelty Detection

7 0.61543643 201 cvpr-2013-Heterogeneous Visual Features Fusion via Sparse Multimodal Machine

8 0.61461633 403 cvpr-2013-Sparse Output Coding for Large-Scale Visual Recognition

9 0.61394751 417 cvpr-2013-Subcategory-Aware Object Classification

10 0.61247236 221 cvpr-2013-Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection

11 0.61211818 377 cvpr-2013-Sample-Specific Late Fusion for Visual Category Recognition

12 0.60960883 168 cvpr-2013-Fast Object Detection with Entropy-Driven Evaluation

13 0.60819221 93 cvpr-2013-Constraints as Features

14 0.60121971 129 cvpr-2013-Discriminative Brain Effective Connectivity Analysis for Alzheimer's Disease: A Kernel Learning Approach upon Sparse Gaussian Bayesian Network

15 0.59501767 379 cvpr-2013-Scalable Sparse Subspace Clustering

16 0.59015459 261 cvpr-2013-Learning by Associating Ambiguously Labeled Images

17 0.586896 157 cvpr-2013-Exploring Implicit Image Statistics for Visual Representativeness Modeling

18 0.57782197 260 cvpr-2013-Learning and Calibrating Per-Location Classifiers for Visual Place Recognition

19 0.57669812 382 cvpr-2013-Scene Text Recognition Using Part-Based Tree-Structured Character Detection

20 0.57499778 67 cvpr-2013-Blocks That Shout: Distinctive Parts for Scene Classification


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(10, 0.105), (16, 0.03), (26, 0.038), (28, 0.284), (33, 0.299), (67, 0.05), (69, 0.051), (87, 0.062)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.88788831 192 cvpr-2013-Graph Matching with Anchor Nodes: A Learning Approach

Author: Nan Hu, Raif M. Rustamov, Leonidas Guibas

Abstract: In this paper, we consider the weighted graph matching problem with partially disclosed correspondences between a number of anchor nodes. Our construction exploits recently introduced node signatures based on graph Laplacians, namely the Laplacian family signature (LFS) on the nodes, and the pairwise heat kernel map on the edges. In this paper, without assuming an explicit form of parametric dependence nor a distance metric between node signatures, we formulate an optimization problem which incorporates the knowledge of anchor nodes. Solving this problem gives us an optimized proximity measure specific to the graphs under consideration. Using this as a first order compatibility term, we then set up an integer quadratic program (IQP) to solve for a near optimal graph matching. Our experiments demonstrate the superior performance of our approach on randomly generated graphs and on two widelyused image sequences, when compared with other existing signature and adjacency matrix based graph matching methods.

2 0.870915 428 cvpr-2013-The Episolar Constraint: Monocular Shape from Shadow Correspondence

Author: Austin Abrams, Kylia Miskell, Robert Pless

Abstract: Shadows encode a powerful geometric cue: if one pixel casts a shadow onto another, then the two pixels are colinear with the lighting direction. Given many images over many lighting directions, this constraint can be leveraged to recover the depth of a scene from a single viewpoint. For outdoor scenes with solar illumination, we term this the episolar constraint, which provides a convex optimization to solve for the sparse depth of a scene from shadow correspondences, a method to reduce the search space when finding shadow correspondences, and a method to geometrically calibrate a camera using shadow constraints. Our method constructs a dense network of nonlocal constraints which complements recent work on outdoor photometric stereo and cloud based cues for 3D. We demonstrate results across a variety of time-lapse sequences from webcams “in . wu st l. edu (b)(c) the wild.”

3 0.86787724 261 cvpr-2013-Learning by Associating Ambiguously Labeled Images

Author: Zinan Zeng, Shijie Xiao, Kui Jia, Tsung-Han Chan, Shenghua Gao, Dong Xu, Yi Ma

Abstract: We study in this paper the problem of learning classifiers from ambiguously labeled images. For instance, in the collection of new images, each image contains some samples of interest (e.g., human faces), and its associated caption has labels with the true ones included, while the samplelabel association is unknown. The task is to learn classifiers from these ambiguously labeled images and generalize to new images. An essential consideration here is how to make use of the information embedded in the relations between samples and labels, both within each image and across the image set. To this end, we propose a novel framework to address this problem. Our framework is motivated by the observation that samples from the same class repetitively appear in the collection of ambiguously labeled training images, while they are just ambiguously labeled in each image. If we can identify samples of the same class from each image and associate them across the image set, the matrix formed by the samples from the same class would be ideally low-rank. By leveraging such a low-rank assump- tion, we can simultaneously optimize a partial permutation matrix (PPM) for each image, which is formulated in order to exploit all information between samples and labels in a principled way. The obtained PPMs can be readily used to assign labels to samples in training images, and then a standard SVM classifier can be trained and used for unseen data. Experiments on benchmark datasets show the effectiveness of our proposed method.

same-paper 4 0.85965753 134 cvpr-2013-Discriminative Sub-categorization

Author: Minh Hoai, Andrew Zisserman

Abstract: The objective of this work is to learn sub-categories. Rather than casting this as a problem of unsupervised clustering, we investigate a weakly supervised approach using both positive and negative samples of the category. We make the following contributions: (i) we introduce a new model for discriminative sub-categorization which determines cluster membership for positive samples whilst simultaneously learning a max-margin classifier to separate each cluster from the negative samples; (ii) we show that this model does not suffer from the degenerate cluster problem that afflicts several competing methods (e.g., Latent SVM and Max-Margin Clustering); (iii) we show that the method is able to discover interpretable sub-categories in various datasets. The model is evaluated experimentally over various datasets, and itsperformance advantages over k-means and Latent SVM are demonstrated. We also stress test the model and show its resilience in discovering sub-categories as the parameters are varied.

5 0.85841775 4 cvpr-2013-3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image

Author: Ishani Chakraborty, Hui Cheng, Omar Javed

Abstract: We present a unified framework for detecting and classifying people interactions in unconstrained user generated images. 1 Unlike previous approaches that directly map people/face locations in 2D image space into features for classification, we first estimate camera viewpoint and people positions in 3D space and then extract spatial configuration features from explicit 3D people positions. This approach has several advantages. First, it can accurately estimate relative distances and orientations between people in 3D. Second, it encodes spatial arrangements of people into a richer set of shape descriptors than afforded in 2D. Our 3D shape descriptors are invariant to camera pose variations often seen in web images and videos. The proposed approach also estimates camera pose and uses it to capture the intent of the photo. To achieve accurate 3D people layout estimation, we develop an algorithm that robustly fuses semantic constraints about human interpositions into a linear camera model. This enables our model to handle large variations in people size, heights (e.g. age) and poses. An accurate 3D layout also allows us to construct features informed by Proxemics that improves our semantic classification. To characterize the human interaction space, we introduce visual proxemes; a set of prototypical patterns that represent commonly occurring social interactions in events. We train a discriminative classifier that classifies 3D arrangements of people into visual proxemes and quantitatively evaluate the performance on a large, challenging dataset.

6 0.84606397 328 cvpr-2013-Pedestrian Detection with Unsupervised Multi-stage Feature Learning

7 0.82635766 232 cvpr-2013-Joint Geodesic Upsampling of Depth Images

8 0.81723768 234 cvpr-2013-Joint Spectral Correspondence for Disparate Image Matching

9 0.78502643 206 cvpr-2013-Human Pose Estimation Using Body Parts Dependent Joint Regressors

10 0.77814275 284 cvpr-2013-Mesh Based Semantic Modelling for Indoor and Outdoor Scenes

11 0.77706212 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection

12 0.77208292 432 cvpr-2013-Three-Dimensional Bilateral Symmetry Plane Estimation in the Phase Domain

13 0.77178419 221 cvpr-2013-Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection

14 0.77167338 395 cvpr-2013-Shape from Silhouette Probability Maps: Reconstruction of Thin Objects in the Presence of Silhouette Extraction and Calibration Error

15 0.77094698 19 cvpr-2013-A Minimum Error Vanishing Point Detection Approach for Uncalibrated Monocular Images of Man-Made Environments

16 0.76983953 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation

17 0.76958042 394 cvpr-2013-Shading-Based Shape Refinement of RGB-D Images

18 0.76944309 299 cvpr-2013-Multi-source Multi-scale Counting in Extremely Dense Crowd Images

19 0.76911128 54 cvpr-2013-BRDF Slices: Accurate Adaptive Anisotropic Appearance Acquisition

20 0.76827633 80 cvpr-2013-Category Modeling from Just a Single Labeling: Use Depth Information to Guide the Learning of 2D Models