nips nips2011 nips2011-233 knowledge-graph by maker-knowledge-mining

233 nips-2011-Rapid Deformable Object Detection using Dual-Tree Branch-and-Bound

Source: pdf

Author: Iasonas Kokkinos

Abstract: In this work we use Branch-and-Bound (BB) to efﬁciently detect objects with deformable part models. Instead of evaluating the classiﬁer score exhaustively over image locations and scales, we use BB to focus on promising image locations. The core problem is to compute bounds that accommodate part deformations; for this we adapt the Dual Trees data structure [7] to our problem. We evaluate our approach using Mixture-of-Deformable Part Models [4]. We obtain exactly the same results but are 10-20 times faster on average. We also develop a multiple-object detection variation of the system, where hypotheses for 20 categories are inserted in a common priority queue. For the problem of ﬁnding the strongest category in an image this results in a 100-fold speedup.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 fr Abstract In this work we use Branch-and-Bound (BB) to efﬁciently detect objects with deformable part models. [sent-3, score-0.225]

2 Instead of evaluating the classiﬁer score exhaustively over image locations and scales, we use BB to focus on promising image locations. [sent-4, score-0.347]

3 The core problem is to compute bounds that accommodate part deformations; for this we adapt the Dual Trees data structure [7] to our problem. [sent-5, score-0.15]

4 We also develop a multiple-object detection variation of the system, where hypotheses for 20 categories are inserted in a common priority queue. [sent-8, score-0.433]

5 1 Introduction Deformable Part Models (DPMs) deliver state-of-the-art object detection results [4] on challenging benchmarks when trained discriminatively, and have become a standard in object recognition research. [sent-10, score-0.435]

6 At the heart of these models lies the optimization of a merit function -the classiﬁer scorewith respect to the part displacements and the global object pose. [sent-11, score-0.293]

7 The most common detection algorithm used in conjunction with DPMs relies on Generalized Distance Transforms (GDTs) [5], whose complexity is linear in the image size. [sent-13, score-0.222]

8 We also develop a multiple-object detection variation of the system, where all object hypotheses are inserted in the same priority queue. [sent-21, score-0.577]

9 If our task is to ﬁnd the best (or k-best) object hypotheses in an image this can result in a 100-fold speedup. [sent-22, score-0.253]

10 2 Previous Work on Efﬁcient Detection Cascaded object detection [20] has led to a proliferation of vision applications, but far less work exists to deal with part-based models. [sent-23, score-0.291]

11 The combinatorics of matching have been extensively studied for rigid objects [8], while [17] used A∗ for detecting object instances. [sent-24, score-0.193]

12 For categories, recent works [1, 10, 11, 19, 6, 18, 15] have focused on reducing the high-dimensional pose search space during 1 detection by initially simplifying the cost function being optimized, mostly using ideas similar to A∗ and coarse-to-ﬁne processing. [sent-25, score-0.181]

13 Branch-and-bound (BB) prioritizes the search of promising image areas, as indicated by an upper bound on the classiﬁer’s score. [sent-27, score-0.351]

14 A most inﬂuential paper has been the Efﬁcient Subwindow Search (ESS) technique of [12], where an upper bound of a bag-of-words classiﬁer score delivers the bounds required by BB. [sent-28, score-0.399]

15 Later [16] combined Graph-Cuts with BB for object segmentation, while in [13] a general cascade system was devised for efﬁcient detection with a nonlinear classiﬁer. [sent-29, score-0.332]

16 Each node p has a unary observation potential Up (x), indicating the ﬁdelity of the image at x to the node; e. [sent-45, score-0.274]

17 Moreover we consider η0 = 0, µ0 = 0 and H0 , V0 large enough so that B0 (xp , x0 ) will be zero for xp = x0 and practically inﬁnite elsewhere. [sent-50, score-0.227]

18 If the root is at x0 the merit for part p being at xp is given by mp (xp , x0 ) = Up (xp ) + Bp (xp , x0 ); summing over p gives the score p mp (xp , x0 ) of a root-and-parts conﬁguration X = (x0 , . [sent-51, score-0.827]

19 max mp (xp , x) = p=0 xp p=0 xp (1) A GDT can be used to maximize each summand in Eq. [sent-57, score-0.675]

20 −1 for a classiﬁer trained with SVMs, the GDT-based approach turns out to be wasteful as it treats equally all image locations, even those where we can quickly realize that the classiﬁer score cannot exceed this threshold. [sent-64, score-0.149]

21 1: in (a) we show the part-root conﬁguration that gives the maximum score, and in (b) the score of a bicycle model from [4] over the whole image domain. [sent-66, score-0.149]

22 Our approach 2 (a) Input & Detection result (b) Detector score S(x) (c) BB for arg maxx S(x) (d) BB for S(x) ≥ −1. [sent-67, score-0.178]

23 Typically only a tiny portion of the image domain should be positivein (b) we draw a black contour around {x : S(x) > −1} for an SVM-based classiﬁer. [sent-69, score-0.223]

24 BB ignores large intervals with low S(x) by upper bounding their values, and postponing their ‘exploration’ in favor of more promising ones. [sent-70, score-0.327]

25 In (c) we show as heat maps the upper bounds of the intervals visited by BB until the strongest location was explored, and in (d) of the intervals visited until all locations x with S(x) > −1 were explored. [sent-71, score-0.52]

26 speeds up detection by upper bounding the score of the detector within intervals of x while using low-cost operations. [sent-72, score-0.463]

27 This allows us to use a prioritized search strategy that can reﬁne these bounds on promising intervals, while postponing the exploration of less promising intervals. [sent-73, score-0.353]

28 1(c,d) where we show as heat maps the upper bounds of the intervals visited by BB: parts of the image where the heat maps are more ﬁne grained correspond to image locations that seemed promising. [sent-75, score-0.63]

29 ﬁnding all x : S(x) > −1 (d), a large part of the image domain is effectively ignored and the algorithm obtains reﬁned bounds only around ‘interesting’ image locations. [sent-78, score-0.448]

30 2 Dual Trees: Data Structures for Set-Set interactions The main technical challenge is to efﬁciently compute upper bounds for a model involving deformable parts; our main contribution consists in realizing that this can be accomplished with the Dual Tree data structure of [7]. [sent-80, score-0.291]

31 We refer to XS as ‘source’ terms, and to XD as ‘domain’ terms, the idea being that the source points XS generate a ‘ﬁeld’ P , which we want evaluate at the domain locations XP . [sent-91, score-0.384]

32 The Dual Tree algorithm efﬁciently computes this sum by using two KD-trees, one (S) for the source locations XS and another (D) for the domain locations XD . [sent-94, score-0.401]

33 2: if a ‘chunk’ of source points cannot affect a ‘chunk’ of domain points, we skip computing their domain-source point interactions. [sent-97, score-0.305]

34 BB searches for the interval containing the function’s maximum using a prioritized search strategy; the priority of an interval is determined by the function’s upper bound within it. [sent-99, score-0.598]

35 Starting from an interval containing the whole function domain, BB increasingly narrows down to the solution: at each step an interval of solutions is popped from a priority queue, split into sub-intervals (Branch), and a new upper bound for those intervals is computed (Bound). [sent-100, score-0.688]

36 These intervals are then inserted in the priority queue and the process repeats until a singleton interval is popped. [sent-101, score-0.505]

37 1 is a sum of scores of the form: sp (x0 ) = max mp (xp , x0 ) = max Up (hp , vp ) − (hp − h0 − ηp )2 Hp − (vp − v0 − νp )2 Vp . [sent-109, score-0.498]

38 xP (hp ,vp ) (3) Using Dual Tree terminology the ‘source points’ correspond to part locations xp , i. [sent-110, score-0.373]

39 XSp = {xp }, and the ‘domain points’ to object locations x0 , i. [sent-112, score-0.223]

40 Dual Trees allow us to efﬁciently derive bounds for sp (x0 ), x0 ∈ XD , the scores that a set of object locations can have due to a set of part p locations. [sent-115, score-0.455]

41 Once these are formed, we add over parts to bound the score S(x0 ) = p sp (x0 ), x0 ∈ XD . [sent-116, score-0.297]

42 We p denote by wi = Up (xi ) the unary potential of part p at location xi . [sent-122, score-0.299]

43 We shift the unary scores by the nominal offsets µ, which gives new source locations: xi → xi − µp , (hi , vi ) → (hi − ηp , vi − νp ). [sent-123, score-0.395]

44 Finally, we drop p from mp , Hp and Vp unless necessary. [sent-124, score-0.168]

45 4 at (h0 , v0 ) we use prioritized search over intervals of i ∈ Sp , starting from Sp and gradually narrowing down to the best i. [sent-128, score-0.23]

46 To prioritize intervals we use a KD-tree for the source points xi ∈ XSp to quickly compute bounds of Eq. [sent-129, score-0.329]

47 In speciﬁc, if Sn is the set of children of the n-th node of the KD-tree for Sp , consider the subproblem: mn (h0 , v0 ) = max i∈Sn wi − H(hi − h0 )2 − V (vi − v0 )2 = max i∈Sn wi + Gi , (5) . [sent-131, score-0.433]

48 We know that for all points (hi , vi ) within Sn we have hi ∈ [ln , rn ] and vi ∈ [bn , tn ], where l, r, b, t are the left, right, bottom, top axes deﬁning n’s bounding box, Bn . [sent-134, score-0.414]

49 The upper bound is zero inside Bn and uses the boundaries of Bn that lie closest to (h0 , v0 ), when (h0 , v0 ) is outside Bn . [sent-136, score-0.198]

50 5, for both bounds we can use the value wj , j = arg maxi∈Sn wi . [sent-139, score-0.251]

51 For the lower bound, since Gi > Gn ∀i ∈ Sn , we have maxi∈Sn wi + Gi ≥ wj + Gj ≥ wj + Gn . [sent-141, score-0.171]

52 So wj + Gn provides a proper lower bound for maxi∈Sn wi + Gi . [sent-142, score-0.224]

53 4 m 7 0 l l1 l2 m1 4 2 n 3 0 m2 6 1 n1 2 0 n2 3 1 o 8 4 o1 5 4 o2 8 6 Figure 3: Supporter pruning: source nodes {m, n, o} are among the possible supporters of domain-node l. [sent-145, score-0.278]

54 Their upper and lower bounds (shown as numbers to the right of each node) are used to prune them. [sent-146, score-0.182]

55 Here, the upper bound for n (3) is smaller than the maximal lower bound among supporters (4, from o): this implies the upper bound of n’s children contributions to l’s children (shown here for l1 ) will not surpass the lower bound of o’s children. [sent-147, score-0.949]

56 We can use the upper bound in a prioritized search for the maximum of m(h0 , v0 ), as described in Table 1. [sent-149, score-0.339]

57 Starting with the root of the KD-tree we expand its children nodes, estimate their prioritiesupper bounds, and insert them in a priority queue. [sent-150, score-0.308]

58 The search stops when the ﬁrst leaf node is popped; this provides the maximizer, as its upper and lower bounds coincide and all other elements waiting in queue have smaller upper bounds. [sent-151, score-0.523]

59 2 Maximization for All Domain Points Having described how KD-trees to provide bounds in the single domain point case, we now describe how Dual Trees can speedup this operation in when treating multiple domain points simultaneously. [sent-156, score-0.572]

60 In speciﬁc, we consider the following maximization problem: x∗ = arg max m(x) = arg max max wi − H(hi − hj )2 − V (vi − vj )2 , x∈XD (8) j∈D i∈S where XD /D is the set of domain points/indices and S are the source indices. [sent-157, score-0.649]

61 For this, as in the original Dual Tree work, we build a second KD-tree for the domain points (‘Domain tree’, as opposed to ‘Source tree’). [sent-160, score-0.21]

62 The nodes in the Domain tree (‘domain-nodes’) correspond to intervals of domain points that are processed jointly. [sent-161, score-0.37]

63 This saves repetitions of similar bounding operations, and quickly discards large domain areas with poor bounds. [sent-162, score-0.202]

64 1 we consider the effect of source points contained in a node Sn of the Source tree. [sent-165, score-0.24]

65 The difference is that now we bound the maximum of this quantity over domain points contained in a domain-node Dl . [sent-166, score-0.309]

66 In speciﬁc, we consider the quantity: ml,n = max max j∈Dl i∈Sn wi − H(hi − hj )2 − V (vi − vj )2 (9) Bounding Gi,j = −H(hi − hj )2 − V (vi − vj )2 involves two 2D intervals, one for the domain-node l and one for the domain-node n. [sent-167, score-0.255]

67 The upper bound is zero if the boxes overlap, or else equals the (scaled) distance of their closest points. [sent-170, score-0.198]

68 The lower bound uses the furthest points of the two boxes. [sent-171, score-0.161]

69 We start by associating the root node of the Domain tree with the root node of the Source-tree, which means that all domain-source point interactions are originally considered. [sent-180, score-0.319]

70 More speciﬁcally, to determine the supporters for a child m of domain-node l we consider only the children of the source-nodes in Sl ; formally, denoting by pa and ch the parent and child operations respectively we have Sm ⊂ ∪n∈Spa(m) {ch(n)}. [sent-182, score-0.446]

71 This is achieved by pruning based on both the lower and upper bounds derived above. [sent-184, score-0.182]

72 makes the upper bounds less optimistic and the lower bounds more optimistic. [sent-187, score-0.265]

73 Denoting the maximal lower bound for contributions to parent node l by Gl = maxn∈Sl Gl,n , this means that Gk ≥ Gl if pa(k) = l. [sent-188, score-0.182]

74 This means that if for sourcenode n at the parent level Gl,n < Gl , at the children level the children of n will contribute something worse than Gm , the lower bound on l’s child score. [sent-190, score-0.342]

75 The algorithm uses a priority queue for Domain tree nodes, initialized with the root of the Domain tree (i. [sent-196, score-0.489]

76 At each iteration we pop a Domain tree node from the queue, compute upper bounds and supporters for its children, which are then pushed in the priority queue. [sent-199, score-0.789]

77 The ﬁrst leaf node that is popped contains the best domain location: its upper bound equals its lower bound, and all other nodes in the priority queue have smaller upper bounds, therefore cannot result in a better solution. [sent-200, score-0.976]

78 To see this, recall that at each step BB pops a domain of the function being maximized from the priority queue, splits it into subdomains (Branch), and computes a new upper bound for the subdomains (Bound). [sent-203, score-0.609]

79 In our case Branching amounts to considering the two descendants of the domain node being popped, while Bounding amounts to taking the maximum of the upper bounds of the domain node supporters. [sent-204, score-0.644]

80 Namely, we only consider using points in S for object parts, and subtract mp from hi , vi to yield simple quadratic forms; since mp is part-dependent, we now have a p superscript for hi , vi . [sent-209, score-0.954]

81 Finally, wp,i depends on p, since the same image point will give different unary potentials for different object parts. [sent-211, score-0.372]

82 From this form we realize that computing the upper bound of m(x) within a range of values of x, as required by Branch-and-Bound is as easy as it was for the single terms in the previous secP tion. [sent-212, score-0.198]

83 In speciﬁc we have m(x) = p=1 mp (x), where mp are the individual part contributions; P P since maxx p=0 mp (x) ≤ p=0 maxx mp (x). [sent-213, score-0.861]

84 we can separately upper bound the individual part contributions, and sum them up to get an overall upper bound. [sent-214, score-0.364]

85 Each part’s contribution to the score is computed using the supporters it lends to the node; the total bound is obtained by summing the individual part bounds. [sent-217, score-0.423]

86 LB = BoundL(x,child); Push(S,child); end for end while Multiple Domain Points, Multiple Parts IN: ST[P], DT {P Source Trees/Domain Tree} OUT: arg maxx∈DT p maxi∈ST [P ] m(x, xp , i) Seed = DT. [sent-224, score-0.338]

87 supc = supc; Push(S,child); end for end while Bounding Routine IN: child,supporters,DT,ST OUT: supch, LB {Chosen supporters, Max LB} UB = −∞; LB = ∞; for n ∈ supporters do UB[n] = BoundU(DT. [sent-242, score-0.251]

88 We directly extend the Branch-and-Bound technique that we developed for a single DPM to deal with multiple scales and mixtures (‘ORs’) of DPMs [4, 21], by inserting all object hypotheses into the same queue. [sent-248, score-0.178]

89 To detect multiple instances of objects at multiple scales we continue BB after getting the best scoring object hypothesis. [sent-249, score-0.193]

90 As termination criterion we choose to stop when we pop an interval whose upper bound is below a ﬁxed threshold. [sent-250, score-0.326]

91 We consider the standard detection scenario where we want to detect all objects in an image having score above a certain threshold. [sent-257, score-0.345]

92 the threshold affects the speedup we obtain; for a conservative threshold the speedup is typically tenfold, but as we become more aggressive it doubles. [sent-265, score-0.262]

93 Our claim is that Branch-and-Bound allows us to pursue a different approach, where in fact having more object categories can increase the speed of detection, if we leave the unary potential computation aside. [sent-270, score-0.26]

94 In speciﬁc, our approach can be directly extended to the multiple-object detection setting; as long as the scores computed by different object categories are commensurate, they can all be inserted in the same priority queue. [sent-271, score-0.543]

95 The reason for this is that including into our object repertoire a model giving a large score helps BB stop; otherwise BB keeps searching for another object. [sent-273, score-0.218]

96 We compare the time that would be required by GDT to perform detection of all multiple objects considered in Pascal, to that of a model simultaneously exploring all models. [sent-276, score-0.196]

97 Of course if we use a ﬁxed threshold the speedup would not change, when compared to plot (a), since essentially the objects do not ‘interact’ in any way (we do not use nonmaximum suppression). [sent-279, score-0.18]

98 We note that the timings refer to the ‘message passing’ part implemented with GDT and not the computation of unary potentials, which is common for both models, and is currently the bottleneck. [sent-281, score-0.183]

99 Even though it is tangential to our contribution in this paper, we mention that as shown in plot (d) we compute unary potentials approximately ﬁve times faster than the single-threaded convolution provided by [3] by exploiting Matlab’s optimized matrix multiplication routines. [sent-282, score-0.153]

100 We have used Dual Trees to compute upper bounds on the cost function of a part-based model and thereby derived a Branch-and-Bound algorithm for detection. [sent-284, score-0.182]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('bb', 0.349), ('xp', 0.227), ('hp', 0.205), ('supporters', 0.183), ('priority', 0.181), ('mp', 0.168), ('dpms', 0.163), ('domain', 0.148), ('detection', 0.147), ('object', 0.144), ('vp', 0.142), ('dpm', 0.142), ('popped', 0.142), ('speedup', 0.131), ('ub', 0.131), ('gn', 0.126), ('dual', 0.125), ('queue', 0.125), ('sn', 0.118), ('unary', 0.116), ('hi', 0.114), ('xd', 0.109), ('deformable', 0.109), ('prioritized', 0.107), ('bound', 0.099), ('upper', 0.099), ('source', 0.095), ('vi', 0.092), ('pop', 0.089), ('intervals', 0.089), ('children', 0.086), ('bounds', 0.083), ('node', 0.083), ('merit', 0.082), ('lb', 0.082), ('sp', 0.082), ('gdt', 0.081), ('locations', 0.079), ('wi', 0.079), ('push', 0.078), ('image', 0.075), ('score', 0.074), ('tree', 0.071), ('inserted', 0.071), ('child', 0.071), ('bn', 0.071), ('part', 0.067), ('gi', 0.064), ('dv', 0.063), ('points', 0.062), ('trees', 0.062), ('dts', 0.061), ('kokkinos', 0.061), ('maxlb', 0.061), ('supch', 0.061), ('maxx', 0.061), ('branch', 0.059), ('hn', 0.059), ('vn', 0.058), ('maxi', 0.057), ('dh', 0.056), ('hl', 0.056), ('bounding', 0.054), ('sl', 0.054), ('vl', 0.054), ('max', 0.053), ('xs', 0.05), ('objects', 0.049), ('cvpr', 0.047), ('maximization', 0.047), ('wj', 0.046), ('promising', 0.044), ('delivers', 0.044), ('heat', 0.044), ('arg', 0.043), ('parts', 0.042), ('st', 0.042), ('cascade', 0.041), ('felzenszwalb', 0.041), ('boundl', 0.041), ('boundu', 0.041), ('gdts', 0.041), ('postponing', 0.041), ('subdomains', 0.041), ('supporter', 0.041), ('xsp', 0.041), ('root', 0.041), ('interval', 0.039), ('pascal', 0.039), ('pseudocode', 0.038), ('gl', 0.038), ('location', 0.037), ('potentials', 0.037), ('chunk', 0.036), ('hj', 0.035), ('pa', 0.035), ('end', 0.034), ('er', 0.034), ('search', 0.034), ('hypotheses', 0.034)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 233 nips-2011-Rapid Deformable Object Detection using Dual-Tree Branch-and-Bound

Author: Iasonas Kokkinos

2 0.14323732 12 nips-2011-A Two-Stage Weighting Framework for Multi-Source Domain Adaptation

Author: Qian Sun, Rita Chattopadhyay, Sethuraman Panchanathan, Jieping Ye

Abstract: Discriminative learning when training and test data belong to different distributions is a challenging and complex task. Often times we have very few or no labeled data from the test or target distribution but may have plenty of labeled data from multiple related sources with different distributions. The difference in distributions may be both in marginal and conditional probabilities. Most of the existing domain adaptation work focuses on the marginal probability distribution difference between the domains, assuming that the conditional probabilities are similar. However in many real world applications, conditional probability distribution differences are as commonplace as marginal probability differences. In this paper we propose a two-stage domain adaptation methodology which combines weighted data from multiple sources based on marginal probability differences (ﬁrst stage) as well as conditional probability differences (second stage), with the target domain data. The weights for minimizing the marginal probability differences are estimated independently, while the weights for minimizing conditional probability differences are computed simultaneously by exploiting the potential interaction among multiple sources. We also provide a theoretical analysis on the generalization performance of the proposed multi-source domain adaptation formulation using the weighted Rademacher complexity measure. Empirical comparisons with existing state-of-the-art domain adaptation methods using three real-world datasets demonstrate the effectiveness of the proposed approach. 1

3 0.13952786 1 nips-2011-$\theta$-MRF: Capturing Spatial and Semantic Structure in the Parameters for Scene Understanding

Author: Congcong Li, Ashutosh Saxena, Tsuhan Chen

Abstract: For most scene understanding tasks (such as object detection or depth estimation), the classiﬁers need to consider contextual information in addition to the local features. We can capture such contextual information by taking as input the features/attributes from all the regions in the image. However, this contextual dependence also varies with the spatial location of the region of interest, and we therefore need a different set of parameters for each spatial location. This results in a very large number of parameters. In this work, we model the independence properties between the parameters for each location and for each task, by deﬁning a Markov Random Field (MRF) over the parameters. In particular, two sets of parameters are encouraged to have similar values if they are spatially close or semantically close. Our method is, in principle, complementary to other ways of capturing context such as the ones that use a graphical model over the labels instead. In extensive evaluation over two different settings, of multi-class object detection and of multiple scene understanding tasks (scene categorization, depth estimation, geometric labeling), our method beats the state-of-the-art methods in all the four tasks. 1

4 0.13571984 166 nips-2011-Maximal Cliques that Satisfy Hard Constraints with Application to Deformable Object Model Learning

Author: Xinggang Wang, Xiang Bai, Xingwei Yang, Wenyu Liu, Longin J. Latecki

Abstract: We propose a novel inference framework for ﬁnding maximal cliques in a weighted graph that satisfy hard constraints. The constraints specify the graph nodes that must belong to the solution as well as mutual exclusions of graph nodes, i.e., sets of nodes that cannot belong to the same solution. The proposed inference is based on a novel particle ﬁlter algorithm with state permeations. We apply the inference framework to a challenging problem of learning part-based, deformable object models. Two core problems in the learning framework, matching of image patches and ﬁnding salient parts, are formulated as two instances of the problem of ﬁnding maximal cliques with hard constraints. Our learning framework yields discriminative part based object models that achieve very good detection rate, and outperform other methods on object classes with large deformation. 1

5 0.1260107 96 nips-2011-Fast and Balanced: Efficient Label Tree Learning for Large Scale Object Recognition

Author: Jia Deng, Sanjeev Satheesh, Alexander C. Berg, Fei Li

Abstract: We present a novel approach to efﬁciently learn a label tree for large scale classiﬁcation with many classes. The key contribution of the approach is a technique to simultaneously determine the structure of the tree and learn the classiﬁers for each node in the tree. This approach also allows ﬁne grained control over the efﬁciency vs accuracy trade-off in designing a label tree, leading to more balanced trees. Experiments are performed on large scale image classiﬁcation with 10184 classes and 9 million images. We demonstrate signiﬁcant improvements in test accuracy and efﬁciency with less training time and more balanced trees compared to the previous state of the art by Bengio et al. 1

6 0.11891804 226 nips-2011-Projection onto A Nonnegative Max-Heap

7 0.11776754 286 nips-2011-The Local Rademacher Complexity of Lp-Norm Multiple Kernel Learning

8 0.11179025 193 nips-2011-Object Detection with Grammar Models

9 0.10794964 154 nips-2011-Learning person-object interactions for action recognition in still images

10 0.098483019 272 nips-2011-Stochastic convex optimization with bandit feedback

11 0.086637676 231 nips-2011-Randomized Algorithms for Comparison-based Search

12 0.085643411 91 nips-2011-Exploiting spatial overlap to efficiently compute appearance distances between image windows

13 0.084601507 180 nips-2011-Multiple Instance Filtering

14 0.083159126 140 nips-2011-Kernel Embeddings of Latent Tree Graphical Models

15 0.082942165 138 nips-2011-Joint 3D Estimation of Objects and Scene Layout

16 0.080784746 141 nips-2011-Large-Scale Category Structure Aware Image Categorization

17 0.075316571 170 nips-2011-Message-Passing for Approximate MAP Inference with Latent Variables

18 0.075103186 151 nips-2011-Learning a Tree of Metrics with Disjoint Visual Features

19 0.074109666 46 nips-2011-Better Mini-Batch Algorithms via Accelerated Gradient Methods

20 0.073294789 157 nips-2011-Learning to Search Efficiently in High Dimensions

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.222), (1, 0.053), (2, -0.128), (3, 0.083), (4, 0.056), (5, 0.026), (6, -0.068), (7, -0.104), (8, -0.009), (9, -0.027), (10, -0.018), (11, 0.023), (12, -0.15), (13, 0.144), (14, -0.019), (15, 0.017), (16, 0.018), (17, -0.055), (18, -0.011), (19, 0.021), (20, -0.069), (21, -0.018), (22, -0.061), (23, -0.052), (24, 0.027), (25, -0.029), (26, -0.031), (27, 0.155), (28, 0.014), (29, -0.045), (30, -0.033), (31, -0.048), (32, -0.023), (33, -0.092), (34, -0.053), (35, 0.07), (36, -0.003), (37, -0.003), (38, -0.049), (39, 0.111), (40, -0.072), (41, -0.049), (42, 0.063), (43, -0.036), (44, 0.06), (45, -0.0), (46, -0.005), (47, 0.145), (48, 0.051), (49, -0.092)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95960391 233 nips-2011-Rapid Deformable Object Detection using Dual-Tree Branch-and-Bound

Author: Iasonas Kokkinos

2 0.55919808 166 nips-2011-Maximal Cliques that Satisfy Hard Constraints with Application to Deformable Object Model Learning

Author: Xinggang Wang, Xiang Bai, Xingwei Yang, Wenyu Liu, Longin J. Latecki

3 0.5479089 226 nips-2011-Projection onto A Nonnegative Max-Heap

Author: Jun Liu, Liang Sun, Jieping Ye

Abstract: We consider the problem of computing the Euclidean projection of a vector of length p onto a non-negative max-heap—an ordered tree where the values of the nodes are all nonnegative and the value of any parent node is no less than the value(s) of its child node(s). This Euclidean projection plays a building block role in the optimization problem with a non-negative maxheap constraint. Such a constraint is desirable when the features follow an ordered tree structure, that is, a given feature is selected for the given regression/classiﬁcation task only if its parent node is selected. In this paper, we show that such Euclidean projection problem admits an analytical solution and we develop a top-down algorithm where the key operation is to ﬁnd the so-called maximal root-tree of the subtree rooted at each node. A naive approach for ﬁnding the maximal root-tree is to enumerate all the possible root-trees, which, however, does not scale well. We reveal several important properties of the maximal root-tree, based on which we design a bottom-up algorithm with merge for eﬃciently ﬁnding the maximal roottree. The proposed algorithm has a (worst-case) linear time complexity for a sequential list, and O(p2 ) for a general tree. We report simulation results showing the eﬀectiveness of the max-heap for regression with an ordered tree structure. Empirical results show that the proposed algorithm has an expected linear time complexity for many special cases including a sequential list, a full binary tree, and a tree with depth 1. 1

4 0.53534174 193 nips-2011-Object Detection with Grammar Models

Author: Ross B. Girshick, Pedro F. Felzenszwalb, David A. McAllester

Abstract: Compositional models provide an elegant formalism for representing the visual appearance of highly variable objects. While such models are appealing from a theoretical point of view, it has been difﬁcult to demonstrate that they lead to performance advantages on challenging datasets. Here we develop a grammar model for person detection and show that it outperforms previous high-performance systems on the PASCAL benchmark. Our model represents people using a hierarchy of deformable parts, variable structure and an explicit model of occlusion for partially visible objects. To train the model, we introduce a new discriminative framework for learning structured prediction models from weakly-labeled data. 1

5 0.52517271 290 nips-2011-Transfer Learning by Borrowing Examples for Multiclass Object Detection

Author: Joseph J. Lim, Antonio Torralba, Ruslan Salakhutdinov

Abstract: Despite the recent trend of increasingly large datasets for object detection, there still exist many classes with few training examples. To overcome this lack of training data for certain classes, we propose a novel way of augmenting the training data for each class by borrowing and transforming examples from other classes. Our model learns which training instances from other classes to borrow and how to transform the borrowed examples so that they become more similar to instances from the target class. Our experimental results demonstrate that our new object detector, with borrowed and transformed examples, improves upon the current state-of-the-art detector on the challenging SUN09 object detection dataset. 1

6 0.50429535 154 nips-2011-Learning person-object interactions for action recognition in still images

7 0.50061917 170 nips-2011-Message-Passing for Approximate MAP Inference with Latent Variables

8 0.49435738 96 nips-2011-Fast and Balanced: Efficient Label Tree Learning for Large Scale Object Recognition

9 0.49078372 234 nips-2011-Reconstructing Patterns of Information Diffusion from Incomplete Observations

10 0.48598623 242 nips-2011-See the Tree Through the Lines: The Shazoo Algorithm

11 0.47633684 180 nips-2011-Multiple Instance Filtering

12 0.45741117 138 nips-2011-Joint 3D Estimation of Objects and Scene Layout

13 0.45622835 91 nips-2011-Exploiting spatial overlap to efficiently compute appearance distances between image windows

14 0.45237029 1 nips-2011-$\theta$-MRF: Capturing Spatial and Semantic Structure in the Parameters for Scene Understanding

15 0.44378674 12 nips-2011-A Two-Stage Weighting Framework for Multi-Source Domain Adaptation

16 0.44227442 127 nips-2011-Image Parsing with Stochastic Scene Grammar

17 0.43136531 284 nips-2011-The Impact of Unlabeled Patterns in Rademacher Complexity Theory for Kernel Classifiers

18 0.42902845 231 nips-2011-Randomized Algorithms for Comparison-based Search

19 0.42844513 274 nips-2011-Structure Learning for Optimization

20 0.41483849 157 nips-2011-Learning to Search Efficiently in High Dimensions

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.045), (4, 0.09), (10, 0.018), (20, 0.034), (26, 0.034), (31, 0.05), (33, 0.036), (43, 0.073), (45, 0.098), (48, 0.011), (57, 0.073), (65, 0.012), (74, 0.033), (75, 0.224), (83, 0.032), (99, 0.059)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.81138808 233 nips-2011-Rapid Deformable Object Detection using Dual-Tree Branch-and-Bound

Author: Iasonas Kokkinos

2 0.62509245 127 nips-2011-Image Parsing with Stochastic Scene Grammar

Author: Yibiao Zhao, Song-chun Zhu

Abstract: This paper proposes a parsing algorithm for scene understanding which includes four aspects: computing 3D scene layout, detecting 3D objects (e.g. furniture), detecting 2D faces (windows, doors etc.), and segmenting background. In contrast to previous scene labeling work that applied discriminative classiﬁers to pixels (or super-pixels), we use a generative Stochastic Scene Grammar (SSG). This grammar represents the compositional structures of visual entities from scene categories, 3D foreground/background, 2D faces, to 1D lines. The grammar includes three types of production rules and two types of contextual relations. Production rules: (i) AND rules represent the decomposition of an entity into sub-parts; (ii) OR rules represent the switching among sub-types of an entity; (iii) SET rules represent an ensemble of visual entities. Contextual relations: (i) Cooperative “+” relations represent positive links between binding entities, such as hinged faces of a object or aligned boxes; (ii) Competitive “-” relations represents negative links between competing entities, such as mutually exclusive boxes. We design an efﬁcient MCMC inference algorithm, namely Hierarchical cluster sampling, to search in the large solution space of scene conﬁgurations. The algorithm has two stages: (i) Clustering: It forms all possible higher-level structures (clusters) from lower-level entities by production rules and contextual relations. (ii) Sampling: It jumps between alternative structures (clusters) in each layer of the hierarchy to ﬁnd the most probable conﬁguration (represented by a parse tree). In our experiment, we demonstrate the superiority of our algorithm over existing methods on public dataset. In addition, our approach achieves richer structures in the parse tree. 1

3 0.62161928 242 nips-2011-See the Tree Through the Lines: The Shazoo Algorithm

Author: Fabio Vitale, Nicolò Cesa-bianchi, Claudio Gentile, Giovanni Zappella

Abstract: Predicting the nodes of a given graph is a fascinating theoretical problem with applications in several domains. Since graph sparsiﬁcation via spanning trees retains enough information while making the task much easier, trees are an important special case of this problem. Although it is known how to predict the nodes of an unweighted tree in a nearly optimal way, in the weighted case a fully satisfactory algorithm is not available yet. We ﬁll this hole and introduce an efﬁcient node predictor, S HAZOO, which is nearly optimal on any weighted tree. Moreover, we show that S HAZOO can be viewed as a common nontrivial generalization of both previous approaches for unweighted trees and weighted lines. Experiments on real-world datasets conﬁrm that S HAZOO performs well in that it fully exploits the structure of the input tree, and gets very close to (and sometimes better than) less scalable energy minimization methods. 1

4 0.61706579 171 nips-2011-Metric Learning with Multiple Kernels

Author: Jun Wang, Huyen T. Do, Adam Woznica, Alexandros Kalousis

Abstract: Metric learning has become a very active research ﬁeld. The most popular representative–Mahalanobis metric learning–can be seen as learning a linear transformation and then computing the Euclidean metric in the transformed space. Since a linear transformation might not always be appropriate for a given learning problem, kernelized versions of various metric learning algorithms exist. However, the problem then becomes ﬁnding the appropriate kernel function. Multiple kernel learning addresses this limitation by learning a linear combination of a number of predeﬁned kernels; this approach can be also readily used in the context of multiple-source learning to fuse different data sources. Surprisingly, and despite the extensive work on multiple kernel learning for SVMs, there has been no work in the area of metric learning with multiple kernel learning. In this paper we ﬁll this gap and present a general approach for metric learning with multiple kernel learning. Our approach can be instantiated with different metric learning algorithms provided that they satisfy some constraints. Experimental evidence suggests that our approach outperforms metric learning with an unweighted kernel combination and metric learning with cross-validation based kernel selection. 1

5 0.61553985 139 nips-2011-Kernel Bayes' Rule

Author: Kenji Fukumizu, Le Song, Arthur Gretton

Abstract: A nonparametric kernel-based method for realizing Bayes’ rule is proposed, based on kernel representations of probabilities in reproducing kernel Hilbert spaces. The prior and conditional probabilities are expressed as empirical kernel mean and covariance operators, respectively, and the kernel mean of the posterior distribution is computed in the form of a weighted sample. The kernel Bayes’ rule can be applied to a wide variety of Bayesian inference problems: we demonstrate Bayesian computation without likelihood, and ﬁltering with a nonparametric statespace model. A consistency rate for the posterior estimate is established. 1

6 0.6145795 64 nips-2011-Convergent Bounds on the Euclidean Distance

7 0.61221206 204 nips-2011-Online Learning: Stochastic, Constrained, and Smoothed Adversaries

8 0.61100161 213 nips-2011-Phase transition in the family of p-resistances

9 0.61079299 29 nips-2011-Algorithms and hardness results for parallel large margin learning

10 0.60551405 154 nips-2011-Learning person-object interactions for action recognition in still images

11 0.60464507 231 nips-2011-Randomized Algorithms for Comparison-based Search

12 0.60441339 166 nips-2011-Maximal Cliques that Satisfy Hard Constraints with Application to Deformable Object Model Learning

13 0.60243827 186 nips-2011-Noise Thresholds for Spectral Clustering

14 0.6012516 150 nips-2011-Learning a Distance Metric from a Network

15 0.60079736 263 nips-2011-Sparse Manifold Clustering and Embedding

16 0.59960926 236 nips-2011-Regularized Laplacian Estimation and Fast Eigenvector Approximation

17 0.59954154 159 nips-2011-Learning with the weighted trace-norm under arbitrary sampling distributions

18 0.59946442 22 nips-2011-Active Ranking using Pairwise Comparisons

19 0.59836948 227 nips-2011-Pylon Model for Semantic Segmentation

20 0.59819245 226 nips-2011-Projection onto A Nonnegative Max-Heap