nips nips2012 nips2012-106 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Xiaolong Wang, Liang Lin
Abstract: This paper studies a novel discriminative part-based model to represent and recognize object shapes with an “And-Or graph”. We define this model consisting of three layers: the leaf-nodes with collaborative edges for localizing local parts, the or-nodes specifying the switch of leaf-nodes, and the root-node encoding the global verification. A discriminative learning algorithm, extended from the CCCP [23], is proposed to train the model in a dynamical manner: the model structure (e.g., the configuration of the leaf-nodes associated with the or-nodes) is automatically determined with optimizing the multi-layer parameters during the iteration. The advantages of our method are two-fold. (i) The And-Or graph model enables us to handle well large intra-class variance and background clutters for object shape detection from images. (ii) The proposed learning algorithm is able to obtain the And-Or graph representation without requiring elaborate supervision and initialization. We validate the proposed method on several challenging databases (e.g., INRIA-Horse, ETHZ-Shape, and UIUC-People), and it outperforms the state-of-the-arts approaches. 1
Reference: text
sentIndex sentText sentNum sentScore
1 com Abstract This paper studies a novel discriminative part-based model to represent and recognize object shapes with an “And-Or graph”. [sent-7, score-0.266]
2 We define this model consisting of three layers: the leaf-nodes with collaborative edges for localizing local parts, the or-nodes specifying the switch of leaf-nodes, and the root-node encoding the global verification. [sent-8, score-0.161]
3 (i) The And-Or graph model enables us to handle well large intra-class variance and background clutters for object shape detection from images. [sent-13, score-0.65]
4 1 Introduction Part-based and hierarchical representations have been widely studied in computer vision, and lead to some elegant frameworks for complex object detection and recognition. [sent-18, score-0.366]
5 structural switch) in hierarchy, which is the key to handle the large intra-class variance in object detection. [sent-21, score-0.277]
6 And-Or graph models are recently explored in [26, 27] to hierarchically model object categories via “and-nodes” and “or-nodes” that represent, respectively, compositions of parts and structural variation of parts. [sent-23, score-0.436]
7 The leaf-nodes in the bottom layer represent a batch of local classifiers of contour fragments. [sent-29, score-0.261]
8 1 the problem that the true contours of objects are often connected to background clutters due to unreliable edge extraction. [sent-36, score-0.399]
9 Each or-node is used to select one contour from the candidates detected via the associated leaf-nodes in the bottom layer. [sent-39, score-0.406]
10 The contours selected via the or-nodes are further verified as a whole, in order to make the detection robust against the background clutters. [sent-44, score-0.466]
11 Concretely, our model allows nearby contours to interact with each other. [sent-46, score-0.274]
12 , the layout of or-nodes and the activation of leaf-nodes) are implicitly inferred with the latent variables. [sent-52, score-0.161]
13 2 Related Work Remarkable progress has been made in shape-based object detection [6, 10, 9, 11, 19]. [sent-53, score-0.366]
14 By employing some shape descriptors and matching schemes, many works represent and recognize object shapes as a loose collection of local contours. [sent-54, score-0.331]
15 [6] used a codebook of PAS (pairwise adjacent segments) to localize object of interest; Maji et al. [sent-56, score-0.174]
16 Recently, the tree structure latent models [25, 5] have provided significant improvements on object detection. [sent-58, score-0.223]
17 using configurable graph structures with And, Or nodes, has been applied in object and scene parsing [26, 18, 24] and action classification [20]. [sent-67, score-0.346]
18 3(a) illustrates, the square on the top is the root-node representing the complete object instances. [sent-70, score-0.174]
19 The dashed circles derived from the root are z or-nodes arranged in a layout of b1 × b2 blocks, representing the object parts. [sent-71, score-0.246]
20 , collaborative edges) are defined between the leaf-nodes that are associated with different or-nodes, in order to encode the compatibility of object parts. [sent-85, score-0.25]
21 Suppose a contour fragment c on the edge map X is captured by the block located at pi = (px , py ), as the input of classifier. [sent-91, score-0.55]
22 The response of classifier Lj at location pi of the edge map X is defined as: l RLj (X, pi ) = max ωj · ϕl (pi , c), (1) c∈X l where ωj is a parameter vector, which is set to zero if the corresponding leaf-node Lj is nonexistent. [sent-94, score-0.479]
23 l Then we can detect the contour from edge map X via the classifier, cj = argmaxc∈X ωj · ϕl (pi , c). [sent-95, score-0.381]
24 , z is proposed to specify a proper contour from a set of candidates detected via its children leaf-nodes. [sent-99, score-0.406]
25 For each or-node Ui , we define the deformation feature as ϕs (p0 , pi ) = (dx, dy, dx2 , dy 2 ), where (dx, dy) is the displacement of the or-node position pi to the expected position p0 determined by the root-node. [sent-102, score-0.475]
26 Then the cost of locating Ui at pi is: s Costi (p0 , pi ) = −ωi · ϕs (p0 , pi ), (2) s ωi s where is a 4-dimensional parameter vector corresponding to ϕ (p0 , pi ). [sent-103, score-0.704]
27 For each leaf-node Lj associated with Ui , we introduce an indicator variable vj ∈ {0, 1} representing whether it is activated or not. [sent-105, score-0.186]
28 Thus, the response of the or-node Ui is defined as, ∑ RUi (X, p0 , pi , vi ) = RLj (X, pi ) · vj + Costi (p0 , pi ). [sent-110, score-0.728]
29 (3) j∈ch(i) Collaborative Edge: For any pair of leaf-nodes (Lj , Lj ′ ) respectively associated with two different or-nodes, we define the collaborative edge between them according to their contextual cooccurrence. [sent-111, score-0.155]
30 That is, how likely it is that the object contains contours detected via the two leaf-nodes. [sent-112, score-0.549]
31 Root-node: The root-node represents a global classifier to verify the ensemble of contour fragments C r = {c1 , . [sent-121, score-0.392]
32 For better understanding, we refer H = (P, V ) as the latent variables during inference, where P implies the deformation of parts represented by the or-nodes and V implies the discrete distribution of leaf-nodes (i. [sent-130, score-0.149]
33 4 Inference The inference task is to localize the optimal contour fragments within the detection window, which is slidden at all scales and positions of the edge map X. [sent-156, score-0.663]
34 Assuming the root-node is located at p0 , the object shape is localized by maximizing RG (X, H) defined in (6): S(p0 , X) = max RG (X, H). [sent-157, score-0.297]
35 the local classifiers) are utilized to detect contour fragments within the edge map X. [sent-160, score-0.471]
36 Assume that leaf-node Lj , j ∈ ch(i) associated with Ui is activated, vj = 1, and the optimal contour fragment cj is localized by maximizing the response in Eq. [sent-161, score-0.477]
37 Then we generate i,j a set of candidates for each or-node, {cj , p∗ }, each of which is one detected contour fragments via i,j the leaf-nodes. [sent-163, score-0.537]
38 These sets of candidates will be passed to the top-down step where the leaf-node activation vi for Ui can be further validated. [sent-164, score-0.143]
39 We calculate the response for the bottom-up step, as, z ∑ Rbot (V ) = (11) RUi (X, p0 , p∗ , vi ), i i=1 where V = {vi } denotes a hypothesis of leaf-node activation for all or-nodes. [sent-165, score-0.147]
40 In practice, we can further prune the candidate contours by setting a threshold on Rbot (V ). [sent-166, score-0.274]
41 Thus, given the V = {vi }, we can select an ensemble of contours C r = {c1 , . [sent-167, score-0.274]
42 , cz }, each of which is detected by an activated leaf-node, Lj , vj = 1. [sent-170, score-0.337]
43 Top-down verification: Given the ensemble of contours C r , we then apply the global classifier at the root-node to verify C r by Eq. [sent-171, score-0.274]
44 By incorporating the bottom-up and top-down steps, we obtain the response of And-Or graph model by Eq. [sent-174, score-0.163]
45 The final detection is acquired by selecting the maximum score in Eq. [sent-176, score-0.192]
46 This algorithm iterates to determine the And-Or graph structure in a dynamical manner: given the inferred latent variables H = (P, V ) in each step, the leaf-nodes can be automatically created or removed to generate a new structural configuration. [sent-179, score-0.415]
47 To be specific, a new leaf-node is encouraged to be created as the local detector for contours that cannot be handled by the current model(Fig. [sent-180, score-0.359]
48 (13) The optimization of this function can be solved by using structural SVM with latent variables, ∑ 1 min ∥ω∥2 + D [max(ω · ϕ(Xk , y, H) + L(yk , y, H)) − max(ω · ϕ(Xk , yk , H))], ω 2 y,H H N (14) k=1 where D is a penalty parameter(set as 0. [sent-192, score-0.4]
49 We define that L(yk , y, H) = 0 if yk = y, “1” if yk ̸= y in our method. [sent-194, score-0.496]
50 (14) into a convex and concave form as, ∑ ∑ 1 min[ ∥ω∥2 + D max(ω · ϕ(Xk , y, H) + L(yk , y, H))] − [D max(ω · ϕ(Xk , yk , H))] ω y,H H 2 N N k=1 k=1 = min[f (ω) − g(ω)], (15) (16) ω where f (ω) represents the first two terms, and g(ω) represents the last term in (15). [sent-203, score-0.248]
51 The original CCCP includes two iterative steps: (I) fixing the model parameters, estimate the latent variables H ∗ for each positive samples; (II) compute the model parameters by the traditional structural SVM method. [sent-204, score-0.152]
52 (I) For optimization, we first find a hyperplane qt to upper bound the concave part −g(ω) in Eq. [sent-210, score-0.187]
53 We construct qt by ∗ calculating the optimal latent variables Hk = argmaxH (ωt ·ϕ(Xk , yk , H)). [sent-213, score-0.417]
54 Since ϕ(Xk , yk , H) = 0 when yk = −1, we only take the positive training samples into account during computation. [sent-214, score-0.496]
55 Then ∑N ∗ the hyperplane is constructed as qt = −D k=1 ϕ(Xk , yk , Hk ). [sent-215, score-0.435]
56 Accordingly, the hyperplane qt would change with ϕ(X, y, H ∗ ), and would lead to non-convergence of learning. [sent-219, score-0.187]
57 Given ϕ(Xk , yk , Hk ) of all positive samples, we apply PCA on them, K ∑ ∗ ϕ(Xk , yk , Hk ) ≈ u + βk,i ei , (18) i=1 where K is the number of the eigenvectors, ei the eigenvector with its parameter βk,i . [sent-225, score-0.578]
58 We set K a ∑K ∗ large number so that ||ϕ(Xk , yk , Hk ) − (u + i=1 βk,i ei )||2 < σ, ∀k. [sent-226, score-0.289]
59 For the jth bin of the feature 5 ( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10) (a) (b) (c) Figure 2: A toy example for structural clustering. [sent-227, score-0.206]
60 (a) shows the feature vectors ϕ of the samples associated with Ui , and the intensity of the feature bin indicates the feature value. [sent-232, score-0.271]
61 The red and green bounding boxes on the vectors indicate the non-principal features representing the detected contour fragments via two different leaf-nodes. [sent-233, score-0.527]
62 For each or-node Ui , a set of detected contour fragments, {c1 , c2 , . [sent-242, score-0.362]
63 The feature vectors for these contours that are generated by the leaf-nodes, {ϕl (p1 , c1 ), . [sent-246, score-0.375]
64 More specifically, once we select the jth bin for the l all feature vectors ϕ , it can be either principal or not in different vectors ϕ. [sent-253, score-0.171]
65 We thus refactor the feature vectors of these contours as {ϕ′ (p1 , c1 ), . [sent-255, score-0.375]
66 To trigger the structural reconfiguration, for each ornode Ui , we perform the clustering for detected contour fragments represented by the newly formed feature vectors. [sent-260, score-0.709]
67 We first group the contours detected by the same leaf-node into the same cluster as a temporary partition. [sent-261, score-0.375]
68 And the close contours are grouped into the same cluster. [sent-263, score-0.315]
69 represent the similar contour with the same bins in the complete feature vector ϕ. [sent-266, score-0.328]
70 Please recall that the vector of one contour is part of ϕ. [sent-267, score-0.261]
71 Their parameters can be learned based on the feature vectors of contours within the clusters. [sent-274, score-0.375]
72 • One leaf-node is removed when the feature bins related to it are zero, which implies the contours detected by the leaf-node are grouped to another cluster. [sent-275, score-0.528]
73 After the structural reconfiguration, we denote ∗ ∗ all the feature vectors ϕ(Xk , yk , Hk ) are adjusted to ϕd (Xk , yk , Hk ). [sent-279, score-0.7]
74 Then the new hyperplane is ∑N d ∗ generated as qt = −D k=1 ϕd (Xk , yk , Hk ). [sent-280, score-0.435]
75 ∗ (III) Given the newly generated model structures represented by the feature vectors ϕd (Xk , yk , Hk ), d we can learn the model parameters by solving ωt+1 = argminω [f (ω) + ω · qt ]. [sent-281, score-0.469]
76 By substituting d −g(ω) with the upper bound hyperplane qt , the optimization task in Eq. [sent-282, score-0.187]
77 (15) can be rewritten as, ∑ 1 ∗ min ∥ω∥2 + D [max(ω · ϕ(Xk , y, H) + L(yk , y, H)) − ω · ϕd (Xk , yk , Hk )]. [sent-283, score-0.248]
78 , U8 ; a practical detection with the activated leafnodes are highlighted by red. [sent-296, score-0.285]
79 ω∗ = D ∑ ∗ αk,y,H ∆ϕ(Xk , y, H), (20) k,y,H ∗ where ∆ϕ(Xk , y, H) = ϕd (Xk , yk , Hk ) − ϕ(Xk , y, H). [sent-298, score-0.248]
80 For each training sample (whose contours have been extracted), we partition it into a regular layout of several blocks, each of which corresponds to one or-node. [sent-305, score-0.346]
81 The contours fallen into the block are treated as the input for learning. [sent-306, score-0.32]
82 Once there are more than two contours in one block, we select the one with largest length. [sent-307, score-0.274]
83 Then the leaf-nodes are generated by clustering the selected contours without any constraints, and we can thus obtain the initial feature vector ϕd for each sample. [sent-308, score-0.341]
84 6 Experiments We evaluate our method for object shape detection, using three benchmark datasets: the UIUCPeople [17], the ETHZ-Shape [7] and the INRIA-Horse [7]. [sent-309, score-0.297]
85 For positive samples, we extract their clutter-free object contours; for negative samples, we compute their edge maps by using the Pb edge detector [12] with an edge link method. [sent-316, score-0.445]
86 During detection, the edge maps of test images are extracted as for negative training samples, within which the object is searched at 6 different scales, 2 per octave. [sent-318, score-0.307]
87 For each contour as the input to the leaf-node, we sample 20 points and compute the Shape Context descriptor for each point; the descriptor is quantized with 6 polar angles and 2 radial bins. [sent-319, score-0.329]
88 We adopt the testing criterion defined in the PASCAL VOC challenge: a detection is counted as correct if the intersection over union with the groundtruth is at least 50%. [sent-320, score-0.192]
89 To evaluate the benefit from the collaborative edges, we degenerate our model to the And-Or Tree (AOT) by removing the collaborative edges. [sent-326, score-0.152]
90 3(c) illustrates, the average precisions (AP) of detection by applying AOG and AOT are 56. [sent-328, score-0.192]
91 709 (b) Table 1: (a) Comparisons of detection accuracies on the UIUC-People dataset. [sent-383, score-0.192]
92 metric mentioned in [18], to calculate the detection accuracy, we only consider the detection with the highest score on an image for all the methods. [sent-385, score-0.384]
93 (b),(c) and (d) shows a few object shape detections by applying our method on the three datasets, and the false positives are annotated by blue frames. [sent-402, score-0.297]
94 It is shown that our system substantially outperforms the recent methods: the AOG and AOT models achieve detection rates of 89. [sent-409, score-0.192]
95 We test our method with more object categories on the ETHZ-Shape dataset: Applelogos, Bottles, Giraffes, Mugs and Swans. [sent-418, score-0.174]
96 7 Conclusion This paper proposes a discriminative contour-based object model with the And-Or graph representation. [sent-425, score-0.347]
97 Our method achieves the state-of-art of object shape detection on challenging datasets. [sent-427, score-0.489]
98 Schiele, Pictorial structures revisited: People detection and articulated pose estimation, In CVPR, 2009. [sent-435, score-0.192]
99 Schiele, Discriminative structure learning of hierarchical representations for object detection, In CVPR, 2009. [sent-493, score-0.174]
100 Joachims, Learning structural svms with latent variables, In ICML, 2009. [sent-527, score-0.152]
wordName wordTfidf (topN-words)
[('recon', 0.3), ('contours', 0.274), ('contour', 0.261), ('yk', 0.248), ('cccp', 0.206), ('detection', 0.192), ('pi', 0.176), ('object', 0.174), ('xk', 0.167), ('aog', 0.161), ('aot', 0.161), ('lj', 0.159), ('hk', 0.138), ('fragments', 0.131), ('shape', 0.123), ('qt', 0.12), ('graph', 0.115), ('ui', 0.111), ('structural', 0.103), ('detected', 0.101), ('guration', 0.097), ('vz', 0.094), ('vj', 0.093), ('activated', 0.093), ('edge', 0.079), ('collaborative', 0.076), ('latecki', 0.075), ('layout', 0.072), ('fppi', 0.069), ('iksvm', 0.069), ('neigh', 0.069), ('feature', 0.067), ('malik', 0.067), ('hyperplane', 0.067), ('maji', 0.062), ('rg', 0.062), ('cvpr', 0.06), ('vi', 0.059), ('discriminative', 0.058), ('ch', 0.058), ('parsing', 0.057), ('deformation', 0.056), ('zhu', 0.054), ('images', 0.054), ('pz', 0.053), ('rui', 0.053), ('dynamical', 0.052), ('created', 0.051), ('yuille', 0.05), ('cz', 0.05), ('veri', 0.05), ('switch', 0.05), ('latent', 0.049), ('vn', 0.049), ('response', 0.048), ('srinivasan', 0.046), ('applelogos', 0.046), ('clutters', 0.046), ('costi', 0.046), ('dcccp', 0.046), ('fallen', 0.046), ('felz', 0.046), ('giraffes', 0.046), ('mugs', 0.046), ('ornode', 0.046), ('rbot', 0.046), ('rlj', 0.046), ('uiucpeople', 0.046), ('removed', 0.045), ('ferrari', 0.045), ('tpami', 0.045), ('guided', 0.045), ('ap', 0.045), ('parts', 0.044), ('candidates', 0.044), ('ei', 0.041), ('grouped', 0.041), ('cj', 0.041), ('guangzhou', 0.041), ('andriluka', 0.041), ('bottles', 0.041), ('mumford', 0.041), ('schnitzspan', 0.041), ('tran', 0.041), ('activation', 0.04), ('eccv', 0.039), ('dynamically', 0.037), ('particle', 0.037), ('er', 0.036), ('bin', 0.036), ('schiele', 0.035), ('bourdev', 0.035), ('edges', 0.035), ('voting', 0.035), ('china', 0.035), ('vectors', 0.034), ('detector', 0.034), ('recognize', 0.034), ('descriptor', 0.034), ('fragment', 0.034)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999923 106 nips-2012-Dynamical And-Or Graph Learning for Object Shape Modeling and Detection
Author: Xiaolong Wang, Liang Lin
Abstract: This paper studies a novel discriminative part-based model to represent and recognize object shapes with an “And-Or graph”. We define this model consisting of three layers: the leaf-nodes with collaborative edges for localizing local parts, the or-nodes specifying the switch of leaf-nodes, and the root-node encoding the global verification. A discriminative learning algorithm, extended from the CCCP [23], is proposed to train the model in a dynamical manner: the model structure (e.g., the configuration of the leaf-nodes associated with the or-nodes) is automatically determined with optimizing the multi-layer parameters during the iteration. The advantages of our method are two-fold. (i) The And-Or graph model enables us to handle well large intra-class variance and background clutters for object shape detection from images. (ii) The proposed learning algorithm is able to obtain the And-Or graph representation without requiring elaborate supervision and initialization. We validate the proposed method on several challenging databases (e.g., INRIA-Horse, ETHZ-Shape, and UIUC-People), and it outperforms the state-of-the-arts approaches. 1
2 0.20202087 101 nips-2012-Discriminatively Trained Sparse Code Gradients for Contour Detection
Author: Ren Xiaofeng, Liefeng Bo
Abstract: Finding contours in natural images is a fundamental problem that serves as the basis of many tasks such as image segmentation and object recognition. At the core of contour detection technologies are a set of hand-designed gradient features, used by most approaches including the state-of-the-art Global Pb (gPb) operator. In this work, we show that contour detection accuracy can be significantly improved by computing Sparse Code Gradients (SCG), which measure contrast using patch representations automatically learned through sparse coding. We use K-SVD for dictionary learning and Orthogonal Matching Pursuit for computing sparse codes on oriented local neighborhoods, and apply multi-scale pooling and power transforms before classifying them with linear SVMs. By extracting rich representations from pixels and avoiding collapsing them prematurely, Sparse Code Gradients effectively learn how to measure local contrasts and find contours. We improve the F-measure metric on the BSDS500 benchmark to 0.74 (up from 0.71 of gPb contours). Moreover, our learning approach can easily adapt to novel sensor data such as Kinect-style RGB-D cameras: Sparse Code Gradients on depth maps and surface normals lead to promising contour detection using depth and depth+color, as verified on the NYU Depth Dataset. 1
3 0.17466502 344 nips-2012-Timely Object Recognition
Author: Sergey Karayev, Tobias Baumgartner, Mario Fritz, Trevor Darrell
Abstract: In a large visual multi-class detection framework, the timeliness of results can be crucial. Our method for timely multi-class detection aims to give the best possible performance at any single point after a start time; it is terminated at a deadline time. Toward this goal, we formulate a dynamic, closed-loop policy that infers the contents of the image in order to decide which detector to deploy next. In contrast to previous work, our method significantly diverges from the predominant greedy strategies, and is able to learn to take actions with deferred values. We evaluate our method with a novel timeliness measure, computed as the area under an Average Precision vs. Time curve. Experiments are conducted on the PASCAL VOC object detection dataset. If execution is stopped when only half the detectors have been run, our method obtains 66% better AP than a random ordering, and 14% better performance than an intelligent baseline. On the timeliness measure, our method obtains at least 11% better performance. Our method is easily extensible, as it treats detectors and classifiers as black boxes and learns from execution traces using reinforcement learning. 1
4 0.16262116 360 nips-2012-Visual Recognition using Embedded Feature Selection for Curvature Self-Similarity
Author: Angela Eigenstetter, Bjorn Ommer
Abstract: Category-level object detection has a crucial need for informative object representations. This demand has led to feature descriptors of ever increasing dimensionality like co-occurrence statistics and self-similarity. In this paper we propose a new object representation based on curvature self-similarity that goes beyond the currently popular approximation of objects using straight lines. However, like all descriptors using second order statistics, ours also exhibits a high dimensionality. Although improving discriminability, the high dimensionality becomes a critical issue due to lack of generalization ability and curse of dimensionality. Given only a limited amount of training data, even sophisticated learning algorithms such as the popular kernel methods are not able to suppress noisy or superfluous dimensions of such high-dimensional data. Consequently, there is a natural need for feature selection when using present-day informative features and, particularly, curvature self-similarity. We therefore suggest an embedded feature selection method for SVMs that reduces complexity and improves generalization capability of object models. By successfully integrating the proposed curvature self-similarity representation together with the embedded feature selection in a widely used state-of-the-art object detection framework we show the general pertinence of the approach. 1
5 0.15282668 27 nips-2012-A quasi-Newton proximal splitting method
Author: Stephen Becker, Jalal Fadili
Abstract: A new result in convex analysis on the calculation of proximity operators in certain scaled norms is derived. We describe efficient implementations of the proximity calculation for a useful class of functions; the implementations exploit the piece-wise linear nature of the dual problem. The second part of the paper applies the previous result to acceleration of convex minimization problems, and leads to an elegant quasi-Newton method. The optimization method compares favorably against state-of-the-art alternatives. The algorithm has extensive applications including signal processing, sparse recovery and machine learning and classification. 1
6 0.14533134 40 nips-2012-Analyzing 3D Objects in Cluttered Images
7 0.14384542 300 nips-2012-Scalable nonconvex inexact proximal splitting
9 0.13252604 201 nips-2012-Localizing 3D cuboids in single-view images
10 0.13158651 1 nips-2012-3D Object Detection and Viewpoint Estimation with a Deformable 3D Cuboid Model
11 0.12494597 282 nips-2012-Proximal Newton-type methods for convex optimization
12 0.11267956 303 nips-2012-Searching for objects driven by context
13 0.10900647 209 nips-2012-Max-Margin Structured Output Regression for Spatio-Temporal Action Localization
14 0.10629623 81 nips-2012-Context-Sensitive Decision Forests for Object Detection
15 0.10493867 8 nips-2012-A Generative Model for Parts-based Object Segmentation
16 0.10493435 357 nips-2012-Unsupervised Template Learning for Fine-Grained Object Recognition
17 0.10138432 311 nips-2012-Shifting Weights: Adapting Object Detectors from Image to Video
18 0.089822128 168 nips-2012-Kernel Latent SVM for Visual Recognition
19 0.088017076 13 nips-2012-A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function
20 0.083249398 197 nips-2012-Learning with Recursive Perceptual Representations
topicId topicWeight
[(0, 0.226), (1, 0.048), (2, -0.122), (3, -0.078), (4, 0.13), (5, -0.096), (6, 0.018), (7, -0.154), (8, 0.096), (9, 0.072), (10, -0.125), (11, 0.1), (12, 0.166), (13, -0.153), (14, 0.001), (15, 0.091), (16, -0.0), (17, -0.014), (18, -0.088), (19, -0.009), (20, -0.084), (21, -0.044), (22, 0.016), (23, -0.036), (24, -0.051), (25, 0.026), (26, -0.02), (27, -0.048), (28, 0.021), (29, -0.063), (30, -0.003), (31, 0.018), (32, -0.033), (33, 0.085), (34, 0.032), (35, -0.074), (36, -0.012), (37, 0.042), (38, -0.014), (39, -0.045), (40, 0.063), (41, -0.015), (42, -0.018), (43, 0.008), (44, 0.003), (45, -0.006), (46, -0.106), (47, -0.05), (48, -0.078), (49, 0.005)]
simIndex simValue paperId paperTitle
same-paper 1 0.9585585 106 nips-2012-Dynamical And-Or Graph Learning for Object Shape Modeling and Detection
Author: Xiaolong Wang, Liang Lin
Abstract: This paper studies a novel discriminative part-based model to represent and recognize object shapes with an “And-Or graph”. We define this model consisting of three layers: the leaf-nodes with collaborative edges for localizing local parts, the or-nodes specifying the switch of leaf-nodes, and the root-node encoding the global verification. A discriminative learning algorithm, extended from the CCCP [23], is proposed to train the model in a dynamical manner: the model structure (e.g., the configuration of the leaf-nodes associated with the or-nodes) is automatically determined with optimizing the multi-layer parameters during the iteration. The advantages of our method are two-fold. (i) The And-Or graph model enables us to handle well large intra-class variance and background clutters for object shape detection from images. (ii) The proposed learning algorithm is able to obtain the And-Or graph representation without requiring elaborate supervision and initialization. We validate the proposed method on several challenging databases (e.g., INRIA-Horse, ETHZ-Shape, and UIUC-People), and it outperforms the state-of-the-arts approaches. 1
2 0.74843019 201 nips-2012-Localizing 3D cuboids in single-view images
Author: Jianxiong Xiao, Bryan Russell, Antonio Torralba
Abstract: In this paper we seek to detect rectangular cuboids and localize their corners in uncalibrated single-view images depicting everyday scenes. In contrast to recent approaches that rely on detecting vanishing points of the scene and grouping line segments to form cuboids, we build a discriminative parts-based detector that models the appearance of the cuboid corners and internal edges while enforcing consistency to a 3D cuboid model. Our model copes with different 3D viewpoints and aspect ratios and is able to detect cuboids across many different object categories. We introduce a database of images with cuboid annotations that spans a variety of indoor and outdoor scenes and show qualitative and quantitative results on our collected database. Our model out-performs baseline detectors that use 2D constraints alone on the task of localizing cuboid corners. 1
3 0.72603059 1 nips-2012-3D Object Detection and Viewpoint Estimation with a Deformable 3D Cuboid Model
Author: Sanja Fidler, Sven Dickinson, Raquel Urtasun
Abstract: This paper addresses the problem of category-level 3D object detection. Given a monocular image, our aim is to localize the objects in 3D by enclosing them with tight oriented 3D bounding boxes. We propose a novel approach that extends the well-acclaimed deformable part-based model [1] to reason in 3D. Our model represents an object class as a deformable 3D cuboid composed of faces and parts, which are both allowed to deform with respect to their anchors on the 3D box. We model the appearance of each face in fronto-parallel coordinates, thus effectively factoring out the appearance variation induced by viewpoint. Our model reasons about face visibility patters called aspects. We train the cuboid model jointly and discriminatively and share weights across all aspects to attain efficiency. Inference then entails sliding and rotating the box in 3D and scoring object hypotheses. While for inference we discretize the search space, the variables are continuous in our model. We demonstrate the effectiveness of our approach in indoor and outdoor scenarios, and show that our approach significantly outperforms the stateof-the-art in both 2D [1] and 3D object detection [2]. 1
4 0.68804157 360 nips-2012-Visual Recognition using Embedded Feature Selection for Curvature Self-Similarity
Author: Angela Eigenstetter, Bjorn Ommer
Abstract: Category-level object detection has a crucial need for informative object representations. This demand has led to feature descriptors of ever increasing dimensionality like co-occurrence statistics and self-similarity. In this paper we propose a new object representation based on curvature self-similarity that goes beyond the currently popular approximation of objects using straight lines. However, like all descriptors using second order statistics, ours also exhibits a high dimensionality. Although improving discriminability, the high dimensionality becomes a critical issue due to lack of generalization ability and curse of dimensionality. Given only a limited amount of training data, even sophisticated learning algorithms such as the popular kernel methods are not able to suppress noisy or superfluous dimensions of such high-dimensional data. Consequently, there is a natural need for feature selection when using present-day informative features and, particularly, curvature self-similarity. We therefore suggest an embedded feature selection method for SVMs that reduces complexity and improves generalization capability of object models. By successfully integrating the proposed curvature self-similarity representation together with the embedded feature selection in a widely used state-of-the-art object detection framework we show the general pertinence of the approach. 1
5 0.65166718 101 nips-2012-Discriminatively Trained Sparse Code Gradients for Contour Detection
Author: Ren Xiaofeng, Liefeng Bo
Abstract: Finding contours in natural images is a fundamental problem that serves as the basis of many tasks such as image segmentation and object recognition. At the core of contour detection technologies are a set of hand-designed gradient features, used by most approaches including the state-of-the-art Global Pb (gPb) operator. In this work, we show that contour detection accuracy can be significantly improved by computing Sparse Code Gradients (SCG), which measure contrast using patch representations automatically learned through sparse coding. We use K-SVD for dictionary learning and Orthogonal Matching Pursuit for computing sparse codes on oriented local neighborhoods, and apply multi-scale pooling and power transforms before classifying them with linear SVMs. By extracting rich representations from pixels and avoiding collapsing them prematurely, Sparse Code Gradients effectively learn how to measure local contrasts and find contours. We improve the F-measure metric on the BSDS500 benchmark to 0.74 (up from 0.71 of gPb contours). Moreover, our learning approach can easily adapt to novel sensor data such as Kinect-style RGB-D cameras: Sparse Code Gradients on depth maps and surface normals lead to promising contour detection using depth and depth+color, as verified on the NYU Depth Dataset. 1
6 0.64983392 357 nips-2012-Unsupervised Template Learning for Fine-Grained Object Recognition
7 0.62156922 40 nips-2012-Analyzing 3D Objects in Cluttered Images
8 0.6143682 303 nips-2012-Searching for objects driven by context
9 0.59859586 344 nips-2012-Timely Object Recognition
10 0.59438998 209 nips-2012-Max-Margin Structured Output Regression for Spatio-Temporal Action Localization
11 0.54265821 8 nips-2012-A Generative Model for Parts-based Object Segmentation
12 0.53713715 168 nips-2012-Kernel Latent SVM for Visual Recognition
13 0.51643682 223 nips-2012-Multi-criteria Anomaly Detection using Pareto Depth Analysis
14 0.51194948 300 nips-2012-Scalable nonconvex inexact proximal splitting
15 0.49244818 311 nips-2012-Shifting Weights: Adapting Object Detectors from Image to Video
16 0.49087921 81 nips-2012-Context-Sensitive Decision Forests for Object Detection
17 0.48543665 282 nips-2012-Proximal Newton-type methods for convex optimization
18 0.4746587 27 nips-2012-A quasi-Newton proximal splitting method
19 0.46235791 87 nips-2012-Convolutional-Recursive Deep Learning for 3D Object Classification
20 0.45292589 83 nips-2012-Controlled Recognition Bounds for Visual Learning and Exploration
topicId topicWeight
[(0, 0.054), (21, 0.038), (38, 0.067), (39, 0.341), (42, 0.023), (54, 0.027), (55, 0.028), (74, 0.13), (76, 0.125), (80, 0.051), (92, 0.044)]
simIndex simValue paperId paperTitle
same-paper 1 0.82009411 106 nips-2012-Dynamical And-Or Graph Learning for Object Shape Modeling and Detection
Author: Xiaolong Wang, Liang Lin
Abstract: This paper studies a novel discriminative part-based model to represent and recognize object shapes with an “And-Or graph”. We define this model consisting of three layers: the leaf-nodes with collaborative edges for localizing local parts, the or-nodes specifying the switch of leaf-nodes, and the root-node encoding the global verification. A discriminative learning algorithm, extended from the CCCP [23], is proposed to train the model in a dynamical manner: the model structure (e.g., the configuration of the leaf-nodes associated with the or-nodes) is automatically determined with optimizing the multi-layer parameters during the iteration. The advantages of our method are two-fold. (i) The And-Or graph model enables us to handle well large intra-class variance and background clutters for object shape detection from images. (ii) The proposed learning algorithm is able to obtain the And-Or graph representation without requiring elaborate supervision and initialization. We validate the proposed method on several challenging databases (e.g., INRIA-Horse, ETHZ-Shape, and UIUC-People), and it outperforms the state-of-the-arts approaches. 1
2 0.76878071 351 nips-2012-Transelliptical Component Analysis
Author: Fang Han, Han Liu
Abstract: We propose a high dimensional semiparametric scale-invariant principle component analysis, named TCA, by utilize the natural connection between the elliptical distribution family and the principal component analysis. Elliptical distribution family includes many well-known multivariate distributions like multivariate Gaussian, t and logistic and it is extended to the meta-elliptical by Fang et.al (2002) using the copula techniques. In this paper we extend the meta-elliptical distribution family to a even larger family, called transelliptical. We prove that TCA can obtain a near-optimal s log d/n estimation consistency rate in recovering the leading eigenvector of the latent generalized correlation matrix under the transelliptical distribution family, even if the distributions are very heavy-tailed, have infinite second moments, do not have densities and possess arbitrarily continuous marginal distributions. A feature selection result with explicit rate is also provided. TCA is further implemented in both numerical simulations and largescale stock data to illustrate its empirical usefulness. Both theories and experiments confirm that TCA can achieve model flexibility, estimation accuracy and robustness at almost no cost. 1
3 0.751113 248 nips-2012-Nonparanormal Belief Propagation (NPNBP)
Author: Gal Elidan, Cobi Cario
Abstract: The empirical success of the belief propagation approximate inference algorithm has inspired numerous theoretical and algorithmic advances. Yet, for continuous non-Gaussian domains performing belief propagation remains a challenging task: recent innovations such as nonparametric or kernel belief propagation, while useful, come with a substantial computational cost and offer little theoretical guarantees, even for tree structured models. In this work we present Nonparanormal BP for performing efficient inference on distributions parameterized by a Gaussian copulas network and any univariate marginals. For tree structured networks, our approach is guaranteed to be exact for this powerful class of non-Gaussian models. Importantly, the method is as efficient as standard Gaussian BP, and its convergence properties do not depend on the complexity of the univariate marginals, even when a nonparametric representation is used. 1
4 0.72735333 166 nips-2012-Joint Modeling of a Matrix with Associated Text via Latent Binary Features
Author: Xianxing Zhang, Lawrence Carin
Abstract: A new methodology is developed for joint analysis of a matrix and accompanying documents, with the documents associated with the matrix rows/columns. The documents are modeled with a focused topic model, inferring interpretable latent binary features for each document. A new matrix decomposition is developed, with latent binary features associated with the rows/columns, and with imposition of a low-rank constraint. The matrix decomposition and topic model are coupled by sharing the latent binary feature vectors associated with each. The model is applied to roll-call data, with the associated documents defined by the legislation. Advantages of the proposed model are demonstrated for prediction of votes on a new piece of legislation, based only on the observed text of legislation. The coupling of the text and legislation is also shown to yield insight into the properties of the matrix decomposition for roll-call data. 1
5 0.69530964 249 nips-2012-Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison
Author: Tianbao Yang, Yu-feng Li, Mehrdad Mahdavi, Rong Jin, Zhi-Hua Zhou
Abstract: Both random Fourier features and the Nystr¨ m method have been successfully o applied to efficient kernel learning. In this work, we investigate the fundamental difference between these two approaches, and how the difference could affect their generalization performances. Unlike approaches based on random Fourier features where the basis functions (i.e., cosine and sine functions) are sampled from a distribution independent from the training data, basis functions used by the Nystr¨ m method are randomly sampled from the training examples and are o therefore data dependent. By exploring this difference, we show that when there is a large gap in the eigen-spectrum of the kernel matrix, approaches based on the Nystr¨ m method can yield impressively better generalization error bound than o random Fourier features based approach. We empirically verify our theoretical findings on a wide range of large data sets. 1
6 0.6719889 323 nips-2012-Statistical Consistency of Ranking Methods in A Rank-Differentiable Probability Space
7 0.64131641 352 nips-2012-Transelliptical Graphical Models
8 0.54485869 310 nips-2012-Semiparametric Principal Component Analysis
9 0.53707594 210 nips-2012-Memorability of Image Regions
10 0.53528661 274 nips-2012-Priors for Diversity in Generative Latent Variable Models
11 0.5345996 201 nips-2012-Localizing 3D cuboids in single-view images
12 0.53403455 185 nips-2012-Learning about Canonical Views from Internet Image Collections
13 0.5330689 357 nips-2012-Unsupervised Template Learning for Fine-Grained Object Recognition
14 0.52680266 101 nips-2012-Discriminatively Trained Sparse Code Gradients for Contour Detection
15 0.52661711 360 nips-2012-Visual Recognition using Embedded Feature Selection for Curvature Self-Similarity
16 0.52031344 47 nips-2012-Augment-and-Conquer Negative Binomial Processes
17 0.5156672 363 nips-2012-Wavelet based multi-scale shape features on arbitrary surfaces for cortical thickness discrimination
18 0.51566058 193 nips-2012-Learning to Align from Scratch
19 0.51475334 235 nips-2012-Natural Images, Gaussian Mixtures and Dead Leaves
20 0.51428038 3 nips-2012-A Bayesian Approach for Policy Learning from Trajectory Preference Queries