nips nips2008 nips2008-201 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Alex J. Smola, Julian J. Mcauley, Tibério S. Caetano
Abstract: Models for near-rigid shape matching are typically based on distance-related features, in order to infer matches that are consistent with the isometric assumption. However, real shapes from image datasets, even when expected to be related by “almost isometric” transformations, are actually subject not only to noise but also, to some limited degree, to variations in appearance and scale. In this paper, we introduce a graphical model that parameterises appearance, distance, and angle features and we learn all of the involved parameters via structured prediction. The outcome is a model for near-rigid shape matching which is robust in the sense that it is able to capture the possibly limited but still important scale and appearance variations. Our experimental results reveal substantial improvements upon recent successful models, while maintaining similar running times. 1
Reference: text
sentIndex sentText sentNum sentScore
1 org Abstract Models for near-rigid shape matching are typically based on distance-related features, in order to infer matches that are consistent with the isometric assumption. [sent-12, score-0.78]
2 However, real shapes from image datasets, even when expected to be related by “almost isometric” transformations, are actually subject not only to noise but also, to some limited degree, to variations in appearance and scale. [sent-13, score-0.289]
3 In this paper, we introduce a graphical model that parameterises appearance, distance, and angle features and we learn all of the involved parameters via structured prediction. [sent-14, score-0.374]
4 The outcome is a model for near-rigid shape matching which is robust in the sense that it is able to capture the possibly limited but still important scale and appearance variations. [sent-15, score-0.614]
5 Our experimental results reveal substantial improvements upon recent successful models, while maintaining similar running times. [sent-16, score-0.094]
6 1 Introduction Matching shapes in images has many applications, including image retrieval, alignment, and registration [1, 2, 3, 4]. [sent-17, score-0.184]
7 Typically, matching is approached by selecting features for a set of landmark points in both images; a correspondence between the two is then chosen such that some distance measure between these features is minimised. [sent-18, score-0.718]
8 A great deal of attention has been devoted to defining complex features which are robust to changes in rotation, scale etc. [sent-19, score-0.236]
9 1 An important class of matching problems is that of near-isometric shape matching. [sent-21, score-0.409]
10 In this setting, it is assumed that shapes are defined up to an isometric transformation (allowing for some noise), and therefore distance features are typically used to encode the shape. [sent-22, score-0.521]
11 Recent work has shown how the isometric constraint can be exploited by a particular type of graphical model whose topology encodes the necessary properties for obtaining optimal matches in polynomial time [11]. [sent-23, score-0.551]
12 Another line of work has focused on structured learning to optimize graph matching scores, however no explicit exploitation of the geometrical constraints involved in shape modeling are made [12]. [sent-24, score-0.579]
13 We produce an exact, efficient model to solve near-isometric shape matching problems using not only isometryinvariant features, but also appearance and scale-invariant features. [sent-26, score-0.53]
14 By doing so we can learn the relative importances of variations in appearance and scale with regard to variations in shape per se. [sent-27, score-0.573]
15 Therefore, even knowing that we are in a near-isometric setting, we will capture the eventual variations in appearance and scale into our matching criterion in order to produce a robust nearisometric matcher. [sent-28, score-0.447]
16 In terms of learning, we introduce a two-stage structured learning approach to address the speed and memory efficiency of this model. [sent-29, score-0.137]
17 1 1 Figure 1: The graphical model introduced in [11]. [sent-36, score-0.119]
18 Here we study the case of identifying an instance of a template shape (S ⊆ T ) in a target scene (U) [1]. [sent-39, score-0.46]
19 the points in the template that we want to query in the scene. [sent-42, score-0.227]
20 For each point t ∈ T and u ∈ U, a certain set of unary features are extracted (here denoted by φ(t), φ(u)), which contain local information about the image at that point [5, 6]. [sent-44, score-0.472]
21 (1) is a linear assignment problem, efficiently solvable in cubic time. [sent-50, score-0.122]
22 In addition to unary or first-order features, pairwise or second-order features can be induced from the locations of the unary features. [sent-51, score-0.741]
23 (1) would be generalised to minimise an aggregate distance between pairwise features. [sent-53, score-0.177]
24 Discriminative structured learning has recently been applied to models of both linear and quadratic assignment in [12]. [sent-55, score-0.365]
25 2 Graphical Models In isometric matching settings, one may suspect that it may not be necessary to include all pairwise relations in quadratic assignment. [sent-57, score-0.559]
26 In fact a recent paper [11] has shown that if only the distances as encoded by the graphical model depicted in figure 1 are taken into account (nodes represent points in S and states represent points in U), exact probabilistic inference in such a model can solve the isometric problem optimally. [sent-58, score-0.613]
27 (2) i=1 In [11], it is shown that loopy belief propagation using this model converges to the optimal assignment, and that the number of iterations required before convergence is small in practice. [sent-60, score-0.095]
28 We will extend this model by adding a unary term, c1 (si , y(si )) (as in (eq. [sent-61, score-0.305]
29 2 Here T is the set of all points in the template scene, whereas S corresponds to those points in which we are interested. [sent-64, score-0.311]
30 3 Discriminative Structured Learning In practice, feature vectors may be very high-dimensional, and which components are ‘important’ will depend on the specific properties of the shapes being matched. [sent-70, score-0.104]
31 Therefore, we introduce a parameter, θ, which controls the relative importances of the various feature components. [sent-71, score-0.094]
32 Note that θ is parameterising the matching criterion itself. [sent-72, score-0.194]
33 4 In order to measure the performance of a particular weight vector, we use a loss function, ∆(ˆ, y i ), which represents the cost incurred by choosing the assignment y when the correct y ˆ assignment is y i (our specific choice of loss function is described in section 4). [sent-75, score-0.435]
34 Learning in this setting now becomes a matter of choosing θ such that the empirical risk (average loss on all training instances) is minimised, but which is also sufficiently ‘smooth’ (to prevent overfitting). [sent-78, score-0.153]
35 y N , then we wish to minimise 1 N N ∆(f (S i , U i ; θ), y i ) + i=1 empirical risk λ θ 2 2 2 . [sent-88, score-0.106]
36 (5) regulariser Here λ (the regularisation constant) controls the relative importance of minimising the empirical risk against the regulariser. [sent-89, score-0.167]
37 Here we capitalise on recent advances in large-margin structured estimation [15], which consist of obtaining convex relaxations of this problem. [sent-93, score-0.103]
38 This means that we end up minimising an upper bound on the loss, instead of the loss itself. [sent-96, score-0.123]
39 3 Figure 2: Left: the (ordered) set of points in our template shape (S). [sent-107, score-0.442]
40 3 Our Model Although the model of [11] solves isometric matching problems optimally, it provides no guarantees for near-isometric problems, as it only considers those compatibilities which form cliques in our graphical model. [sent-110, score-0.59]
41 With this in mind, we introduce three new features (for brevity we use the shorthand yi = y(si )): Φ1 (s1 , s2 , y1 , y2 ) = (d1 (s1 , s2 ) − d1 (y1 , y2 ))2 , where d1 (a, b) is the Euclidean distance between a and b, scaled according to the width of the target scene. [sent-112, score-0.292]
42 Φ2 (s1 , s2 , s3 , y1 , y2 , y3 ) = (d2 (s1 , s2 , s3 ) − d2 (y1 , y2 , y3 ))2 , where d2 (a, b, c) is the Euclidean distance between a and b scaled by the average of the distances between a, b, and c. [sent-113, score-0.147]
43 5 We also include the unary features Φ0 (s1 , y1 ) = (φ(s1 ) − φ(y1 ))2 (i. [sent-118, score-0.427]
44 Φ1 is exactly the feature used in [11], and is invariant to isometric transformations (rotation, reflection, and translation); Φ2 and Φ3 capture triangle similarity, and are thus also invariant to scale. [sent-121, score-0.409]
45 (8) In practice, landmark detectors often identify several hundred points [6, 17], which is clearly impractical for an O(|S||U|3 ) method (|U| is the number of landmarks in the target scene). [sent-124, score-0.452]
46 To address this, we adopt a two stage learning approach: in the first stage, we learn only unary compatibilities, exactly as is done in [12]. [sent-125, score-0.523]
47 During the second stage of learning, we collapse the first-order feature vector into a single term, namely Φ0 (s1 , y1 ) = θ0 , Φ0 (s1 , y1 ) (9) (θ0 is the weight vector learned during the first stage). [sent-126, score-0.286]
48 We now perform learning for the third-order model, but consider only the p ‘most likely’ matches for each node, where the likelihood is simply determined using Φ0 (s1 , y1 ). [sent-127, score-0.155]
49 A consequence of using this approach is that we must now tune two regularisation constants; this is not an issue in practice, as learning can be performed quickly using this approach. [sent-129, score-0.103]
50 6 5 Using features of such different scales can be an issue for regularisation – in practice we adjusted these features to have roughly the same scale. [sent-130, score-0.373]
51 6 In fact, even in those cases where a single stage approach was tractable (such as the experiment in section 4. [sent-132, score-0.214]
52 1), we found that the two stage approach worked better. [sent-133, score-0.251]
53 Typically, we required much less regularity during the second stage, possibly because the higher order features are heterogeneous. [sent-134, score-0.152]
54 4 Figure 3: Left: The adjacency structure of the graph (top); the boundary of our ‘shape’ (centre); the topology of our graphical model (bottom). [sent-135, score-0.257]
55 Right: Example matches using linear assignment (top, 6/30 mismatches), quadratic assignment (centre, 4/30 mismatches), and the proposed model (bottom, no mismatches). [sent-136, score-0.501]
56 The images shown are the 12th and 102nd frames in our sequence. [sent-137, score-0.118]
57 Correct matches are shown in green, incorrect matches in red. [sent-138, score-0.242]
58 Both papers report the performance of their methods on the CMU ‘house’ sequence – a sequence of 111 frames of a toy house, with 30 landmarks identified in each frame. [sent-142, score-0.225]
59 7 As in [12], we compute the Shape Context features for each of the 30 points [5]. [sent-143, score-0.236]
60 In addition to the unary model of [12], a model based on quadratic assignment is also presented, in which pairwise features are determined using the adjacency structure of the graphs. [sent-144, score-0.811]
61 Specifically, if a pair of points (p1 , p2 ) in the template scene is to be matched to (q1 , q2 ) in the target, there is a feature which is 1 if there is an edge between p1 and p2 in the template, and an edge between q1 and q2 in the target (and 0 otherwise). [sent-145, score-0.426]
62 We also use such a feature for this experiment, however our model only considers matchings for which (p1 , p2 ) forms an edge in our graphical model (see figure 3, bottom left). [sent-146, score-0.245]
63 As in [11], we compare pairs of images with a fixed baseline (separation between frames). [sent-148, score-0.14]
64 For our loss function, ∆(ˆ, y i ), we used the normalised Hamming loss, i. [sent-149, score-0.205]
65 In figure 5, we see that the running time of our method is similar to the quadratic assignment method of [12]. [sent-155, score-0.35]
66 for each point in the template scene, we only consider the 10 ‘most likely’ matches, using the weights from the first stage of learning. [sent-158, score-0.41]
67 html Interestingly, the quadratic method of [12] performs worse than their unary method; this is likely because the relative scale of the unary and quadratic features is badly tuned before learning, and is indeed similar to what the authors report. [sent-163, score-0.99]
68 Furthermore, the results we present for the method of [12] after learning are much better than what the authors report – in that paper, the unary features are scaled using a pointwise exponent (− exp(−|φa − φb |2 )), whereas we found that scaling the features linearly (|φa − φb |2 ) worked better. [sent-164, score-0.776]
69 8 5 House data, learning Normalised Hamming loss on test set Normalised Hamming loss on test set House data, no learning 1 point matching linear quadratic higher order 0. [sent-165, score-0.516]
70 3 linear (learning) quadratic (learning) higher order (learning, 10 points) higher order (learning) 0. [sent-170, score-0.106]
71 25 linear (learning) quadratic (learning) higher order (learning, 10 points) higher order (learning) 0. [sent-181, score-0.106]
72 1 Average running time (seconds, logarithmic scale) 1 Figure 5: The running time and performance of our method, compared to those of [12] (note that the method of [11] has running time identical to our method). [sent-189, score-0.216]
73 magnitude, bringing it closer to that of linear assignment; even this model achieves approximately zero error up to a baseline of 50. [sent-191, score-0.106]
74 Finally, figure 6 (left) shows the weight vector of our model, for a baseline of 60. [sent-192, score-0.119]
75 It is much more difficult to reason about the second stage of learning, as the features have different scales, and cannot be compared directly – however, it appears that all of the higher-order features are important to our model. [sent-196, score-0.518]
76 2 Bikes Data For our second experiment, we used images of bicycles from the Caltech 256 Dataset [18]. [sent-198, score-0.162]
77 Bicycles are reasonably rigid objects, meaning that matching based on their shape is logical. [sent-199, score-0.445]
78 For each image in the dataset, we detected landmarks automatically, and six points on the frame were hand-labelled (see figure 7). [sent-201, score-0.3]
79 Only shapes in which these interest points were not occluded were used, and we only included images that had a background; in total, we labelled 44 6 House data first/higher order weight vector (baseline = 60) 2 0. [sent-202, score-0.301]
80 The first 60 weights are for the Shape Context features from the first stage of of learning; the final 5 weights are for the second stage of learning. [sent-211, score-0.686]
81 Left: The template image (with the shape outlined in green, and landmark points marked in blue). [sent-215, score-0.579]
82 Centre: The target image, and the match (in red) using unary features with the affine invariant/SIFT model of [17] after learning (endpoint error = 0. [sent-216, score-0.589]
83 Right: the match using our model after learning (endpoint error = 0. [sent-218, score-0.123]
84 Thus we are learning to match bicycles similar to the chosen template. [sent-222, score-0.191]
85 Initially, we used the SIFT landmarks and features as described in [6]. [sent-223, score-0.323]
86 Since we cannot hope to get exact matches, we use the endpoint error instead of the normalised Hamming loss, i. [sent-227, score-0.262]
87 This may be explained by the fact that although the SIFT features are invariant to scale and rotation, they are not invariant to reflection. [sent-231, score-0.326]
88 In [17], the authors report that the SIFT features can provide good matches in such cases, as long as landmarks are chosen which are locally invariant to affine transformations. [sent-232, score-0.508]
89 They give a method for identifying affine-invariant feature points, whose SIFT features are then computed. [sent-233, score-0.211]
90 Figure 7 shows an example match using both the unary and higher-order techniques. [sent-235, score-0.334]
91 Interestingly, the first-order term during the second stage of learning has almost zero weight. [sent-237, score-0.248]
92 This must not be misinterpreted: during the second stage, the response of each of the 20 candidate points is so similar that the first-order features are simply unable to convey any new information – yet they are still very useful in determining the 20 candidate points. [sent-238, score-0.236]
93 9 Here the endpoint error is just the average Euclidean distance from the correct label, scaled according to the width of the image. [sent-239, score-0.232]
94 The endpoint error is reported, with standard errors in parentheses (note that the second-last column, ‘higher-order’ uses the weights from the first stage of learning, but not the second). [sent-242, score-0.398]
95 039) Affine invariant/SIFT [17] 5 unary Training: 0. [sent-261, score-0.275]
96 034) Conclusion We have presented a model for near-isometric shape matching which is robust to typical additional variations of the shape. [sent-291, score-0.555]
97 This is achieved by performing structured learning in a graphical model that encodes features with several different types of invariances, so that we can directly learn a “compound invariance” instead of taking for granted the exclusive assumption of isometric invariance. [sent-292, score-0.671]
98 Our experiments revealed that structured learning with a principled graphical model that encodes both the rigid shape as well as non-isometric variations gives substantial improvements, while still maintaining competitive performance in terms of running time. [sent-293, score-0.722]
99 : Learning methods for generic object recognition with invariance to pose and lighting. [sent-339, score-0.119]
100 : Support vector machine learning for interdependent and structured output spaces. [sent-366, score-0.137]
wordName wordTfidf (topN-words)
[('si', 0.354), ('unary', 0.275), ('house', 0.228), ('isometric', 0.22), ('shape', 0.215), ('stage', 0.214), ('matching', 0.194), ('landmarks', 0.171), ('features', 0.152), ('template', 0.143), ('normalised', 0.131), ('endpoint', 0.131), ('assignment', 0.122), ('matches', 0.121), ('bikes', 0.114), ('quadratic', 0.106), ('malik', 0.106), ('structured', 0.103), ('centre', 0.102), ('sift', 0.098), ('bicycles', 0.098), ('caetano', 0.098), ('landmark', 0.092), ('appearance', 0.091), ('graphical', 0.089), ('hamming', 0.087), ('mismatches', 0.086), ('points', 0.084), ('variations', 0.078), ('baseline', 0.076), ('shapes', 0.075), ('loss', 0.074), ('regularisation', 0.069), ('belongie', 0.069), ('barbosa', 0.065), ('importances', 0.065), ('mcauley', 0.065), ('bins', 0.065), ('invariant', 0.064), ('images', 0.064), ('af', 0.064), ('gure', 0.063), ('scene', 0.063), ('running', 0.062), ('match', 0.059), ('smola', 0.059), ('scaled', 0.057), ('compatibilities', 0.057), ('minimise', 0.057), ('adjacency', 0.057), ('pami', 0.057), ('object', 0.056), ('radial', 0.054), ('frames', 0.054), ('weights', 0.053), ('mori', 0.052), ('minimised', 0.052), ('risk', 0.049), ('minimising', 0.049), ('topology', 0.048), ('nicta', 0.046), ('distances', 0.046), ('scale', 0.046), ('image', 0.045), ('distance', 0.044), ('felzenszwalb', 0.044), ('weight', 0.043), ('encodes', 0.043), ('rotation', 0.04), ('pointwise', 0.039), ('australian', 0.039), ('pairwise', 0.039), ('target', 0.039), ('robust', 0.038), ('aggregate', 0.037), ('worked', 0.037), ('hundred', 0.036), ('rigid', 0.036), ('alexander', 0.036), ('propagation', 0.035), ('labelled', 0.035), ('euclidean', 0.035), ('learning', 0.034), ('edge', 0.034), ('graph', 0.033), ('invariance', 0.033), ('bottom', 0.033), ('transformations', 0.032), ('substantial', 0.032), ('argmax', 0.031), ('context', 0.03), ('validation', 0.03), ('typically', 0.03), ('model', 0.03), ('matter', 0.03), ('depicted', 0.03), ('recognition', 0.03), ('method', 0.03), ('belief', 0.03), ('feature', 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000004 201 nips-2008-Robust Near-Isometric Matching via Structured Learning of Graphical Models
Author: Alex J. Smola, Julian J. Mcauley, Tibério S. Caetano
Abstract: Models for near-rigid shape matching are typically based on distance-related features, in order to infer matches that are consistent with the isometric assumption. However, real shapes from image datasets, even when expected to be related by “almost isometric” transformations, are actually subject not only to noise but also, to some limited degree, to variations in appearance and scale. In this paper, we introduce a graphical model that parameterises appearance, distance, and angle features and we learn all of the involved parameters via structured prediction. The outcome is a model for near-rigid shape matching which is robust in the sense that it is able to capture the possibly limited but still important scale and appearance variations. Our experimental results reveal substantial improvements upon recent successful models, while maintaining similar running times. 1
2 0.19421262 207 nips-2008-Shape-Based Object Localization for Descriptive Classification
Author: Geremy Heitz, Gal Elidan, Benjamin Packer, Daphne Koller
Abstract: Discriminative tasks, including object categorization and detection, are central components of high-level computer vision. Sometimes, however, we are interested in more refined aspects of the object in an image, such as pose or particular regions. In this paper we develop a method (LOOPS) for learning a shape and image feature model that can be trained on a particular object class, and used to outline instances of the class in novel images. Furthermore, while the training data consists of uncorresponded outlines, the resulting LOOPS model contains a set of landmark points that appear consistently across instances, and can be accurately localized in an image. Our model achieves state-of-the-art results in precisely outlining objects that exhibit large deformations and articulations in cluttered natural images. These localizations can then be used to address a range of tasks, including descriptive classification, search, and clustering. 1
Author: Christoph Kolodziejski, Bernd Porr, Minija Tamosiunaite, Florentin Wörgötter
Abstract: In this theoretical contribution we provide mathematical proof that two of the most important classes of network learning - correlation-based differential Hebbian learning and reward-based temporal difference learning - are asymptotically equivalent when timing the learning with a local modulatory signal. This opens the opportunity to consistently reformulate most of the abstract reinforcement learning framework from a correlation based perspective that is more closely related to the biophysics of neurons. 1
4 0.15586901 95 nips-2008-Grouping Contours Via a Related Image
Author: Praveen Srinivasan, Liming Wang, Jianbo Shi
Abstract: Contours have been established in the biological and computer vision literature as a compact yet descriptive representation of object shape. While individual contours provide structure, they lack the large spatial support of region segments (which lack internal structure). We present a method for further grouping of contours in an image using their relationship to the contours of a second, related image. Stereo, motion, and similarity all provide cues that can aid this task; contours that have similar transformations relating them to their matching contours in the second image likely belong to a single group. To find matches for contours, we rely only on shape, which applies directly to all three modalities without modification, in contrast to the specialized approaches developed for each independently. Visually salient contours are extracted in each image, along with a set of candidate transformations for aligning subsets of them. For each transformation, groups of contours with matching shape across the two images are identified to provide a context for evaluating matches of individual contour points across the images. The resulting contexts of contours are used to perform a final grouping on contours in the original image while simultaneously finding matches in the related image, again by shape matching. We demonstrate grouping results on image pairs consisting of stereo, motion, and similar images. Our method also produces qualitatively better results against a baseline method that does not use the inferred contexts. 1
5 0.12967254 6 nips-2008-A ``Shape Aware'' Model for semi-supervised Learning of Objects and its Context
Author: Abhinav Gupta, Jianbo Shi, Larry S. Davis
Abstract: We present an approach that combines bag-of-words and spatial models to perform semantic and syntactic analysis for recognition of an object based on its internal appearance and its context. We argue that while object recognition requires modeling relative spatial locations of image features within the object, a bag-of-word is sufficient for representing context. Learning such a model from weakly labeled data involves labeling of features into two classes: foreground(object) or “informative” background(context). We present a “shape-aware” model which utilizes contour information for efficient and accurate labeling of features in the image. Our approach iterates between an MCMC-based labeling and contour based labeling of features to integrate co-occurrence of features and shape similarity. 1
6 0.12410012 113 nips-2008-Kernelized Sorting
7 0.11235752 215 nips-2008-Sparse Signal Recovery Using Markov Random Fields
8 0.093238421 116 nips-2008-Learning Hybrid Models for Image Annotation with Partially Labeled Data
9 0.091645613 87 nips-2008-Fitted Q-iteration by Advantage Weighted Regression
10 0.090835035 239 nips-2008-Tighter Bounds for Structured Estimation
11 0.090047367 119 nips-2008-Learning a discriminative hidden part model for human action recognition
12 0.089981109 42 nips-2008-Cascaded Classification Models: Combining Models for Holistic Scene Understanding
13 0.087590076 147 nips-2008-Multiscale Random Fields with Application to Contour Grouping
14 0.084723823 191 nips-2008-Recursive Segmentation and Recognition Templates for 2D Parsing
15 0.081172526 51 nips-2008-Convergence and Rate of Convergence of a Manifold-Based Dimension Reduction Algorithm
16 0.080997713 157 nips-2008-Nonrigid Structure from Motion in Trajectory Space
17 0.079458505 194 nips-2008-Regularized Learning with Networks of Features
18 0.077884383 241 nips-2008-Transfer Learning by Distribution Matching for Targeted Advertising
19 0.07540863 26 nips-2008-Analyzing human feature learning as nonparametric Bayesian inference
20 0.075278036 145 nips-2008-Multi-stage Convex Relaxation for Learning with Sparse Regularization
topicId topicWeight
[(0, -0.238), (1, -0.065), (2, 0.07), (3, -0.104), (4, 0.043), (5, -0.012), (6, -0.068), (7, -0.194), (8, 0.05), (9, -0.044), (10, -0.065), (11, -0.008), (12, -0.007), (13, -0.039), (14, 0.095), (15, 0.01), (16, 0.156), (17, 0.135), (18, 0.041), (19, 0.102), (20, -0.003), (21, -0.005), (22, -0.097), (23, 0.042), (24, 0.05), (25, 0.178), (26, -0.048), (27, -0.082), (28, -0.036), (29, 0.065), (30, -0.073), (31, -0.064), (32, 0.061), (33, -0.076), (34, -0.002), (35, 0.091), (36, 0.081), (37, -0.039), (38, -0.126), (39, 0.031), (40, 0.035), (41, -0.03), (42, -0.08), (43, -0.004), (44, -0.037), (45, 0.037), (46, -0.048), (47, -0.017), (48, -0.019), (49, 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 0.9549402 201 nips-2008-Robust Near-Isometric Matching via Structured Learning of Graphical Models
Author: Alex J. Smola, Julian J. Mcauley, Tibério S. Caetano
Abstract: Models for near-rigid shape matching are typically based on distance-related features, in order to infer matches that are consistent with the isometric assumption. However, real shapes from image datasets, even when expected to be related by “almost isometric” transformations, are actually subject not only to noise but also, to some limited degree, to variations in appearance and scale. In this paper, we introduce a graphical model that parameterises appearance, distance, and angle features and we learn all of the involved parameters via structured prediction. The outcome is a model for near-rigid shape matching which is robust in the sense that it is able to capture the possibly limited but still important scale and appearance variations. Our experimental results reveal substantial improvements upon recent successful models, while maintaining similar running times. 1
2 0.73573065 207 nips-2008-Shape-Based Object Localization for Descriptive Classification
Author: Geremy Heitz, Gal Elidan, Benjamin Packer, Daphne Koller
Abstract: Discriminative tasks, including object categorization and detection, are central components of high-level computer vision. Sometimes, however, we are interested in more refined aspects of the object in an image, such as pose or particular regions. In this paper we develop a method (LOOPS) for learning a shape and image feature model that can be trained on a particular object class, and used to outline instances of the class in novel images. Furthermore, while the training data consists of uncorresponded outlines, the resulting LOOPS model contains a set of landmark points that appear consistently across instances, and can be accurately localized in an image. Our model achieves state-of-the-art results in precisely outlining objects that exhibit large deformations and articulations in cluttered natural images. These localizations can then be used to address a range of tasks, including descriptive classification, search, and clustering. 1
3 0.67977023 95 nips-2008-Grouping Contours Via a Related Image
Author: Praveen Srinivasan, Liming Wang, Jianbo Shi
Abstract: Contours have been established in the biological and computer vision literature as a compact yet descriptive representation of object shape. While individual contours provide structure, they lack the large spatial support of region segments (which lack internal structure). We present a method for further grouping of contours in an image using their relationship to the contours of a second, related image. Stereo, motion, and similarity all provide cues that can aid this task; contours that have similar transformations relating them to their matching contours in the second image likely belong to a single group. To find matches for contours, we rely only on shape, which applies directly to all three modalities without modification, in contrast to the specialized approaches developed for each independently. Visually salient contours are extracted in each image, along with a set of candidate transformations for aligning subsets of them. For each transformation, groups of contours with matching shape across the two images are identified to provide a context for evaluating matches of individual contour points across the images. The resulting contexts of contours are used to perform a final grouping on contours in the original image while simultaneously finding matches in the related image, again by shape matching. We demonstrate grouping results on image pairs consisting of stereo, motion, and similar images. Our method also produces qualitatively better results against a baseline method that does not use the inferred contexts. 1
4 0.62482417 215 nips-2008-Sparse Signal Recovery Using Markov Random Fields
Author: Volkan Cevher, Marco F. Duarte, Chinmay Hegde, Richard Baraniuk
Abstract: Compressive Sensing (CS) combines sampling and compression into a single subNyquist linear measurement process for sparse and compressible signals. In this paper, we extend the theory of CS to include signals that are concisely represented in terms of a graphical model. In particular, we use Markov Random Fields (MRFs) to represent sparse signals whose nonzero coefficients are clustered. Our new model-based recovery algorithm, dubbed Lattice Matching Pursuit (LaMP), stably recovers MRF-modeled signals using many fewer measurements and computations than the current state-of-the-art algorithms.
5 0.62254596 6 nips-2008-A ``Shape Aware'' Model for semi-supervised Learning of Objects and its Context
Author: Abhinav Gupta, Jianbo Shi, Larry S. Davis
Abstract: We present an approach that combines bag-of-words and spatial models to perform semantic and syntactic analysis for recognition of an object based on its internal appearance and its context. We argue that while object recognition requires modeling relative spatial locations of image features within the object, a bag-of-word is sufficient for representing context. Learning such a model from weakly labeled data involves labeling of features into two classes: foreground(object) or “informative” background(context). We present a “shape-aware” model which utilizes contour information for efficient and accurate labeling of features in the image. Our approach iterates between an MCMC-based labeling and contour based labeling of features to integrate co-occurrence of features and shape similarity. 1
6 0.60650319 147 nips-2008-Multiscale Random Fields with Application to Contour Grouping
8 0.44211811 157 nips-2008-Nonrigid Structure from Motion in Trajectory Space
9 0.43948734 113 nips-2008-Kernelized Sorting
10 0.43722802 23 nips-2008-An ideal observer model of infant object perception
11 0.41896236 42 nips-2008-Cascaded Classification Models: Combining Models for Holistic Scene Understanding
12 0.38667795 87 nips-2008-Fitted Q-iteration by Advantage Weighted Regression
13 0.38439178 183 nips-2008-Predicting the Geometry of Metal Binding Sites from Protein Sequence
14 0.36907741 191 nips-2008-Recursive Segmentation and Recognition Templates for 2D Parsing
15 0.36656025 30 nips-2008-Bayesian Experimental Design of Magnetic Resonance Imaging Sequences
16 0.36470175 246 nips-2008-Unsupervised Learning of Visual Sense Models for Polysemous Words
17 0.36254823 26 nips-2008-Analyzing human feature learning as nonparametric Bayesian inference
18 0.34930497 110 nips-2008-Kernel-ARMA for Hand Tracking and Brain-Machine interfacing During 3D Motor Control
19 0.34912306 74 nips-2008-Estimating the Location and Orientation of Complex, Correlated Neural Activity using MEG
20 0.34711757 239 nips-2008-Tighter Bounds for Structured Estimation
topicId topicWeight
[(6, 0.06), (7, 0.062), (12, 0.088), (28, 0.13), (57, 0.071), (59, 0.044), (63, 0.021), (71, 0.028), (77, 0.07), (81, 0.257), (83, 0.095)]
simIndex simValue paperId paperTitle
1 0.84159958 180 nips-2008-Playing Pinball with non-invasive BCI
Author: Matthias Krauledat, Konrad Grzeska, Max Sagebaum, Benjamin Blankertz, Carmen Vidaurre, Klaus-Robert Müller, Michael Schröder
Abstract: Compared to invasive Brain-Computer Interfaces (BCI), non-invasive BCI systems based on Electroencephalogram (EEG) signals have not been applied successfully for precisely timed control tasks. In the present study, however, we demonstrate and report on the interaction of subjects with a real device: a pinball machine. Results of this study clearly show that fast and well-timed control well beyond chance level is possible, even though the environment is extremely rich and requires precisely timed and complex predictive behavior. Using machine learning methods for mental state decoding, BCI-based pinball control is possible within the first session without the necessity to employ lengthy subject training. The current study shows clearly that very compelling control with excellent timing and dynamics is possible for a non-invasive BCI. 1
same-paper 2 0.75039941 201 nips-2008-Robust Near-Isometric Matching via Structured Learning of Graphical Models
Author: Alex J. Smola, Julian J. Mcauley, Tibério S. Caetano
Abstract: Models for near-rigid shape matching are typically based on distance-related features, in order to infer matches that are consistent with the isometric assumption. However, real shapes from image datasets, even when expected to be related by “almost isometric” transformations, are actually subject not only to noise but also, to some limited degree, to variations in appearance and scale. In this paper, we introduce a graphical model that parameterises appearance, distance, and angle features and we learn all of the involved parameters via structured prediction. The outcome is a model for near-rigid shape matching which is robust in the sense that it is able to capture the possibly limited but still important scale and appearance variations. Our experimental results reveal substantial improvements upon recent successful models, while maintaining similar running times. 1
3 0.71316177 226 nips-2008-Supervised Dictionary Learning
Author: Julien Mairal, Jean Ponce, Guillermo Sapiro, Andrew Zisserman, Francis R. Bach
Abstract: It is now well established that sparse signal models are well suited for restoration tasks and can be effectively learned from audio, image, and video data. Recent research has been aimed at learning discriminative sparse models instead of purely reconstructive ones. This paper proposes a new step in that direction, with a novel sparse representation for signals belonging to different classes in terms of a shared dictionary and discriminative class models. The linear version of the proposed model admits a simple probabilistic interpretation, while its most general variant admits an interpretation in terms of kernels. An optimization framework for learning all the components of the proposed model is presented, along with experimental results on standard handwritten digit and texture classification tasks. 1
4 0.64558095 95 nips-2008-Grouping Contours Via a Related Image
Author: Praveen Srinivasan, Liming Wang, Jianbo Shi
Abstract: Contours have been established in the biological and computer vision literature as a compact yet descriptive representation of object shape. While individual contours provide structure, they lack the large spatial support of region segments (which lack internal structure). We present a method for further grouping of contours in an image using their relationship to the contours of a second, related image. Stereo, motion, and similarity all provide cues that can aid this task; contours that have similar transformations relating them to their matching contours in the second image likely belong to a single group. To find matches for contours, we rely only on shape, which applies directly to all three modalities without modification, in contrast to the specialized approaches developed for each independently. Visually salient contours are extracted in each image, along with a set of candidate transformations for aligning subsets of them. For each transformation, groups of contours with matching shape across the two images are identified to provide a context for evaluating matches of individual contour points across the images. The resulting contexts of contours are used to perform a final grouping on contours in the original image while simultaneously finding matches in the related image, again by shape matching. We demonstrate grouping results on image pairs consisting of stereo, motion, and similar images. Our method also produces qualitatively better results against a baseline method that does not use the inferred contexts. 1
5 0.63276803 246 nips-2008-Unsupervised Learning of Visual Sense Models for Polysemous Words
Author: Kate Saenko, Trevor Darrell
Abstract: Polysemy is a problem for methods that exploit image search engines to build object category models. Existing unsupervised approaches do not take word sense into consideration. We propose a new method that uses a dictionary to learn models of visual word sense from a large collection of unlabeled web data. The use of LDA to discover a latent sense space makes the model robust despite the very limited nature of dictionary definitions. The definitions are used to learn a distribution in the latent space that best represents a sense. The algorithm then uses the text surrounding image links to retrieve images with high probability of a particular dictionary sense. An object classifier is trained on the resulting sense-specific images. We evaluate our method on a dataset obtained by searching the web for polysemous words. Category classification experiments show that our dictionarybased approach outperforms baseline methods. 1
6 0.63071054 75 nips-2008-Estimating vector fields using sparse basis field expansions
7 0.61615801 207 nips-2008-Shape-Based Object Localization for Descriptive Classification
8 0.61314219 205 nips-2008-Semi-supervised Learning with Weakly-Related Unlabeled Data : Towards Better Text Categorization
9 0.61189425 116 nips-2008-Learning Hybrid Models for Image Annotation with Partially Labeled Data
10 0.61108601 245 nips-2008-Unlabeled data: Now it helps, now it doesn't
11 0.60936159 194 nips-2008-Regularized Learning with Networks of Features
12 0.60316753 79 nips-2008-Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning
13 0.60117638 193 nips-2008-Regularized Co-Clustering with Dual Supervision
14 0.59952104 42 nips-2008-Cascaded Classification Models: Combining Models for Holistic Scene Understanding
15 0.59835988 16 nips-2008-Adaptive Template Matching with Shift-Invariant Semi-NMF
16 0.59672755 120 nips-2008-Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text
17 0.59484458 63 nips-2008-Dimensionality Reduction for Data in Multiple Feature Representations
18 0.59353024 179 nips-2008-Phase transitions for high-dimensional joint support recovery
19 0.59086919 202 nips-2008-Robust Regression and Lasso
20 0.59084219 99 nips-2008-High-dimensional support union recovery in multivariate regression