nips nips2009 nips2009-58 knowledge-graph by maker-knowledge-mining

58 nips-2009-Constructing Topological Maps using Markov Random Fields and Loop-Closure Detection


Source: pdf

Author: Roy Anati, Kostas Daniilidis

Abstract: We present a system which constructs a topological map of an environment given a sequence of images. This system includes a novel image similarity score which uses dynamic programming to match images using both the appearance and relative positions of local features simultaneously. Additionally, an MRF is constructed to model the probability of loop-closures. A locally optimal labeling is found using Loopy-BP. Finally we outline a method to generate a topological map from loop closure data. Results, presented on four urban sequences and one indoor sequence, outperform the state of the art. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract We present a system which constructs a topological map of an environment given a sequence of images. [sent-3, score-0.481]

2 This system includes a novel image similarity score which uses dynamic programming to match images using both the appearance and relative positions of local features simultaneously. [sent-4, score-0.844]

3 Finally we outline a method to generate a topological map from loop closure data. [sent-7, score-0.627]

4 Results, presented on four urban sequences and one indoor sequence, outperform the state of the art. [sent-8, score-0.222]

5 1 Introduction The task of generating a topological map from video data has gained prominence in recent years. [sent-9, score-0.497]

6 Topological maps can correct for the drift in visual odometry systems and can be part of hybrid representations where the environment is represented metrically locally but topologically globally. [sent-12, score-0.216]

7 We identify two challenges in constructing a topological map from video: how can we say whether two images have been taken from the same place; and how can we reduce the original set of thousands of video frames to a reduced representative set of keyframes for path planning. [sent-13, score-1.03]

8 Video guarantees that keyframes will be reachable to each other but it also provides temporal ordering constraints on deciding about loop closures. [sent-15, score-0.471]

9 The paper has three innovations: We define a novel image similarity score which uses dynamic programming to match images using both the appearance and the layout of the features in the environment. [sent-16, score-0.8]

10 Finally, we show how the temporal assumption can be used to generate compact topological maps using minimum dominating sets. [sent-18, score-0.563]

11 We formally define a topological map T as a graph T = (K, ET ), where K is a set of keyframes and ET edges describing connectivity between keyframes. [sent-19, score-0.907]

12 We will see later that keyframes are representatives of locations. [sent-20, score-0.33]

13 We desire the following properties of T : Loop closure For any two locations i, j ∈ K, ET contains the edge (i, j) if and only if it is possible to reach location j from location i without passing through any other location k ∈ K. [sent-21, score-0.275]

14 1 Spatial distinctiveness Two images from “different locations” cannot be represented by the same keyframe. [sent-23, score-0.2]

15 Note that spatial distinctiveness requires that we distinguish between separate locations, however compactness encourages agglomeration of geographically similar images. [sent-24, score-0.195]

16 This distinction is important, as lack of compactness does not lead to errors in either path planning or visual odometry while breaking spatial distinctiveness does. [sent-25, score-0.379]

17 Our approach to building topological maps is divided into three modules: calculating image similarity, detecting loop closures, and map construction. [sent-26, score-0.83]

18 Starting with I, a sequence of n images, the result of calculating image similarity scores is a matrix Mn×n where Mij represents a relative similarity between images i and j. [sent-29, score-0.676]

19 In section 2 we describe how we use local image features to compute the matrix M . [sent-30, score-0.181]

20 To detect loop-closures we have to discretize M into a binary decision matrix Dn×n where Dij = 1 indicates that images i and j are geographically equivalent and form a loop closure. [sent-31, score-0.365]

21 In the final step, the topological map T is generated from D. [sent-33, score-0.44]

22 We calculate the set of keyframes K and their associated connectivity ET using the minimum dominating set of the graph represented by D (Section 4). [sent-34, score-0.58]

23 Related Work The state of the art in topological mapping of images is the FAB-MAP [8] algorithm. [sent-35, score-0.468]

24 Bayesian inference is also used in [1] where bags of words on local image descriptors model locations whose consistency is validated with epipolar geometry. [sent-38, score-0.317]

25 [14] incorporate both odometry and appearance and maintain several hypotheses of topological maps. [sent-40, score-0.498]

26 [17] define maps on two levels, creating global (topological) maps by matching independent local (metric) data and combining loop -closure detection with visual SLAM (Self Localization and Mapping). [sent-42, score-0.334]

27 Tomatis et al [17] detect loop closures by examining the modality of the robot position’s density function (PDF). [sent-44, score-0.377]

28 Approaches like [3] [19] [18] and [9] represent the environment using only an image similarity matrix. [sent-46, score-0.296]

29 Booij et al [3] use the similarity matrix to define a weighted graph for robot navigation. [sent-47, score-0.318]

30 Navigation is conducted on a node by node basis, using new observations and epipolar geometry to estimate the direction of the next node. [sent-48, score-0.211]

31 Valgren et al [19] avoid exhaustively computing the similarity matrix by searching for and sampling cells which are more likely to describe existing loop-closures. [sent-49, score-0.208]

32 Fraundoerfer et al [9] use hierarchical vocabulary trees [13] to quickly compute image similarity scores. [sent-51, score-0.34]

33 Our approach advances the state of the art by using a powerful image alignment score without employing full epipolar geometry, and more robust loop colsure detection by applying MRF inference on the similarity matrix. [sent-58, score-0.8]

34 It is together with [4] the only video-based approach that provides a greatly reduced set of nodes for the final topological representation, making thus path planning tractable. [sent-59, score-0.473]

35 2 2 Image similarity score For any two images i and j, we calculate the similarity score Mij in three steps: generate image features, sort image features into sequences, calculate optimal alignment between both sequences. [sent-60, score-1.188]

36 To detect and generate image features we use Scale Invariant Feature Transform (SIFT) [12]. [sent-61, score-0.235]

37 However, to mitigate perceptual aliasing, we take advantage of the fact that features represent real world structures with fixed spatial arrangements and therefore the similarity score should take their relative positions into account. [sent-64, score-0.433]

38 Instead, we make the assumption that the gravity vector is known so that we can split image position into bearing and elevation and we take into account only the bearing of each feature. [sent-67, score-0.3]

39 We then search for an optimal alignment between pairs of sequences, incorporating both the value and ordering of SIFT features into our similarity score. [sent-69, score-0.355]

40 Sequence alignment To solve for the optimal alignment between two ordered sequences of features we employ dynamic programming. [sent-70, score-0.508]

41 Since bearing is not given with respect to an absolute orientation, ordering is meant only cyclically, which can be handled easily in dynamic programming by replicating one of the input sequences. [sent-74, score-0.207]

42 Modifying the first and last rows of the score matrix to allow for arbitrary start and end locations yields the optimal cyclical alignment in most cases. [sent-75, score-0.4]

43 The score of the optimal alignment between both sequences of features provides the basis for the similarity score between two images and the entries of the matrix M . [sent-77, score-0.867]

44 3 Loop closure-detection using MRF Using the image similarity measure matrix M , we use Markov Random Fields to detect loopclosures. [sent-80, score-0.35]

45 A lattice H is defined as an n × n lattice of binary nodes where a node vi,j represents the probability of images i and j forming a loop-closure. [sent-81, score-0.401]

46 Loops closures in the score matrix M appear as one of three possible shapes. [sent-84, score-0.221]

47 In an intersection the score matrix contains an ellipse. [sent-85, score-0.188]

48 Therefore we define lattice H with eight way connectivity, as it better captures the structure of possible loop closures. [sent-89, score-0.218]

49 As adjacent nodes in H represent sequential images in the sequence, we expect significant overlap in their content. [sent-90, score-0.183]

50 Sudden changes occur when either a loop is just closed (sudden increase) or when a loop closure is complete (sudden decrease) or due to noise caused by a sudden occlusion in one of the scenes. [sent-92, score-0.404]

51 By imposing smoothness on the labeling we capture loop closures while discarding noise. [sent-93, score-0.274]

52 We model every node and every edge in H as a node in the cluster graph C. [sent-100, score-0.228]

53 We initialize the cluster graph directly from the lattice H with ψi,j = φi,j for nodes and ψi,j,k,l = φi,j,k,l for edges. [sent-104, score-0.186]

54 The MAP labeling found here defines our matrix D determining whether two images i and j close a loop. [sent-105, score-0.179]

55 4 Constructing the topological map Finally the decision matrix D is used to define keyframes K and determine the map connectivity ET . [sent-109, score-0.913]

56 Since there is no guarantee that D found through belief propagation is symmetric, we initially treat D as an adjacency matrix for a directed graph, and then remove the direction from all the edges resulting in a symmetric graph D = D ∨ DT . [sent-111, score-0.219]

57 It is possible to use the graph defined by D as a topological map. [sent-112, score-0.394]

58 We find the keyframes K by finding the minimum dominating set of D . [sent-117, score-0.483]

59 The dominating set itself serves as our keyframes K. [sent-120, score-0.483]

60 Each dominating node k ∈ K is also associated with the set of nodes it dominates Nk . [sent-121, score-0.271]

61 4 Algorithm 1: Approximate Minimum Dominating Set Input: Adjacency matric D Output: K,{Nk : k ∈ K} K←∅ while D is not empty do k ← node with largest degree K ← K ∪ {k} Nk ← {k} ∪ N b(k) Remove all nodes Nk from matrix D end 5 Experiments The system was applied to five image sequences. [sent-128, score-0.25]

62 Results are shown for the system as described, as well as for FAB-MAP ([8]) and for different methods of calculating image similarity scores. [sent-129, score-0.342]

63 The Ladybug is composed of five wide-angle lens camera arranged in circle around the base and one camera on top facing upwards. [sent-131, score-0.242]

64 The resulting output is a sequence of frames each containing a set of images captured by the six cameras. [sent-132, score-0.247]

65 For the outdoor sequences the camera was mounted on top of a vehicle which was driven around an urban setting, in this case the cities of Philadelphia and Pittsburgh. [sent-133, score-0.491]

66 In the indoor sequence, the camera was mounted on a tripod set on a cart and moved inside the building covering the ground and 1st floors. [sent-134, score-0.286]

67 Ladybug images were processed independently for each camera using the SIFT detector and extractor provided in the VLFeat toolbox [20]. [sent-135, score-0.25]

68 The resulting features for every camera were merged into a single set and sorted by their spherical coordinates. [sent-136, score-0.239]

69 The two remaining sequences, City Centre and New College were captured in an outdoor setting by Cummins [7] from a limited field of view camera mounted on a mobile robot. [sent-137, score-0.285]

70 All the outdoor sequences were provided with GPS location of the vehicle / robot. [sent-139, score-0.288]

71 of frames 852 1,266 1,256 1,237 1,073 Camera Type spherical spherical spherical limited field of view limited field of view Format raw Ladybug stream file raw Ladybug stream file rectified images standard images standard images Table 1: Summary of image sequences processed. [sent-144, score-0.87]

72 For the indoor sequence the position of the camera was manually determined using building schematics at an arbitrary scale. [sent-147, score-0.262]

73 Parameters Both the image similarity scores and the MRF contain a number of parameters that need to be set. [sent-150, score-0.296]

74 When calculating the image similarity score, there are five parameters. [sent-151, score-0.342]

75 In addition, dynamic programming requires three parameters to define the score of an optimal alignment: smatch ,sgap ,smiss . [sent-153, score-0.332]

76 smatch is the value by which the score of an alignment is improved by including correctly matched pairs of features. [sent-154, score-0.351]

77 sgap is the cost of ignoring a feature in the optimal alignment (insertion and deletion), and smiss is the cost of including incorrectly matched pairs (substitution). [sent-155, score-0.236]

78 Finally we use w = 30 as our window size, to avoid calculating similarity scores for images taken within very short time of each 1 The Pittsburgh dataset has been provided by Google for research purposes 5 Precision Recall Indoors 91. [sent-158, score-0.339]

79 Results In addition to the image similarity score defined above, we also processed the image SIF sequences using alternative similarity measures. [sent-177, score-0.837]

80 We show results for Mij T = number of SIFT REC matches, Mij = number of reciprocal SIFT matches (the intersection of matches from image i to image j and from j to i). [sent-178, score-0.314]

81 To process spherical images using FAB-MAP we limited ourselves to using images captured by camera 0 (Directly forwards / backwards). [sent-180, score-0.488]

82 Figure 2 shows precision-recall curves for all sequences and similarity measures. [sent-181, score-0.271]

83 Table 2 shows the results of performing inference on the image similarity matrices. [sent-185, score-0.296]

84 Finally figure 3 shows the topological map resulting from running dominating sets on the decision matrices D. [sent-186, score-0.593]

85 The blue dots represent the locations of the keyframes K with the edges ET drawn in blue. [sent-188, score-0.496]

86 6 Outlook We presented a system that constructs purely topological maps from video sequences captured from moving vehicles. [sent-191, score-0.614]

87 A highly accurate image similarity score is found by a cyclical alignment of sorted feature sequences. [sent-193, score-0.638]

88 This score is then refined via loopy-belief propagation to detect loop-closures. [sent-194, score-0.235]

89 Finally we constructed a topological map for the sequence in question. [sent-195, score-0.481]

90 This map can be used for either path planning or for bundle adjustment in visual SLAM systems. [sent-196, score-0.232]

91 The bottleneck of the system is computing the image similarity score. [sent-197, score-0.296]

92 In addition to implementing score calculation with a parallel algorithm (either on a multicore machine or using graphics hardware), we plan to construct approximations to our image similarity score. [sent-199, score-0.434]

93 These include using visual bags of words in a hierarchical fashion [13] and building the score matrix M incrementally [19, 18]. [sent-200, score-0.233]

94 8 Recall Recall (d) City Centre (e) New College Figure 2: Precision-recall curves for different thresholds on image similarity scores. [sent-272, score-0.296]

95 Blue dots represent positions of keyframes K with edges ET drawn in blue. [sent-274, score-0.482]

96 Blue dots represent positions of keyframes K with edges ET drawn in blue. [sent-277, score-0.482]

97 Pruning the image set for appearance based robot localization. [sent-303, score-0.252]

98 Monocular visual odometry in urban environments using an omnidirectional camera. [sent-381, score-0.301]

99 Hybrid simultaneous localization and map building: a natural integration of topological and metric. [sent-388, score-0.44]

100 Incremental spectral clustering and its application to topological mapping. [sent-395, score-0.339]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('topological', 0.339), ('keyframes', 0.33), ('sift', 0.248), ('indoors', 0.165), ('similarity', 0.164), ('dominating', 0.153), ('alignment', 0.142), ('loop', 0.141), ('score', 0.138), ('philadelphia', 0.136), ('mij', 0.132), ('image', 0.132), ('images', 0.129), ('robotics', 0.124), ('camera', 0.121), ('fab', 0.118), ('ladybug', 0.118), ('sequences', 0.107), ('city', 0.102), ('pittsburgh', 0.102), ('map', 0.101), ('booij', 0.094), ('odometry', 0.094), ('omnidirectional', 0.094), ('college', 0.093), ('centre', 0.087), ('nk', 0.085), ('bearing', 0.084), ('mrf', 0.083), ('compactness', 0.083), ('closures', 0.083), ('epipolar', 0.083), ('precision', 0.081), ('lattice', 0.077), ('vehicle', 0.077), ('sudden', 0.076), ('maps', 0.071), ('traversal', 0.071), ('cummins', 0.071), ('distinctiveness', 0.071), ('smatch', 0.071), ('tmatch', 0.071), ('tomatis', 0.071), ('valgren', 0.071), ('spherical', 0.069), ('dynamic', 0.068), ('dots', 0.068), ('appearance', 0.065), ('node', 0.064), ('cyclical', 0.062), ('mounted', 0.062), ('outdoor', 0.062), ('urban', 0.062), ('recall', 0.062), ('locations', 0.058), ('automation', 0.057), ('video', 0.057), ('robot', 0.055), ('programming', 0.055), ('graph', 0.055), ('nodes', 0.054), ('detect', 0.054), ('indoor', 0.053), ('visual', 0.051), ('atlas', 0.05), ('ground', 0.05), ('labeling', 0.05), ('intersection', 0.05), ('features', 0.049), ('ranganathan', 0.047), ('schematics', 0.047), ('sgap', 0.047), ('slam', 0.047), ('smiss', 0.047), ('vlfeat', 0.047), ('zivkovic', 0.047), ('closure', 0.046), ('calculating', 0.046), ('truth', 0.045), ('edge', 0.045), ('bags', 0.044), ('positions', 0.044), ('et', 0.044), ('propagation', 0.043), ('planning', 0.043), ('location', 0.042), ('connectivity', 0.042), ('dmax', 0.041), ('fb', 0.041), ('geographically', 0.041), ('icra', 0.041), ('sequence', 0.041), ('symmetric', 0.041), ('adjacency', 0.04), ('captured', 0.04), ('edges', 0.04), ('aliasing', 0.038), ('mitigate', 0.038), ('frames', 0.037), ('path', 0.037)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999946 58 nips-2009-Constructing Topological Maps using Markov Random Fields and Loop-Closure Detection

Author: Roy Anati, Kostas Daniilidis

Abstract: We present a system which constructs a topological map of an environment given a sequence of images. This system includes a novel image similarity score which uses dynamic programming to match images using both the appearance and relative positions of local features simultaneously. Additionally, an MRF is constructed to model the probability of loop-closures. A locally optimal labeling is found using Loopy-BP. Finally we outline a method to generate a topological map from loop closure data. Results, presented on four urban sequences and one indoor sequence, outperform the state of the art. 1

2 0.14234369 211 nips-2009-Segmenting Scenes by Matching Image Composites

Author: Bryan Russell, Alyosha Efros, Josef Sivic, Bill Freeman, Andrew Zisserman

Abstract: In this paper, we investigate how, given an image, similar images sharing the same global description can help with unsupervised scene segmentation. In contrast to recent work in semantic alignment of scenes, we allow an input image to be explained by partial matches of similar scenes. This allows for a better explanation of the input scenes. We perform MRF-based segmentation that optimizes over matches, while respecting boundary information. The recovered segments are then used to re-query a large database of images to retrieve better matches for the target regions. We show improved performance in detecting the principal occluding and contact boundaries for the scene over previous methods on data gathered from the LabelMe database.

3 0.12868318 133 nips-2009-Learning models of object structure

Author: Joseph Schlecht, Kobus Barnard

Abstract: We present an approach for learning stochastic geometric models of object categories from single view images. We focus here on models expressible as a spatially contiguous assemblage of blocks. Model topologies are learned across groups of images, and one or more such topologies is linked to an object category (e.g. chairs). Fitting learned topologies to an image can be used to identify the object class, as well as detail its geometry. The latter goes beyond labeling objects, as it provides the geometric structure of particular instances. We learn the models using joint statistical inference over category parameters, camera parameters, and instance parameters. These produce an image likelihood through a statistical imaging model. We use trans-dimensional sampling to explore topology hypotheses, and alternate between Metropolis-Hastings and stochastic dynamics to explore instance parameters. Experiments on images of furniture objects such as tables and chairs suggest that this is an effective approach for learning models that encode simple representations of category geometry and the statistics thereof, and support inferring both category and geometry on held out single view images. 1

4 0.097133353 141 nips-2009-Local Rules for Global MAP: When Do They Work ?

Author: Kyomin Jung, Pushmeet Kohli, Devavrat Shah

Abstract: We consider the question of computing Maximum A Posteriori (MAP) assignment in an arbitrary pair-wise Markov Random Field (MRF). We present a randomized iterative algorithm based on simple local updates. The algorithm, starting with an arbitrary initial assignment, updates it in each iteration by first, picking a random node, then selecting an (appropriately chosen) random local neighborhood and optimizing over this local neighborhood. Somewhat surprisingly, we show that this algorithm finds a near optimal assignment within n log 2 n iterations with high probability for any n node pair-wise MRF with geometry (i.e. MRF graph with polynomial growth) with the approximation error depending on (in a reasonable manner) the geometric growth rate of the graph and the average radius of the local neighborhood – this allows for a graceful tradeoff between the complexity of the algorithm and the approximation error. Through extensive simulations, we show that our algorithm finds extremely good approximate solutions for various kinds of MRFs with geometry.

5 0.088255435 137 nips-2009-Learning transport operators for image manifolds

Author: Benjamin Culpepper, Bruno A. Olshausen

Abstract: We describe an unsupervised manifold learning algorithm that represents a surface through a compact description of operators that traverse it. The operators are based on matrix exponentials, which are the solution to a system of first-order linear differential equations. The matrix exponents are represented by a basis that is adapted to the statistics of the data so that the infinitesimal generator for a trajectory along the underlying manifold can be produced by linearly composing a few elements. The method is applied to recover topological structure from low dimensional synthetic data, and to model local structure in how natural images change over time and scale. 1

6 0.08467181 43 nips-2009-Bayesian estimation of orientation preference maps

7 0.082607225 251 nips-2009-Unsupervised Detection of Regions of Interest Using Iterative Link Analysis

8 0.082414202 201 nips-2009-Region-based Segmentation and Object Detection

9 0.081243753 6 nips-2009-A Biologically Plausible Model for Rapid Natural Scene Identification

10 0.079046808 30 nips-2009-An Integer Projected Fixed Point Method for Graph Matching and MAP Inference

11 0.079018325 96 nips-2009-Filtering Abstract Senses From Image Search Results

12 0.078270711 77 nips-2009-Efficient Match Kernel between Sets of Features for Visual Recognition

13 0.077810183 131 nips-2009-Learning from Neighboring Strokes: Combining Appearance and Context for Multi-Domain Sketch Recognition

14 0.077194102 124 nips-2009-Lattice Regression

15 0.077118866 250 nips-2009-Training Factor Graphs with Reinforcement Learning for Efficient MAP Inference

16 0.076734774 70 nips-2009-Discriminative Network Models of Schizophrenia

17 0.07391572 5 nips-2009-A Bayesian Model for Simultaneous Image Clustering, Annotation and Object Segmentation

18 0.072496012 228 nips-2009-Speeding up Magnetic Resonance Image Acquisition by Bayesian Multi-Slice Adaptive Compressed Sensing

19 0.071902968 224 nips-2009-Sparse and Locally Constant Gaussian Graphical Models

20 0.07149417 97 nips-2009-Free energy score space


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.217), (1, -0.106), (2, -0.114), (3, -0.032), (4, -0.058), (5, 0.114), (6, 0.03), (7, 0.038), (8, 0.05), (9, -0.09), (10, -0.04), (11, 0.046), (12, 0.062), (13, 0.064), (14, -0.052), (15, -0.024), (16, -0.022), (17, -0.023), (18, -0.0), (19, -0.096), (20, 0.021), (21, 0.014), (22, 0.051), (23, -0.015), (24, 0.034), (25, 0.061), (26, 0.047), (27, -0.018), (28, -0.026), (29, 0.019), (30, 0.029), (31, 0.139), (32, -0.031), (33, 0.016), (34, -0.05), (35, 0.039), (36, 0.108), (37, -0.03), (38, -0.019), (39, 0.06), (40, -0.052), (41, 0.043), (42, -0.056), (43, 0.041), (44, -0.109), (45, 0.113), (46, -0.066), (47, -0.09), (48, 0.066), (49, -0.084)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96211898 58 nips-2009-Constructing Topological Maps using Markov Random Fields and Loop-Closure Detection

Author: Roy Anati, Kostas Daniilidis

Abstract: We present a system which constructs a topological map of an environment given a sequence of images. This system includes a novel image similarity score which uses dynamic programming to match images using both the appearance and relative positions of local features simultaneously. Additionally, an MRF is constructed to model the probability of loop-closures. A locally optimal labeling is found using Loopy-BP. Finally we outline a method to generate a topological map from loop closure data. Results, presented on four urban sequences and one indoor sequence, outperform the state of the art. 1

2 0.66129839 211 nips-2009-Segmenting Scenes by Matching Image Composites

Author: Bryan Russell, Alyosha Efros, Josef Sivic, Bill Freeman, Andrew Zisserman

Abstract: In this paper, we investigate how, given an image, similar images sharing the same global description can help with unsupervised scene segmentation. In contrast to recent work in semantic alignment of scenes, we allow an input image to be explained by partial matches of similar scenes. This allows for a better explanation of the input scenes. We perform MRF-based segmentation that optimizes over matches, while respecting boundary information. The recovered segments are then used to re-query a large database of images to retrieve better matches for the target regions. We show improved performance in detecting the principal occluding and contact boundaries for the scene over previous methods on data gathered from the LabelMe database.

3 0.61013258 172 nips-2009-Nonparametric Bayesian Texture Learning and Synthesis

Author: Long Zhu, Yuanahao Chen, Bill Freeman, Antonio Torralba

Abstract: We present a nonparametric Bayesian method for texture learning and synthesis. A texture image is represented by a 2D Hidden Markov Model (2DHMM) where the hidden states correspond to the cluster labeling of textons and the transition matrix encodes their spatial layout (the compatibility between adjacent textons). The 2DHMM is coupled with the Hierarchical Dirichlet process (HDP) which allows the number of textons and the complexity of transition matrix grow as the input texture becomes irregular. The HDP makes use of Dirichlet process prior which favors regular textures by penalizing the model complexity. This framework (HDP-2DHMM) learns the texton vocabulary and their spatial layout jointly and automatically. The HDP-2DHMM results in a compact representation of textures which allows fast texture synthesis with comparable rendering quality over the state-of-the-art patch-based rendering methods. We also show that the HDP2DHMM can be applied to perform image segmentation and synthesis. The preliminary results suggest that HDP-2DHMM is generally useful for further applications in low-level vision problems. 1

4 0.60365558 30 nips-2009-An Integer Projected Fixed Point Method for Graph Matching and MAP Inference

Author: Marius Leordeanu, Martial Hebert, Rahul Sukthankar

Abstract: Graph matching and MAP inference are essential problems in computer vision and machine learning. We introduce a novel algorithm that can accommodate both problems and solve them efficiently. Recent graph matching algorithms are based on a general quadratic programming formulation, which takes in consideration both unary and second-order terms reflecting the similarities in local appearance as well as in the pairwise geometric relationships between the matched features. This problem is NP-hard, therefore most algorithms find approximate solutions by relaxing the original problem. They find the optimal continuous solution of the modified problem, ignoring during optimization the original discrete constraints. Then the continuous solution is quickly binarized at the end, but very little attention is put into this final discretization step. In this paper we argue that the stage in which a discrete solution is found is crucial for good performance. We propose an efficient algorithm, with climbing and convergence properties, that optimizes in the discrete domain the quadratic score, and it gives excellent results either by itself or by starting from the solution returned by any graph matching algorithm. In practice it outperforms state-or-the art graph matching algorithms and it also significantly improves their performance if used in combination. When applied to MAP inference, the algorithm is a parallel extension of Iterated Conditional Modes (ICM) with climbing and convergence properties that make it a compelling alternative to the sequential ICM. In our experiments on MAP inference our algorithm proved its effectiveness by significantly outperforming [13], ICM and Max-Product Belief Propagation. 1

5 0.59688807 93 nips-2009-Fast Image Deconvolution using Hyper-Laplacian Priors

Author: Dilip Krishnan, Rob Fergus

Abstract: The heavy-tailed distribution of gradients in natural scenes have proven effective priors for a range of problems such as denoising, deblurring and super-resolution. α These distributions are well modeled by a hyper-Laplacian p(x) ∝ e−k|x| , typically with 0.5 ≤ α ≤ 0.8. However, the use of sparse distributions makes the problem non-convex and impractically slow to solve for multi-megapixel images. In this paper we describe a deconvolution approach that is several orders of magnitude faster than existing techniques that use hyper-Laplacian priors. We adopt an alternating minimization scheme where one of the two phases is a non-convex problem that is separable over pixels. This per-pixel sub-problem may be solved with a lookup table (LUT). Alternatively, for two specific values of α, 1/2 and 2/3 an analytic solution can be found, by finding the roots of a cubic and quartic polynomial, respectively. Our approach (using either LUTs or analytic formulae) is able to deconvolve a 1 megapixel image in less than ∼3 seconds, achieving comparable quality to existing methods such as iteratively reweighted least squares (IRLS) that take ∼20 minutes. Furthermore, our method is quite general and can easily be extended to related image processing problems, beyond the deconvolution application demonstrated. 1

6 0.56516689 6 nips-2009-A Biologically Plausible Model for Rapid Natural Scene Identification

7 0.56430829 131 nips-2009-Learning from Neighboring Strokes: Combining Appearance and Context for Multi-Domain Sketch Recognition

8 0.55898309 32 nips-2009-An Online Algorithm for Large Scale Image Similarity Learning

9 0.55724502 137 nips-2009-Learning transport operators for image manifolds

10 0.55389047 258 nips-2009-Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise

11 0.53567034 149 nips-2009-Maximin affinity learning of image segmentation

12 0.51271617 124 nips-2009-Lattice Regression

13 0.50113672 5 nips-2009-A Bayesian Model for Simultaneous Image Clustering, Annotation and Object Segmentation

14 0.50053388 133 nips-2009-Learning models of object structure

15 0.48969901 83 nips-2009-Estimating image bases for visual image reconstruction from human brain activity

16 0.4894869 141 nips-2009-Local Rules for Global MAP: When Do They Work ?

17 0.48800448 188 nips-2009-Perceptual Multistability as Markov Chain Monte Carlo Inference

18 0.46620935 228 nips-2009-Speeding up Magnetic Resonance Image Acquisition by Bayesian Multi-Slice Adaptive Compressed Sensing

19 0.4660868 10 nips-2009-A Gaussian Tree Approximation for Integer Least-Squares

20 0.46001551 28 nips-2009-An Additive Latent Feature Model for Transparent Object Recognition


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(21, 0.015), (24, 0.019), (25, 0.075), (35, 0.087), (36, 0.106), (39, 0.088), (58, 0.069), (61, 0.011), (71, 0.05), (81, 0.021), (82, 0.289), (86, 0.083), (91, 0.011)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.82912368 58 nips-2009-Constructing Topological Maps using Markov Random Fields and Loop-Closure Detection

Author: Roy Anati, Kostas Daniilidis

Abstract: We present a system which constructs a topological map of an environment given a sequence of images. This system includes a novel image similarity score which uses dynamic programming to match images using both the appearance and relative positions of local features simultaneously. Additionally, an MRF is constructed to model the probability of loop-closures. A locally optimal labeling is found using Loopy-BP. Finally we outline a method to generate a topological map from loop closure data. Results, presented on four urban sequences and one indoor sequence, outperform the state of the art. 1

2 0.69526672 150 nips-2009-Maximum likelihood trajectories for continuous-time Markov chains

Author: Theodore J. Perkins

Abstract: Continuous-time Markov chains are used to model systems in which transitions between states as well as the time the system spends in each state are random. Many computational problems related to such chains have been solved, including determining state distributions as a function of time, parameter estimation, and control. However, the problem of inferring most likely trajectories, where a trajectory is a sequence of states as well as the amount of time spent in each state, appears unsolved. We study three versions of this problem: (i) an initial value problem, in which an initial state is given and we seek the most likely trajectory until a given final time, (ii) a boundary value problem, in which initial and final states and times are given, and we seek the most likely trajectory connecting them, and (iii) trajectory inference under partial observability, analogous to finding maximum likelihood trajectories for hidden Markov models. We show that maximum likelihood trajectories are not always well-defined, and describe a polynomial time test for well-definedness. When well-definedness holds, we show that each of the three problems can be solved in polynomial time, and we develop efficient dynamic programming algorithms for doing so. 1

3 0.59235156 250 nips-2009-Training Factor Graphs with Reinforcement Learning for Efficient MAP Inference

Author: Khashayar Rohanimanesh, Sameer Singh, Andrew McCallum, Michael J. Black

Abstract: Large, relational factor graphs with structure defined by first-order logic or other languages give rise to notoriously difficult inference problems. Because unrolling the structure necessary to represent distributions over all hypotheses has exponential blow-up, solutions are often derived from MCMC. However, because of limitations in the design and parameterization of the jump function, these samplingbased methods suffer from local minima—the system must transition through lower-scoring configurations before arriving at a better MAP solution. This paper presents a new method of explicitly selecting fruitful downward jumps by leveraging reinforcement learning (RL). Rather than setting parameters to maximize the likelihood of the training data, parameters of the factor graph are treated as a log-linear function approximator and learned with methods of temporal difference (TD); MAP inference is performed by executing the resulting policy on held out test data. Our method allows efficient gradient updates since only factors in the neighborhood of variables affected by an action need to be computed—we bypass the need to compute marginals entirely. Our method yields dramatic empirical success, producing new state-of-the-art results on a complex joint model of ontology alignment, with a 48% reduction in error over state-of-the-art in that domain. 1

4 0.57332981 155 nips-2009-Modelling Relational Data using Bayesian Clustered Tensor Factorization

Author: Ilya Sutskever, Joshua B. Tenenbaum, Ruslan Salakhutdinov

Abstract: We consider the problem of learning probabilistic models for complex relational structures between various types of objects. A model can help us “understand” a dataset of relational facts in at least two ways, by finding interpretable structure in the data, and by supporting predictions, or inferences about whether particular unobserved relations are likely to be true. Often there is a tradeoff between these two aims: cluster-based models yield more easily interpretable representations, while factorization-based approaches have given better predictive performance on large data sets. We introduce the Bayesian Clustered Tensor Factorization (BCTF) model, which embeds a factorized representation of relations in a nonparametric Bayesian clustering framework. Inference is fully Bayesian but scales well to large data sets. The model simultaneously discovers interpretable clusters and yields predictive performance that matches or beats previous probabilistic models for relational data.

5 0.56848097 162 nips-2009-Neural Implementation of Hierarchical Bayesian Inference by Importance Sampling

Author: Lei Shi, Thomas L. Griffiths

Abstract: The goal of perception is to infer the hidden states in the hierarchical process by which sensory data are generated. Human behavior is consistent with the optimal statistical solution to this problem in many tasks, including cue combination and orientation detection. Understanding the neural mechanisms underlying this behavior is of particular importance, since probabilistic computations are notoriously challenging. Here we propose a simple mechanism for Bayesian inference which involves averaging over a few feature detection neurons which fire at a rate determined by their similarity to a sensory stimulus. This mechanism is based on a Monte Carlo method known as importance sampling, commonly used in computer science and statistics. Moreover, a simple extension to recursive importance sampling can be used to perform hierarchical Bayesian inference. We identify a scheme for implementing importance sampling with spiking neurons, and show that this scheme can account for human behavior in cue combination and the oblique effect. 1

6 0.56834513 158 nips-2009-Multi-Label Prediction via Sparse Infinite CCA

7 0.56779075 188 nips-2009-Perceptual Multistability as Markov Chain Monte Carlo Inference

8 0.56731933 131 nips-2009-Learning from Neighboring Strokes: Combining Appearance and Context for Multi-Domain Sketch Recognition

9 0.56725711 174 nips-2009-Nonparametric Latent Feature Models for Link Prediction

10 0.56569254 28 nips-2009-An Additive Latent Feature Model for Transparent Object Recognition

11 0.56402016 70 nips-2009-Discriminative Network Models of Schizophrenia

12 0.56094187 133 nips-2009-Learning models of object structure

13 0.56012583 145 nips-2009-Manifold Embeddings for Model-Based Reinforcement Learning under Partial Observability

14 0.56001008 167 nips-2009-Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations

15 0.55962723 112 nips-2009-Human Rademacher Complexity

16 0.55868089 113 nips-2009-Improving Existing Fault Recovery Policies

17 0.55857396 137 nips-2009-Learning transport operators for image manifolds

18 0.55751759 168 nips-2009-Non-stationary continuous dynamic Bayesian networks

19 0.55659747 23 nips-2009-Accelerating Bayesian Structural Inference for Non-Decomposable Gaussian Graphical Models

20 0.55651468 132 nips-2009-Learning in Markov Random Fields using Tempered Transitions