iccv iccv2013 iccv2013-56 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Bernhard Zeisl, Kevin Köser, Marc Pollefeys
Abstract: We address the problem of wide-baseline registration of RGB-D data, such as photo-textured laser scans without any artificial targets or prediction on the relative motion. Our approach allows to fully automatically register scans taken in GPS-denied environments such as urban canyon, industrial facilities or even indoors. We build upon image features which are plenty, localized well and much more discriminative than geometry features; however, they suffer from viewpoint distortions and request for normalization. We utilize the principle of salient directions present in the geometry and propose to extract (several) directions from the distribution of surface normals or other cues such as observable symmetries. Compared to previous work we pose no requirements on the scanned scene (like containing large textured planes) and can handle arbitrary surface shapes. Rendering the whole scene from these repeatable directions using an orthographic camera generates textures which are identical up to 2D similarity transformations. This ambiguity is naturally handled by 2D features and allows to find stable correspondences among scans. For geometric pose estimation from tentative matches we propose a fast and robust 2 point sample consensus scheme integrating an early rejection phase. We evaluate our approach on different challenging real world scenes.
Reference: text
sentIndex sentText sentNum sentScore
1 ch Abstract We address the problem of wide-baseline registration of RGB-D data, such as photo-textured laser scans without any artificial targets or prediction on the relative motion. [sent-6, score-0.904]
2 Our approach allows to fully automatically register scans taken in GPS-denied environments such as urban canyon, industrial facilities or even indoors. [sent-7, score-0.552]
3 We utilize the principle of salient directions present in the geometry and propose to extract (several) directions from the distribution of surface normals or other cues such as observable symmetries. [sent-9, score-0.926]
4 Compared to previous work we pose no requirements on the scanned scene (like containing large textured planes) and can handle arbitrary surface shapes. [sent-10, score-0.22]
5 Rendering the whole scene from these repeatable directions using an orthographic camera generates textures which are identical up to 2D similarity transformations. [sent-11, score-0.388]
6 For geometric pose estimation from tentative matches we propose a fast and robust 2 point sample consensus scheme integrating an early rejection phase. [sent-13, score-0.316]
7 Introduction When surveying construction sites, historical buildings or industrial facilities laser scanning is the state-of-the-art technique to obtain accurate three-dimensional models. [sent-16, score-0.359]
8 Usually a scanner is positioned at different places, in- or outdoors, in order to minimize scan shadows and to obtain a model as complete as possible. [sent-21, score-0.344]
9 Since scanning is a time-consuming and therefore expensive task the number of scans is usually kept as small as possible, leading to a wide baseline setting between ∗This work was done while K. [sent-22, score-0.482]
10 was employed at the Institute for gorithm from 5 individual scans (CHURCH dataset). [sent-24, score-0.392]
11 We achieve entirely automatic registration of arbitrary geometry from largely different viewpoints by exploiting depth and image data jointly. [sent-25, score-0.393]
12 Not only scanning, but also the registration ofindividual scans takes a lot oftime - either afterwards by manually aligning models, or on site by carefully positioning targets (artificial markers) in the scene, which are spotted and automatically detected from several scan positions. [sent-27, score-0.87]
13 If one desires to rescan the facility at another point in time and align current data with an older model, exploiting artificial markers for registration is impossible. [sent-28, score-0.309]
14 As a result there is a quest for automatic registration methods which do not rely on any artificial landmarks, but can generate accurate registration results by exploiting the scan data itself. [sent-29, score-0.712]
15 GPS and magnetic compass can simplify the registration problem, but they fail under bridges, inside buildings, urban canyon or close to metallic or electric installations, respectively. [sent-31, score-0.371]
16 In this work we propose to become independent of the original sensor viewpoint by exploiting characteristic salient directions of the scene, which are repeatable among different scans. [sent-35, score-0.616]
17 Examples include peaks in the distribution of the surface normals, vanishing points, symmetry, gravity or other directions that can be reliably obtained from the sensor or the scene. [sent-36, score-0.466]
18 Each salient direction is then exploited to render an orthographic view, and by this way remov- ing the perspective effects that had been introduced by the particular scanner position. [sent-37, score-0.738]
19 Importantly, for corresponding salient directions between scans generated images are identical (for jointly seen Lambertian scene parts) up to a 2D similarity transformation! [sent-38, score-0.883]
20 Compared to earlier approaches proposed for consumer depth cameras [25] or stereo systems [22, 4] our approach does not pose any requirements on the presence of particular geometric shapes. [sent-40, score-0.231]
21 This is an important aspect if the visible overlap between scans is small. [sent-42, score-0.392]
22 Finally, we propose a novel 2-point solution for the restricted 4 DoF registration problem, allowing for a greedy rejection of outliercontaminated hypotheses in a sample consensus framework. [sent-47, score-0.334]
23 The remainder of the paper is structured as follows: After a discussion of existing registration techniques in the next section, we show how to obtain viewpoint invariance from salient directions in Sec. [sent-48, score-0.819]
24 5 cover details of our approach for salient direction detection and pose estimation. [sent-52, score-0.436]
25 This registration can be performed by targets or calibration patterns [21] or by maximizing mutual information between reflectance and color [17]. [sent-57, score-0.3]
26 However, this repeatability is likely to decrease with increasing surface complexity because of self-occlusions. [sent-76, score-0.251]
27 For planar scenes like facades with clearly visible straight lines, vanishing points can be used, even if no depth information is available [18, 3, 1]. [sent-82, score-0.247]
28 For more general scenes, it was shown that the sole usage of affine features can be improved, if they are normalized with respect to the local surface normal rather than to the affine shape [11]. [sent-83, score-0.295]
29 Recently, this local approach has been generalized from planes to parametric developable surfaces, allowing also to use cylinders and cones [25]. [sent-86, score-0.239]
30 Feature matching and thus registration from these images fails in most cases. [sent-90, score-0.269]
31 (middle, right:) Generated salient direction rectified (SDR) renderings along corresponding salient directions. [sent-91, score-0.733]
32 Viewpoint Invariance via Salient Directions Our novel approach to register widely separated scans builds upon image features rather than 3D geometry features, because image features are plenty, well localized and discriminative. [sent-96, score-0.47]
33 We eliminate effects of viewpoint to allow for wide baseline registration of scans without a prediction on relative pose. [sent-97, score-0.794]
34 Instead we exploit the entire scene information by the concept of salient directions. [sent-101, score-0.321]
35 Let us now define what we mean by a salient direction. [sent-102, score-0.284]
36 The pose of a laser scanner in the world coordinate system is specified by the mapping of a point X from world to scanner coordinates via Xi = siRiX+ti = siRiX −siRiCi . [sent-103, score-0.645]
37 Here, Ci represents the origin of the scanner in Xwo−rlsd coordinates, while Ri represents its orientation and si is the scaling. [sent-104, score-0.246]
38 A salient direction is a real-world direction in global coordinates dsal that can be observed locally as disal, djsal in independent scans iand j: dsal = RTidisal = RTjdjsal. [sent-107, score-1.182]
39 (1) Intuitively, imagine dsal is the north direction, that is repre- Figure 3: Orthographic renderings along a salient direction. [sent-108, score-0.437]
40 The scene overlap of planar (red) and free form (blue) surface will be rendered identically along dsal for each scanner. [sent-109, score-0.388]
41 sented in scans iand j as disal and djsal respectively. [sent-110, score-0.681]
42 5D depth and image data, either from a laser scanner or from a consumer depth device or stereo system. [sent-112, score-0.571]
43 Then, for the depth data, local normals are estimated and we will call the set of range data, color data and normals taken from one position a scan. [sent-114, score-0.351]
44 A salient direction rectified (SDR) image, is an image which is obtained by rendering the scene along a salient direction disal with orthographic projection matrix Pi=? [sent-120, score-1.166]
45 Given a salient direction d sal with corresponding local directions disal, djsal in scans iand j, then corresponding points in the two SDR-images relate to each other via a 2D similarity transformation. [sent-130, score-1.046]
46 Without loss of generality we set the ith scanner pose [I, 0] and denote [sjRj , tj] = [sR, t] . [sent-138, score-0.266]
47 If absolute scale is known – as for laser scans the freedom reduces to a 2D euclidean transformation. [sent-157, score-0.526]
48 Given that a global direction g is known commonly among scans in local coordinates as gi and that ˜ ri,1 × is chosen as ˜ ri,1 = (gi disal)/|gi images differ only in tra×nsdl atio)n/. [sent-164, score-0.524]
49 | disal |, then generated Defining ˜ ri,1 as above and setting ˜ ri,2 orthogonal to it via r˜i,2 = (disal ˜r i,1)/|disal ˜ ri,1 | ensures that g appears upright in the× SDR-images. [sent-165, score-0.276]
50 Normalization of image data with respect to salient directions (per direction per scan) 3. [sent-176, score-0.55]
51 Geometric verification and concurrent pose estimation (for a scan pair) 4. [sent-178, score-0.224]
52 Salient Direction Detection and Image Normalization Given a salient world direction that can be identified in two different scans, we have shown that we can transform the image content in a way that it becomes virtually invari- ant with respect to the unknown pose. [sent-179, score-0.38]
53 Depending on the scene type several possibilities exist how to identify salient directions, including vanishing points [1] in modern architecture, directions of repetitions or symmetries [12] in historical buildings or north direction from the sky or the time and the sun [14] in outdoor scenes. [sent-180, score-0.723]
54 However, in this contribution we demonstrate the idea using salient directions derived from characteristics of geometric structures, that is peaks in the distribution of surface normals (cf. [sent-181, score-0.813]
55 successful registration only a single peak needs to be consistent, while remaining modes can be different. [sent-185, score-0.302]
56 Dominant normal directions Potentially disjoint, locally planar surfaces give rise to dominant surface normals. [sent-186, score-0.48]
57 The algorithm now performs gradient descent on the density estimate fˆ(nk) and sample trajectories reach stable points at peaks of the density function. [sent-193, score-0.171]
58 As a distance measure between normals we use their orientation agreement. [sent-194, score-0.17]
59 The sampling density of points on a surface highly depends on the distance of the surface from the scanner, as well as the slant of the surface wrt. [sent-203, score-0.449]
60 Thus, if we used raw 3D points x (and their normals n) as generated from the scanner much higher emphasis would be given to surfaces close to the scanner and parallel to the scanning direction. [sent-205, score-0.73]
61 − − − 281 1 dered from salient directions highly supported by structures near the scanner, and repeatability of salient directions between scans would be degraded. [sent-220, score-1.424]
62 (7) Here a(x) denotes the surface area orthogonal to the scanning direction rx. [sent-227, score-0.313]
63 For a depth map it is the projected pixel footprint at depth [x]z, while for a laser-scan it relates to the projected 2D scan interval (given by the angular scan resolution) at distance ? [sent-228, score-0.434]
64 c Alosud a ar ensdu are aeb glee ntoe rdaettee arm sipnaesalient directions bias-free. [sent-235, score-0.17]
65 Within the first category are holes which are caused by occluders in the original scanner viewpoint placed in front of the surface to render, e. [sent-243, score-0.517]
66 Second, are holes which are caused by missing data in the scanning process (e. [sent-248, score-0.174]
67 For the relative registration of two scans we augment each feature by its 3D position and normal in the local coordinate system and denote points as ps and pt (in the following indices s and t indicate source and target scan, respectively). [sent-266, score-0.94]
68 For a laser scanner the gravity direction is usually known (assumed to be aligned with the z-axis in the following), so we need to estimate only 4 parameters; however, for a hand-held RGB-D sensor 6 DoF need to be estimated. [sent-268, score-0.504]
69 For upright features the latter is fixed by the gravity direction and the local coordinate system is defined as [n, n ez , (n ez) n]). [sent-275, score-0.48]
70 The rotation angle θ around the gravity direction ez is computed between normals ns , nt projected in the x-y plane n¯s = ns ? [sent-278, score-0.571]
71 28 12 Algorithm 1 2-point geometric pose verification Algorithm 12-point geometric pose verification {p(si) pt(i) Require: set m = [m1, . [sent-294, score-0.242]
72 K do uniformly sample 2 matches mi , mj from m vs ← vt ← if | ? [sent-300, score-0.174]
73 The orientation of these vectors is more precisely compared to normals due to their much larger spatial extent. [sent-325, score-0.17]
74 3D points in the source in the target scene form vectors vs and vt respectively, connecting the 2 points in the local scans. [sent-332, score-0.218]
75 ps(i), p(sj) and p(ti), p(tj) Full 6 DoF transformation To estimate all 6 DoF of a 3D rigid body transformation, at minimum 3 corresponding points are required (if normals and feature orientations should be avoided). [sent-339, score-0.25]
76 Experimental Evaluation For evaluation we recorded 3 different datasets with different scene characteristics which are typical for laser scanning scenarios. [sent-343, score-0.261]
77 CHURCH is an indoor dataset of an old church consisting of 5 scans and exhibiting many vaults. [sent-344, score-0.511]
78 Note that there exists a sign ambiguity for the symmetry plane normal, thus we use both possible normal directions as salient direction. [sent-346, score-0.556]
79 For CITY we captured 3 scans in an urban area showing a high number of structured facades (e. [sent-347, score-0.446]
80 Repeatability of Salient Directions It is essential for successful registration that we extract at least one salient direction (up to small variation) in both viewpoints. [sent-356, score-0.649]
81 For evaluation we have taken scans with known rel- ative pose and rendered the source scene into the viewpoint of the target scene. [sent-358, score-0.623]
82 Thus, in these regions corresponding salient directions (defined as directions differing by 10◦ at maximum) can and should get support. [sent-363, score-0.624]
83 We now determine repeatability scores by comparing the number of corresponding salient directions to the total number of detected salient directions. [sent-364, score-0.9]
84 Registration performance To demonstrate the registration performance of our approach we compare it against state-of-the-art planar RGB-D rectification [22, 4]. [sent-367, score-0.399]
85 cube face images), but registration fails in more than half of the cases. [sent-370, score-0.309]
86 salient directions can be established from a untextured white wall, while the features for matching originate from some other textured free-form surface. [sent-388, score-0.454]
87 In addition Fig 1and Fig 6 illustrates the global registration results for CHURCH and CITY, respectively. [sent-389, score-0.269]
88 Previously pair-wise estimated relative poses form a graph connecting the scans with successful registration. [sent-390, score-0.429]
89 An initial solution for the absolute pose of each scans is obtained by construction of a minimum spanning tree (MST) in the graph and concatenating relative transformations accordingly. [sent-391, score-0.485]
90 Conclusion In this work we have presented the novel concept of obtaining viewpoint invariance by means of an orthographic projection along detected salient directions in range data. [sent-398, score-0.703]
91 We have proven that resulting salient direction rectified (SDR) images for corresponding salient directions in different scans are identical up to a 2D similarity transformation in the general case or even more restricted in special, but common cases. [sent-399, score-1.347]
92 This allows to exploit texture and features not only on parametric objects like planes, cones or cylinders, but on any free-form surface in the scene. [sent-400, score-0.205]
93 We have proposed to utilize modes in the distribution of surface normals for salient direction detection. [sent-401, score-0.674]
94 Compared to model fitting approaches for the parametric surfaces, estimating modes via mean-shift is robust, which is reflected by the high repeatability scores we achieve. [sent-402, score-0.19]
95 We have evaluated the algorithm on challenging scenes with wide baseline and little overlap and demonstrated superior registration performance. [sent-403, score-0.302]
96 Future work will explore fully automatic registration of scans taken at different points in time or in different lighting, seasons or weather conditions. [sent-404, score-0.693]
97 Registration of multiple range scans as a location recognition problem: hypothesis generation, refinement and verification. [sent-474, score-0.392]
98 (Lower left parts) Repeatability scores for salient directions, i. [sent-530, score-0.284]
99 the ration of found and present salient directions in the scan overlap. [sent-532, score-0.588]
100 Surface Signatures: An orientation independent free-form surface representation scheme for the purpose of objects registration and matching. [sent-590, score-0.432]
wordName wordTfidf (topN-words)
[('scans', 0.392), ('salient', 0.284), ('registration', 0.269), ('ez', 0.226), ('disal', 0.217), ('scanner', 0.21), ('directions', 0.17), ('laser', 0.134), ('normals', 0.134), ('scan', 0.134), ('surface', 0.127), ('repeatability', 0.124), ('dsal', 0.121), ('church', 0.119), ('rz', 0.119), ('orthographic', 0.115), ('tentative', 0.107), ('direction', 0.096), ('viewpoint', 0.096), ('scanning', 0.09), ('holes', 0.084), ('transformation', 0.084), ('depth', 0.083), ('djsal', 0.072), ('sdr', 0.072), ('claim', 0.071), ('planes', 0.07), ('rectification', 0.069), ('normal', 0.068), ('peaks', 0.067), ('repeatable', 0.066), ('vt', 0.066), ('rejection', 0.065), ('oser', 0.064), ('gravity', 0.064), ('dof', 0.063), ('planar', 0.061), ('ps', 0.061), ('consumer', 0.061), ('upright', 0.059), ('matches', 0.057), ('pose', 0.056), ('surfaces', 0.054), ('urban', 0.054), ('castle', 0.053), ('cylinders', 0.051), ('vs', 0.051), ('rotation', 0.051), ('affine', 0.05), ('canyon', 0.048), ('ntni', 0.048), ('sirix', 0.048), ('zeisl', 0.048), ('correspondences', 0.046), ('pt', 0.046), ('cones', 0.045), ('ser', 0.044), ('zurich', 0.044), ('inpainting', 0.043), ('rendered', 0.042), ('geometry', 0.041), ('cube', 0.04), ('artificial', 0.04), ('developable', 0.04), ('geomar', 0.04), ('translation', 0.038), ('vanishing', 0.038), ('detected', 0.038), ('facilities', 0.037), ('spotted', 0.037), ('scene', 0.037), ('relative', 0.037), ('register', 0.037), ('rectified', 0.037), ('density', 0.036), ('orientation', 0.036), ('gi', 0.036), ('plenty', 0.036), ('statue', 0.036), ('city', 0.035), ('handled', 0.035), ('coordinate', 0.035), ('symmetry', 0.034), ('verification', 0.034), ('scenes', 0.033), ('mst', 0.033), ('historical', 0.033), ('panoramic', 0.033), ('roof', 0.033), ('parametric', 0.033), ('render', 0.033), ('buildings', 0.033), ('modes', 0.033), ('points', 0.032), ('renderings', 0.032), ('arccos', 0.032), ('industrial', 0.032), ('ch', 0.032), ('calibration', 0.031), ('geometric', 0.031)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999899 56 iccv-2013-Automatic Registration of RGB-D Scans via Salient Directions
Author: Bernhard Zeisl, Kevin Köser, Marc Pollefeys
Abstract: We address the problem of wide-baseline registration of RGB-D data, such as photo-textured laser scans without any artificial targets or prediction on the relative motion. Our approach allows to fully automatically register scans taken in GPS-denied environments such as urban canyon, industrial facilities or even indoors. We build upon image features which are plenty, localized well and much more discriminative than geometry features; however, they suffer from viewpoint distortions and request for normalization. We utilize the principle of salient directions present in the geometry and propose to extract (several) directions from the distribution of surface normals or other cues such as observable symmetries. Compared to previous work we pose no requirements on the scanned scene (like containing large textured planes) and can handle arbitrary surface shapes. Rendering the whole scene from these repeatable directions using an orthographic camera generates textures which are identical up to 2D similarity transformations. This ambiguity is naturally handled by 2D features and allows to find stable correspondences among scans. For geometric pose estimation from tentative matches we propose a fast and robust 2 point sample consensus scheme integrating an early rejection phase. We evaluate our approach on different challenging real world scenes.
2 0.16796446 343 iccv-2013-Real-World Normal Map Capture for Nearly Flat Reflective Surfaces
Author: Bastien Jacquet, Christian Häne, Kevin Köser, Marc Pollefeys
Abstract: Although specular objects have gained interest in recent years, virtually no approaches exist for markerless reconstruction of reflective scenes in the wild. In this work, we present a practical approach to capturing normal maps in real-world scenes using video only. We focus on nearly planar surfaces such as windows, facades from glass or metal, or frames, screens and other indoor objects and show how normal maps of these can be obtained without the use of an artificial calibration object. Rather, we track the reflections of real-world straight lines, while moving with a hand-held or vehicle-mounted camera in front of the object. In contrast to error-prone local edge tracking, we obtain the reflections by a robust, global segmentation technique of an ortho-rectified 3D video cube that also naturally allows efficient user interaction. Then, at each point of the reflective surface, the resulting 2D-curve to 3D-line correspondence provides a novel quadratic constraint on the local surface normal. This allows to globally solve for the shape by integrability and smoothness constraints and easily supports the usage of multiple lines. We demonstrate the technique on several objects and facades.
Author: Sarah Parisot, William Wells_III, Stéphane Chemouny, Hugues Duffau, Nikos Paragios
Abstract: Graph-based methods have become popular in recent years and have successfully addressed tasks like segmentation and deformable registration. Their main strength is optimality of the obtained solution while their main limitation is the lack of precision due to the grid-like representations and the discrete nature of the quantized search space. In this paper we introduce a novel approach for combined segmentation/registration of brain tumors that adapts graph and sampling resolution according to the image content. To this end we estimate the segmentation and registration marginals towards adaptive graph resolution and intelligent definition of the search space. This information is considered in a hierarchical framework where uncertainties are propagated in a natural manner. State of the art results in the joint segmentation/registration of brain images with low-grade gliomas demonstrate the potential of our approach.
4 0.14835124 370 iccv-2013-Saliency Detection in Large Point Sets
Author: Elizabeth Shtrom, George Leifman, Ayellet Tal
Abstract: While saliency in images has been extensively studied in recent years, there is very little work on saliency of point sets. This is despite the fact that point sets and range data are becoming ever more widespread and have myriad applications. In this paper we present an algorithm for detecting the salient points in unorganized 3D point sets. Our algorithm is designed to cope with extremely large sets, which may contain tens of millions of points. Such data is typical of urban scenes, which have recently become commonly available on the web. No previous work has handled such data. For general data sets, we show that our results are competitive with those of saliency detection of surfaces, although we do not have any connectivity information. We demonstrate the utility of our algorithm in two applications: producing a set of the most informative viewpoints and suggesting an informative city tour given a city scan.
5 0.14384393 1 iccv-2013-3DNN: Viewpoint Invariant 3D Geometry Matching for Scene Understanding
Author: Scott Satkin, Martial Hebert
Abstract: We present a new algorithm 3DNN (3D NearestNeighbor), which is capable of matching an image with 3D data, independently of the viewpoint from which the image was captured. By leveraging rich annotations associated with each image, our algorithm can automatically produce precise and detailed 3D models of a scene from a single image. Moreover, we can transfer information across images to accurately label and segment objects in a scene. The true benefit of 3DNN compared to a traditional 2D nearest-neighbor approach is that by generalizing across viewpoints, we free ourselves from the need to have training examples captured from all possible viewpoints. Thus, we are able to achieve comparable results using orders of magnitude less data, and recognize objects from never-beforeseen viewpoints. In this work, we describe the 3DNN algorithm and rigorously evaluate its performance for the tasks of geometry estimation and object detection/segmentation. By decoupling the viewpoint and the geometry of an image, we develop a scene matching approach which is truly 100% viewpoint invariant, yielding state-of-the-art performance on challenging data.
6 0.14257294 281 iccv-2013-Multi-view Normal Field Integration for 3D Reconstruction of Mirroring Objects
7 0.14094186 139 iccv-2013-Elastic Fragments for Dense Scene Reconstruction
8 0.13623431 319 iccv-2013-Point-Based 3D Reconstruction of Thin Objects
9 0.13548332 283 iccv-2013-Multiple Non-rigid Surface Detection and Registration
10 0.12426072 36 iccv-2013-Accurate and Robust 3D Facial Capture Using a Single RGBD Camera
11 0.12385441 185 iccv-2013-Go-ICP: Solving 3D Registration Efficiently and Globally Optimally
12 0.12044571 183 iccv-2013-Geometric Registration Based on Distortion Estimation
13 0.11966816 199 iccv-2013-High Quality Shape from a Single RGB-D Image under Uncalibrated Natural Illumination
14 0.1162549 16 iccv-2013-A Generic Deformation Model for Dense Non-rigid Surface Registration: A Higher-Order MRF-Based Approach
15 0.11498395 367 iccv-2013-SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels
16 0.10830726 79 iccv-2013-Coherent Object Detection with 3D Geometric Context from a Single Image
17 0.10758572 9 iccv-2013-A Flexible Scene Representation for 3D Reconstruction Using an RGB-D Camera
18 0.10680857 137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction
19 0.10521436 12 iccv-2013-A General Dense Image Matching Framework Combining Direct and Feature-Based Costs
20 0.10286738 410 iccv-2013-Support Surface Prediction in Indoor Scenes
topicId topicWeight
[(0, 0.226), (1, -0.196), (2, 0.004), (3, -0.033), (4, -0.008), (5, -0.01), (6, 0.062), (7, -0.149), (8, -0.009), (9, -0.026), (10, 0.002), (11, 0.035), (12, -0.07), (13, 0.038), (14, 0.066), (15, -0.033), (16, 0.074), (17, 0.049), (18, -0.006), (19, -0.034), (20, 0.011), (21, 0.035), (22, 0.076), (23, 0.025), (24, -0.002), (25, -0.081), (26, -0.027), (27, 0.091), (28, -0.004), (29, -0.037), (30, 0.058), (31, -0.045), (32, 0.056), (33, -0.029), (34, 0.011), (35, -0.049), (36, 0.031), (37, 0.064), (38, 0.031), (39, -0.0), (40, 0.07), (41, 0.096), (42, -0.051), (43, -0.045), (44, 0.006), (45, -0.027), (46, -0.007), (47, 0.038), (48, -0.048), (49, 0.036)]
simIndex simValue paperId paperTitle
same-paper 1 0.94438475 56 iccv-2013-Automatic Registration of RGB-D Scans via Salient Directions
Author: Bernhard Zeisl, Kevin Köser, Marc Pollefeys
Abstract: We address the problem of wide-baseline registration of RGB-D data, such as photo-textured laser scans without any artificial targets or prediction on the relative motion. Our approach allows to fully automatically register scans taken in GPS-denied environments such as urban canyon, industrial facilities or even indoors. We build upon image features which are plenty, localized well and much more discriminative than geometry features; however, they suffer from viewpoint distortions and request for normalization. We utilize the principle of salient directions present in the geometry and propose to extract (several) directions from the distribution of surface normals or other cues such as observable symmetries. Compared to previous work we pose no requirements on the scanned scene (like containing large textured planes) and can handle arbitrary surface shapes. Rendering the whole scene from these repeatable directions using an orthographic camera generates textures which are identical up to 2D similarity transformations. This ambiguity is naturally handled by 2D features and allows to find stable correspondences among scans. For geometric pose estimation from tentative matches we propose a fast and robust 2 point sample consensus scheme integrating an early rejection phase. We evaluate our approach on different challenging real world scenes.
2 0.82175392 139 iccv-2013-Elastic Fragments for Dense Scene Reconstruction
Author: Qian-Yi Zhou, Stephen Miller, Vladlen Koltun
Abstract: We present an approach to reconstruction of detailed scene geometry from range video. Range data produced by commodity handheld cameras suffers from high-frequency errors and low-frequency distortion. Our approach deals with both sources of error by reconstructing locally smooth scene fragments and letting these fragments deform in order to align to each other. We develop a volumetric registration formulation that leverages the smoothness of the deformation to make optimization practical for large scenes. Experimental results demonstrate that our approach substantially increases the fidelity of complex scene geometry reconstructed with commodity handheld cameras.
3 0.76902092 183 iccv-2013-Geometric Registration Based on Distortion Estimation
Author: Wei Zeng, Mayank Goswami, Feng Luo, Xianfeng Gu
Abstract: Surface registration plays a fundamental role in many applications in computer vision and aims at finding a oneto-one correspondence between surfaces. Conformal mapping based surface registration methods conformally map 2D/3D surfaces onto 2D canonical domains and perform the matching on the 2D plane. This registration framework reduces dimensionality, and the result is intrinsic to Riemannian metric and invariant under isometric deformation. However, conformal mapping will be affected by inconsistent boundaries and non-isometric deformations of surfaces. In this work, we quantify the effects of boundary variation and non-isometric deformation to conformal mappings, and give the theoretical upper bounds for the distortions of conformal mappings under these two factors. Besides giving the thorough theoretical proofs of the theorems, we verified them by concrete experiments using 3D human facial scans with dynamic expressions and varying boundaries. Furthermore, we used the distortion estimates for reducing search range in feature matching of surface registration applications. The experimental results are consistent with the theoreticalpredictions and also demonstrate the performance improvements in feature tracking.
4 0.76891398 16 iccv-2013-A Generic Deformation Model for Dense Non-rigid Surface Registration: A Higher-Order MRF-Based Approach
Author: Yun Zeng, Chaohui Wang, Xianfeng Gu, Dimitris Samaras, Nikos Paragios
Abstract: We propose a novel approach for dense non-rigid 3D surface registration, which brings together Riemannian geometry and graphical models. To this end, we first introduce a generic deformation model, called Canonical Distortion Coefficients (CDCs), by characterizing the deformation of every point on a surface using the distortions along its two principle directions. This model subsumes the deformation groups commonly used in surface registration such as isometry and conformality, and is able to handle more complex deformations. We also derive its discrete counterpart which can be computed very efficiently in a closed form. Based on these, we introduce a higher-order Markov Random Field (MRF) model which seamlessly integrates our deformation model and a geometry/texture similarity metric. Then we jointly establish the optimal correspondences for all the points via maximum a posteriori (MAP) inference. Moreover, we develop a parallel optimization algorithm to efficiently perform the inference for the proposed higher-order MRF model. The resulting registration algorithm outperforms state-of-the-art methods in both dense non-rigid 3D surface registration and tracking.
5 0.73570162 185 iccv-2013-Go-ICP: Solving 3D Registration Efficiently and Globally Optimally
Author: Jiaolong Yang, Hongdong Li, Yunde Jia
Abstract: Registration is a fundamental task in computer vision. The Iterative Closest Point (ICP) algorithm is one of the widely-used methods for solving the registration problem. Based on local iteration, ICP is however well-known to suffer from local minima. Its performance critically relies on the quality of initialization, and only local optimality is guaranteed. This paper provides the very first globally optimal solution to Euclidean registration of two 3D pointsets or two 3D surfaces under the L2 error. Our method is built upon ICP, but combines it with a branch-and-bound (BnB) scheme which searches the 3D motion space SE(3) efficiently. By exploiting the special structure of the underlying geometry, we derive novel upper and lower bounds for the ICP error function. The integration of local ICP and global BnB enables the new method to run efficiently in practice, and its optimality is exactly guaranteed. We also discuss extensions, addressing the issue of outlier robustness.
6 0.73160803 283 iccv-2013-Multiple Non-rigid Surface Detection and Registration
7 0.73044956 319 iccv-2013-Point-Based 3D Reconstruction of Thin Objects
8 0.72651136 284 iccv-2013-Multiview Photometric Stereo Using Planar Mesh Parameterization
9 0.7068423 228 iccv-2013-Large-Scale Multi-resolution Surface Reconstruction from RGB-D Sequences
10 0.67969739 281 iccv-2013-Multi-view Normal Field Integration for 3D Reconstruction of Mirroring Objects
11 0.66390711 9 iccv-2013-A Flexible Scene Representation for 3D Reconstruction Using an RGB-D Camera
12 0.6423825 410 iccv-2013-Support Surface Prediction in Indoor Scenes
13 0.63438386 102 iccv-2013-Data-Driven 3D Primitives for Single Image Understanding
14 0.59970862 1 iccv-2013-3DNN: Viewpoint Invariant 3D Geometry Matching for Scene Understanding
15 0.59967864 407 iccv-2013-Subpixel Scanning Invariant to Indirect Lighting Using Quadratic Code Length
16 0.59045815 90 iccv-2013-Content-Aware Rotation
17 0.58477432 370 iccv-2013-Saliency Detection in Large Point Sets
18 0.57909405 343 iccv-2013-Real-World Normal Map Capture for Nearly Flat Reflective Surfaces
19 0.56541342 196 iccv-2013-Hierarchical Data-Driven Descent for Efficient Optimal Deformation Estimation
20 0.55698788 254 iccv-2013-Live Metric 3D Reconstruction on Mobile Phones
topicId topicWeight
[(2, 0.035), (26, 0.048), (31, 0.03), (35, 0.012), (40, 0.012), (42, 0.088), (64, 0.031), (73, 0.028), (89, 0.631)]
simIndex simValue paperId paperTitle
1 0.99954683 139 iccv-2013-Elastic Fragments for Dense Scene Reconstruction
Author: Qian-Yi Zhou, Stephen Miller, Vladlen Koltun
Abstract: We present an approach to reconstruction of detailed scene geometry from range video. Range data produced by commodity handheld cameras suffers from high-frequency errors and low-frequency distortion. Our approach deals with both sources of error by reconstructing locally smooth scene fragments and letting these fragments deform in order to align to each other. We develop a volumetric registration formulation that leverages the smoothness of the deformation to make optimization practical for large scenes. Experimental results demonstrate that our approach substantially increases the fidelity of complex scene geometry reconstructed with commodity handheld cameras.
2 0.99817562 81 iccv-2013-Combining the Right Features for Complex Event Recognition
Author: Kevin Tang, Bangpeng Yao, Li Fei-Fei, Daphne Koller
Abstract: In this paper, we tackle the problem of combining features extracted from video for complex event recognition. Feature combination is an especially relevant task in video data, as there are many features we can extract, ranging from image features computed from individual frames to video features that take temporal information into account. To combine features effectively, we propose a method that is able to be selective of different subsets of features, as some features or feature combinations may be uninformative for certain classes. We introduce a hierarchical method for combining features based on the AND/OR graph structure, where nodes in the graph represent combinations of different sets of features. Our method automatically learns the structure of the AND/OR graph using score-based structure learning, and we introduce an inference procedure that is able to efficiently compute structure scores. We present promising results and analysis on the difficult and large-scale 2011 TRECVID Multimedia Event Detection dataset [17].
3 0.99783742 216 iccv-2013-Inferring "Dark Matter" and "Dark Energy" from Videos
Author: Dan Xie, Sinisa Todorovic, Song-Chun Zhu
Abstract: This paper presents an approach to localizing functional objects in surveillance videos without domain knowledge about semantic object classes that may appear in the scene. Functional objects do not have discriminative appearance and shape, but they affect behavior of people in the scene. For example, they “attract” people to approach them for satisfying certain needs (e.g., vending machines could quench thirst), or “repel” people to avoid them (e.g., grass lawns). Therefore, functional objects can be viewed as “dark matter”, emanating “dark energy ” that affects people ’s trajectories in the video. To detect “dark matter” and infer their “dark energy ” field, we extend the Lagrangian mechanics. People are treated as particle-agents with latent intents to approach “dark matter” and thus satisfy their needs, where their motions are subject to a composite “dark energy ” field of all functional objects in the scene. We make the assumption that people take globally optimal paths toward the intended “dark matter” while avoiding latent obstacles. A Bayesian framework is used to probabilistically model: people ’s trajectories and intents, constraint map of the scene, and locations of functional objects. A data-driven Markov Chain Monte Carlo (MCMC) process is used for inference. Our evaluation on videos of public squares and courtyards demonstrates our effectiveness in localizing functional objects and predicting people ’s trajectories in unobserved parts of the video footage.
4 0.99612582 103 iccv-2013-Deblurring by Example Using Dense Correspondence
Author: Yoav Hacohen, Eli Shechtman, Dani Lischinski
Abstract: This paper presents a new method for deblurring photos using a sharp reference example that contains some shared content with the blurry photo. Most previous deblurring methods that exploit information from other photos require an accurately registered photo of the same static scene. In contrast, our method aims to exploit reference images where the shared content may have undergone substantial photometric and non-rigid geometric transformations, as these are the kind of reference images most likely to be found in personal photo albums. Our approach builds upon a recent method for examplebased deblurring using non-rigid dense correspondence (NRDC) [11] and extends it in two ways. First, we suggest exploiting information from the reference image not only for blur kernel estimation, but also as a powerful local prior for the non-blind deconvolution step. Second, we introduce a simple yet robust technique for spatially varying blur estimation, rather than assuming spatially uniform blur. Unlike the aboveprevious method, which hasproven successful only with simple deblurring scenarios, we demonstrate that our method succeeds on a variety of real-world examples. We provide quantitative and qualitative evaluation of our method and show that it outperforms the state-of-the-art.
5 0.99542743 39 iccv-2013-Action Recognition with Improved Trajectories
Author: Heng Wang, Cordelia Schmid
Abstract: Recently dense trajectories were shown to be an efficient video representation for action recognition and achieved state-of-the-art results on a variety of datasets. This paper improves their performance by taking into account camera motion to correct them. To estimate camera motion, we match feature points between frames using SURF descriptors and dense optical flow, which are shown to be complementary. These matches are, then, used to robustly estimate a homography with RANSAC. Human motion is in general different from camera motion and generates inconsistent matches. To improve the estimation, a human detector is employed to remove these matches. Given the estimated camera motion, we remove trajectories consistent with it. We also use this estimation to cancel out camera motion from the optical flow. This significantly improves motion-based descriptors, such as HOF and MBH. Experimental results onfour challenging action datasets (i.e., Hollywood2, HMDB51, Olympic Sports and UCF50) significantly outperform the current state of the art.
6 0.9947443 302 iccv-2013-Optimization Problems for Fast AAM Fitting in-the-Wild
same-paper 7 0.99426681 56 iccv-2013-Automatic Registration of RGB-D Scans via Salient Directions
8 0.99260086 337 iccv-2013-Random Grids: Fast Approximate Nearest Neighbors and Range Searching for Image Search
9 0.98677588 2 iccv-2013-3D Scene Understanding by Voxel-CRF
10 0.98320645 390 iccv-2013-Shufflets: Shared Mid-level Parts for Fast Object Detection
11 0.97769624 319 iccv-2013-Point-Based 3D Reconstruction of Thin Objects
12 0.96589702 129 iccv-2013-Dynamic Scene Deblurring
13 0.96496314 228 iccv-2013-Large-Scale Multi-resolution Surface Reconstruction from RGB-D Sequences
14 0.96197075 343 iccv-2013-Real-World Normal Map Capture for Nearly Flat Reflective Surfaces
15 0.96063471 9 iccv-2013-A Flexible Scene Representation for 3D Reconstruction Using an RGB-D Camera
16 0.95918745 40 iccv-2013-Action and Event Recognition with Fisher Vectors on a Compact Feature Set
17 0.95789695 317 iccv-2013-Piecewise Rigid Scene Flow
18 0.95605528 42 iccv-2013-Active MAP Inference in CRFs for Efficient Semantic Segmentation
19 0.95574826 370 iccv-2013-Saliency Detection in Large Point Sets
20 0.95553541 226 iccv-2013-Joint Subspace Stabilization for Stereoscopic Video