iccv iccv2013 iccv2013-319 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Benjamin Ummenhofer, Thomas Brox
Abstract: 3D reconstruction deals with the problem of finding the shape of an object from a set of images. Thin objects that have virtually no volumepose a special challengefor reconstruction with respect to shape representation and fusion of depth information. In this paper we present a dense pointbased reconstruction method that can deal with this special class of objects. We seek to jointly optimize a set of depth maps by treating each pixel as a point in space. Points are pulled towards a common surface by pairwise forces in an iterative scheme. The method also handles the problem of opposed surfaces by means of penalty forces. Efficient optimization is achieved by grouping points to superpixels and a spatial hashing approach for fast neighborhood queries. We show that the approach is on a par with state-of-the-art methods for standard multi view stereo settings and gives superior results for thin objects.
Reference: text
sentIndex sentText sentNum sentScore
1 Thin objects that have virtually no volumepose a special challengefor reconstruction with respect to shape representation and fusion of depth information. [sent-4, score-0.317]
2 We seek to jointly optimize a set of depth maps by treating each pixel as a point in space. [sent-6, score-0.452]
3 Points are pulled towards a common surface by pairwise forces in an iterative scheme. [sent-7, score-0.333]
4 The method also handles the problem of opposed surfaces by means of penalty forces. [sent-8, score-0.271]
5 Efficient optimization is achieved by grouping points to superpixels and a spatial hashing approach for fast neighborhood queries. [sent-9, score-0.343]
6 We show that the approach is on a par with state-of-the-art methods for standard multi view stereo settings and gives superior results for thin objects. [sent-10, score-0.304]
7 Introduction Image-based 3D reconstruction is the problem of infer- ring the surface of real world objects solely from visual clues. [sent-12, score-0.293]
8 To the best of our knowledge, none of them addresses the reconstruction of very thin objects, such as the street sign in Fig. [sent-14, score-0.466]
9 Such thin objects are very problematic for contemporary reconstruction methods. [sent-17, score-0.415]
10 However, grids cannot properly represent objects thinner than the voxel size, and the fixed grid comes with high memory requirements, which severely limits the resolution. [sent-20, score-0.205]
11 In the case of an arbitrary thin object, the resolution required to represent the object leads to extreme memory requirements. [sent-21, score-0.304]
12 Top: Two renderings of the reconstruction when ignoring opposed surfaces (left and center) and a photo ofthe scene (right). [sent-24, score-0.321]
13 Our approach resolves collisions between points that represent different sides of thin objects. [sent-28, score-0.555]
14 Almost all points from the correct side pass the depth test (left and center) and therefore lie on the correct side. [sent-29, score-0.353]
15 The approach preserves the thin structure of the objects, as seen in the view from the top (right). [sent-30, score-0.304]
16 Another popular surface representation is by triangle meshes [8, 9, 5]. [sent-32, score-0.228]
17 In contrast to voxel grids, they only model the surface rather than the whole scene volume, and different parts of the scene can be represented by triangles of different size. [sent-33, score-0.299]
18 This makes mesh based algorithms suitable for large scale reconstructions [9], and potentially also allows to handle thin objects. [sent-34, score-0.361]
19 However, mesh representations typically have problems with change in topology during surface evolution. [sent-35, score-0.298]
20 [9] create a Delaunay tetrahedral mesh where tetrahedra are labeled as inside 996699 or outside. [sent-37, score-0.206]
21 The initial surface triangles are the tetrahedron faces that connect two tetrahedra with opposite labels. [sent-38, score-0.456]
22 In case of a thin sheet, none of the tetrahedra would be labeled as inside the object and the triangulated surface would miss the object. [sent-39, score-0.598]
23 We argue that the best representation for thin objects is a point cloud representation with reference to a set of registered depth maps. [sent-40, score-0.879]
24 Similar to Szeliski and Tonnesen [23] and Fua [4] we use forces to manipulate the orientation and position of points. [sent-44, score-0.249]
25 We can avoid the latter, because we keep a reference of the points to the depth maps from which they originated and allow only for motion of points along their projection rays. [sent-46, score-0.59]
26 Finally, they generate a subset of high quality depth maps by fusing the information from multiple neighboring depth maps. [sent-49, score-0.589]
27 A limitation of depth maps is the affinity to a specific camera. [sent-50, score-0.296]
28 In contrast, our approach treats the values of all depth maps as a point cloud and jointly optimizes all points, improving all depth maps at the same time. [sent-52, score-0.961]
29 The PMVS approach of Furukawa and Ponce [6] uses a patch representation similar to a point cloud and potentially can deal with thin objects. [sent-53, score-0.673]
30 Like in our approach, the depth and the normal of the patches is optimized. [sent-55, score-0.293]
31 A common challenge of point cloud representations is computational efficiency because, in contrast to voxel grids or meshes, the neighborhood structure is not explicit and may change. [sent-57, score-0.598]
32 We use efficient data structures and a coarseto-fine optimization based on superpixels to handle large point clouds with millions of points. [sent-58, score-0.298]
33 Moreover, we explicitly deal with a problem that is specific to thin objects: if the object is regarded from opposite viewpoints, points from different surfaces basically share the same position but have opposite normals. [sent-59, score-0.838]
34 Noise in the measurements will lead to contradictive results, where invisible points occlude visible ones. [sent-60, score-0.261]
35 We call this the prob- lem of opposed surfaces and introduce a coupling term in our energy model that deals with this problem. [sent-61, score-0.371]
36 An initial point cloud is computed via incremental bundle adjustment and a variant of semi-global matching [10]. [sent-63, score-0.449]
37 The heart of the approach is an energy model that regularizes the point cloud and pulls the points to common surfaces with normals that are consistent with the viewing direction. [sent-64, score-0.941]
38 Initial depth maps and camera parameters The initialization of our algorithm consists of a set of depth maps and the corresponding camera projection matrices. [sent-67, score-0.712]
39 For each we compute a depth map with semi-global matching [10]. [sent-73, score-0.206]
40 We accumulate a simple sum of abso- lute differences photometric error over 14 neighboring images for 128 depth labels. [sent-76, score-0.33]
41 Our SGM implementation uses 32 directions and an adaptive penalty for large depth label changes steered by the gradient magnitude of the image. [sent-77, score-0.392]
42 Camera parameters and depth values yield a coarse estimate of the scene. [sent-78, score-0.206]
43 The depth maps contain a large amount of outliers and noise. [sent-79, score-0.342]
44 Energy model We represent the surfaces of a scene by a set of oriented points P. [sent-81, score-0.282]
45 The points are initially given by the dense depth maps, iP. [sent-82, score-0.389]
46 It con∈tai Pns c tohrer essuprfoancdes position pi ∈n oRne3 and its normal vector ni at this point. [sent-86, score-0.562]
47 Surfaces covered by many pixels in the image are automatically represented at a higher resolution in the reconstruction Points generated from different depth maps are unlikely to agree on the same surface due to noise, wrong measurements and inaccurate camera poses. [sent-88, score-0.649]
48 We treat the point cloud as a particle simulation and define an energy that pulls close points towards a common surface: E = Esmooth + αEdata + βEcollision. [sent-89, score-0.734]
49 (1) Edata keeps the points close to their measured position p0, Esmooth and Ecollision define pairwise forces that pull the 997700 points to a common surface and push the points to resolve self intersections, respectively. [sent-90, score-0.928]
50 Each point in the point cloud corresponds to a depth map. [sent-93, score-0.685]
51 We denote the distance that the point P has been moved away from its original position p0 by u and optimize this quantity together with the surface normal n associated with this point. [sent-97, score-0.523]
52 The data term penalizes points that diverge from their initial position: Edata=P? [sent-98, score-0.186]
53 The energy Esmooth defines pairwise interactions of points and reads Esmooth=P? [sent-106, score-0.305]
54 Tthhee s eunrfearcgyes m mdeefaisnuerde by the neighboring points Pj . [sent-114, score-0.234]
55 The energy for two points Pi and Pj is minimal if the points lie on the respective planes defined by their position and normal. [sent-115, score-0.515]
56 The second term in (4) weights the angle between the normal ni and the neighboring point’s position pj . [sent-126, score-0.674]
57 Points directly behind or in front of a point Pi should have a high influence as they promote a different position for the surface described by pi and ni, while a point near the tangent plane describes a similar surface at a different position. [sent-127, score-0.975]
58 2 shows the value of wij for varying positions of pj . [sent-129, score-0.403]
59 The choice of the smoothing radius r defines the size of the neighborhood and therefore directly influences the runtime as well as the topology of the reconstruction. [sent-130, score-0.407]
60 The radius r also relates to the depth uncertainty of the initial depth maps and should be chosen accordingly. [sent-134, score-0.651]
61 The function η restricts the computation of the smoothness force to points that belong to the same surface. [sent-135, score-0.203]
62 Points with normals pointing in different directions shall not influence each other; hence we define ηni,nj=? [sent-136, score-0.215]
63 > 0 (6) We use the density ρi to normalize the energy and make it independent of the point density. [sent-142, score-0.275]
64 ∈P A special problem that arises for the reconstruction of thin objects are inconsistencies between the front-face and the back-face of a thin object. [sent-149, score-0.719]
65 Due to noise, points with normals pointing in different directions may occlude each other. [sent-150, score-0.439]
66 To resolve this opposed surface problem, we introduce a penalty force: Ecollision=P? [sent-151, score-0.374]
67 ) The energy measures the truncated signed distance of points Pi to the surfaces defined by the neighboring points Pj . [sent-155, score-0.679]
68 The energy becomes non-zero if the distance of the points is positive and the normals have different directions (the dot product of the normals is negative). [sent-156, score-0.529]
69 Point pairs Pi, Pj with this configuration are in conflict because they occlude each other but belong to opposite surfaces of the object. [sent-157, score-0.289]
70 Point cloud optimization The gradient of the global energy (1) defines the forces that are used in an iterative scheme to optimize the position and normal of the points. [sent-159, score-0.899]
71 The energy is non-convex due to the non-convex dependency of the weights w on the variables ui and ni. [sent-160, score-0.202]
72 We assume that a sufficient number of points is close enough to the actual surface to find a good local minimum. [sent-161, score-0.329]
73 We use gradient descent for fixed values of w and ρ to optimize the points and update w and ρ after each iteration, which yields a fixed point iteration scheme for w and ρ. [sent-162, score-0.518]
74 Weight wij with radius r = 1for varying positions of pj relative to pi. [sent-170, score-0.513]
75 The weight is low (black) when pj is far away and when the point is ’beside’ pi describing a different part of the surface. [sent-172, score-0.678]
76 The update scheme is nuit ++11== nuti t− τ τ∂∂unit EE, (9) where ni and Eni are parameterized with spherical coordinates in an appropriate local coordinate frame. [sent-173, score-0.209]
77 The gradient descent scheme is very slow since the time step size τ must be chosen small to achieve convergence. [sent-174, score-0.215]
78 We found that a mixture of coordinate descent and gradient descent significantly speeds up convergence. [sent-175, score-0.391]
79 The sign ambiguity is resolved by the fact that the surface must point towards the camera that observes it. [sent-180, score-0.403]
80 The energy (1) with fixed density ρ and weight w is a sum of weighted and possibly truncated ? [sent-182, score-0.205]
81 Sorting these intervals with respect to the coordinate ui allows us to quickly compute the minimum. [sent-186, score-0.195]
82 The sorting can be aborted as soon as the sign of the derivative changes and the minimum is found. [sent-187, score-0.201]
83 Let uˆit be the position on the ray where the energy for the point is minimal. [sent-188, score-0.406]
84 m We efo trra calkl points and decrease ω by the factor 21 when the minimum and maximum is not altered for 80% of the points in the last iteration. [sent-192, score-0.294]
85 To resolve remaining collisions we add a last iteration using the coordinate descent scheme for the variables u with ω = 1. [sent-195, score-0.397]
86 The line search of the coordinate descent scheme allows to find a state free of collisions for points where the penalty forces act too late. [sent-198, score-0.737]
87 Runtime optimization Processing point clouds with millions of points is computationally expensive. [sent-201, score-0.349]
88 The time complexity for updating a point cloud with N fully connected points is in O(N2). [sent-202, score-0.516]
89 Fortunately, due to the limited support ofthe smoothing kernel (5), the complexity can be reduced to O(N) since only neighboring points within a radius r need to be considered. [sent-203, score-0.42]
90 We optimize the superpixel point cloud until convergence and transfer the positions and normals to the original point cloud. [sent-216, score-0.634]
91 The optimization result of the superpixel point cloud yields a good approximation of the solution and greatly reduces the number of iterations spent on the original problem with N points. [sent-217, score-0.369]
92 Outlier removal Due to erroneous depth maps, the initial point cloud may contain a large number of outliers, i. [sent-220, score-0.614]
93 , points that do not describe an actual surface of the scene. [sent-222, score-0.329]
94 To detect these outliers, we compute for each point Pi how many points from other images support the corresponding surface. [sent-224, score-0.293]
95 A point Pj supports a point Pi if the position pi is close to the tangent plane defined by pj and nj . [sent-225, score-0.926]
96 The use of superpixels reduces the size of the point cloud and significantly speeds up the optimization. [sent-227, score-0.506]
97 The sampling density adapts to the scene depth to create superpixels with approximately equal size in space. [sent-230, score-0.344]
98 (11) sj is the disk radius that we also use for rendering the point. [sent-236, score-0.226]
99 The disk radius is simply computed as sj = where ξ is the depth of the point and f is the focal length. [sent-237, score-0.542]
100 The neighborhood N contains only points within a radIi. [sent-239, score-0.208]
wordName wordTfidf (topN-words)
[('pj', 0.315), ('thin', 0.304), ('cloud', 0.259), ('pi', 0.253), ('depth', 0.206), ('surface', 0.182), ('forces', 0.151), ('points', 0.147), ('esmooth', 0.139), ('surfaces', 0.135), ('energy', 0.123), ('descent', 0.115), ('tetrahedra', 0.112), ('reconstruction', 0.111), ('point', 0.11), ('radius', 0.11), ('normals', 0.109), ('collisions', 0.104), ('position', 0.098), ('grids', 0.097), ('superpixels', 0.096), ('uit', 0.093), ('maps', 0.09), ('wij', 0.088), ('neighboring', 0.087), ('ni', 0.087), ('normal', 0.087), ('edata', 0.086), ('ecollision', 0.085), ('ui', 0.079), ('occlude', 0.077), ('opposite', 0.077), ('opposed', 0.075), ('ray', 0.075), ('coordinate', 0.071), ('voxel', 0.071), ('pointing', 0.065), ('pmvs', 0.062), ('neighborhood', 0.061), ('penalty', 0.061), ('camera', 0.06), ('derivative', 0.06), ('disk', 0.06), ('topology', 0.059), ('originate', 0.058), ('pulls', 0.058), ('clouds', 0.057), ('mesh', 0.057), ('rendered', 0.056), ('sj', 0.056), ('resolve', 0.056), ('force', 0.056), ('runtime', 0.055), ('sign', 0.051), ('scheme', 0.051), ('gradient', 0.049), ('influences', 0.047), ('outliers', 0.046), ('meshes', 0.046), ('triangles', 0.046), ('hash', 0.046), ('optimize', 0.046), ('soon', 0.045), ('intervals', 0.045), ('sorting', 0.045), ('density', 0.042), ('directions', 0.041), ('bundle', 0.041), ('speeds', 0.041), ('truncated', 0.04), ('tangent', 0.04), ('smoothing', 0.04), ('hashing', 0.039), ('initial', 0.039), ('deals', 0.038), ('tetrahedral', 0.037), ('freiburg', 0.037), ('contradictive', 0.037), ('persist', 0.037), ('pointbased', 0.037), ('iburg', 0.037), ('pns', 0.037), ('piti', 0.037), ('miso', 0.037), ('lute', 0.037), ('ainond', 0.037), ('thieni', 0.037), ('thinner', 0.037), ('tohrer', 0.037), ('simulation', 0.037), ('act', 0.037), ('support', 0.036), ('dense', 0.036), ('defines', 0.035), ('millions', 0.035), ('brox', 0.035), ('benjamin', 0.035), ('steered', 0.035), ('octrees', 0.035), ('merrell', 0.035)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999994 319 iccv-2013-Point-Based 3D Reconstruction of Thin Objects
Author: Benjamin Ummenhofer, Thomas Brox
Abstract: 3D reconstruction deals with the problem of finding the shape of an object from a set of images. Thin objects that have virtually no volumepose a special challengefor reconstruction with respect to shape representation and fusion of depth information. In this paper we present a dense pointbased reconstruction method that can deal with this special class of objects. We seek to jointly optimize a set of depth maps by treating each pixel as a point in space. Points are pulled towards a common surface by pairwise forces in an iterative scheme. The method also handles the problem of opposed surfaces by means of penalty forces. Efficient optimization is achieved by grouping points to superpixels and a spatial hashing approach for fast neighborhood queries. We show that the approach is on a par with state-of-the-art methods for standard multi view stereo settings and gives superior results for thin objects.
2 0.19167015 281 iccv-2013-Multi-view Normal Field Integration for 3D Reconstruction of Mirroring Objects
Author: Michael Weinmann, Aljosa Osep, Roland Ruiters, Reinhard Klein
Abstract: In this paper, we present a novel, robust multi-view normal field integration technique for reconstructing the full 3D shape of mirroring objects. We employ a turntablebased setup with several cameras and displays. These are used to display illumination patterns which are reflected by the object surface. The pattern information observed in the cameras enables the calculation of individual volumetric normal fields for each combination of camera, display and turntable angle. As the pattern information might be blurred depending on the surface curvature or due to nonperfect mirroring surface characteristics, we locally adapt the decoding to the finest still resolvable pattern resolution. In complex real-world scenarios, the normal fields contain regions without observations due to occlusions and outliers due to interreflections and noise. Therefore, a robust reconstruction using only normal information is challenging. Via a non-parametric clustering of normal hypotheses derived for each point in the scene, we obtain both the most likely local surface normal and a local surface consistency estimate. This information is utilized in an iterative mincut based variational approach to reconstruct the surface geometry.
3 0.17648527 387 iccv-2013-Shape Anchors for Data-Driven Multi-view Reconstruction
Author: Andrew Owens, Jianxiong Xiao, Antonio Torralba, William Freeman
Abstract: We present a data-driven method for building dense 3D reconstructions using a combination of recognition and multi-view cues. Our approach is based on the idea that there are image patches that are so distinctive that we can accurately estimate their latent 3D shapes solely using recognition. We call these patches shape anchors, and we use them as the basis of a multi-view reconstruction system that transfers dense, complex geometry between scenes. We “anchor” our 3D interpretation from these patches, using them to predict geometry for parts of the scene that are relatively ambiguous. The resulting algorithm produces dense reconstructions from stereo point clouds that are sparse and noisy, and we demonstrate it on a challenging dataset of real-world, indoor scenes.
4 0.17452133 228 iccv-2013-Large-Scale Multi-resolution Surface Reconstruction from RGB-D Sequences
Author: Frank Steinbrücker, Christian Kerl, Daniel Cremers
Abstract: We propose a method to generate highly detailed, textured 3D models of large environments from RGB-D sequences. Our system runs in real-time on a standard desktop PC with a state-of-the-art graphics card. To reduce the memory consumption, we fuse the acquired depth maps and colors in a multi-scale octree representation of a signed distance function. To estimate the camera poses, we construct a pose graph and use dense image alignment to determine the relative pose between pairs of frames. We add edges between nodes when we detect loop-closures and optimize the pose graph to correct for long-term drift. Our implementation is highly parallelized on graphics hardware to achieve real-time performance. More specifically, we can reconstruct, store, and continuously update a colored 3D model of an entire corridor of nine rooms at high levels of detail in real-time on a single GPU with 2.5GB.
5 0.16844721 343 iccv-2013-Real-World Normal Map Capture for Nearly Flat Reflective Surfaces
Author: Bastien Jacquet, Christian Häne, Kevin Köser, Marc Pollefeys
Abstract: Although specular objects have gained interest in recent years, virtually no approaches exist for markerless reconstruction of reflective scenes in the wild. In this work, we present a practical approach to capturing normal maps in real-world scenes using video only. We focus on nearly planar surfaces such as windows, facades from glass or metal, or frames, screens and other indoor objects and show how normal maps of these can be obtained without the use of an artificial calibration object. Rather, we track the reflections of real-world straight lines, while moving with a hand-held or vehicle-mounted camera in front of the object. In contrast to error-prone local edge tracking, we obtain the reflections by a robust, global segmentation technique of an ortho-rectified 3D video cube that also naturally allows efficient user interaction. Then, at each point of the reflective surface, the resulting 2D-curve to 3D-line correspondence provides a novel quadratic constraint on the local surface normal. This allows to globally solve for the shape by integrability and smoothness constraints and easily supports the usage of multiple lines. We demonstrate the technique on several objects and facades.
6 0.16792007 366 iccv-2013-STAR3D: Simultaneous Tracking and Reconstruction of 3D Objects Using RGB-D Data
7 0.16593283 370 iccv-2013-Saliency Detection in Large Point Sets
8 0.15763541 410 iccv-2013-Support Surface Prediction in Indoor Scenes
9 0.15438306 332 iccv-2013-Quadruplet-Wise Image Similarity Learning
10 0.15350308 382 iccv-2013-Semi-dense Visual Odometry for a Monocular Camera
11 0.15342808 284 iccv-2013-Multiview Photometric Stereo Using Planar Mesh Parameterization
12 0.15034516 444 iccv-2013-Viewing Real-World Faces in 3D
13 0.14923966 209 iccv-2013-Image Guided Depth Upsampling Using Anisotropic Total Generalized Variation
14 0.14892003 199 iccv-2013-High Quality Shape from a Single RGB-D Image under Uncalibrated Natural Illumination
15 0.1457454 139 iccv-2013-Elastic Fragments for Dense Scene Reconstruction
16 0.14507847 367 iccv-2013-SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels
17 0.13623431 56 iccv-2013-Automatic Registration of RGB-D Scans via Salient Directions
18 0.1351919 9 iccv-2013-A Flexible Scene Representation for 3D Reconstruction Using an RGB-D Camera
19 0.12897092 304 iccv-2013-PM-Huber: PatchMatch with Huber Regularization for Stereo Matching
20 0.12866554 256 iccv-2013-Locally Affine Sparse-to-Dense Matching for Motion and Occlusion Estimation
topicId topicWeight
[(0, 0.243), (1, -0.245), (2, -0.044), (3, 0.02), (4, -0.028), (5, 0.054), (6, 0.006), (7, -0.173), (8, -0.074), (9, 0.002), (10, -0.049), (11, 0.082), (12, -0.05), (13, 0.059), (14, 0.04), (15, -0.101), (16, -0.024), (17, 0.005), (18, -0.057), (19, -0.04), (20, -0.023), (21, 0.07), (22, 0.044), (23, -0.023), (24, -0.038), (25, 0.011), (26, -0.028), (27, 0.089), (28, 0.013), (29, 0.059), (30, 0.018), (31, -0.021), (32, 0.017), (33, -0.048), (34, -0.007), (35, -0.07), (36, 0.076), (37, 0.065), (38, 0.061), (39, -0.079), (40, 0.018), (41, -0.098), (42, -0.039), (43, 0.03), (44, -0.049), (45, -0.008), (46, 0.008), (47, -0.048), (48, -0.001), (49, 0.011)]
simIndex simValue paperId paperTitle
same-paper 1 0.97354621 319 iccv-2013-Point-Based 3D Reconstruction of Thin Objects
Author: Benjamin Ummenhofer, Thomas Brox
Abstract: 3D reconstruction deals with the problem of finding the shape of an object from a set of images. Thin objects that have virtually no volumepose a special challengefor reconstruction with respect to shape representation and fusion of depth information. In this paper we present a dense pointbased reconstruction method that can deal with this special class of objects. We seek to jointly optimize a set of depth maps by treating each pixel as a point in space. Points are pulled towards a common surface by pairwise forces in an iterative scheme. The method also handles the problem of opposed surfaces by means of penalty forces. Efficient optimization is achieved by grouping points to superpixels and a spatial hashing approach for fast neighborhood queries. We show that the approach is on a par with state-of-the-art methods for standard multi view stereo settings and gives superior results for thin objects.
2 0.83981669 284 iccv-2013-Multiview Photometric Stereo Using Planar Mesh Parameterization
Author: Jaesik Park, Sudipta N. Sinha, Yasuyuki Matsushita, Yu-Wing Tai, In So Kweon
Abstract: We propose a method for accurate 3D shape reconstruction using uncalibrated multiview photometric stereo. A coarse mesh reconstructed using multiview stereo is first parameterized using a planar mesh parameterization technique. Subsequently, multiview photometric stereo is performed in the 2D parameter domain of the mesh, where all geometric and photometric cues from multiple images can be treated uniformly. Unlike traditional methods, there is no need for merging view-dependent surface normal maps. Our key contribution is a new photometric stereo based mesh refinement technique that can efficiently reconstruct meshes with extremely fine geometric details by directly estimating a displacement texture map in the 2D parameter domain. We demonstrate that intricate surface geometry can be reconstructed using several challenging datasets containing surfaces with specular reflections, multiple albedos and complex topologies.
3 0.82460332 228 iccv-2013-Large-Scale Multi-resolution Surface Reconstruction from RGB-D Sequences
Author: Frank Steinbrücker, Christian Kerl, Daniel Cremers
Abstract: We propose a method to generate highly detailed, textured 3D models of large environments from RGB-D sequences. Our system runs in real-time on a standard desktop PC with a state-of-the-art graphics card. To reduce the memory consumption, we fuse the acquired depth maps and colors in a multi-scale octree representation of a signed distance function. To estimate the camera poses, we construct a pose graph and use dense image alignment to determine the relative pose between pairs of frames. We add edges between nodes when we detect loop-closures and optimize the pose graph to correct for long-term drift. Our implementation is highly parallelized on graphics hardware to achieve real-time performance. More specifically, we can reconstruct, store, and continuously update a colored 3D model of an entire corridor of nine rooms at high levels of detail in real-time on a single GPU with 2.5GB.
4 0.80113 139 iccv-2013-Elastic Fragments for Dense Scene Reconstruction
Author: Qian-Yi Zhou, Stephen Miller, Vladlen Koltun
Abstract: We present an approach to reconstruction of detailed scene geometry from range video. Range data produced by commodity handheld cameras suffers from high-frequency errors and low-frequency distortion. Our approach deals with both sources of error by reconstructing locally smooth scene fragments and letting these fragments deform in order to align to each other. We develop a volumetric registration formulation that leverages the smoothness of the deformation to make optimization practical for large scenes. Experimental results demonstrate that our approach substantially increases the fidelity of complex scene geometry reconstructed with commodity handheld cameras.
5 0.76246971 9 iccv-2013-A Flexible Scene Representation for 3D Reconstruction Using an RGB-D Camera
Author: Diego Thomas, Akihiro Sugimoto
Abstract: Updating a global 3D model with live RGB-D measurements has proven to be successful for 3D reconstruction of indoor scenes. Recently, a Truncated Signed Distance Function (TSDF) volumetric model and a fusion algorithm have been introduced (KinectFusion), showing significant advantages such as computational speed and accuracy of the reconstructed scene. This algorithm, however, is expensive in memory when constructing and updating the global model. As a consequence, the method is not well scalable to large scenes. We propose a new flexible 3D scene representation using a set of planes that is cheap in memory use and, nevertheless, achieves accurate reconstruction of indoor scenes from RGB-D image sequences. Projecting the scene onto different planes reduces significantly the size of the scene representation and thus it allows us to generate a global textured 3D model with lower memory requirement while keeping accuracy and easiness to update with live RGB-D measurements. Experimental results demonstrate that our proposed flexible 3D scene representation achieves accurate reconstruction, while keeping the scalability for large indoor scenes.
6 0.759525 281 iccv-2013-Multi-view Normal Field Integration for 3D Reconstruction of Mirroring Objects
7 0.72447366 56 iccv-2013-Automatic Registration of RGB-D Scans via Salient Directions
8 0.6982038 387 iccv-2013-Shape Anchors for Data-Driven Multi-view Reconstruction
9 0.67156291 343 iccv-2013-Real-World Normal Map Capture for Nearly Flat Reflective Surfaces
10 0.67038196 199 iccv-2013-High Quality Shape from a Single RGB-D Image under Uncalibrated Natural Illumination
11 0.66708207 254 iccv-2013-Live Metric 3D Reconstruction on Mobile Phones
12 0.66238374 102 iccv-2013-Data-Driven 3D Primitives for Single Image Understanding
13 0.65674525 410 iccv-2013-Support Surface Prediction in Indoor Scenes
14 0.64656949 2 iccv-2013-3D Scene Understanding by Voxel-CRF
15 0.6442287 366 iccv-2013-STAR3D: Simultaneous Tracking and Reconstruction of 3D Objects Using RGB-D Data
16 0.62163752 382 iccv-2013-Semi-dense Visual Odometry for a Monocular Camera
17 0.62152737 16 iccv-2013-A Generic Deformation Model for Dense Non-rigid Surface Registration: A Higher-Order MRF-Based Approach
18 0.60577351 209 iccv-2013-Image Guided Depth Upsampling Using Anisotropic Total Generalized Variation
19 0.59408492 183 iccv-2013-Geometric Registration Based on Distortion Estimation
20 0.58963275 108 iccv-2013-Depth from Combining Defocus and Correspondence Using Light-Field Cameras
topicId topicWeight
[(2, 0.058), (6, 0.014), (7, 0.014), (26, 0.064), (31, 0.039), (40, 0.016), (42, 0.096), (48, 0.012), (64, 0.056), (73, 0.05), (78, 0.149), (89, 0.327), (95, 0.012)]
simIndex simValue paperId paperTitle
1 0.97982121 252 iccv-2013-Line Assisted Light Field Triangulation and Stereo Matching
Author: Zhan Yu, Xinqing Guo, Haibing Lin, Andrew Lumsdaine, Jingyi Yu
Abstract: Light fields are image-based representations that use densely sampled rays as a scene description. In this paper, we explore geometric structures of 3D lines in ray space for improving light field triangulation and stereo matching. The triangulation problem aims to fill in the ray space with continuous and non-overlapping simplices anchored at sampled points (rays). Such a triangulation provides a piecewise-linear interpolant useful for light field superresolution. We show that the light field space is largely bilinear due to 3D line segments in the scene, and direct triangulation of these bilinear subspaces leads to large errors. We instead present a simple but effective algorithm to first map bilinear subspaces to line constraints and then apply Constrained Delaunay Triangulation (CDT). Based on our analysis, we further develop a novel line-assisted graphcut (LAGC) algorithm that effectively encodes 3D line constraints into light field stereo matching. Experiments on synthetic and real data show that both our triangulation and LAGC algorithms outperform state-of-the-art solutions in accuracy and visual quality.
2 0.97138906 69 iccv-2013-Capturing Global Semantic Relationships for Facial Action Unit Recognition
Author: Ziheng Wang, Yongqiang Li, Shangfei Wang, Qiang Ji
Abstract: In this paper we tackle the problem of facial action unit (AU) recognition by exploiting the complex semantic relationships among AUs, which carry crucial top-down information yet have not been thoroughly exploited. Towards this goal, we build a hierarchical model that combines the bottom-level image features and the top-level AU relationships to jointly recognize AUs in a principled manner. The proposed model has two major advantages over existing methods. 1) Unlike methods that can only capture local pair-wise AU dependencies, our model is developed upon the restricted Boltzmann machine and therefore can exploit the global relationships among AUs. 2) Although AU relationships are influenced by many related factors such as facial expressions, these factors are generally ignored by the current methods. Our model, however, can successfully capture them to more accurately characterize the AU relationships. Efficient learning and inference algorithms of the proposed model are also developed. Experimental results on benchmark databases demonstrate the effectiveness of the proposed approach in modelling complex AU relationships as well as its superior AU recognition performance over existing approaches.
3 0.95923877 175 iccv-2013-From Actemes to Action: A Strongly-Supervised Representation for Detailed Action Understanding
Author: Weiyu Zhang, Menglong Zhu, Konstantinos G. Derpanis
Abstract: This paper presents a novel approach for analyzing human actions in non-scripted, unconstrained video settings based on volumetric, x-y-t, patch classifiers, termed actemes. Unlike previous action-related work, the discovery of patch classifiers is posed as a strongly-supervised process. Specifically, keypoint labels (e.g., position) across spacetime are used in a data-driven training process to discover patches that are highly clustered in the spacetime keypoint configuration space. To support this process, a new human action dataset consisting of challenging consumer videos is introduced, where notably the action label, the 2D position of a set of keypoints and their visibilities are provided for each video frame. On a novel input video, each acteme is used in a sliding volume scheme to yield a set of sparse, non-overlapping detections. These detections provide the intermediate substrate for segmenting out the action. For action classification, the proposed representation shows significant improvement over state-of-the-art low-level features, while providing spatiotemporal localiza- tion as additional output. This output sheds further light into detailed action understanding.
4 0.95635349 344 iccv-2013-Recognising Human-Object Interaction via Exemplar Based Modelling
Author: Jian-Fang Hu, Wei-Shi Zheng, Jianhuang Lai, Shaogang Gong, Tao Xiang
Abstract: Human action can be recognised from a single still image by modelling Human-object interaction (HOI), which infers the mutual spatial structure information between human and object as well as their appearance. Existing approaches rely heavily on accurate detection of human and object, and estimation of human pose. They are thus sensitive to large variations of human poses, occlusion and unsatisfactory detection of small size objects. To overcome this limitation, a novel exemplar based approach is proposed in this work. Our approach learns a set of spatial pose-object interaction exemplars, which are density functions describing how a person is interacting with a manipulated object for different activities spatially in a probabilistic way. A representation based on our HOI exemplar thus has great potential for being robust to the errors in human/object detection and pose estimation. A new framework consists of a proposed exemplar based HOI descriptor and an activity specific matching model that learns the parameters is formulated for robust human activity recog- nition. Experiments on two benchmark activity datasets demonstrate that the proposed approach obtains state-ofthe-art performance.
same-paper 5 0.95519269 319 iccv-2013-Point-Based 3D Reconstruction of Thin Objects
Author: Benjamin Ummenhofer, Thomas Brox
Abstract: 3D reconstruction deals with the problem of finding the shape of an object from a set of images. Thin objects that have virtually no volumepose a special challengefor reconstruction with respect to shape representation and fusion of depth information. In this paper we present a dense pointbased reconstruction method that can deal with this special class of objects. We seek to jointly optimize a set of depth maps by treating each pixel as a point in space. Points are pulled towards a common surface by pairwise forces in an iterative scheme. The method also handles the problem of opposed surfaces by means of penalty forces. Efficient optimization is achieved by grouping points to superpixels and a spatial hashing approach for fast neighborhood queries. We show that the approach is on a par with state-of-the-art methods for standard multi view stereo settings and gives superior results for thin objects.
6 0.94519264 155 iccv-2013-Facial Action Unit Event Detection by Cascade of Tasks
7 0.93405133 268 iccv-2013-Modeling 4D Human-Object Interactions for Event and Object Recognition
8 0.92829216 105 iccv-2013-DeepFlow: Large Displacement Optical Flow with Deep Matching
9 0.92404521 290 iccv-2013-New Graph Structured Sparsity Model for Multi-label Image Annotations
10 0.92045617 129 iccv-2013-Dynamic Scene Deblurring
11 0.91980517 265 iccv-2013-Mining Motion Atoms and Phrases for Complex Action Recognition
12 0.91780609 256 iccv-2013-Locally Affine Sparse-to-Dense Matching for Motion and Occlusion Estimation
13 0.91760266 317 iccv-2013-Piecewise Rigid Scene Flow
14 0.91619527 382 iccv-2013-Semi-dense Visual Odometry for a Monocular Camera
15 0.91432858 410 iccv-2013-Support Surface Prediction in Indoor Scenes
16 0.91376764 444 iccv-2013-Viewing Real-World Faces in 3D
17 0.91358292 343 iccv-2013-Real-World Normal Map Capture for Nearly Flat Reflective Surfaces
18 0.91301775 9 iccv-2013-A Flexible Scene Representation for 3D Reconstruction Using an RGB-D Camera
19 0.9124949 439 iccv-2013-Video Co-segmentation for Meaningful Action Extraction
20 0.91237748 174 iccv-2013-Forward Motion Deblurring