cvpr cvpr2013 cvpr2013-230 knowledge-graph by maker-knowledge-mining

230 cvpr-2013-Joint 3D Scene Reconstruction and Class Segmentation

Source: pdf

Author: Christian Häne, Christopher Zach, Andrea Cohen, Roland Angst, Marc Pollefeys

Abstract: Both image segmentation and dense 3D modeling from images represent an intrinsically ill-posed problem. Strong regularizers are therefore required to constrain the solutions from being ’too noisy’. Unfortunately, these priors generally yield overly smooth reconstructions and/or segmentations in certain regions whereas they fail in other areas to constrain the solution sufficiently. In this paper we argue that image segmentation and dense 3D reconstruction contribute valuable information to each other’s task. As a consequence, we propose a rigorous mathematical framework to formulate and solve a joint segmentation and dense reconstruction problem. Image segmentations provide geometric cues about which surface orientations are more likely to appear at a certain location in space whereas a dense 3D reconstruction yields a suitable regularization for the segmentation problem by lifting the labeling from 2D images to 3D space. We show how appearance-based cues and 3D surface orientation priors can be learned from training data and subsequently used for class-specific regularization. Experimental results on several real data sets highlight the advantages of our joint formulation.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 ch Abstract Both image segmentation and dense 3D modeling from images represent an intrinsically ill-posed problem. [sent-3, score-0.173]

2 Unfortunately, these priors generally yield overly smooth reconstructions and/or segmentations in certain regions whereas they fail in other areas to constrain the solution sufficiently. [sent-5, score-0.152]

3 In this paper we argue that image segmentation and dense 3D reconstruction contribute valuable information to each other’s task. [sent-6, score-0.241]

4 As a consequence, we propose a rigorous mathematical framework to formulate and solve a joint segmentation and dense reconstruction problem. [sent-7, score-0.36]

5 Image segmentations provide geometric cues about which surface orientations are more likely to appear at a certain location in space whereas a dense 3D reconstruction yields a suitable regularization for the segmentation problem by lifting the labeling from 2D images to 3D space. [sent-8, score-0.523]

6 We show how appearance-based cues and 3D surface orientation priors can be learned from training data and subsequently used for class-specific regularization. [sent-9, score-0.298]

7 Introduction Even though remarkable progress has been made in recent years, both image segmentation and dense 3D modeling from images remain intrinsically ill-posed problems. [sent-12, score-0.173]

8 Traditionally, the priors enforced in image segmentation approaches are stated entirely in the 2D image domain (e. [sent-14, score-0.185]

9 a contrast-sensitive spatial smoothness assumption), whereas priors employed for image-based reconstruction typically yield piece-wise smooth surfaces in 3D as their solutions. [sent-16, score-0.438]

10 In this paper we demonstrate that joint image segmentation and dense 3D reconstruction is beneficial for both tasks. [sent-17, score-0.354]

11 We extend volumetric scene reconstruction methods, which segment a volume of interest into occupied and free-space regions, to a multi-label volumetric segmentation framework assigning object classes or a free-space label to voxels. [sent-21, score-0.651]

12 On the one hand, such a joint approach is highly beneficial since the associated appearance (and therefore a likely semantic category) of surface elements can influence the spatial smoothness prior. [sent-22, score-0.474]

13 Thus, a class-specific regularizer guided by image appearances can adaptively enforce spatial smoothness and preferred orientations of 3D surfaces. [sent-23, score-0.225]

14 In a nutshell, we propose to learn appearance likelihoods and class-specific geometry priors for surface orientations 999777 from training data in an initial step. [sent-26, score-0.526]

15 These data-driven priors can then be used to define unary and pairwise potentials in a volumetric segmentation framework, complementary to the measured evidence acquired from depth maps. [sent-27, score-0.652]

16 While optimizing over the label assignment in this volume, the image-based appearance likelihoods, depth maps from computational stereo, and geometric priors interact with each other yielding an improved dense reconstruction and labeling. [sent-28, score-0.512]

17 Given a collection of depth images (or equivalently densely sampled oriented 3D points) the methods proposed in [13, 27, 23] essentially utilize the surface area as regularization prior, and obtain the final surface representation indirectly via volumetric optimization. [sent-35, score-0.691]

18 independent of the surface normal (up to the impact of the underlying discretization), corresponding to a total variation (TV) regularizer in the volumetric representation. [sent-39, score-0.417]

19 The work of [10] utilizes an anisotropic TV prior for 3D modeling in order to enforce the consistency of the surface normals with a given normal field, thus better preserving high frequency details in the final reconstruction. [sent-40, score-0.308]

20 All of the above mentioned work on volumetric 3D modeling from images returns solely a binary decision on the occupancy state of a voxel. [sent-41, score-0.256]

21 Hence, these methods are unaware of typical class-specific geometry, such as the normals of the ground plane pointing upwards. [sent-42, score-0.181]

22 These methods are therefore unable to adjust the utilized smoothness prior in an object- or classspecific way. [sent-43, score-0.181]

23 More specifically, it is notoriously difficult to faithfully reconstruct weakly or indirectly observed parts of the scene such as the ground, which is usually captured in images at very slanted angles (at least in terrestrial image data). [sent-45, score-0.168]

24 [9] proposes to extend an adaptive volumetric method for surface reconstruction in order not to miss important parts of the scene in the final geometry. [sent-46, score-0.478]

25 The assumption in their method is that surfaces with weak evidence are likely to be real surfaces if adjacent to strongly observed freespace. [sent-47, score-0.158]

26 A key property of our work is that weakly supported scene geometry can be assisted by a class-specific smoothness prior. [sent-48, score-0.244]

27 If only a single image is considered and direct depth cues from multiple images are not available, assigning object categories to pixels yields crucial information about the 3D scene layout [8, 19], e. [sent-49, score-0.19]

28 by exploiting the fact that building facades are usually vertical, and ground is typically horizontal. [sent-51, score-0.146]

29 by assuming a particular layout for indoor images [15], a tiered layout [6] or class-specific 2D smoothness priors [22]. [sent-55, score-0.398]

30 Utilizing appearance-based pixel categories and stereo cues in a joint framework was proposed in [11] in order to improve the quality of obtained depth maps and semantic image segmentations. [sent-56, score-0.343]

31 In our work, we also aim on joint estimation of 3D scene geometry and assignment of semantic categories, but use a completely different problem representation—which is intrinsically using multiple images—and solution method. [sent-57, score-0.25]

32 [18, 2] also present joint segmentation and 3D reconstruction methods, but the determined segments correspond to individual objects (in terms of an underlying smooth geometry) rather than to semantic categories. [sent-58, score-0.312]

33 Furthermore, a method [1] using semantic information for dense object reconstruction in form of shape priors has been developed concurrently to our work. [sent-59, score-0.38]

34 Joint 3D Reconstruction and Classification In this section we describe the underlying energy formulation for our proposed joint surface reconstruction and classification framework and its motivation. [sent-61, score-0.344]

35 Similar to previous works on global surface reconstruction we lift the problem from an explicit surface representation to an implicit volumetric one. [sent-62, score-0.622]

36 Continuous Formulation We cast the ultimate goal of semantically guided shape reconstruction as a volumetric labeling problem, where one out of L + 1labels is assigned to each location z ∈ Ω in a continuous volumetric domain Ω ⊂ R3. [sent-66, score-0.665]

37 With this notation in place, the convex relaxation of the labeling problem in a continuous volumetric domain Ω reads as space” Econt(x,y) =? [sent-85, score-0.368]

38 We choose η(dˆ − d) = β sgn(dˆ − d) for dˆ P(Aˆ β > 0, corresponding to an exponentially βdi sstgrnib(dute −d dn)oise for depth inliers. [sent-94, score-0.151]

39 Inserting unaries only near the observed depth corresponds to truncating the cost function, hence we assume exponentially distributed inliers and uniformly distributed outlier depth values. [sent-95, score-0.501]

40 β0weightsurfaceσclassi Figure 2: Unaries assigned to voxels along a particular lineof-sight. [sent-98, score-0.157]

41 Since we enforce spatial smoothness of the labeling (i. [sent-99, score-0.181]

42 multiple crossings within the narrow band near dˆ are very unlikely), we expect three possible configurations for voxels in [dˆ − δ, dˆ + δ] described below. [sent-101, score-0.228]

43 In the labeling of interest we have that free-space transitions to a particular object class iat depth d. [sent-104, score-0.308]

44 7 over th [de, voxels in [dˆ − δ, dˆ + δ] yields [dˆ dˆ+ σclass i+ ? [sent-107, score-0.157]

45 If all voxels in the particular range − δ, + δ] are freespace (x0s = 1for all the voxels ind dt h−is range), then the contribution to the total energy is just σsky. [sent-124, score-0.454]

46 Since a potential transition to a solid object class outside the near band is not taken into account, this choice of unary potentials implicitly encodes the assumption that that freespace near the observed depth implies freespace along the whole ray. [sent-125, score-0.785]

47 All voxels in the range are assigned to object label i(i. [sent-127, score-0.157]

48 This means that there [dˆ dˆ 111000000 was a transition from freespace to object type iearlier along the ray. [sent-130, score-0.2]

49 Overall, our choice of unaries will faithfully approximate the desired true data costs in most cases. [sent-132, score-0.191]

50 Since camera centers are in free-space by definition, we add a slight bias towards free-space along the line-of-sight from the respective camera center to the observed depth (i. [sent-133, score-0.245]

51 dˆ Missing depth: If no depth was observed at a particular pixel p, we cannot assign unaries along the corresponding ray. [sent-137, score-0.315]

52 Since missing depth values mostly occur in the sky regions of images, we found the following modification helpful to avoid “bleeding” of buildings etc. [sent-138, score-0.266]

53 beyond their respective silhouettes in the image: in case of missing depth we set the unary potentials to ρs0 = min {0, σsky − mini? [sent-139, score-0.354]

54 =sky σi } (8) and ρsi = 0 for i > 0 for all voxels s along ray(p). [sent-140, score-0.157]

55 This choice of unaries favors freespace along the whole ray whenever depth is missing and sky is the most likely class label in the image. [sent-141, score-0.729]

56 Training the Priors In this section, we will explain how the appearance likelihoods used in the unary potentials ρsi and the class-specific geometric priors φij are learned from training data. [sent-143, score-0.414]

57 While the appearance terms are based on classification scores of a standard classifier, training of geometric priors from labeled data is more involved. [sent-144, score-0.219]

58 We first start describing the training of the appearance likelihoods before discussing the training procedure for smoothness priors. [sent-145, score-0.332]

59 Appearance Likelihoods In order to get classification scores for the labels in the input images we train a boosted decision tree classifier [7] on manually labeled training data. [sent-148, score-0.144]

60 It should be noted that the geometry and location features are extracted by using 2-D information on the images (superpixel size, shape, and relative position in the image) and they are not related to the 3-D geometry of the scene. [sent-152, score-0.142]

61 The extracted features and ground truth annotations are fed into the boosted decision tree. [sent-153, score-0.123]

62 Class-Specific Geometric Priors We use a parametric model for the functions φisj appearing in the smoothness term of Eq. [sent-163, score-0.137]

63 Let si↔j denote a transition event between labels iand j at some voxel s, and let nisj be the (unit-length) boundary normal at this voxel. [sent-169, score-0.174]

64 where the summation goes over all the Nij transition samples between labels iand j. [sent-205, score-0.129]

65 3) which enables us to train ψij for the transitions ground ↔ free space, ground ↔ building and building ↔ free space. [sent-210, score-0.226]

66 5 are usually non- =def normalized gradient directions yisj xisj − xsji ∈ [−1, 1]3. [sent-214, score-0.158]

67 However, remember that ψij is a convex a−nd x positively 1homogeneous function. [sent-215, score-0.133]

68 Together with the fact that the area of the surface element in finite difference discretizations is captured exactly by ? [sent-216, score-0.144]

69 2, we derive the contribution of yisj to the regularizer as ? [sent-218, score-0.216]

70 be composed of an anisotropic, direction-dependent component ψij and an isotropic contribution proportional to Cij = log Zij − log P(i ↔ j). [sent-245, score-0.148]

71 14 above is positively 1-homogeneous if ψij is, but convexity can only be guaranteed whenever Cij = log Zij −log P(i ↔ j) 0 or P(i ↔ j) ≤ Zij . [sent-249, score-0.133]

72 This is in practice −nolto a severe r ejs)tr ≥ict 0ion, sPin(ice ↔ ↔for j a sufficiently fine discretization of the domain the occurrence of a boundary surface is a very rare event and therefore P(i ↔ j) ? [sent-250, score-0.176]

73 Choices for ψij We need to restrict ψij to be convex and positively 1homogeneous. [sent-254, score-0.133]

74 nt route and parametrize the convex conjugate of ψij, ? [sent-258, score-0.123]

75 Given remark 1 above there is no need to model the Wulff shape with an isotropic and an anisotropic component (i. [sent-271, score-0.232]

76 The Wulff shapes described below are designed to model two frequent surface priors encountered in urban environments: one prior favors surface normals that are in alignment with a specific direction (e. [sent-274, score-0.589]

77 ground surface normals prefer to be aligned with the vertical direction), and the second Wulff shape favors surface normals orthogonal to a given direction (such as facade surfaces having generally normals perpendicular to the vertical direction). [sent-276, score-0.821]

78 We refer to the supplementary material for graphical illustrations of the Wulff shapes and induced smoothness costs. [sent-278, score-0.174]

79 The corre- sponding function ψ favors directions pointing upwards and isotropically penalizes downward pointing normals. [sent-284, score-0.149]

80 We compare our geometry to a standard volumetric fusion (in particular “TV-Flux” [23]) and also illustrate the improvement of the class segmentation compared to a single image best-cost segmentation. [sent-301, score-0.436]

81 The depth maps are computed using plane sweep stereo matching for each of the images with zero mean normalized cross correlation (ZNCC) matching costs. [sent-304, score-0.215]

82 To get rid of the noise the raw depth maps are filtered by discarding depth values with a ZNCC matching score above 0. [sent-306, score-0.302]

83 The class scores are obtained by using the boosted decision tree classifier explained in Section 5. [sent-308, score-0.128]

84 As expected, computational stereo in particular struggles with faithfully capturing the ground, which is represented by relatively few depth samples. [sent-315, score-0.276]

85 Consequently, depth integration methods with a generic surface prior such as TV-Flux easily remove the ground and other weakly observed surfaces (due to the well-known shrinking bias of the employed boundary regularizer). [sent-316, score-0.506]

86 In contrast, our proposed joint optimization leads to more accurate geometry, and at the same time image segmentation is clearly improved over a greedy best-cost class assignment. [sent-317, score-0.199]

87 4 illustrates that the most probable class labels according to the trained appearance likelihoods especially confuses ground, building, and clutter categories. [sent-319, score-0.215]

88 Fusing appearance likelihood over multiple images and incorporating the surface geometry almost perfectly disambiguates the assigned object classes. [sent-320, score-0.248]

89 The joint determination of the right smoothness prior also enables our approach to fully reconstruct ground and all the facades as seen in Fig. [sent-321, score-0.329]

90 The ground is consistently missing in the TV-Flux results, and partially the facades and roof structure suffer from the generic smoothness assumption Fig. [sent-323, score-0.29]

91 We selected a weighting between data fidelity and smoothness in the TV-Flux method such that successfully reconstructed surfaces have a (visually) similar level of smoothness than the results of our proposed method. [sent-325, score-0.336]

92 Conclusion We present an approach for dense 3D scene reconstruction from multiple images and simultaneous image segmentation. [sent-327, score-0.176]

93 This challenging problem is formulated as joint volumetric inference task over multiple labels, which enables us to utilize class-specific smoothness assumptions in order to improve the quality of the obtained reconstruction. [sent-328, score-0.433]

94 We use a parametric representation for the respective smoothness priors, which yields a compact representation for the priors and—at the same time—allows to adjust the underlying parameters from training data. [sent-329, score-0.351]

95 We demonstrate the benefits of our approach over standard smoothness assumptions for volumetric scene reconstruction on several challenging data sets. [sent-330, score-0.471]

96 As a volumetric approach operating in a regular voxel grid, our method shares the limitations in terms of spatial resolution with most other volumetric approaches. [sent-332, score-0.475]

97 Adaptive representations for volumetric data can be a potential solution. [sent-333, score-0.215]

98 111000333 input images, example depth map, raw image labeling, our result, tv-flux fusion result; The different class labels are depicted using the following color scheme: building → red, ground → dark gray, vegetation → green, clutter → light gray. [sent-398, score-0.406]

99 Joint optimisation for object class segmentation and dense stereo reconstruction. [sent-426, score-0.239]

100 A comparison and evaluation of multi-view stereo reconstruction algorithms. [sent-495, score-0.183]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('ij', 0.485), ('wulff', 0.286), ('volumetric', 0.215), ('nsij', 0.19), ('zij', 0.167), ('yisj', 0.158), ('voxels', 0.157), ('depth', 0.151), ('surface', 0.144), ('freespace', 0.14), ('smoothness', 0.137), ('unaries', 0.13), ('priors', 0.12), ('reconstruction', 0.119), ('likelihoods', 0.094), ('zach', 0.088), ('normals', 0.087), ('joint', 0.081), ('nij', 0.078), ('anisotropic', 0.077), ('convex', 0.074), ('sky', 0.073), ('geometry', 0.071), ('segmentation', 0.065), ('stereo', 0.064), ('nikj', 0.063), ('tiered', 0.063), ('zncc', 0.063), ('facades', 0.063), ('def', 0.063), ('surfaces', 0.062), ('faithfully', 0.061), ('respective', 0.06), ('isotropic', 0.06), ('transitions', 0.06), ('transition', 0.06), ('positively', 0.059), ('regularizer', 0.058), ('remark', 0.058), ('dense', 0.057), ('favors', 0.057), ('cadastral', 0.056), ('ray', 0.053), ('class', 0.053), ('ptn', 0.052), ('vegetation', 0.052), ('isj', 0.052), ('si', 0.052), ('unary', 0.052), ('intrinsically', 0.051), ('potentials', 0.049), ('parametrize', 0.049), ('xsi', 0.049), ('ground', 0.048), ('semantic', 0.047), ('pointing', 0.046), ('voxel', 0.045), ('roland', 0.045), ('labeling', 0.044), ('utilized', 0.044), ('log', 0.044), ('missing', 0.042), ('un', 0.042), ('decision', 0.041), ('minimizer', 0.041), ('relaxations', 0.04), ('layout', 0.039), ('cij', 0.039), ('cap', 0.038), ('rigorous', 0.038), ('shape', 0.037), ('shapes', 0.037), ('indirectly', 0.037), ('occupied', 0.037), ('weakly', 0.036), ('band', 0.036), ('argm', 0.035), ('near', 0.035), ('continuous', 0.035), ('labels', 0.035), ('building', 0.035), ('training', 0.034), ('observed', 0.034), ('vertical', 0.034), ('iand', 0.034), ('boosted', 0.034), ('pock', 0.034), ('appearance', 0.033), ('discretization', 0.032), ('carlo', 0.032), ('beneficial', 0.032), ('geometric', 0.032), ('segmentations', 0.032), ('fusion', 0.032), ('city', 0.031), ('integration', 0.031), ('monte', 0.031), ('relations', 0.03), ('whenever', 0.03), ('orientations', 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000005 230 cvpr-2013-Joint 3D Scene Reconstruction and Class Segmentation

Author: Christian Häne, Christopher Zach, Andrea Cohen, Roland Angst, Marc Pollefeys

2 0.24573748 397 cvpr-2013-Simultaneous Super-Resolution of Depth and Images Using a Single Camera

Author: Hee Seok Lee, Kuoung Mu Lee

Abstract: In this paper, we propose a convex optimization framework for simultaneous estimation of super-resolved depth map and images from a single moving camera. The pixel measurement error in 3D reconstruction is directly related to the resolution of the images at hand. In turn, even a small measurement error can cause significant errors in reconstructing 3D scene structure or camera pose. Therefore, enhancing image resolution can be an effective solution for securing the accuracy as well as the resolution of 3D reconstruction. In the proposed method, depth map estimation and image super-resolution are formulated in a single energy minimization framework with a convex function and solved efficiently by a first-order primal-dual algorithm. Explicit inter-frame pixel correspondences are not required for our super-resolution procedure, thus we can avoid a huge computation time and obtain improved depth map in the accuracy and resolution as well as highresolution images with reasonable time. The superiority of our algorithm is demonstrated by presenting the improved depth map accuracy, image super-resolution results, and camera pose estimation.

3 0.21046665 24 cvpr-2013-A Principled Deep Random Field Model for Image Segmentation

Author: Pushmeet Kohli, Anton Osokin, Stefanie Jegelka

Abstract: We discuss a model for image segmentation that is able to overcome the short-boundary bias observed in standard pairwise random field based approaches. To wit, we show that a random field with multi-layered hidden units can encode boundary preserving higher order potentials such as the ones used in the cooperative cuts model of [11] while still allowing for fast and exact MAP inference. Exact inference allows our model to outperform previous image segmentation methods, and to see the true effect of coupling graph edges. Finally, our model can be easily extended to handle segmentation instances with multiple labels, for which it yields promising results.

4 0.1810564 86 cvpr-2013-Composite Statistical Inference for Semantic Segmentation

Author: Fuxin Li, Joao Carreira, Guy Lebanon, Cristian Sminchisescu

Abstract: In this paper we present an inference procedure for the semantic segmentation of images. Differentfrom many CRF approaches that rely on dependencies modeled with unary and pairwise pixel or superpixel potentials, our method is entirely based on estimates of the overlap between each of a set of mid-level object segmentation proposals and the objects present in the image. We define continuous latent variables on superpixels obtained by multiple intersections of segments, then output the optimal segments from the inferred superpixel statistics. The algorithm is capable of recombine and refine initial mid-level proposals, as well as handle multiple interacting objects, even from the same class, all in a consistent joint inference framework by maximizing the composite likelihood of the underlying statistical model using an EM algorithm. In the PASCAL VOC segmentation challenge, the proposed approach obtains high accuracy and successfully handles images of complex object interactions.

5 0.16722158 111 cvpr-2013-Dense Reconstruction Using 3D Object Shape Priors

Author: Amaury Dame, Victor A. Prisacariu, Carl Y. Ren, Ian Reid

Abstract: We propose a formulation of monocular SLAM which combines live dense reconstruction with shape priors-based 3D tracking and reconstruction. Current live dense SLAM approaches are limited to the reconstruction of visible surfaces. Moreover, most of them are based on the minimisation of a photo-consistency error, which usually makes them sensitive to specularities. In the 3D pose recovery literature, problems caused by imperfect and ambiguous image information have been dealt with by using prior shape knowledge. At the same time, the success of depth sensors has shown that combining joint image and depth information drastically increases the robustness of the classical monocular 3D tracking and 3D reconstruction approaches. In this work we link dense SLAM to 3D object pose and shape recovery. More specifically, we automatically augment our SLAMsystem with object specific identity, together with 6D pose and additional shape degrees of freedom for the object(s) of known class in the scene, combining im- age data and depth information for the pose and shape recovery. This leads to a system that allows for full scaled 3D reconstruction with the known object(s) segmented from the scene. The segmentation enhances the clarity, accuracy and completeness of the maps built by the dense SLAM system, while the dense 3D data aids the segmentation process, yieldingfaster and more reliable convergence than when using 2D image data alone.

6 0.16136509 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics

7 0.15219243 245 cvpr-2013-Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras

8 0.13617313 423 cvpr-2013-Template-Based Isometric Deformable 3D Reconstruction with Sampling-Based Focal Length Self-Calibration

9 0.13340352 187 cvpr-2013-Geometric Context from Videos

10 0.13336711 117 cvpr-2013-Detecting Changes in 3D Structure of a Scene from Multi-view Images Captured by a Vehicle-Mounted Camera

11 0.13122812 394 cvpr-2013-Shading-Based Shape Refinement of RGB-D Images

12 0.13097668 196 cvpr-2013-HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences

13 0.12083395 284 cvpr-2013-Mesh Based Semantic Modelling for Indoor and Outdoor Scenes

14 0.11699259 43 cvpr-2013-Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs

15 0.1166074 303 cvpr-2013-Multi-view Photometric Stereo with Spatially Varying Isotropic Materials

16 0.11345995 329 cvpr-2013-Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images

17 0.11323068 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image

18 0.11258654 354 cvpr-2013-Relative Volume Constraints for Single View 3D Reconstruction

19 0.11115962 443 cvpr-2013-Uncalibrated Photometric Stereo for Unknown Isotropic Reflectances

20 0.10824358 188 cvpr-2013-Globally Consistent Multi-label Assignment on the Ray Space of 4D Light Fields

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.255), (1, 0.182), (2, 0.036), (3, 0.037), (4, 0.054), (5, -0.059), (6, -0.058), (7, 0.13), (8, -0.032), (9, 0.001), (10, 0.051), (11, -0.05), (12, -0.066), (13, 0.079), (14, -0.035), (15, -0.011), (16, -0.021), (17, 0.079), (18, -0.024), (19, -0.03), (20, -0.083), (21, -0.037), (22, 0.022), (23, 0.042), (24, -0.002), (25, 0.058), (26, -0.024), (27, 0.019), (28, -0.037), (29, -0.115), (30, -0.038), (31, 0.007), (32, -0.02), (33, -0.011), (34, 0.058), (35, -0.043), (36, 0.038), (37, -0.027), (38, -0.047), (39, -0.064), (40, 0.129), (41, -0.008), (42, -0.009), (43, -0.037), (44, -0.03), (45, 0.014), (46, -0.08), (47, 0.017), (48, -0.047), (49, -0.069)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95916229 230 cvpr-2013-Joint 3D Scene Reconstruction and Class Segmentation

Author: Christian Häne, Christopher Zach, Andrea Cohen, Roland Angst, Marc Pollefeys

2 0.74157453 354 cvpr-2013-Relative Volume Constraints for Single View 3D Reconstruction

Author: Eno Töppe, Claudia Nieuwenhuis, Daniel Cremers

Abstract: We introduce the concept of relative volume constraints in order to account for insufficient information in the reconstruction of 3D objects from a single image. The key idea is to formulate a variational reconstruction approach with shape priors in form of relative depth profiles or volume ratios relating object parts. Such shape priors can easily be derived either from a user sketch or from the object’s shading profile in the image. They can handle textured or shadowed object regions by propagating information. We propose a convex relaxation of the constrained optimization problem which can be solved optimally in a few seconds on graphics hardware. In contrast to existing single view reconstruction algorithms, the proposed algorithm provides substantially more flexibility to recover shape details such as self-occlusions, dents and holes, which are not visible in the object silhouette.

3 0.71387863 219 cvpr-2013-In Defense of 3D-Label Stereo

Author: Carl Olsson, Johannes Ulén, Yuri Boykov

Abstract: It is commonly believed that higher order smoothness should be modeled using higher order interactions. For example, 2nd order derivatives for deformable (active) contours are represented by triple cliques. Similarly, the 2nd order regularization methods in stereo predominantly use MRF models with scalar (1D) disparity labels and triple clique interactions. In this paper we advocate a largely overlooked alternative approach to stereo where 2nd order surface smoothness is represented by pairwise interactions with 3D-labels, e.g. tangent planes. This general paradigm has been criticized due to perceived computational complexity of optimization in higher-dimensional label space. Contrary to popular beliefs, we demonstrate that representing 2nd order surface smoothness with 3D labels leads to simpler optimization problems with (nearly) submodular pairwise interactions. Our theoretical and experimental re- sults demonstrate advantages over state-of-the-art methods for 2nd order smoothness stereo. 1

4 0.7061621 397 cvpr-2013-Simultaneous Super-Resolution of Depth and Images Using a Single Camera

Author: Hee Seok Lee, Kuoung Mu Lee

5 0.69191504 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics

Author: Bo Zheng, Yibiao Zhao, Joey C. Yu, Katsushi Ikeuchi, Song-Chun Zhu

Abstract: In this paper, we present an approach for scene understanding by reasoning physical stability of objects from point cloud. We utilize a simple observation that, by human design, objects in static scenes should be stable with respect to gravity. This assumption is applicable to all scene categories and poses useful constraints for the plausible interpretations (parses) in scene understanding. Our method consists of two major steps: 1) geometric reasoning: recovering solid 3D volumetric primitives from defective point cloud; and 2) physical reasoning: grouping the unstable primitives to physically stable objects by optimizing the stability and the scene prior. We propose to use a novel disconnectivity graph (DG) to represent the energy landscape and use a Swendsen-Wang Cut (MCMC) method for optimization. In experiments, we demonstrate that the algorithm achieves substantially better performance for i) object segmentation, ii) 3D volumetric recovery of the scene, and iii) better parsing result for scene understanding in comparison to state-of-the-art methods in both public dataset and our own new dataset.

6 0.66852468 423 cvpr-2013-Template-Based Isometric Deformable 3D Reconstruction with Sampling-Based Focal Length Self-Calibration

7 0.66371107 111 cvpr-2013-Dense Reconstruction Using 3D Object Shape Priors

8 0.66042703 117 cvpr-2013-Detecting Changes in 3D Structure of a Scene from Multi-view Images Captured by a Vehicle-Mounted Camera

9 0.63311046 286 cvpr-2013-Mirror Surface Reconstruction from a Single Image

10 0.61888701 114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns

11 0.61074525 289 cvpr-2013-Monocular Template-Based 3D Reconstruction of Extensible Surfaces with Local Linear Elasticity

12 0.60910374 284 cvpr-2013-Mesh Based Semantic Modelling for Indoor and Outdoor Scenes

13 0.60181123 394 cvpr-2013-Shading-Based Shape Refinement of RGB-D Images

14 0.59918827 24 cvpr-2013-A Principled Deep Random Field Model for Image Segmentation

15 0.5980764 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image

16 0.59233958 303 cvpr-2013-Multi-view Photometric Stereo with Spatially Varying Isotropic Materials

17 0.59160095 435 cvpr-2013-Towards Contactless, Low-Cost and Accurate 3D Fingerprint Identification

18 0.59099305 466 cvpr-2013-Whitened Expectation Propagation: Non-Lambertian Shape from Shading and Shadow

19 0.58955711 467 cvpr-2013-Wide-Baseline Hair Capture Using Strand-Based Refinement

20 0.58090162 110 cvpr-2013-Dense Object Reconstruction with Semantic Priors

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(10, 0.095), (16, 0.019), (26, 0.032), (33, 0.211), (67, 0.033), (69, 0.041), (87, 0.504)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.88936746 274 cvpr-2013-Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization

Author: Marcus A. Brubaker, Andreas Geiger, Raquel Urtasun

Abstract: In this paper we propose an affordable solution to selflocalization, which utilizes visual odometry and road maps as the only inputs. To this end, we present a probabilistic model as well as an efficient approximate inference algorithm, which is able to utilize distributed computation to meet the real-time requirements of autonomous systems. Because of the probabilistic nature of the model we are able to cope with uncertainty due to noisy visual odometry and inherent ambiguities in the map (e.g., in a Manhattan world). By exploiting freely available, community developed maps and visual odometry measurements, we are able to localize a vehicle up to 3m after only a few seconds of driving on maps which contain more than 2,150km of drivable roads.

same-paper 2 0.87987459 230 cvpr-2013-Joint 3D Scene Reconstruction and Class Segmentation

Author: Christian Häne, Christopher Zach, Andrea Cohen, Roland Angst, Marc Pollefeys

3 0.87433976 354 cvpr-2013-Relative Volume Constraints for Single View 3D Reconstruction

Author: Eno Töppe, Claudia Nieuwenhuis, Daniel Cremers

4 0.84388649 209 cvpr-2013-Hypergraphs for Joint Multi-view Reconstruction and Multi-object Tracking

Author: Martin Hofmann, Daniel Wolf, Gerhard Rigoll

Abstract: We generalize the network flow formulation for multiobject tracking to multi-camera setups. In the past, reconstruction of multi-camera data was done as a separate extension. In this work, we present a combined maximum a posteriori (MAP) formulation, which jointly models multicamera reconstruction as well as global temporal data association. A flow graph is constructed, which tracks objects in 3D world space. The multi-camera reconstruction can be efficiently incorporated as additional constraints on the flow graph without making the graph unnecessarily large. The final graph is efficiently solved using binary linear programming. On the PETS 2009 dataset we achieve results that significantly exceed the current state of the art.

5 0.82926822 125 cvpr-2013-Dictionary Learning from Ambiguously Labeled Data

Author: Yi-Chen Chen, Vishal M. Patel, Jaishanker K. Pillai, Rama Chellappa, P. Jonathon Phillips

Abstract: We propose a novel dictionary-based learning method for ambiguously labeled multiclass classification, where each training sample has multiple labels and only one of them is the correct label. The dictionary learning problem is solved using an iterative alternating algorithm. At each iteration of the algorithm, two alternating steps are performed: a confidence update and a dictionary update. The confidence of each sample is defined as the probability distribution on its ambiguous labels. The dictionaries are updated using either soft (EM-based) or hard decision rules. Extensive evaluations on existing datasets demonstrate that the proposed method performs significantly better than state-of-the-art ambiguously labeled learning approaches.

6 0.81947416 337 cvpr-2013-Principal Observation Ray Calibration for Tiled-Lens-Array Integral Imaging Display

7 0.81495059 39 cvpr-2013-Alternating Decision Forests

8 0.80923611 107 cvpr-2013-Deformable Spatial Pyramid Matching for Fast Dense Correspondences

9 0.77957416 222 cvpr-2013-Incorporating User Interaction and Topological Constraints within Contour Completion via Discrete Calculus

10 0.74836946 396 cvpr-2013-Simultaneous Active Learning of Classifiers & Attributes via Relative Feedback

11 0.73922193 298 cvpr-2013-Multi-scale Curve Detection on Surfaces

12 0.69596279 71 cvpr-2013-Boundary Cues for 3D Object Shape Recovery

13 0.68634301 155 cvpr-2013-Exploiting the Power of Stereo Confidences

14 0.6732868 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities

15 0.66636074 279 cvpr-2013-Manhattan Scene Understanding via XSlit Imaging

16 0.66298288 147 cvpr-2013-Ensemble Learning for Confidence Measures in Stereo Vision

17 0.65912449 467 cvpr-2013-Wide-Baseline Hair Capture Using Strand-Based Refinement

18 0.6570791 373 cvpr-2013-SWIGS: A Swift Guided Sampling Method

19 0.6562717 443 cvpr-2013-Uncalibrated Photometric Stereo for Unknown Isotropic Reflectances

20 0.65498573 289 cvpr-2013-Monocular Template-Based 3D Reconstruction of Extensible Surfaces with Local Linear Elasticity