cvpr cvpr2013 cvpr2013-114 knowledge-graph by maker-knowledge-mining

114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns


Source: pdf

Author: Zhe Yang, Zhiwei Xiong, Yueyi Zhang, Jiao Wang, Feng Wu

Abstract: This paper proposes novel density modulated binary patterns for depth acquisition. Similar to Kinect, the illumination patterns do not need a projector for generation and can be emitted by infrared lasers and diffraction gratings. Our key idea is to use the density of light spots in the patterns to carry phase information. Two technical problems are addressed here. First, we propose an algorithm to design the patterns to carry more phase information without compromising the depth reconstruction from a single captured image as with Kinect. Second, since the carried phase is not strictly sinusoidal, the depth reconstructed from the phase contains a systematic error. We further propose a pixelbased phase matching algorithm to reduce the error. Experimental results show that the depth quality can be greatly improved using the phase carried by the density of light spots. Furthermore, our scheme can achieve 20 fps depth reconstruction with GPU assistance.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 cn ao se Abstract This paper proposes novel density modulated binary patterns for depth acquisition. [sent-15, score-0.657]

2 Similar to Kinect, the illumination patterns do not need a projector for generation and can be emitted by infrared lasers and diffraction gratings. [sent-16, score-0.744]

3 Our key idea is to use the density of light spots in the patterns to carry phase information. [sent-17, score-1.32]

4 First, we propose an algorithm to design the patterns to carry more phase information without compromising the depth reconstruction from a single captured image as with Kinect. [sent-19, score-1.194]

5 Second, since the carried phase is not strictly sinusoidal, the depth reconstructed from the phase contains a systematic error. [sent-20, score-1.553]

6 We further propose a pixelbased phase matching algorithm to reduce the error. [sent-21, score-0.619]

7 Experimental results show that the depth quality can be greatly improved using the phase carried by the density of light spots. [sent-22, score-1.188]

8 The position of light spots in every small region is unique. [sent-28, score-0.544]

9 So depth can be reconstructed by establishing the correspondences between the reference and captured images. [sent-29, score-0.467]

10 However, depth reconstructed in this way suffers from holes and severe noise caused by the pattern to some extent. [sent-30, score-0.434]

11 First, the position of light spots has to be identified by a block of pixels. [sent-31, score-0.631]

12 Phase shifting, which projects a series of phase-shifted sinusoidal patterns [23], achieves better quality. [sent-39, score-0.51]

13 Furthermore, the depth is calculated from sinusoidal phase differences. [sent-42, score-1.172]

14 However, as shown in Figure 1(b), the phase shifting patterns are grayscale and hard to generate using infrared lasers and diffraction gratings like Kinect. [sent-44, score-1.278]

15 In this paper, we propose a novel approach to embed phase information into binary patterns that can still be generated with infrared lasers and diffraction gratings. [sent-45, score-1.063]

16 Similar to image dithering, our idea is to use the density of light spots to represent phase, as shown in Figure 1(c). [sent-46, score-0.549]

17 Figure 1(d) shows the energy images averaged in a sliding window from the patterns in Figure 1(c), which have similar properties to the phase shifting patterns in Figure 1(b). [sent-47, score-1.218]

18 The immediate advantage is that the depth quality can be improved with extracted phase information. [sent-48, score-0.853]

19 The goal is to carry more phase information without compromising the depth reconstructed from a single captured image as with Kinect. [sent-51, score-1.054]

20 Second, since the carried phase is not strictly sinusoidal, the depth reconstructed from the phase contains a systematic error. [sent-52, score-1.553]

21 A pixel-based phase matching algorithm is further proposed to reduce the error. [sent-53, score-0.588]

22 Finally, the depth data reconstructed by the position of light spots in one captured image and by the phase carried in three captured images are adaptively inte222555 Figure 1. [sent-54, score-1.648]

23 (b) Sinusoidal fringe patterns used in phase shifting. [sent-56, score-0.805]

24 Before this paper, Zhang proposed generating phaseshifting patterns from binary patterns by projector defocusing [3 1]. [sent-60, score-0.641]

25 In our scheme, every captured image consists of light spots and thus depth can be reconstructed just like Kinect. [sent-62, score-0.956]

26 Meanwhile, experimental results show that the depth quality can be greatly improved using the phase carried by the density of light spots. [sent-63, score-1.188]

27 Section 2 gives an overview of structured light and phase shifting. [sent-65, score-0.792]

28 They require multiple patterns and the scene cannot have motion when the patterns are projected. [sent-75, score-0.396]

29 proposed combining a set of parallel color stripes and a perpendicular set of sinusoidal intensity stripes [7]. [sent-90, score-0.404]

30 For pattern emitting, although some patterns can be emitted by a laser, all the above schemes use a projector. [sent-93, score-0.409]

31 Kinect features an infrared laser that can generate and emit a constant pattern with light spots [8, 14], which makes a depth camera available as a consumer-grade device. [sent-94, score-0.993]

32 By designing new patterns with the phase information embedded, our scheme greatly improves the depth quality. [sent-96, score-1.099]

33 Phase Shifting Phase shifting is a special SL scheme that emits a series of phase shifted sinusoidal patterns. [sent-99, score-1.144]

34 Increasing the number of stripes in the patterns can improve the measurement accuracy, but high frequency results in ambiguities that requires phase unwrapping. [sent-100, score-0.809]

35 proposed embedding a period cue into the phase shifting patterns without reducing the signal-to-noise ratio [26]. [sent-104, score-1.04]

36 As a result, each period of the sinusoidal patterns can be identified. [sent-105, score-0.617]

37 Global illumination and defocusing are practical factors that often introduce errors into depth measurement. [sent-108, score-0.408]

38 showed that high-frequency sinusoidal patterns can be used to separate global illumination [18]. [sent-110, score-0.541]

39 constrained the frequencies of sinusoidal patterns to a narrow high-frequency band, which greatly reduces global illumination and defocusing [10]. [sent-116, score-0.685]

40 For phase shifting, when a set of patterns is emitted, the scene is assumed to be static. [sent-119, score-0.745]

41 proposed estimating lines from the projected sinusoidal patterns and 222666 calculating motion as line translation [16]. [sent-122, score-0.557]

42 compensated for the motion by introducing the first-order Taylor term in phase shifting [27]. [sent-127, score-0.771]

43 used hybrid patterns of random speckle and sinusoidal fringe [33]. [sent-131, score-0.57]

44 Our key contribution in this paper is designing the phase shifting patterns that can be emitted by infrared laser, where the phase information is approximated by the density of light spots in a local region. [sent-132, score-2.243]

45 In our scheme, the phase ambiguity problem is solved naturally by the random position of light spots. [sent-133, score-0.781]

46 When the scene contains moving objects, depth in the corresponding regions can be still reconstructed from a single captured image as with Kinect. [sent-134, score-0.497]

47 Density Modulated Binary Patterns As shown in Figure 1(c), the three proposed patterns are binary and can thus be generated using infrared lasers and diffraction gratings. [sent-136, score-0.504]

48 In this section, we discuss how to modulate the density of light spots to represent phase. [sent-137, score-0.549]

49 In Kinect, the light spots are randomly and uniformly distributed in P(r, c). [sent-147, score-0.484]

50 t Terhne position nof, light spots is random in a basic unit but the same for all basic units in the same row. [sent-154, score-0.512]

51 At the same time, since the number of light spots and their positions are different in different rows, the position of light spots in every block is still unique. [sent-156, score-1.147]

52 In the proposed algorithm, there are two parameters that must be determined, namely, the scaling factor α and the Algorithm 1 Pattern Generation Require: The number of rows in one period T, the scaling factor α, and the initial phase θ 1: for r = 1, . [sent-157, score-0.699]

53 , M do 5: Randomly select k(r) positions from 1to N 6: Let pixels at selected positions be light spots 7: end for 8: end for (a)(b) (c)(d) Figure 2. [sent-164, score-0.484]

54 (a) TFhigeu pattern wwoit hp light spots hin N every row. [sent-169, score-0.595]

55 (c) The pattern with light spots in every other row. [sent-171, score-0.595]

56 In (c), every other row contains the light spots while the remaining rows are black. [sent-190, score-0.573]

57 From the captured images in (b) and (d), it is difficult to keep the position of light spots clear if using the pattern in (a), since the captured image may contain too many light spots. [sent-191, score-0.955]

58 222777 As mentioned above, the smaller the period T is, the more accurate the depth measurement will be. [sent-193, score-0.388]

59 Implicit Phase Next we evaluate whether the densities of generated patterns can approximate the sinusoidal fringe well. [sent-209, score-0.597]

60 Thus the energy E in a sliding window in the proposed patterns is a sinusoidal function mathematically. [sent-217, score-0.609]

61 Second, it can be reconstructed by phase Figure 3. [sent-228, score-0.657]

62 Since the position of light spots in every small region is still unique, we can use a block matching method to get the disparity of every pixel between the reference image I¯(r, c) and the captured image I(r, c). [sent-242, score-0.907]

63 (d) Depth reconstructed from three energy images by pixel-based phase matching. [sent-277, score-0.711]

64 Similar to original phase shifting [23], the three energy images can be represented as E1(r, c) = E? [sent-280, score-0.824]

65 (r, c) is the sinusoidal amplitude, and φ(r, c) is the phase to be solved. [sent-293, score-0.883]

66 (6) Since the proposed patterns contain a stair-wise error for approximating sinusoidal fringes as shown in Figure 3, we cannot calculate the depth directly from φ(r, c). [sent-296, score-0.827]

67 For every phase φ(r, c), we then search for the most matched φ¯(r − Δr, c) within one period. [sent-299, score-0.591]

68 Here we assume the phase varies linearly at the sub-pixel level. [sent-311, score-0.559]

69 The phase ambiguity problem still needs to be solved. [sent-313, score-0.559]

70 It is easy to be solved for the proposed patterns because the position of light spots is unique in every period. [sent-315, score-0.73]

71 The blue curve “a” is the reconstruction from original phase shifting. [sent-324, score-0.671]

72 The light-blue curve “d” is the reconstruction by the proposed pixel-based phase matching algorithm. [sent-330, score-0.677]

73 Depth Integration Although using the embedded phase to reconstruct depth produces better quality, this method requires three captured images and is not good at handling moving objects in the scene. [sent-335, score-0.984]

74 However, the depth reconstructed by block matching only needs one captured image, which is more suitable for moving objects. [sent-336, score-0.645]

75 The phase calculation and pixelbased phase matching are carried out on an Intel Xeon E5440 CPU with 2. [sent-343, score-1.23]

76 The second is original phase shifting, which uses three grayscale sinusoidal patterns emitted by a DLP projector and reconstructs depth directly from phase. [sent-350, score-1.593]

77 The third is the proposed scheme, and the binary patterns are also emitted by the DLP projector for simulation. [sent-351, score-0.438]

78 Figure 6(b) is the depth reconstructed from original phase shifting. [sent-357, score-0.937]

79 Figure 6(c) is the depth reconstructed by the proposed scheme but using block matching only. [sent-359, score-0.576]

80 Figure [62(2d]) wish tihlee depth mreceroans istru ocntleyd by ×the4 proposed scheme using the embedded phase and pixel-based phase matching. [sent-362, score-1.474]

81 It is much better than that from Kinect and also close to that from original phase shifting. [sent-363, score-0.582]

82 (d) The proposed scheme using pixel-based phase matching. [sent-369, score-0.632]

83 that carrying phase information in the binary patterns is a feasible way to improve the depth quality over Kinect. [sent-374, score-1.085]

84 We further compare the point cloud from the proposed scheme and original phase shifting, as shown in Figure 7. [sent-375, score-0.655]

85 Continuously improving the quality deserves future efforts, yet there should be an inherent compromise as binary patterns cannot represent phase as perfectly as grayscale patterns. [sent-379, score-0.866]

86 Although our results look close to the results from original phase shifting using grayscale patterns, there are some noticeable differences. [sent-397, score-0.808]

87 Since these scenes are static, they favor original phase shifting. [sent-398, score-0.582]

88 (b) is the result obtained using pixel-based phase matching. [sent-406, score-0.559]

89 The depth of the moving object is really poor because there is motion when the three patterns are emitted. [sent-407, score-0.524]

90 Original phase shifting will suffer from the same problem. [sent-408, score-0.747]

91 In our scheme, when the objects in the scene start to move, the depth reconstruction will degrade to block matching. [sent-417, score-0.428]

92 For example, can we use the depth from block matching to drive the previous obtained high quality depth? [sent-419, score-0.442]

93 Or can we use the depth from block matching to compensate for the motion? [sent-420, score-0.405]

94 Conclusions The proposed density modulated binary patterns carve out a feasible way to improve the depth quality over Kinect. [sent-423, score-0.694]

95 By embedding phase into the binary patterns, it provides more information for depth acquisition. [sent-424, score-0.862]

96 We propose the pattern generation algorithm and the pixel-based phase matching algorithm for reconstruction. [sent-425, score-0.716]

97 Experimental results show that our scheme consistently outperforms Kinect for static scenes and original phase shifting for moving objects. [sent-426, score-0.933]

98 Accuracy and resolution of kinect depth data for indoor mapping applications. [sent-528, score-0.429]

99 Period coded phase shifting strategy for real-time 3-d structured light illumination. [sent-610, score-1.037]

100 Fast and accurate 3d scanning using coded phase shifting and high speed pattern projection. [sent-622, score-0.911]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('phase', 0.559), ('sinusoidal', 0.324), ('spots', 0.29), ('depth', 0.257), ('light', 0.194), ('shifting', 0.188), ('patterns', 0.186), ('kinect', 0.146), ('defocusing', 0.12), ('block', 0.119), ('period', 0.107), ('modulated', 0.103), ('emitted', 0.103), ('projector', 0.103), ('infrared', 0.099), ('reconstructed', 0.098), ('diffraction', 0.093), ('captured', 0.085), ('koninckx', 0.08), ('lasers', 0.08), ('pattern', 0.079), ('scheme', 0.073), ('disparity', 0.071), ('density', 0.065), ('ncc', 0.062), ('fringe', 0.06), ('fringes', 0.06), ('moving', 0.057), ('coded', 0.057), ('energy', 0.054), ('carried', 0.052), ('reconstruction', 0.052), ('sl', 0.051), ('generation', 0.049), ('weise', 0.049), ('binary', 0.046), ('sliding', 0.045), ('laser', 0.045), ('optics', 0.043), ('schemes', 0.041), ('emitting', 0.041), ('acqusition', 0.04), ('albitar', 0.04), ('decodable', 0.04), ('dlp', 0.04), ('fong', 0.04), ('graebling', 0.04), ('grating', 0.04), ('maurice', 0.04), ('photoresist', 0.04), ('wissmann', 0.04), ('stripes', 0.04), ('structured', 0.039), ('grayscale', 0.038), ('sin', 0.037), ('curve', 0.037), ('quality', 0.037), ('xbox', 0.035), ('gratings', 0.035), ('kawasaki', 0.035), ('northeastern', 0.035), ('static', 0.033), ('looks', 0.033), ('rows', 0.033), ('multiplexed', 0.033), ('shpunt', 0.033), ('workshops', 0.032), ('every', 0.032), ('calculated', 0.032), ('salvi', 0.031), ('pixelbased', 0.031), ('illumination', 0.031), ('emit', 0.029), ('compromising', 0.029), ('matching', 0.029), ('taguchi', 0.028), ('patent', 0.028), ('position', 0.028), ('systematic', 0.028), ('scanning', 0.028), ('cos', 0.028), ('reference', 0.027), ('decided', 0.027), ('integer', 0.027), ('round', 0.027), ('densities', 0.027), ('microsoft', 0.026), ('indoor', 0.026), ('carry', 0.026), ('rounding', 0.026), ('opaque', 0.026), ('zhe', 0.026), ('embedded', 0.026), ('chen', 0.025), ('measurement', 0.024), ('greatly', 0.024), ('motion', 0.024), ('row', 0.024), ('calculating', 0.023), ('original', 0.023)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns

Author: Zhe Yang, Zhiwei Xiong, Yueyi Zhang, Jiao Wang, Feng Wu

Abstract: This paper proposes novel density modulated binary patterns for depth acquisition. Similar to Kinect, the illumination patterns do not need a projector for generation and can be emitted by infrared lasers and diffraction gratings. Our key idea is to use the density of light spots in the patterns to carry phase information. Two technical problems are addressed here. First, we propose an algorithm to design the patterns to carry more phase information without compromising the depth reconstruction from a single captured image as with Kinect. Second, since the carried phase is not strictly sinusoidal, the depth reconstructed from the phase contains a systematic error. We further propose a pixelbased phase matching algorithm to reduce the error. Experimental results show that the depth quality can be greatly improved using the phase carried by the density of light spots. Furthermore, our scheme can achieve 20 fps depth reconstruction with GPU assistance.

2 0.26688722 245 cvpr-2013-Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras

Author: Ju Shen, Sen-Ching S. Cheung

Abstract: The recent popularity of structured-light depth sensors has enabled many new applications from gesture-based user interface to 3D reconstructions. The quality of the depth measurements of these systems, however, is far from perfect. Some depth values can have significant errors, while others can be missing altogether. The uncertainty in depth measurements among these sensors can significantly degrade the performance of any subsequent vision processing. In this paper, we propose a novel probabilistic model to capture various types of uncertainties in the depth measurement process among structured-light systems. The key to our model is the use of depth layers to account for the differences between foreground objects and background scene, the missing depth value phenomenon, and the correlation between color and depth channels. The depth layer labeling is solved as a maximum a-posteriori estimation problem, and a Markov Random Field attuned to the uncertainty in measurements is used to spatially smooth the labeling process. Using the depth-layer labels, we propose a depth correction and completion algorithm that outperforms oth- er techniques in the literature.

3 0.15932348 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image

Author: Jonathan T. Barron, Jitendra Malik

Abstract: In this paper we extend the “shape, illumination and reflectance from shading ” (SIRFS) model [3, 4], which recovers intrinsic scene properties from a single image. Though SIRFS performs well on images of segmented objects, it performs poorly on images of natural scenes, which contain occlusion and spatially-varying illumination. We therefore present Scene-SIRFS, a generalization of SIRFS in which we have a mixture of shapes and a mixture of illuminations, and those mixture components are embedded in a “soft” segmentation of the input image. We additionally use the noisy depth maps provided by RGB-D sensors (in this case, the Kinect) to improve shape estimation. Our model takes as input a single RGB-D image and produces as output an improved depth map, a set of surface normals, a reflectance image, a shading image, and a spatially varying model of illumination. The output of our model can be used for graphics applications, or for any application involving RGB-D images.

4 0.15824772 432 cvpr-2013-Three-Dimensional Bilateral Symmetry Plane Estimation in the Phase Domain

Author: Ramakrishna Kakarala, Prabhu Kaliamoorthi, Vittal Premachandran

Abstract: We show that bilateral symmetry plane estimation for three-dimensional (3-D) shapes may be carried out accurately, and efficiently, in the spherical harmonic domain. Our methods are valuable for applications where spherical harmonic expansion is already employed, such as 3-D shape registration, morphometry, and retrieval. We show that the presence of bilateral symmetry in the 3-D shape is equivalent to a linear phase structure in the corresponding spherical harmonic coefficients, and provide algorithms for estimating the orientation of the symmetry plane. The benefit of using spherical harmonic phase is that symmetry estimation reduces to matching a compact set of descriptors, without the need to solve a correspondence problem. Our methods work on point clouds as well as large-scale mesh models of 3-D shapes.

5 0.14640924 394 cvpr-2013-Shading-Based Shape Refinement of RGB-D Images

Author: Lap-Fai Yu, Sai-Kit Yeung, Yu-Wing Tai, Stephen Lin

Abstract: We present a shading-based shape refinement algorithm which uses a noisy, incomplete depth map from Kinect to help resolve ambiguities in shape-from-shading. In our framework, the partial depth information is used to overcome bas-relief ambiguity in normals estimation, as well as to assist in recovering relative albedos, which are needed to reliably estimate the lighting environment and to separate shading from albedo. This refinement of surface normals using a noisy depth map leads to high-quality 3D surfaces. The effectiveness of our algorithm is demonstrated through several challenging real-world examples.

6 0.13873656 397 cvpr-2013-Simultaneous Super-Resolution of Depth and Images Using a Single Camera

7 0.13628416 117 cvpr-2013-Detecting Changes in 3D Structure of a Scene from Multi-view Images Captured by a Vehicle-Mounted Camera

8 0.13355289 431 cvpr-2013-The Variational Structure of Disparity and Regularization of 4D Light Fields

9 0.13062534 181 cvpr-2013-Fusing Depth from Defocus and Stereo with Coded Apertures

10 0.12506835 27 cvpr-2013-A Theory of Refractive Photo-Light-Path Triangulation

11 0.11657838 115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D

12 0.11314696 188 cvpr-2013-Globally Consistent Multi-label Assignment on the Ray Space of 4D Light Fields

13 0.1102641 108 cvpr-2013-Dense 3D Reconstruction from Severely Blurred Images Using a Single Moving Camera

14 0.10711596 232 cvpr-2013-Joint Geodesic Upsampling of Depth Images

15 0.10591415 357 cvpr-2013-Revisiting Depth Layers from Occlusions

16 0.10555644 196 cvpr-2013-HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences

17 0.10026743 111 cvpr-2013-Dense Reconstruction Using 3D Object Shape Priors

18 0.098738231 205 cvpr-2013-Hollywood 3D: Recognizing Actions in 3D Natural Scenes

19 0.095700368 303 cvpr-2013-Multi-view Photometric Stereo with Spatially Varying Isotropic Materials

20 0.094662867 407 cvpr-2013-Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.155), (1, 0.198), (2, 0.022), (3, 0.059), (4, -0.043), (5, -0.066), (6, -0.068), (7, 0.097), (8, 0.028), (9, 0.004), (10, -0.048), (11, -0.04), (12, 0.072), (13, 0.09), (14, -0.038), (15, 0.018), (16, -0.149), (17, 0.026), (18, -0.017), (19, -0.017), (20, -0.009), (21, 0.023), (22, -0.017), (23, -0.004), (24, 0.056), (25, 0.045), (26, -0.002), (27, 0.012), (28, -0.031), (29, -0.019), (30, -0.001), (31, 0.049), (32, 0.085), (33, -0.082), (34, 0.007), (35, -0.059), (36, -0.0), (37, 0.004), (38, -0.037), (39, -0.018), (40, -0.06), (41, 0.004), (42, 0.003), (43, 0.057), (44, -0.085), (45, -0.03), (46, -0.092), (47, 0.005), (48, -0.026), (49, 0.032)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97558784 114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns

Author: Zhe Yang, Zhiwei Xiong, Yueyi Zhang, Jiao Wang, Feng Wu

Abstract: This paper proposes novel density modulated binary patterns for depth acquisition. Similar to Kinect, the illumination patterns do not need a projector for generation and can be emitted by infrared lasers and diffraction gratings. Our key idea is to use the density of light spots in the patterns to carry phase information. Two technical problems are addressed here. First, we propose an algorithm to design the patterns to carry more phase information without compromising the depth reconstruction from a single captured image as with Kinect. Second, since the carried phase is not strictly sinusoidal, the depth reconstructed from the phase contains a systematic error. We further propose a pixelbased phase matching algorithm to reduce the error. Experimental results show that the depth quality can be greatly improved using the phase carried by the density of light spots. Furthermore, our scheme can achieve 20 fps depth reconstruction with GPU assistance.

2 0.86366355 245 cvpr-2013-Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras

Author: Ju Shen, Sen-Ching S. Cheung

Abstract: The recent popularity of structured-light depth sensors has enabled many new applications from gesture-based user interface to 3D reconstructions. The quality of the depth measurements of these systems, however, is far from perfect. Some depth values can have significant errors, while others can be missing altogether. The uncertainty in depth measurements among these sensors can significantly degrade the performance of any subsequent vision processing. In this paper, we propose a novel probabilistic model to capture various types of uncertainties in the depth measurement process among structured-light systems. The key to our model is the use of depth layers to account for the differences between foreground objects and background scene, the missing depth value phenomenon, and the correlation between color and depth channels. The depth layer labeling is solved as a maximum a-posteriori estimation problem, and a Markov Random Field attuned to the uncertainty in measurements is used to spatially smooth the labeling process. Using the depth-layer labels, we propose a depth correction and completion algorithm that outperforms oth- er techniques in the literature.

3 0.8120814 232 cvpr-2013-Joint Geodesic Upsampling of Depth Images

Author: Ming-Yu Liu, Oncel Tuzel, Yuichi Taguchi

Abstract: We propose an algorithm utilizing geodesic distances to upsample a low resolution depth image using a registered high resolution color image. Specifically, it computes depth for each pixel in the high resolution image using geodesic paths to the pixels whose depths are known from the low resolution one. Though this is closely related to the all-pairshortest-path problem which has O(n2 log n) complexity, we develop a novel approximation algorithm whose complexity grows linearly with the image size and achieve realtime performance. We compare our algorithm with the state of the art on the benchmark dataset and show that our approach provides more accurate depth upsampling with fewer artifacts. In addition, we show that the proposed algorithm is well suited for upsampling depth images using binary edge maps, an important sensor fusion application.

4 0.7750777 117 cvpr-2013-Detecting Changes in 3D Structure of a Scene from Multi-view Images Captured by a Vehicle-Mounted Camera

Author: Ken Sakurada, Takayuki Okatani, Koichiro Deguchi

Abstract: This paper proposes a method for detecting temporal changes of the three-dimensional structure of an outdoor scene from its multi-view images captured at two separate times. For the images, we consider those captured by a camera mounted on a vehicle running in a city street. The method estimates scene structures probabilistically, not deterministically, and based on their estimates, it evaluates the probability of structural changes in the scene, where the inputs are the similarity of the local image patches among the multi-view images. The aim of the probabilistic treatment is to maximize the accuracy of change detection, behind which there is our conjecture that although it is difficult to estimate the scene structures deterministically, it should be easier to detect their changes. The proposed method is compared with the methods that use multi-view stereo (MVS) to reconstruct the scene structures of the two time points and then differentiate them to detect changes. The experimental results show that the proposed method outperforms such MVS-based methods.

5 0.7724511 397 cvpr-2013-Simultaneous Super-Resolution of Depth and Images Using a Single Camera

Author: Hee Seok Lee, Kuoung Mu Lee

Abstract: In this paper, we propose a convex optimization framework for simultaneous estimation of super-resolved depth map and images from a single moving camera. The pixel measurement error in 3D reconstruction is directly related to the resolution of the images at hand. In turn, even a small measurement error can cause significant errors in reconstructing 3D scene structure or camera pose. Therefore, enhancing image resolution can be an effective solution for securing the accuracy as well as the resolution of 3D reconstruction. In the proposed method, depth map estimation and image super-resolution are formulated in a single energy minimization framework with a convex function and solved efficiently by a first-order primal-dual algorithm. Explicit inter-frame pixel correspondences are not required for our super-resolution procedure, thus we can avoid a huge computation time and obtain improved depth map in the accuracy and resolution as well as highresolution images with reasonable time. The superiority of our algorithm is demonstrated by presenting the improved depth map accuracy, image super-resolution results, and camera pose estimation.

6 0.75485152 115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D

7 0.73029703 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image

8 0.72039944 181 cvpr-2013-Fusing Depth from Defocus and Stereo with Coded Apertures

9 0.72007817 394 cvpr-2013-Shading-Based Shape Refinement of RGB-D Images

10 0.71731317 407 cvpr-2013-Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera

11 0.71367878 219 cvpr-2013-In Defense of 3D-Label Stereo

12 0.66160321 56 cvpr-2013-Bayesian Depth-from-Defocus with Shading Constraints

13 0.65380722 27 cvpr-2013-A Theory of Refractive Photo-Light-Path Triangulation

14 0.64271778 196 cvpr-2013-HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences

15 0.63433814 354 cvpr-2013-Relative Volume Constraints for Single View 3D Reconstruction

16 0.61860925 428 cvpr-2013-The Episolar Constraint: Monocular Shape from Shadow Correspondence

17 0.60628706 230 cvpr-2013-Joint 3D Scene Reconstruction and Class Segmentation

18 0.59469604 357 cvpr-2013-Revisiting Depth Layers from Occlusions

19 0.55828768 205 cvpr-2013-Hollywood 3D: Recognizing Actions in 3D Natural Scenes

20 0.54592085 111 cvpr-2013-Dense Reconstruction Using 3D Object Shape Priors


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(10, 0.092), (16, 0.06), (26, 0.028), (33, 0.197), (67, 0.029), (69, 0.448), (87, 0.044)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.84081727 1 cvpr-2013-3D-Based Reasoning with Blocks, Support, and Stability

Author: Zhaoyin Jia, Andrew Gallagher, Ashutosh Saxena, Tsuhan Chen

Abstract: 3D volumetric reasoning is important for truly understanding a scene. Humans are able to both segment each object in an image, and perceive a rich 3D interpretation of the scene, e.g., the space an object occupies, which objects support other objects, and which objects would, if moved, cause other objects to fall. We propose a new approach for parsing RGB-D images using 3D block units for volumetric reasoning. The algorithm fits image segments with 3D blocks, and iteratively evaluates the scene based on block interaction properties. We produce a 3D representation of the scene based on jointly optimizing over segmentations, block fitting, supporting relations, and object stability. Our algorithm incorporates the intuition that a good 3D representation of the scene is the one that fits the data well, and is a stable, self-supporting (i.e., one that does not topple) arrangement of objects. We experiment on several datasets including controlled and real indoor scenarios. Results show that our stability-reasoning framework improves RGB-D segmentation and scene volumetric representation.

2 0.81041628 172 cvpr-2013-Finding Group Interactions in Social Clutter

Author: Ruonan Li, Parker Porfilio, Todd Zickler

Abstract: We consider the problem of finding distinctive social interactions involving groups of agents embedded in larger social gatherings. Given a pre-defined gallery of short exemplar interaction videos, and a long input video of a large gathering (with approximately-tracked agents), we identify within the gathering small sub-groups of agents exhibiting social interactions that resemble those in the exemplars. The participants of each detected group interaction are localized in space; the extent of their interaction is localized in time; and when the gallery ofexemplars is annotated with group-interaction categories, each detected interaction is classified into one of the pre-defined categories. Our approach represents group behaviors by dichotomous collections of descriptors for (a) individual actions, and (b) pairwise interactions; and it includes efficient algorithms for optimally distinguishing participants from by-standers in every temporal unit and for temporally localizing the extent of the group interaction. Most importantly, the method is generic and can be applied whenever numerous interacting agents can be approximately tracked over time. We evaluate the approach using three different video collections, two that involve humans and one that involves mice.

same-paper 3 0.80868912 114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns

Author: Zhe Yang, Zhiwei Xiong, Yueyi Zhang, Jiao Wang, Feng Wu

Abstract: This paper proposes novel density modulated binary patterns for depth acquisition. Similar to Kinect, the illumination patterns do not need a projector for generation and can be emitted by infrared lasers and diffraction gratings. Our key idea is to use the density of light spots in the patterns to carry phase information. Two technical problems are addressed here. First, we propose an algorithm to design the patterns to carry more phase information without compromising the depth reconstruction from a single captured image as with Kinect. Second, since the carried phase is not strictly sinusoidal, the depth reconstructed from the phase contains a systematic error. We further propose a pixelbased phase matching algorithm to reduce the error. Experimental results show that the depth quality can be greatly improved using the phase carried by the density of light spots. Furthermore, our scheme can achieve 20 fps depth reconstruction with GPU assistance.

4 0.79927772 135 cvpr-2013-Discriminative Subspace Clustering

Author: Vasileios Zografos, Liam Ellis, Rudolf Mester

Abstract: We present a novel method for clustering data drawn from a union of arbitrary dimensional subspaces, called Discriminative Subspace Clustering (DiSC). DiSC solves the subspace clustering problem by using a quadratic classifier trained from unlabeled data (clustering by classification). We generate labels by exploiting the locality of points from the same subspace and a basic affinity criterion. A number of classifiers are then diversely trained from different partitions of the data, and their results are combined together in an ensemble, in order to obtain the final clustering result. We have tested our method with 4 challenging datasets and compared against 8 state-of-the-art methods from literature. Our results show that DiSC is a very strong performer in both accuracy and robustness, and also of low computational complexity.

5 0.7812953 86 cvpr-2013-Composite Statistical Inference for Semantic Segmentation

Author: Fuxin Li, Joao Carreira, Guy Lebanon, Cristian Sminchisescu

Abstract: In this paper we present an inference procedure for the semantic segmentation of images. Differentfrom many CRF approaches that rely on dependencies modeled with unary and pairwise pixel or superpixel potentials, our method is entirely based on estimates of the overlap between each of a set of mid-level object segmentation proposals and the objects present in the image. We define continuous latent variables on superpixels obtained by multiple intersections of segments, then output the optimal segments from the inferred superpixel statistics. The algorithm is capable of recombine and refine initial mid-level proposals, as well as handle multiple interacting objects, even from the same class, all in a consistent joint inference framework by maximizing the composite likelihood of the underlying statistical model using an EM algorithm. In the PASCAL VOC segmentation challenge, the proposed approach obtains high accuracy and successfully handles images of complex object interactions.

6 0.76463896 231 cvpr-2013-Joint Detection, Tracking and Mapping by Semantic Bundle Adjustment

7 0.76013184 392 cvpr-2013-Separable Dictionary Learning

8 0.7156319 371 cvpr-2013-SCaLE: Supervised and Cascaded Laplacian Eigenmaps for Visual Object Recognition Based on Nearest Neighbors

9 0.67060488 292 cvpr-2013-Multi-agent Event Detection: Localization and Role Assignment

10 0.63125926 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics

11 0.60332507 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection

12 0.60227954 288 cvpr-2013-Modeling Mutual Visibility Relationship in Pedestrian Detection

13 0.59765995 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models

14 0.59152341 282 cvpr-2013-Measuring Crowd Collectiveness

15 0.58775705 372 cvpr-2013-SLAM++: Simultaneous Localisation and Mapping at the Level of Objects

16 0.58709246 381 cvpr-2013-Scene Parsing by Integrating Function, Geometry and Appearance Models

17 0.58696407 402 cvpr-2013-Social Role Discovery in Human Events

18 0.58274651 132 cvpr-2013-Discriminative Re-ranking of Diverse Segmentations

19 0.57760388 364 cvpr-2013-Robust Object Co-detection

20 0.57476979 116 cvpr-2013-Designing Category-Level Attributes for Discriminative Visual Recognition