iccv iccv2013 iccv2013-282 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Abdelaziz Djelouah, Jean-Sébastien Franco, Edmond Boyer, François Le_Clerc, Patrick Pérez
Abstract: In this paper, we address the problem of object segmentation in multiple views or videos when two or more viewpoints of the same scene are available. We propose a new approach that propagates segmentation coherence information in both space and time, hence allowing evidences in one image to be shared over the complete set. To this aim the segmentation is cast as a single efficient labeling problem over space and time with graph cuts. In contrast to most existing multi-view segmentation methods that rely on some form of dense reconstruction, ours only requires a sparse 3D sampling to propagate information between viewpoints. The approach is thoroughly evaluated on standard multiview datasets, as well as on videos. With static views, results compete with state of the art methods but they are achieved with significantly fewer viewpoints. With multiple videos, we report results that demonstrate the benefit of segmentation propagation through temporal cues.
Reference: text
sentIndex sentText sentNum sentScore
1 We propose a new approach that propagates segmentation coherence information in both space and time, hence allowing evidences in one image to be shared over the complete set. [sent-2, score-0.381]
2 To this aim the segmentation is cast as a single efficient labeling problem over space and time with graph cuts. [sent-3, score-0.37]
3 In contrast to most existing multi-view segmentation methods that rely on some form of dense reconstruction, ours only requires a sparse 3D sampling to propagate information between viewpoints. [sent-4, score-0.32]
4 With multiple videos, we report results that demonstrate the benefit of segmentation propagation through temporal cues. [sent-7, score-0.37]
5 This paper addresses the task of unsupervised multiple image segmentation of a single physical object, possibly moving, as seen from two or more calibrated cameras, which we refer to as multi-view object segmentation (MVOS), see Fig. [sent-13, score-0.443]
6 As noted by [25], this is an intrinsically challenging problem, especially when the number of views is small, and viewpoints far apart. [sent-15, score-0.522]
7 on shared appearance models of the object between views while parts of the background seen from several viewpoints will present similar aspects. [sent-18, score-0.638]
8 In that respect, the MVOS problem significantly differs from the object cosegmentation problem [21, 14], which assumes shared appearance models for the foreground but different backgrounds. [sent-19, score-0.431]
9 In most applications where viewpoints see a single scene and object, calibration is available or computable using off the shelf tools such as Bundler [23]. [sent-20, score-0.278]
10 We propose a new iterative formulation (§4) of multiple Wviee wpr object segmentation eth faort misu using a joint graphcut linking pixels through space and time. [sent-24, score-0.323]
11 This formulation is inspired by the efficient tools developed by the cosegmentation community to correlate segmentations of different views [13, 24]. [sent-25, score-0.461]
12 Fourth, the framework straightforwardly extends to use of temporal links for multiple video sequences to propagate momentarily reliable segmentation evidences across time in multi-view setups. [sent-31, score-0.657]
13 Many methods follow this initial trend by building explicit 3D object reconstructions and alternating with image segmentations of the views based on foreground/background ap- pearance models [7, 18, 11, 19]. [sent-38, score-0.319]
14 Our focus is therefore on how to propagate information between views and across time for consistent pixel labeling and not precise 3D modeling. [sent-44, score-0.402]
15 Indeed, the simple 3D definition of geometric consistency given above often leads to a complex counterpart in images with regions carved if no compound occupancy from other views is observed on epipolar lines of a pixel, e. [sent-46, score-0.364]
16 This requires semi-circular setups with short baseline (as in [16]) and using specific heuristics to sparsify the superpixel interaction matrix with unclear complexity outcome. [sent-50, score-0.442]
17 The key assumptions of these methods is the observation of a common foreground region, or objects sharing appearance properties, versus a background with higher variability across images. [sent-58, score-0.327]
18 As noted by [25], cosegmentation increasingly refers to diverse scenarios, ranging from user-guided segmentation to segmentation of classes of objects rather than instances. [sent-59, score-0.588]
19 Interestingly, some cosegmentation methods [13] have created tools to link segmentations across views based on appearance, formulating segmentation as a joint graph cut on the views. [sent-61, score-0.87]
20 Similarly, we introduce a graph structure specifically relevant to propagate geometric cues for MVOS, rather than photometric cues. [sent-62, score-0.302]
21 Such cues may be used to propagate manually specified segmentation information [26, 2, 27], or completely automated [8, 5]. [sent-66, score-0.384]
22 Interestingly, some approaches construct a graph over the full 2D+t volume to link segmentations in time [26], which we propose to unify in a single graph-based framework to include intra-view, inter-view and temporal links. [sent-70, score-0.275]
23 Links between superpixels, The iterative process alternates between graphcut on superpixels showed as white lines, are estimated us- and color models update. [sent-78, score-0.378]
24 First, in order to ensure inter-view propagation odf segmentation nin ofrodremra ttoioen n, we b iunitledr on twhe p irdopeathat sparse 3D points (or samples) randomly picked in the region of interest (common field of view of all the cameras) provide sufficient links between images [9]. [sent-81, score-0.46]
25 Each sample creates links in the graph between itself and pixels at its projection, whose strength reflects the object coherence probability of the sample. [sent-82, score-0.335]
26 Second, to ensure efficient intra-frame propagation, we compute a superpixel oversegmentation of each image, and define two neighborhood sets on each superpixel in the graph based on image-space and texturespace proximity. [sent-83, score-0.894]
27 Resorting to superpixels also allows one to benefit from richer region characterizations reducing colorspace ambiguity. [sent-84, score-0.279]
28 Third, the resulting MRF energy is minimized using s-t mincut [4] and resultant segmented regions are used to re-estimate per-view foreground/background appearance models, which are, in their turn, used to update 3D sample object coherence probabilities. [sent-85, score-0.355]
29 For each image iat t we have t=he { Iset Pit of its superpixels p. [sent-92, score-0.279]
30 Segmenting the object in all the views consists in finding for every superpixel p ∈ Pit its tlahebe vl xp sw ciotnh xp s∈ i n{ ffin, bdi}n, gth feo foreground rapnidx background labels. [sent-94, score-1.088]
31 We denote∈ ∈S {t t,hbe} s,e tth oef f o3rDeg samples uds beda ctkog mrooudnedl dependencies n boettweSe en the views at instant t. [sent-95, score-0.311]
32 MRF Energy Principles Given the superpixel decomposition and 3D samples (shown Fig. [sent-99, score-0.421]
33 3), we wish the MRF energy to reward a given labeling of all superpixels as follows, each principle leading to MRF energy terms described in the next subsections. [sent-100, score-0.531]
34 The appearance of a superpixel should comply with image-wide foreground or background models, depending on its label. [sent-102, score-0.714]
35 Neighboring superpixels likely have the same labels if they have similar appearance. [sent-104, score-0.279]
36 Two superpixels with similar color/texture are more likely to be part of the same object and thus, more likely to have the same label. [sent-106, score-0.308]
37 These superpixels may not be neighbors due to occluding objects, etc. [sent-107, score-0.279]
38 Assuming sufficient 3D sampling of the scene, a superpixel should be foreground if it sees at least one object-consistent sample in the scene. [sent-111, score-0.675]
39 Conversely, a superpixel should be background if it sees no object-consistent 3D sample. [sent-112, score-0.51]
40 In the case of video data, superpixels in a sequence likely have the same label when they share similar appearance and are temporally linked through an observed flow field (e. [sent-114, score-0.5]
41 Intra-view appearance terms We use the classic unary data and binary spatial smoothness terms on superpixels, to which we add non-local appearance similarity terms on superpixel pairs for broader information propagation and a finer appearance criterion. [sent-119, score-0.91]
42 We denote Ec the unary data-term related to each superpixel appearance. [sent-121, score-0.473]
43 We characterize appearance by the sum of pixel-wise logprobabilities of being predicted by an image-wide foreground or background appearance distribution: Ec(xp) =? [sent-122, score-0.414]
44 This clustering is used to create texture and color vocabulary on which foreground and background histograms (HFi and HiB) are computed. [sent-133, score-0.327]
45 The links between superpixels of different frames use both interest point matches and optical flow. [sent-135, score-0.414]
46 This binary term, denoted En, discourages the assignment of different labels to neighboring superpixels that exhibit similar appearance. [sent-137, score-0.279]
47 To model this similarity we use the previously defined texture and color vocabulary to create superpixel descriptors. [sent-139, score-0.474]
48 The appearance descriptor of a given superpixel p is noted Ap. [sent-141, score-0.508]
49 Let Nni,t define the set of adjacent superpixel pairs in view iL eatt tNime t. [sent-142, score-0.416]
50 For (p, q) ∈ Nni,t, the proposed En is inversely proportional otor t(hpe, dq)is t∈an Nce between the two superpixel descriptors, as follows: En(xp,xq) =? [sent-143, score-0.447]
51 ) here is the χ2 distance between the superpixel descriptors. [sent-148, score-0.387]
52 Retrieving for each superpixel its k-nearest neighbors for χ2 distance, we define the set Nai,t of similar superpixel pairs adnids faonrc eea, cwhe eo dfe ethfiensee t pairs: Ea(xp,xq) =? [sent-152, score-0.774]
53 Inter-view geometric consistency terms To propagate inter-view information, we use a graph structure connecting a 3D sample to pixels it projects on. [sent-157, score-0.417]
54 We associate a unary term and a label xs to sample s, allowing the cut algorithm the flexibility of deciding on the fly whether to include s in the object segmentation, based on all MRF terms: Es(xs) =? [sent-164, score-0.289]
55 To ensure projection consistency, we connect each sample s to the superpixels p it projects onto in all views, which defines a neighborhood Ns. [sent-167, score-0.414]
56 4, no cut of the corresponding graph may assign simultaneously to background a superpixel p and to foreground a sample s that projects on p. [sent-171, score-0.879]
57 Thus it enforces the following desirable projection consistency property: labeling a superpixel p as background is only possible if it is coherent to label all the samples s projecting on it as background. [sent-172, score-0.676]
58 Ifasmple s is labeled as foreground then superpixels at its projection positions can not be labeled as background. [sent-175, score-0.502]
59 The converse property, inclusion of segmentations in the sample’s projected set, cannot be ensured: a superpixel can be labeled foreground even though it sees no foreground sample. [sent-177, score-0.875]
60 This would require enforcing a foreground superpixel p to see at least one foreground sample s, which can only be expressed with higher order MRF terms. [sent-178, score-0.792]
61 The desired behavior can be achieved by associating to each superpixel p a sample reprojection term P(xp |Vp). [sent-181, score-0.442]
62 Its purpose is to discourage foreground labeling of| p when no sample was labeled foreground in the 3D region Vp seen by the superpixel, and conversely encouraging foreground superpixel labeling as soon as a sample s in Vp is foreground. [sent-182, score-1.18]
63 Time consistency terms In the case of video segmentation, the idea is to benefit from information at different instants and to propagate consistent foreground/background labeling for the frames of the same viewpoint. [sent-185, score-0.412]
64 A set Nfi of related superpixels beotwfe theen sfarmamees v can ob ein te. [sent-186, score-0.279]
65 The propagation is done through the energy term Ef that enforces consistent labeling of linked superpixels (pt, qt+1) ∈ Nfi as follows: Ef(xpt,xqt+1)=⎧⎨θ0fexp(2− )22) ifo xtphte? [sent-188, score-0.557]
66 Thus, a good matching will constrain the two linked superpixels to have the same label. [sent-191, score-0.317]
67 MRF energy and graph construction Let X be the conjunction of all possible sample and superpixel labels. [sent-195, score-0.598]
68 Our MRF energy can thus be written with the three groups of terms: the intra-view group, the interview group with its own multi-view binary and unary terms, and finally the time consistency group with only binary terms between successive instants t and t + 1. [sent-196, score-0.383]
69 Finding a multiview segmentation for our set of images, given the set of histograms HiB and HiF, and the probabilities Psf, consists in finding the labeling X minimizing: = ? [sent-198, score-0.286]
70 This graph contains the two terminal nodes source and sink, one node for each superpixel and one node for each 3D sample s. [sent-217, score-0.526]
71 Edges are added between superpixels and samples according to the energy terms previously defined. [sent-218, score-0.414]
72 Computational approach Similar to most of state of the art segmentation methods, we adopt an iterative scheme where we alternate between the previous graph cut optimization, and an update of the color models. [sent-222, score-0.488]
73 The extraction, description and linking of superpixels is done once, at initialization time. [sent-226, score-0.31]
74 In the iterative process, the unary terms (objectness, superpixel sample projection and silhouette labeling probabilities) computed using the appearance models of the previous iteration. [sent-227, score-0.837]
75 The algorithm converges when no more superpixels are re-labeled from an iteration to another. [sent-228, score-0.279]
76 Superpixel labeling at convergence is used to estimate foreground/background appearance models which are used in a standard graphcut segmentation at pixel level, with unary terms based on appearance and smoothing binary terms using color dissimilarity. [sent-229, score-0.735]
77 Experimental protocol We implemented our approach using publicly available software for superpixel segmentation (SLIC [1]) and using Kolmogorov’s s-t mincut implementation [4]. [sent-235, score-0.641]
78 We use superpixel sizes of 30-50 pixels to ensure oversegmentation, obtaining around 2000 superpixels per image. [sent-236, score-0.666]
79 We show the graph cut result on superpixels at convergence and the final segmentation at pixel level. [sent-259, score-0.683]
80 Very good results are obtained with only 3 widespread viewpoints (such as Fig. [sent-261, score-0.323]
81 Second and third columns contain respectively superpixel and pixel level segmentation results. [sent-269, score-0.594]
82 8, approaches relying only on color [9] fail to segment foreground objects, where our approach benefits from a more complex appearance model. [sent-277, score-0.305]
83 ample with 8 viewpoints the table is seen by all the views and it is identified as part of the foreground. [sent-286, score-0.457]
84 We evaluate here the sensitivity of our approach to the number of viewpoints and the quality of the segmentation result compared to state of the art approaches [9, 16, 25], by randomly picking 10 viewpoint subsets for a given tested number of viewpoints and averaging results. [sent-294, score-0.807]
85 9 shows that our approach exhibits very little sensitivity to the number of viewpoints and achieves excellent segmentation results even with only 3 widespread viewpoints. [sent-296, score-0.563]
86 Let us emphasize the excellent performance of the algorithm on CAR and CHAIR1 datasets, despite the very low number of viewpoints used and the challenging nature of color ambiguities in the datasets. [sent-297, score-0.339]
87 The difference of segmentation precision between approaches is mainly due to some difficult color ambiguities in the model, such as shadows that appear consistent both with hypothesis of geometric and photometric cosegmentation methods. [sent-298, score-0.48]
88 In [16], it should be noted that depth information and plane detection significantly help, especially through the identification of the ground plane, which eliminates some ambiguities at the price of requiring more viewpoints for the purpose of obtaining the stereo. [sent-299, score-0.33]
89 Video segmentation results In the case of video sequences, our framework has the ability to propagate multi-view segmentation evidences over time. [sent-302, score-0.688]
90 It also enables to propagate temporal evidences from a given viewpoint to other viewpoints, e. [sent-303, score-0.296]
91 They can help resolve local segmentation ambiguities in few views in time or space. [sent-306, score-0.466]
92 Multi-video segmentation results using space and time propagation (middle) vs. [sent-308, score-0.296]
93 10 shows segmentation results with and without temporal consistency. [sent-317, score-0.281]
94 In this dataset, the complex nature of the environment, the handheld cameras in general motion and non-static backgrounds, and the few, widespread viewpoints make the segmentation very challenging. [sent-322, score-0.658]
95 10, specifying ambiguous foreground/background regions with two strokes in a single view (second row, left image) is sufficient to obtain visually satisfying results, This demonstrates that cues in an image can benefit to other images with different viewpoints and at different times. [sent-324, score-0.34]
96 To our knowledge we propose the first unified solution dealing with intra-view, inter-view, and temporal cues in a multi-view image and video segmentation context, into a single consistent MRF model. [sent-327, score-0.397]
97 Automatic 3d object segmentation in multiple views using volumetric graph-cuts. [sent-387, score-0.446]
98 Joint 3d-reconstruction and background separation in multiple views using graph cuts. [sent-411, score-0.359]
99 Multiple view object cosegmentation using appearance and stereo cues. [sent-431, score-0.285]
100 Graph cut based multiple view segmentation for 3d reconstruction. [sent-488, score-0.317]
wordName wordTfidf (topN-words)
[('superpixel', 0.387), ('mvos', 0.341), ('superpixels', 0.279), ('viewpoints', 0.247), ('views', 0.21), ('segmentation', 0.207), ('foreground', 0.175), ('cosegmentation', 0.14), ('mrf', 0.12), ('propagate', 0.113), ('xp', 0.111), ('evidences', 0.109), ('links', 0.102), ('propagation', 0.089), ('appearance', 0.087), ('vp', 0.086), ('unary', 0.086), ('djelouah', 0.085), ('hib', 0.085), ('graph', 0.084), ('psf', 0.083), ('cut', 0.081), ('segmentations', 0.08), ('labeling', 0.079), ('widespread', 0.076), ('instants', 0.076), ('temporal', 0.074), ('energy', 0.072), ('dancers', 0.07), ('nfi', 0.07), ('handheld', 0.068), ('silhouette', 0.066), ('background', 0.065), ('coherence', 0.065), ('cues', 0.064), ('consistency', 0.063), ('cameras', 0.06), ('sees', 0.058), ('visibility', 0.058), ('buste', 0.057), ('hif', 0.057), ('interview', 0.057), ('xhpe', 0.057), ('graphcut', 0.056), ('sample', 0.055), ('setups', 0.055), ('rother', 0.054), ('video', 0.052), ('rwi', 0.05), ('occupancy', 0.05), ('ambiguities', 0.049), ('slic', 0.048), ('projection', 0.048), ('static', 0.047), ('mincut', 0.047), ('objectness', 0.045), ('flow', 0.044), ('art', 0.044), ('campbell', 0.044), ('texture', 0.044), ('color', 0.043), ('dozen', 0.042), ('pit', 0.042), ('weighing', 0.042), ('geometric', 0.041), ('vogiatzis', 0.04), ('coined', 0.04), ('kowdle', 0.039), ('xs', 0.038), ('linked', 0.038), ('link', 0.037), ('oversegmentation', 0.036), ('ef', 0.036), ('en', 0.036), ('optic', 0.035), ('ioft', 0.035), ('noted', 0.034), ('user', 0.034), ('samples', 0.034), ('monocular', 0.034), ('interest', 0.033), ('sensitivity', 0.033), ('ej', 0.032), ('projects', 0.032), ('convergence', 0.032), ('plant', 0.032), ('intrinsically', 0.031), ('tools', 0.031), ('black', 0.031), ('sponsored', 0.031), ('eea', 0.031), ('instant', 0.031), ('linking', 0.031), ('blake', 0.03), ('inversely', 0.03), ('proportional', 0.03), ('state', 0.029), ('object', 0.029), ('terms', 0.029), ('view', 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 282 iccv-2013-Multi-view Object Segmentation in Space and Time
Author: Abdelaziz Djelouah, Jean-Sébastien Franco, Edmond Boyer, François Le_Clerc, Patrick Pérez
Abstract: In this paper, we address the problem of object segmentation in multiple views or videos when two or more viewpoints of the same scene are available. We propose a new approach that propagates segmentation coherence information in both space and time, hence allowing evidences in one image to be shared over the complete set. To this aim the segmentation is cast as a single efficient labeling problem over space and time with graph cuts. In contrast to most existing multi-view segmentation methods that rely on some form of dense reconstruction, ours only requires a sparse 3D sampling to propagate information between viewpoints. The approach is thoroughly evaluated on standard multiview datasets, as well as on videos. With static views, results compete with state of the art methods but they are achieved with significantly fewer viewpoints. With multiple videos, we report results that demonstrate the benefit of segmentation propagation through temporal cues.
2 0.31891197 299 iccv-2013-Online Video SEEDS for Temporal Window Objectness
Author: Michael Van_Den_Bergh, Gemma Roig, Xavier Boix, Santiago Manen, Luc Van_Gool
Abstract: Superpixel and objectness algorithms are broadly used as a pre-processing step to generate support regions and to speed-up further computations. Recently, many algorithms have been extended to video in order to exploit the temporal consistency between frames. However, most methods are computationally too expensive for real-time applications. We introduce an online, real-time video superpixel algorithm based on the recently proposed SEEDS superpixels. A new capability is incorporated which delivers multiple diverse samples (hypotheses) of superpixels in the same image or video sequence. The multiple samples are shown to provide a strong cue to efficiently measure the objectness of image windows, and we introduce the novel concept of objectness in temporal windows. Experiments show that the video superpixels achieve comparable performance to state-of-the-art offline methods while running at 30 fps on a single 2.8 GHz i7 CPU. State-of-the-art performance on objectness is also demonstrated, yet orders of magnitude faster and extended to temporal windows in video.
3 0.28875056 414 iccv-2013-Temporally Consistent Superpixels
Author: Matthias Reso, Jörn Jachalsky, Bodo Rosenhahn, Jörn Ostermann
Abstract: Superpixel algorithms represent a very useful and increasingly popular preprocessing step for a wide range of computer vision applications, as they offer the potential to boost efficiency and effectiveness. In this regards, this paper presents a highly competitive approach for temporally consistent superpixelsfor video content. The approach is based on energy-minimizing clustering utilizing a novel hybrid clustering strategy for a multi-dimensional feature space working in a global color subspace and local spatial subspaces. Moreover, a new contour evolution based strategy is introduced to ensure spatial coherency of the generated superpixels. For a thorough evaluation the proposed approach is compared to state of the art supervoxel algorithms using established benchmarks and shows a superior performance.
4 0.28261772 383 iccv-2013-Semi-supervised Learning for Large Scale Image Cosegmentation
Author: Zhengxiang Wang, Rujie Liu
Abstract: This paper introduces to use semi-supervised learning for large scale image cosegmentation. Different from traditional unsupervised cosegmentation that does not use any segmentation groundtruth, semi-supervised cosegmentation exploits the similarity from both the very limited training image foregrounds, as well as the common object shared between the large number of unsegmented images. This would be a much practical way to effectively cosegment a large number of related images simultaneously, where previous unsupervised cosegmentation work poorly due to the large variances in appearance between different images and the lack ofsegmentation groundtruthfor guidance in cosegmentation. For semi-supervised cosegmentation in large scale, we propose an effective method by minimizing an energy function, which consists of the inter-image distance, the intraimage distance and the balance term. We also propose an iterative updating algorithm to efficiently solve this energy function, which decomposes the original energy minimization problem into sub-problems, and updates each image alternatively to reduce the number of variables in each subproblem for computation efficiency. Experiment results on iCoseg and Pascal VOC datasets show that the proposed cosegmentation method can effectively cosegment hundreds of images in less than one minute. And our semi-supervised cosegmentation is able to outperform both unsupervised cosegmentation as well asfully supervised single image segmentation, especially when the training data is limited.
5 0.24066207 160 iccv-2013-Fast Object Segmentation in Unconstrained Video
Author: Anestis Papazoglou, Vittorio Ferrari
Abstract: We present a technique for separating foreground objects from the background in a video. Our method isfast, , fully automatic, and makes minimal assumptions about the video. This enables handling essentially unconstrained settings, including rapidly moving background, arbitrary object motion and appearance, and non-rigid deformations and articulations. In experiments on two datasets containing over 1400 video shots, our method outperforms a state-of-theart background subtraction technique [4] as well as methods based on clustering point tracks [6, 18, 19]. Moreover, it performs comparably to recent video object segmentation methods based on objectproposals [14, 16, 27], while being orders of magnitude faster.
6 0.17422092 144 iccv-2013-Estimating the 3D Layout of Indoor Scenes and Its Clutter from Depth Sensors
7 0.15651159 225 iccv-2013-Joint Segmentation and Pose Tracking of Human in Natural Videos
8 0.15038384 442 iccv-2013-Video Segmentation by Tracking Many Figure-Ground Segments
9 0.14716443 186 iccv-2013-GrabCut in One Cut
10 0.1449655 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation
11 0.14336273 317 iccv-2013-Piecewise Rigid Scene Flow
12 0.13873091 78 iccv-2013-Coherent Motion Segmentation in Moving Camera Videos Using Optical Flow Orientations
13 0.13398777 150 iccv-2013-Exemplar Cut
14 0.13201392 208 iccv-2013-Image Co-segmentation via Consistent Functional Maps
15 0.12762146 318 iccv-2013-PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects
16 0.12424268 33 iccv-2013-A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis
17 0.12334774 71 iccv-2013-Category-Independent Object-Level Saliency Detection
18 0.11934092 411 iccv-2013-Symbiotic Segmentation and Part Localization for Fine-Grained Categorization
19 0.11904786 379 iccv-2013-Semantic Segmentation without Annotating Segments
20 0.11834739 95 iccv-2013-Cosegmentation and Cosketch by Unsupervised Learning
topicId topicWeight
[(0, 0.257), (1, -0.091), (2, 0.099), (3, 0.055), (4, 0.068), (5, 0.065), (6, -0.143), (7, 0.095), (8, 0.048), (9, -0.113), (10, 0.015), (11, 0.198), (12, 0.035), (13, 0.015), (14, -0.1), (15, -0.023), (16, -0.064), (17, -0.098), (18, -0.153), (19, -0.096), (20, 0.138), (21, -0.174), (22, -0.089), (23, -0.027), (24, -0.036), (25, 0.025), (26, -0.089), (27, 0.039), (28, -0.068), (29, 0.064), (30, 0.035), (31, 0.125), (32, 0.073), (33, 0.073), (34, 0.106), (35, -0.055), (36, 0.109), (37, 0.075), (38, -0.06), (39, -0.01), (40, 0.027), (41, -0.129), (42, -0.041), (43, 0.066), (44, -0.051), (45, -0.043), (46, 0.055), (47, -0.01), (48, 0.135), (49, 0.002)]
simIndex simValue paperId paperTitle
same-paper 1 0.94559658 282 iccv-2013-Multi-view Object Segmentation in Space and Time
Author: Abdelaziz Djelouah, Jean-Sébastien Franco, Edmond Boyer, François Le_Clerc, Patrick Pérez
Abstract: In this paper, we address the problem of object segmentation in multiple views or videos when two or more viewpoints of the same scene are available. We propose a new approach that propagates segmentation coherence information in both space and time, hence allowing evidences in one image to be shared over the complete set. To this aim the segmentation is cast as a single efficient labeling problem over space and time with graph cuts. In contrast to most existing multi-view segmentation methods that rely on some form of dense reconstruction, ours only requires a sparse 3D sampling to propagate information between viewpoints. The approach is thoroughly evaluated on standard multiview datasets, as well as on videos. With static views, results compete with state of the art methods but they are achieved with significantly fewer viewpoints. With multiple videos, we report results that demonstrate the benefit of segmentation propagation through temporal cues.
2 0.84406352 383 iccv-2013-Semi-supervised Learning for Large Scale Image Cosegmentation
Author: Zhengxiang Wang, Rujie Liu
Abstract: This paper introduces to use semi-supervised learning for large scale image cosegmentation. Different from traditional unsupervised cosegmentation that does not use any segmentation groundtruth, semi-supervised cosegmentation exploits the similarity from both the very limited training image foregrounds, as well as the common object shared between the large number of unsegmented images. This would be a much practical way to effectively cosegment a large number of related images simultaneously, where previous unsupervised cosegmentation work poorly due to the large variances in appearance between different images and the lack ofsegmentation groundtruthfor guidance in cosegmentation. For semi-supervised cosegmentation in large scale, we propose an effective method by minimizing an energy function, which consists of the inter-image distance, the intraimage distance and the balance term. We also propose an iterative updating algorithm to efficiently solve this energy function, which decomposes the original energy minimization problem into sub-problems, and updates each image alternatively to reduce the number of variables in each subproblem for computation efficiency. Experiment results on iCoseg and Pascal VOC datasets show that the proposed cosegmentation method can effectively cosegment hundreds of images in less than one minute. And our semi-supervised cosegmentation is able to outperform both unsupervised cosegmentation as well asfully supervised single image segmentation, especially when the training data is limited.
3 0.83687866 299 iccv-2013-Online Video SEEDS for Temporal Window Objectness
Author: Michael Van_Den_Bergh, Gemma Roig, Xavier Boix, Santiago Manen, Luc Van_Gool
Abstract: Superpixel and objectness algorithms are broadly used as a pre-processing step to generate support regions and to speed-up further computations. Recently, many algorithms have been extended to video in order to exploit the temporal consistency between frames. However, most methods are computationally too expensive for real-time applications. We introduce an online, real-time video superpixel algorithm based on the recently proposed SEEDS superpixels. A new capability is incorporated which delivers multiple diverse samples (hypotheses) of superpixels in the same image or video sequence. The multiple samples are shown to provide a strong cue to efficiently measure the objectness of image windows, and we introduce the novel concept of objectness in temporal windows. Experiments show that the video superpixels achieve comparable performance to state-of-the-art offline methods while running at 30 fps on a single 2.8 GHz i7 CPU. State-of-the-art performance on objectness is also demonstrated, yet orders of magnitude faster and extended to temporal windows in video.
4 0.81452847 414 iccv-2013-Temporally Consistent Superpixels
Author: Matthias Reso, Jörn Jachalsky, Bodo Rosenhahn, Jörn Ostermann
Abstract: Superpixel algorithms represent a very useful and increasingly popular preprocessing step for a wide range of computer vision applications, as they offer the potential to boost efficiency and effectiveness. In this regards, this paper presents a highly competitive approach for temporally consistent superpixelsfor video content. The approach is based on energy-minimizing clustering utilizing a novel hybrid clustering strategy for a multi-dimensional feature space working in a global color subspace and local spatial subspaces. Moreover, a new contour evolution based strategy is introduced to ensure spatial coherency of the generated superpixels. For a thorough evaluation the proposed approach is compared to state of the art supervoxel algorithms using established benchmarks and shows a superior performance.
5 0.67538083 160 iccv-2013-Fast Object Segmentation in Unconstrained Video
Author: Anestis Papazoglou, Vittorio Ferrari
Abstract: We present a technique for separating foreground objects from the background in a video. Our method isfast, , fully automatic, and makes minimal assumptions about the video. This enables handling essentially unconstrained settings, including rapidly moving background, arbitrary object motion and appearance, and non-rigid deformations and articulations. In experiments on two datasets containing over 1400 video shots, our method outperforms a state-of-theart background subtraction technique [4] as well as methods based on clustering point tracks [6, 18, 19]. Moreover, it performs comparably to recent video object segmentation methods based on objectproposals [14, 16, 27], while being orders of magnitude faster.
6 0.61943734 33 iccv-2013-A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis
7 0.61439556 186 iccv-2013-GrabCut in One Cut
8 0.60202038 63 iccv-2013-Bounded Labeling Function for Global Segmentation of Multi-part Objects with Geometric Constraints
9 0.59690934 208 iccv-2013-Image Co-segmentation via Consistent Functional Maps
10 0.58006507 95 iccv-2013-Cosegmentation and Cosketch by Unsupervised Learning
11 0.56215405 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation
12 0.56035161 110 iccv-2013-Detecting Curved Symmetric Parts Using a Deformable Disc Model
13 0.52554131 76 iccv-2013-Coarse-to-Fine Semantic Video Segmentation Using Supervoxel Trees
14 0.52510947 150 iccv-2013-Exemplar Cut
15 0.50456405 74 iccv-2013-Co-segmentation by Composition
16 0.49026963 330 iccv-2013-Proportion Priors for Image Sequence Segmentation
17 0.47921765 78 iccv-2013-Coherent Motion Segmentation in Moving Camera Videos Using Optical Flow Orientations
18 0.47381973 245 iccv-2013-Learning a Dictionary of Shape Epitomes with Applications to Image Labeling
19 0.46668059 42 iccv-2013-Active MAP Inference in CRFs for Efficient Semantic Segmentation
20 0.46620655 379 iccv-2013-Semantic Segmentation without Annotating Segments
topicId topicWeight
[(2, 0.042), (4, 0.014), (26, 0.487), (31, 0.037), (42, 0.094), (64, 0.042), (73, 0.019), (89, 0.161), (98, 0.011)]
simIndex simValue paperId paperTitle
1 0.9642818 405 iccv-2013-Structured Light in Sunlight
Author: Mohit Gupta, Qi Yin, Shree K. Nayar
Abstract: Strong ambient illumination severely degrades the performance of structured light based techniques. This is especially true in outdoor scenarios, where the structured light sources have to compete with sunlight, whose power is often 2-5 orders of magnitude larger than the projected light. In this paper, we propose the concept of light-concentration to overcome strong ambient illumination. Our key observation is that given a fixed light (power) budget, it is always better to allocate it sequentially in several portions of the scene, as compared to spreading it over the entire scene at once. For a desired level of accuracy, we show that by distributing light appropriately, the proposed approach requires 1-2 orders lower acquisition time than existing approaches. Our approach is illumination-adaptive as the optimal light distribution is determined based on a measurement of the ambient illumination level. Since current light sources have a fixed light distribution, we have built a prototype light source that supports flexible light distribution by controlling the scanning speed of a laser scanner. We show several high quality 3D scanning results in a wide range of outdoor scenarios. The proposed approach will benefit 3D vision systems that need to operate outdoors under extreme ambient illumination levels on a limited time and power budget.
2 0.93258369 51 iccv-2013-Anchored Neighborhood Regression for Fast Example-Based Super-Resolution
Author: Radu Timofte, Vincent De_Smet, Luc Van_Gool
Abstract: Recently there have been significant advances in image upscaling or image super-resolution based on a dictionary of low and high resolution exemplars. The running time of the methods is often ignored despite the fact that it is a critical factor for real applications. This paper proposes fast super-resolution methods while making no compromise on quality. First, we support the use of sparse learned dictionaries in combination with neighbor embedding methods. In this case, the nearest neighbors are computed using the correlation with the dictionary atoms rather than the Euclidean distance. Moreover, we show that most of the current approaches reach top performance for the right parameters. Second, we show that using global collaborative coding has considerable speed advantages, reducing the super-resolution mapping to a precomputed projective matrix. Third, we propose the anchored neighborhood regression. That is to anchor the neighborhood embedding of a low resolution patch to the nearest atom in the dictionary and to precompute the corresponding embedding matrix. These proposals are contrasted with current state-of- the-art methods on standard images. We obtain similar or improved quality and one or two orders of magnitude speed improvements.
3 0.93237549 395 iccv-2013-Slice Sampling Particle Belief Propagation
Author: Oliver Müller, Michael Ying Yang, Bodo Rosenhahn
Abstract: Inference in continuous label Markov random fields is a challenging task. We use particle belief propagation (PBP) for solving the inference problem in continuous label space. Sampling particles from the belief distribution is typically done by using Metropolis-Hastings (MH) Markov chain Monte Carlo (MCMC) methods which involves sampling from a proposal distribution. This proposal distribution has to be carefully designed depending on the particular model and input data to achieve fast convergence. We propose to avoid dependence on a proposal distribution by introducing a slice sampling based PBP algorithm. The proposed approach shows superior convergence performance on an image denoising toy example. Our findings are validated on a challenging relational 2D feature tracking application.
4 0.9179948 125 iccv-2013-Drosophila Embryo Stage Annotation Using Label Propagation
Author: Tomáš Kazmar, Evgeny Z. Kvon, Alexander Stark, Christoph H. Lampert
Abstract: In this work we propose a system for automatic classification of Drosophila embryos into developmental stages. While the system is designed to solve an actual problem in biological research, we believe that the principle underlying it is interesting not only for biologists, but also for researchers in computer vision. The main idea is to combine two orthogonal sources of information: one is a classifier trained on strongly invariant features, which makes it applicable to images of very different conditions, but also leads to rather noisy predictions. The other is a label propagation step based on a more powerful similarity measure that however is only consistent within specific subsets of the data at a time. In our biological setup, the information sources are the shape and the staining patterns of embryo images. We show experimentally that while neither of the methods can be used by itself to achieve satisfactory results, their combination achieves prediction quality comparable to human per- formance.
same-paper 5 0.90559483 282 iccv-2013-Multi-view Object Segmentation in Space and Time
Author: Abdelaziz Djelouah, Jean-Sébastien Franco, Edmond Boyer, François Le_Clerc, Patrick Pérez
Abstract: In this paper, we address the problem of object segmentation in multiple views or videos when two or more viewpoints of the same scene are available. We propose a new approach that propagates segmentation coherence information in both space and time, hence allowing evidences in one image to be shared over the complete set. To this aim the segmentation is cast as a single efficient labeling problem over space and time with graph cuts. In contrast to most existing multi-view segmentation methods that rely on some form of dense reconstruction, ours only requires a sparse 3D sampling to propagate information between viewpoints. The approach is thoroughly evaluated on standard multiview datasets, as well as on videos. With static views, results compete with state of the art methods but they are achieved with significantly fewer viewpoints. With multiple videos, we report results that demonstrate the benefit of segmentation propagation through temporal cues.
6 0.90198302 198 iccv-2013-Hierarchical Part Matching for Fine-Grained Visual Categorization
7 0.90002847 348 iccv-2013-Refractive Structure-from-Motion on Underwater Images
8 0.84555936 295 iccv-2013-On One-Shot Similarity Kernels: Explicit Feature Maps and Properties
9 0.82185835 8 iccv-2013-A Deformable Mixture Parsing Model with Parselets
10 0.82161224 102 iccv-2013-Data-Driven 3D Primitives for Single Image Understanding
11 0.73879117 156 iccv-2013-Fast Direct Super-Resolution by Simple Functions
12 0.73700488 414 iccv-2013-Temporally Consistent Superpixels
13 0.71532178 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation
14 0.69894773 150 iccv-2013-Exemplar Cut
15 0.69653904 161 iccv-2013-Fast Sparsity-Based Orthogonal Dictionary Learning for Image Restoration
16 0.68660313 330 iccv-2013-Proportion Priors for Image Sequence Segmentation
18 0.68179494 423 iccv-2013-Towards Motion Aware Light Field Video for Dynamic Scenes
19 0.681445 411 iccv-2013-Symbiotic Segmentation and Part Localization for Fine-Grained Categorization
20 0.67484206 95 iccv-2013-Cosegmentation and Cosketch by Unsupervised Learning