iccv iccv2013 iccv2013-33 knowledge-graph by maker-knowledge-mining

33 iccv-2013-A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis


Source: pdf

Author: Fabio Galasso, Naveen Shankar Nagaraja, Tatiana Jiménez Cárdenas, Thomas Brox, Bernt Schiele

Abstract: Video segmentation research is currently limited by the lack of a benchmark dataset that covers the large variety of subproblems appearing in video segmentation and that is large enough to avoid overfitting. Consequently, there is little analysis of video segmentation which generalizes across subtasks, and it is not yet clear which and how video segmentation should leverage the information from the still-frames, as previously studied in image segmentation, alongside video specific information, such as temporal volume, motion and occlusion. In this work we provide such an analysis based on annotations of a large video dataset, where each video is manually segmented by multiple persons. Moreover, we introduce a new volume-based metric that includes the important aspect of temporal consistency, that can deal with segmentation hierarchies, and that reflects the tradeoff between over-segmentation and segmentation accuracy.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 In this work we provide such an analysis based on annotations of a large video dataset, where each video is manually segmented by multiple persons. [sent-3, score-0.469]

2 Moreover, we introduce a new volume-based metric that includes the important aspect of temporal consistency, that can deal with segmentation hierarchies, and that reflects the tradeoff between over-segmentation and segmentation accuracy. [sent-4, score-0.731]

3 Introduction Video segmentation is a fundamental problem with many applications such as action recognition, 3D reconstruction, classification, or video indexing. [sent-6, score-0.486]

4 While there are standard benchmark datasets for still image segmentation, such as the Berkeley segmentation dataset (BSDS) [18], a similar standard is missing for video segmentation. [sent-8, score-0.562]

5 Recent influential works have introduced video datasets that specialize on subproblems in video segmentation, such as motion segmentation [4], occlusion boundaries [22, 23], or video superpixels [28]. [sent-9, score-1.108]

6 This work aims for a dataset with corresponding annotation and an evaluation metric that can generalize over subproblems and help in analyzing the various challenges of video segmentation. [sent-10, score-0.405]

7 In contrast to [23], where only a single frame of each video is segmented by a single person, we extend this segmentation to multiple frames and multiple persons per frame. [sent-12, score-0.57]

8 (Column-wise) Frames from three video sequences ofthe dataset [23] and 2 of the human annotations which we collected for each. [sent-14, score-0.364]

9 We provide an analysis of state-of-the-art video segmentation algorithms by means of novel metrics leveraging multiple groundtruths. [sent-16, score-0.63]

10 We also analyze additional video specific subproblems, such as motion, non-rigid motion and camera motion. [sent-17, score-0.315]

11 the evaluation metric: (1) temporally inconsistent segmentations are penalized by the metric; (2) the metric can take the ambiguity of correct segmentations into account. [sent-18, score-0.38]

12 The latter property has been a strong point of the BSDS benchmark on single image segmentation [18]. [sent-19, score-0.35]

13 Some segmentation ambiguities vanish by the use of videos (e. [sent-20, score-0.332]

14 As in [18], we approach the scale issue by multiple human annotations to measure the natural level of ambiguity and a precision-recall metric that allows to compare segmentations tuned for different scales. [sent-25, score-0.262]

15 Moreover, the dataset is supposed to cover also the var- ious subproblems of video segmentation. [sent-26, score-0.302]

16 The annotation also enables deeper analysis of typical limitations of 33552270 current video segmentation methods. [sent-28, score-0.534]

17 Video Segmentation Literature A large body of literature exists on video segmentation leveraging appearance [2, 27, 13, 29], motion [21, 4, 11], or multiple cues [8, 12, 14, 15, 20, 16, 17, 19, 10, 6, 9]. [sent-31, score-0.593]

18 Recent works on video segmentation exploit the motion history contained in point trajectories [4, 17, 19, 9]. [sent-38, score-0.593]

19 This literature overview, which is by far not complete, shows the large diversity of video segmentation approaches. [sent-41, score-0.517]

20 There is a fundamental need for a common dataset and benchmark evaluation metrics that can cover all the various subtasks and that allows an analysis highlighting the strengths and limitations of each approach. [sent-42, score-0.273]

21 Video Segmentation Dataset and Annotation A good video segmentation dataset should consist of a large number of diverse sequences with the diversity spanning across different aspects. [sent-45, score-0.607]

22 Some of those aspects are equivalent to those of a good image segmentation dataset, i. [sent-46, score-0.322]

23 In addition to the image based diversity, the diversity of video sequences should also include occlusion and different kinds of object and camera motion: translational, scaling and perspective motion. [sent-49, score-0.324]

24 Current video segmentation datasets are limited in the aforementioned aspects: figment [9] only includes equallysized basketball players; CamVid [3] fulfills appearance heterogeneity but only includes 4 sequences, all of them recorded from a driving car. [sent-50, score-0.514]

25 A recent video dataset, introduced in [23] for occlusion boundary detection, fulfills the desired criteria of diversity. [sent-56, score-0.298]

26 While the number of frames per video is limited to a maximum of 121 frames, the video sequences are HD and include 100 videos arranged into 40 train + 60 test sequences. [sent-57, score-0.511]

27 The dataset is also challenging for current video segmentation algorithms as experiments show in Sections 5 and 6. [sent-58, score-0.555]

28 We adopt this dataset and provide the annotation necessary to make it a general video segmentation benchmark. [sent-59, score-0.583]

29 Video annotations should be accurate at object boundaries and most importantly temporally consistent: an object should have the same label in all ground truth frames throughout the video sequence. [sent-61, score-0.423]

30 Benchmark evaluation metrics We propose to benchmark video segmentation performance with a boundary oriented metric and with a volumetric one. [sent-71, score-0.789]

31 Boundary precision-recall (BPR) The boundary metric is most popular in the BSDS benchmark for image segmentation [18, 1]. [sent-75, score-0.505]

32 It casts the boundary detection problem as one of classifying boundary from nonboundary pixels and measures the quality of a segmentation boundary map in the precision-recall framework: P = R = F = |S ∩? [sent-76, score-0.552]

33 PR P (3) 33552281 where S is the set of machine generated segmentation boundaries and {Gi}iM=1 are the M sets of human annobtaotuionnd baoriuesnd aanride{s . [sent-82, score-0.4]

34 The metric is of limited use in a video segmentation benchmark, as it evaluates every frame independently, i. [sent-85, score-0.577]

35 , temporal consistency of the segmentation does not play a role. [sent-87, score-0.394]

36 We keep this metric from image segmentation, as it is a good measure for the localization accuracy of segmentation boundaries. [sent-89, score-0.371]

37 Volume precision-recall (VPR) VPR optimally assigns spatio-temporal volumes between the computer generated segmentation S and the M human annotated segmentations {Gi}iM=1 and measures thhuemira overlap. [sent-93, score-0.558]

38 Perfect recall is achieved with volumes that fully cover the human volumes. [sent-107, score-0.245]

39 Obviously, degenerate segmentations (one volume covering the whole video or every pixel being a separate volume) achieve relatively high scores with this metric. [sent-109, score-0.409]

40 For both BPR and VPR we report average precision (AP), the area under the PR curve, and optimal aggregate measures by means ofthe F-measures: optimal dataset scale (ODS), aggregated at a fixed scale over the dataset, and optimal segmentation scale (OSS), optimally selected for each segmentation. [sent-123, score-0.457]

41 More object-centric segmentation methods that tend to yield few larger object volumes will be found in the VPR high recall area, BPR high precision area. [sent-138, score-0.556]

42 The one in [4] is restricted to motion segmentation and does not satisfy (iii) and (v). [sent-145, score-0.441]

43 The boundary metric in [1] is designed for still image segmentation and do not satisfy (vii). [sent-147, score-0.489]

44 The region metrics in [1] have been extended to volumes [27, 10] but do not satisfy (i) and (v). [sent-148, score-0.253]

45 We report optimal dataset scale (ODS) and optimal segmentation scale (OSS), achieved in term of F-measure, alongside the average precision (AP), e. [sent-190, score-0.425]

46 (*) indicates evaluated on video frames resized by 0. [sent-194, score-0.249]

47 In particular, BPR plots indicate high precision, which reflects the very strong human capability to localize object boundaries in video material. [sent-202, score-0.307]

48 Selection of methods We selected a number of recent state-of-the-art video segmentation algorithms based on the availability of pub- lic code. [sent-208, score-0.529]

49 Moreover, we aimed to cover a large set of different working regimes: [19] provides a single segmentation result and specifically addresses the estimation of the number of moving objects and their segmentation. [sent-209, score-0.509]

50 Others [7, 13, 10, 29] provide a hierarchy of segmentations and therefore cover multiple working regimes. [sent-210, score-0.24]

51 According to these working regimes, we separately discuss the performance of the methods in the (VPR) high-precision area (corresponding to super-voxelization) and the (VPR) highrecall area (corresponding to object segmentation with a tendency to under-segmentation). [sent-211, score-0.474]

52 [10] provide coarse-to-fine video segmentation and could be additionally employed for the task. [sent-215, score-0.509]

53 [28] defined important properties for the supervoxel methods: supervoxels should respect object boundaries, be aligned with objects without spanning multiple of them (known as leaking), be temporally consistent and parsimonious, i. [sent-216, score-0.236]

54 VPR, like BPR, is also consistent with the multiple human annotators: perfect volume precision is obtained by supervoxels not leaking any of the multiple GT’s. [sent-226, score-0.318]

55 Mean length - volume precision and mean number of clusters - volume precision curves (respectively Length and NCL) complement the PR curves. [sent-233, score-0.458]

56 Object segmentation Object segmentation stands for algorithms identifying the visual objects in the video sequence and reducing the over-segmentation. [sent-243, score-0.875]

57 Intuitively, important aspects are the parsimonious detection of salient object boundaries and the segmentation of the video sequences into volumes which “explain” (i. [sent-245, score-0.83]

58 Each of the visual objects should be covered by a single volume over the entire video, at the cost of having volumes overlapping multiple objects. [sent-248, score-0.287]

59 An ideal object segmentation method detects the few salient boundaries in the video. [sent-249, score-0.386]

60 The volume property to explain visual objects maintaining temporal consistency is benchmarked by VPR in the high-recall regimes. [sent-252, score-0.292]

61 All algorithms may achieve perfect recall by labeling the whole video as one object. [sent-253, score-0.29]

62 Statistics on the average length (Length) and number of clusters (NCL) at the given ∼15% volume precision in Figure t2e complement thhee g object segmentation analysis. [sent-256, score-0.599]

63 Aggregate measures of boundary precision-recall (BPR) and volume precision-recall (VPR) for the VS algorithms on the motion subtasks. [sent-333, score-0.334]

64 (*) indicates evaluated on video frames resized by 0. [sent-334, score-0.249]

65 Image segmentation and a baseline We also have benchmarked a state-of-the-art image segmentation algorithm [1], alongside an associated oracle performance and a proposed baseline. [sent-341, score-0.771]

66 In the proposed framework, it can be directly compared to video segmentation methods with regard to the boundary BPR metric, but it is heavily penalized on the VPR metric, where temporal consistency across frames is measured. [sent-343, score-0.777]

67 video segmentation algorithms by two orders of magnitude, as image segments re-initialize at each frame of the video sequences, ∼100 frame long. [sent-347, score-0.755]

68 [1] consistently outperforms all selected video segmentation algorithms on the boundary metric, indicating that current video segmentation methods are not proficient in finding good boundaries. [sent-349, score-1.119]

69 The dashed magenta curve in the VPR illustrates performance of [1], when the per-frame segmentation is bound into volumes over time by an oracle prediction based on the ground truth. [sent-351, score-0.514]

70 The performance on the volume reaches levels far beyond state-of-the-art video segmentation performance. [sent-352, score-0.61]

71 The result of [1] (across the hierarchy) at the central frame of the video sequences are propagated to other frames in the video with optical flow [30] and used to label corresponding image segments (across the hierarchy) by maximum voting. [sent-354, score-0.499]

72 As expected, the working regimes of the simple baseline do not extend to superpixelization nor to segmenting few objects, due to the non-repeatability of image segmentation (cf. [sent-355, score-0.592]

73 Although simple, this baseline outperforms all considered video segmentation algorithms [7, 13, 19, 29, 10], consistently over the mid-recall range (however with low average length and large number of clusters, undesirable quality of a real video segmenter). [sent-358, score-0.808]

74 This analysis suggests that state-of-the-art image segmentation is at a more mature research level than video segmentation. [sent-359, score-0.486]

75 In fact, video segmentation has the potential to benefit from additional important cues, e. [sent-360, score-0.486]

76 Evaluation of Motion Segmentation Tasks Compared to still images, videos provide additional information that add to the segmentation into visual objects. [sent-369, score-0.355]

77 object vs camera motion, translational vs zooming and rotating, and it affects a number of other video specificities, e. [sent-373, score-0.401]

78 The additional indication of moving objects allows us to 33552325 Motion segmentation Non-rigid motion Figure 5. [sent-377, score-0.509]

79 Boundary precision-recall (BPR) and volume precision-recall (VPR) curves for VS algorithms on two motion subtasks. [sent-378, score-0.269]

80 compare the performance of segmentation methods on moving objects vs all objects. [sent-381, score-0.476]

81 Most other video segmentation methods perform worse on the moving objects than on the static ones. [sent-388, score-0.611]

82 The performance of the still image segmentation algorithm of [1] and the proposed baseline is also interesting. [sent-394, score-0.332]

83 While [1] is strongly penalized by VPR for the missing temporal consistency, it outperforms all considered video segmentation algorithms on the boundary metric, where the Moving camera BPR VPR Figure 7. [sent-395, score-0.765]

84 Boundary precision-recall (BPR) and volume precisionrecall (VPR) curves for the selected VS algorithms on the moving camera segmentation subtask. [sent-396, score-0.54]

85 This shows that the field of image segmentation is much more advanced than the field of video segmentation and performance on video segmentation can potentially be improved by transferring ideas from image segmentation. [sent-398, score-1.272]

86 For the graphs in Figure 7 and the corresponding statistics in Table 2, we ignored all video sequences where the camera was not undergoing a considerable motion with respect to the depicted static 3D scene (video sequences with jitter were also not included). [sent-401, score-0.534]

87 All algorithms positively maintained the same performance as on the general benchmark video set (cf. [sent-402, score-0.279]

88 Figure 2 and Table 1), clearly indicating that a moving camera is not an issue for state-of-the-art video segmentation algorithms. [sent-403, score-0.564]

89 Conclusion and future work In this work, we have addressed two fundamental limitations in the field video segmentation: the lack of a common dataset with sufficient annotation and the lack of an evaluation metric that is general enough to be employed on a large set of video segmentation subtasks. [sent-405, score-0.9]

90 We showed that the dataset allows for an analysis of the current state-ofthe-art in video segmentation, as we could address many working regimes - from over-segmentation to motion segmentation - with the same dataset and metric. [sent-406, score-0.905]

91 This encourages progress on new aspects of the video segmentation problem. [sent-408, score-0.508]

92 This sets an important challenge in video segmentation and will foster progress in the field. [sent-410, score-0.486]

93 Object segmentation by long term analysis of point trajectories. [sent-436, score-0.3]

94 Efficient multilevel brain tumor segmentation with integrated bayesian model classification. [sent-459, score-0.3]

95 Spatio-temporal segmentation of video by hierarchical mean shift analysis. [sent-463, score-0.509]

96 Video segmentation by tracing discontinuities in a trajectory embedding. [sent-468, score-0.3]

97 Track to the future: Spatio-temporal video segmentation with long-range motion cues. [sent-521, score-0.593]

98 A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. [sent-528, score-0.399]

99 Object segmentation in video: a hierarchical variational approach for turning point trajectories into dense regions. [sent-533, score-0.323]

100 A benchmark for the comparison of 3-D motion segmentation algoriths. [sent-561, score-0.457]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('vpr', 0.609), ('bpr', 0.42), ('segmentation', 0.3), ('regimes', 0.189), ('video', 0.186), ('volumes', 0.141), ('ncl', 0.126), ('motion', 0.107), ('volume', 0.1), ('supervoxel', 0.093), ('boundary', 0.084), ('segmentations', 0.082), ('metrics', 0.078), ('vs', 0.074), ('metric', 0.071), ('working', 0.071), ('penalized', 0.07), ('boundaries', 0.065), ('sequences', 0.064), ('subtasks', 0.063), ('supervoxelization', 0.063), ('precision', 0.061), ('temporal', 0.06), ('moving', 0.056), ('annotators', 0.055), ('subproblems', 0.054), ('annotations', 0.053), ('benchmarked', 0.052), ('leaking', 0.052), ('aggregate', 0.05), ('benchmark', 0.05), ('oracle', 0.049), ('annotation', 0.048), ('objects', 0.046), ('clusters', 0.046), ('bsds', 0.045), ('algorithms', 0.043), ('frames', 0.043), ('galasso', 0.042), ('highrecall', 0.042), ('subtask', 0.042), ('supervoxels', 0.042), ('length', 0.041), ('degenerate', 0.041), ('gi', 0.039), ('alongside', 0.038), ('gsi', 0.037), ('cover', 0.036), ('pr', 0.036), ('layered', 0.036), ('hd', 0.036), ('human', 0.035), ('satisfy', 0.034), ('temporally', 0.034), ('consistency', 0.034), ('recall', 0.033), ('oss', 0.032), ('baseline', 0.032), ('videos', 0.032), ('diversity', 0.031), ('parsimonious', 0.031), ('complement', 0.03), ('regime', 0.03), ('perfect', 0.028), ('fulfills', 0.028), ('granularity', 0.028), ('hierarchy', 0.028), ('ods', 0.026), ('dataset', 0.026), ('im', 0.025), ('streaming', 0.024), ('translational', 0.024), ('spatiotemporal', 0.024), ('undergoing', 0.024), ('superpixels', 0.024), ('levels', 0.024), ('curve', 0.024), ('providing', 0.024), ('hierarchical', 0.023), ('static', 0.023), ('provide', 0.023), ('addressed', 0.023), ('camera', 0.022), ('aspects', 0.022), ('statistics', 0.022), ('graphs', 0.022), ('ambiguity', 0.021), ('animals', 0.021), ('object', 0.021), ('segmented', 0.021), ('arbelaez', 0.021), ('maire', 0.02), ('lack', 0.02), ('area', 0.02), ('resized', 0.02), ('evaluation', 0.02), ('germany', 0.02), ('consistently', 0.02), ('frame', 0.02), ('curves', 0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999964 33 iccv-2013-A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis

Author: Fabio Galasso, Naveen Shankar Nagaraja, Tatiana Jiménez Cárdenas, Thomas Brox, Bernt Schiele

Abstract: Video segmentation research is currently limited by the lack of a benchmark dataset that covers the large variety of subproblems appearing in video segmentation and that is large enough to avoid overfitting. Consequently, there is little analysis of video segmentation which generalizes across subtasks, and it is not yet clear which and how video segmentation should leverage the information from the still-frames, as previously studied in image segmentation, alongside video specific information, such as temporal volume, motion and occlusion. In this work we provide such an analysis based on annotations of a large video dataset, where each video is manually segmented by multiple persons. Moreover, we introduce a new volume-based metric that includes the important aspect of temporal consistency, that can deal with segmentation hierarchies, and that reflects the tradeoff between over-segmentation and segmentation accuracy.

2 0.15628204 172 iccv-2013-Flattening Supervoxel Hierarchies by the Uniform Entropy Slice

Author: Chenliang Xu, Spencer Whitt, Jason J. Corso

Abstract: Supervoxel hierarchies provide a rich multiscale decomposition of a given video suitable for subsequent processing in video analysis. The hierarchies are typically computed by an unsupervised process that is susceptible to undersegmentation at coarse levels and over-segmentation at fine levels, which make it a challenge to adopt the hierarchies for later use. In this paper, we propose the first method to overcome this limitation and flatten the hierarchy into a single segmentation. Our method, called the uniform entropy slice, seeks a selection of supervoxels that balances the relative level of information in the selected supervoxels based on some post hoc feature criterion such as objectness. For example, with this criterion, in regions nearby objects, our method prefers finer supervoxels to capture the local details, but in regions away from any objects we prefer coarser supervoxels. We formulate the uniform entropy slice as a binary quadratic program and implement four different feature criteria, both unsupervised and supervised, to drive the flattening. Although we apply it only to supervoxel hierarchies in this paper, our method is generally applicable to segmentation tree hierarchies. Our experiments demonstrate both strong qualitative performance and superior quantitative performance to state of the art baselines on benchmark internet videos.

3 0.15496939 160 iccv-2013-Fast Object Segmentation in Unconstrained Video

Author: Anestis Papazoglou, Vittorio Ferrari

Abstract: We present a technique for separating foreground objects from the background in a video. Our method isfast, , fully automatic, and makes minimal assumptions about the video. This enables handling essentially unconstrained settings, including rapidly moving background, arbitrary object motion and appearance, and non-rigid deformations and articulations. In experiments on two datasets containing over 1400 video shots, our method outperforms a state-of-theart background subtraction technique [4] as well as methods based on clustering point tracks [6, 18, 19]. Moreover, it performs comparably to recent video object segmentation methods based on objectproposals [14, 16, 27], while being orders of magnitude faster.

4 0.14323251 76 iccv-2013-Coarse-to-Fine Semantic Video Segmentation Using Supervoxel Trees

Author: Aastha Jain, Shuanak Chatterjee, René Vidal

Abstract: We propose an exact, general and efficient coarse-to-fine energy minimization strategy for semantic video segmentation. Our strategy is based on a hierarchical abstraction of the supervoxel graph that allows us to minimize an energy defined at the finest level of the hierarchy by minimizing a series of simpler energies defined over coarser graphs. The strategy is exact, i.e., it produces the same solution as minimizing over the finest graph. It is general, i.e., it can be used to minimize any energy function (e.g., unary, pairwise, and higher-order terms) with any existing energy minimization algorithm (e.g., graph cuts and belief propagation). It also gives significant speedups in inference for several datasets with varying degrees of spatio-temporal continuity. We also discuss the strengths and weaknesses of our strategy relative to existing hierarchical approaches, and the kinds of image and video data that provide the best speedups.

5 0.13755208 297 iccv-2013-Online Motion Segmentation Using Dynamic Label Propagation

Author: Ali Elqursh, Ahmed Elgammal

Abstract: The vast majority of work on motion segmentation adopts the affine camera model due to its simplicity. Under the affine model, the motion segmentation problem becomes that of subspace separation. Due to this assumption, such methods are mainly offline and exhibit poor performance when the assumption is not satisfied. This is made evident in state-of-the-art methods that relax this assumption by using piecewise affine spaces and spectral clustering techniques to achieve better results. In this paper, we formulate the problem of motion segmentation as that of manifold separation. We then show how label propagation can be used in an online framework to achieve manifold separation. The performance of our framework is evaluated on a benchmark dataset and achieves competitive performance while being online.

6 0.12424268 282 iccv-2013-Multi-view Object Segmentation in Space and Time

7 0.12315188 78 iccv-2013-Coherent Motion Segmentation in Moving Camera Videos Using Optical Flow Orientations

8 0.12306597 225 iccv-2013-Joint Segmentation and Pose Tracking of Human in Natural Videos

9 0.11805715 379 iccv-2013-Semantic Segmentation without Annotating Segments

10 0.11523504 414 iccv-2013-Temporally Consistent Superpixels

11 0.11423044 432 iccv-2013-Uncertainty-Driven Efficiently-Sampled Sparse Graphical Models for Concurrent Tumor Segmentation and Atlas Registration

12 0.1121283 442 iccv-2013-Video Segmentation by Tracking Many Figure-Ground Segments

13 0.10515301 314 iccv-2013-Perspective Motion Segmentation via Collaborative Clustering

14 0.10044613 318 iccv-2013-PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects

15 0.095237762 208 iccv-2013-Image Co-segmentation via Consistent Functional Maps

16 0.094207272 317 iccv-2013-Piecewise Rigid Scene Flow

17 0.094140574 361 iccv-2013-Robust Trajectory Clustering for Motion Segmentation

18 0.090071246 322 iccv-2013-Pose Estimation and Segmentation of People in 3D Movies

19 0.089898035 150 iccv-2013-Exemplar Cut

20 0.089670837 447 iccv-2013-Volumetric Semantic Segmentation Using Pyramid Context Features


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.187), (1, -0.033), (2, 0.076), (3, 0.089), (4, 0.041), (5, 0.076), (6, -0.067), (7, 0.098), (8, 0.056), (9, -0.043), (10, 0.009), (11, 0.101), (12, 0.03), (13, -0.006), (14, -0.088), (15, 0.017), (16, -0.058), (17, -0.044), (18, -0.093), (19, -0.037), (20, 0.03), (21, -0.067), (22, -0.028), (23, 0.039), (24, -0.095), (25, 0.01), (26, -0.002), (27, 0.028), (28, -0.012), (29, 0.004), (30, 0.025), (31, -0.074), (32, 0.01), (33, -0.043), (34, 0.029), (35, -0.066), (36, 0.038), (37, 0.006), (38, -0.006), (39, 0.071), (40, 0.06), (41, 0.073), (42, -0.051), (43, -0.05), (44, 0.095), (45, 0.047), (46, 0.004), (47, 0.011), (48, -0.038), (49, 0.003)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95277095 33 iccv-2013-A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis

Author: Fabio Galasso, Naveen Shankar Nagaraja, Tatiana Jiménez Cárdenas, Thomas Brox, Bernt Schiele

Abstract: Video segmentation research is currently limited by the lack of a benchmark dataset that covers the large variety of subproblems appearing in video segmentation and that is large enough to avoid overfitting. Consequently, there is little analysis of video segmentation which generalizes across subtasks, and it is not yet clear which and how video segmentation should leverage the information from the still-frames, as previously studied in image segmentation, alongside video specific information, such as temporal volume, motion and occlusion. In this work we provide such an analysis based on annotations of a large video dataset, where each video is manually segmented by multiple persons. Moreover, we introduce a new volume-based metric that includes the important aspect of temporal consistency, that can deal with segmentation hierarchies, and that reflects the tradeoff between over-segmentation and segmentation accuracy.

2 0.79923987 172 iccv-2013-Flattening Supervoxel Hierarchies by the Uniform Entropy Slice

Author: Chenliang Xu, Spencer Whitt, Jason J. Corso

Abstract: Supervoxel hierarchies provide a rich multiscale decomposition of a given video suitable for subsequent processing in video analysis. The hierarchies are typically computed by an unsupervised process that is susceptible to undersegmentation at coarse levels and over-segmentation at fine levels, which make it a challenge to adopt the hierarchies for later use. In this paper, we propose the first method to overcome this limitation and flatten the hierarchy into a single segmentation. Our method, called the uniform entropy slice, seeks a selection of supervoxels that balances the relative level of information in the selected supervoxels based on some post hoc feature criterion such as objectness. For example, with this criterion, in regions nearby objects, our method prefers finer supervoxels to capture the local details, but in regions away from any objects we prefer coarser supervoxels. We formulate the uniform entropy slice as a binary quadratic program and implement four different feature criteria, both unsupervised and supervised, to drive the flattening. Although we apply it only to supervoxel hierarchies in this paper, our method is generally applicable to segmentation tree hierarchies. Our experiments demonstrate both strong qualitative performance and superior quantitative performance to state of the art baselines on benchmark internet videos.

3 0.76207703 76 iccv-2013-Coarse-to-Fine Semantic Video Segmentation Using Supervoxel Trees

Author: Aastha Jain, Shuanak Chatterjee, René Vidal

Abstract: We propose an exact, general and efficient coarse-to-fine energy minimization strategy for semantic video segmentation. Our strategy is based on a hierarchical abstraction of the supervoxel graph that allows us to minimize an energy defined at the finest level of the hierarchy by minimizing a series of simpler energies defined over coarser graphs. The strategy is exact, i.e., it produces the same solution as minimizing over the finest graph. It is general, i.e., it can be used to minimize any energy function (e.g., unary, pairwise, and higher-order terms) with any existing energy minimization algorithm (e.g., graph cuts and belief propagation). It also gives significant speedups in inference for several datasets with varying degrees of spatio-temporal continuity. We also discuss the strengths and weaknesses of our strategy relative to existing hierarchical approaches, and the kinds of image and video data that provide the best speedups.

4 0.71291709 414 iccv-2013-Temporally Consistent Superpixels

Author: Matthias Reso, Jörn Jachalsky, Bodo Rosenhahn, Jörn Ostermann

Abstract: Superpixel algorithms represent a very useful and increasingly popular preprocessing step for a wide range of computer vision applications, as they offer the potential to boost efficiency and effectiveness. In this regards, this paper presents a highly competitive approach for temporally consistent superpixelsfor video content. The approach is based on energy-minimizing clustering utilizing a novel hybrid clustering strategy for a multi-dimensional feature space working in a global color subspace and local spatial subspaces. Moreover, a new contour evolution based strategy is introduced to ensure spatial coherency of the generated superpixels. For a thorough evaluation the proposed approach is compared to state of the art supervoxel algorithms using established benchmarks and shows a superior performance.

5 0.71107054 160 iccv-2013-Fast Object Segmentation in Unconstrained Video

Author: Anestis Papazoglou, Vittorio Ferrari

Abstract: We present a technique for separating foreground objects from the background in a video. Our method isfast, , fully automatic, and makes minimal assumptions about the video. This enables handling essentially unconstrained settings, including rapidly moving background, arbitrary object motion and appearance, and non-rigid deformations and articulations. In experiments on two datasets containing over 1400 video shots, our method outperforms a state-of-theart background subtraction technique [4] as well as methods based on clustering point tracks [6, 18, 19]. Moreover, it performs comparably to recent video object segmentation methods based on objectproposals [14, 16, 27], while being orders of magnitude faster.

6 0.68784523 282 iccv-2013-Multi-view Object Segmentation in Space and Time

7 0.66668242 299 iccv-2013-Online Video SEEDS for Temporal Window Objectness

8 0.65271032 432 iccv-2013-Uncertainty-Driven Efficiently-Sampled Sparse Graphical Models for Concurrent Tumor Segmentation and Atlas Registration

9 0.63484335 150 iccv-2013-Exemplar Cut

10 0.63297653 63 iccv-2013-Bounded Labeling Function for Global Segmentation of Multi-part Objects with Geometric Constraints

11 0.61526561 186 iccv-2013-GrabCut in One Cut

12 0.60574567 78 iccv-2013-Coherent Motion Segmentation in Moving Camera Videos Using Optical Flow Orientations

13 0.59279561 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation

14 0.59244668 297 iccv-2013-Online Motion Segmentation Using Dynamic Label Propagation

15 0.59055966 379 iccv-2013-Semantic Segmentation without Annotating Segments

16 0.58564413 442 iccv-2013-Video Segmentation by Tracking Many Figure-Ground Segments

17 0.57805401 145 iccv-2013-Estimating the Material Properties of Fabric from Video

18 0.57654947 329 iccv-2013-Progressive Multigrid Eigensolvers for Multiscale Spectral Segmentation

19 0.57469797 330 iccv-2013-Proportion Priors for Image Sequence Segmentation

20 0.56598973 22 iccv-2013-A New Adaptive Segmental Matching Measure for Human Activity Recognition


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.062), (7, 0.014), (26, 0.1), (31, 0.036), (34, 0.011), (40, 0.011), (42, 0.078), (64, 0.059), (73, 0.023), (86, 0.018), (89, 0.168), (97, 0.011), (98, 0.298)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.83534843 435 iccv-2013-Unsupervised Domain Adaptation by Domain Invariant Projection

Author: Mahsa Baktashmotlagh, Mehrtash T. Harandi, Brian C. Lovell, Mathieu Salzmann

Abstract: Domain-invariant representations are key to addressing the domain shift problem where the training and test examples follow different distributions. Existing techniques that have attempted to match the distributions of the source and target domains typically compare these distributions in the original feature space. This space, however, may not be directly suitable for such a comparison, since some of the features may have been distorted by the domain shift, or may be domain specific. In this paper, we introduce a Domain Invariant Projection approach: An unsupervised domain adaptation method that overcomes this issue by extracting the information that is invariant across the source and target domains. More specifically, we learn a projection of the data to a low-dimensional latent space where the distance between the empirical distributions of the source and target examples is minimized. We demonstrate the effectiveness of our approach on the task of visual object recognition and show that it outperforms state-of-the-art methods on a standard domain adaptation benchmark dataset.

2 0.82524061 431 iccv-2013-Unbiased Metric Learning: On the Utilization of Multiple Datasets and Web Images for Softening Bias

Author: Chen Fang, Ye Xu, Daniel N. Rockmore

Abstract: Many standard computer vision datasets exhibit biases due to a variety of sources including illumination condition, imaging system, and preference of dataset collectors. Biases like these can have downstream effects in the use of vision datasets in the construction of generalizable techniques, especially for the goal of the creation of a classification system capable of generalizing to unseen and novel datasets. In this work we propose Unbiased Metric Learning (UML), a metric learning approach, to achieve this goal. UML operates in the following two steps: (1) By varying hyperparameters, it learns a set of less biased candidate distance metrics on training examples from multiple biased datasets. The key idea is to learn a neighborhood for each example, which consists of not only examples of the same category from the same dataset, but those from other datasets. The learning framework is based on structural SVM. (2) We do model validation on a set of weakly-labeled web images retrieved by issuing class labels as keywords to search engine. The metric with best validationperformance is selected. Although the web images sometimes have noisy labels, they often tend to be less biased, which makes them suitable for the validation set in our task. Cross-dataset image classification experiments are carried out. Results show significant performance improvement on four well-known computer vision datasets.

3 0.82221955 271 iccv-2013-Modeling the Calibration Pipeline of the Lytro Camera for High Quality Light-Field Image Reconstruction

Author: Donghyeon Cho, Minhaeng Lee, Sunyeong Kim, Yu-Wing Tai

Abstract: Light-field imaging systems have got much attention recently as the next generation camera model. A light-field imaging system consists of three parts: data acquisition, manipulation, and application. Given an acquisition system, it is important to understand how a light-field camera converts from its raw image to its resulting refocused image. In this paper, using the Lytro camera as an example, we describe step-by-step procedures to calibrate a raw light-field image. In particular, we are interested in knowing the spatial and angular coordinates of the micro lens array and the resampling process for image reconstruction. Since Lytro uses a hexagonal arrangement of a micro lens image, additional treatments in calibration are required. After calibration, we analyze and compare the performances of several resampling methods for image reconstruction with and without calibration. Finally, a learning based interpolation method is proposed which demonstrates a higher quality image reconstruction than previous interpolation methods including a method used in Lytro software.

4 0.80769646 434 iccv-2013-Unifying Nuclear Norm and Bilinear Factorization Approaches for Low-Rank Matrix Decomposition

Author: Ricardo Cabral, Fernando De_La_Torre, João P. Costeira, Alexandre Bernardino

Abstract: Low rank models have been widely usedfor the representation of shape, appearance or motion in computer vision problems. Traditional approaches to fit low rank models make use of an explicit bilinear factorization. These approaches benefit from fast numerical methods for optimization and easy kernelization. However, they suffer from serious local minima problems depending on the loss function and the amount/type of missing data. Recently, these lowrank models have alternatively been formulated as convex problems using the nuclear norm regularizer; unlike factorization methods, their numerical solvers are slow and it is unclear how to kernelize them or to impose a rank a priori. This paper proposes a unified approach to bilinear factorization and nuclear norm regularization, that inherits the benefits of both. We analyze the conditions under which these approaches are equivalent. Moreover, based on this analysis, we propose a new optimization algorithm and a “rank continuation ” strategy that outperform state-of-theart approaches for Robust PCA, Structure from Motion and Photometric Stereo with outliers and missing data.

5 0.80697668 19 iccv-2013-A Learning-Based Approach to Reduce JPEG Artifacts in Image Matting

Author: Inchang Choi, Sunyeong Kim, Michael S. Brown, Yu-Wing Tai

Abstract: Single image matting techniques assume high-quality input images. The vast majority of images on the web and in personal photo collections are encoded using JPEG compression. JPEG images exhibit quantization artifacts that adversely affect the performance of matting algorithms. To address this situation, we propose a learning-based post-processing method to improve the alpha mattes extracted from JPEG images. Our approach learns a set of sparse dictionaries from training examples that are used to transfer details from high-quality alpha mattes to alpha mattes corrupted by JPEG compression. Three different dictionaries are defined to accommodate different object structure (long hair, short hair, and sharp boundaries). A back-projection criteria combined within an MRF framework is used to automatically select the best dictionary to apply on the object’s local boundary. We demonstrate that our method can produces superior results over existing state-of-the-art matting algorithms on a variety of inputs and compression levels.

6 0.80011678 402 iccv-2013-Street View Motion-from-Structure-from-Motion

same-paper 7 0.78515685 33 iccv-2013-A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis

8 0.70012093 181 iccv-2013-Frustratingly Easy NBNN Domain Adaptation

9 0.69793999 123 iccv-2013-Domain Adaptive Classification

10 0.68999088 427 iccv-2013-Transfer Feature Learning with Joint Distribution Adaptation

11 0.67032212 438 iccv-2013-Unsupervised Visual Domain Adaptation Using Subspace Alignment

12 0.66445124 1 iccv-2013-3DNN: Viewpoint Invariant 3D Geometry Matching for Scene Understanding

13 0.64610487 222 iccv-2013-Joint Learning of Discriminative Prototypes and Large Margin Nearest Neighbor Classifiers

14 0.63813531 396 iccv-2013-Space-Time Robust Representation for Action Recognition

15 0.63587284 423 iccv-2013-Towards Motion Aware Light Field Video for Dynamic Scenes

16 0.63261729 172 iccv-2013-Flattening Supervoxel Hierarchies by the Uniform Entropy Slice

17 0.63232052 183 iccv-2013-Geometric Registration Based on Distortion Estimation

18 0.63119566 196 iccv-2013-Hierarchical Data-Driven Descent for Efficient Optimal Deformation Estimation

19 0.63098979 44 iccv-2013-Adapting Classification Cascades to New Domains

20 0.63039416 43 iccv-2013-Active Visual Recognition with Expertise Estimation in Crowdsourcing