iccv iccv2013 iccv2013-299 knowledge-graph by maker-knowledge-mining

299 iccv-2013-Online Video SEEDS for Temporal Window Objectness

Source: pdf

Author: Michael Van_Den_Bergh, Gemma Roig, Xavier Boix, Santiago Manen, Luc Van_Gool

Abstract: Superpixel and objectness algorithms are broadly used as a pre-processing step to generate support regions and to speed-up further computations. Recently, many algorithms have been extended to video in order to exploit the temporal consistency between frames. However, most methods are computationally too expensive for real-time applications. We introduce an online, real-time video superpixel algorithm based on the recently proposed SEEDS superpixels. A new capability is incorporated which delivers multiple diverse samples (hypotheses) of superpixels in the same image or video sequence. The multiple samples are shown to provide a strong cue to efficiently measure the objectness of image windows, and we introduce the novel concept of objectness in temporal windows. Experiments show that the video superpixels achieve comparable performance to state-of-the-art offline methods while running at 30 fps on a single 2.8 GHz i7 CPU. State-of-the-art performance on objectness is also demonstrated, yet orders of magnitude faster and extended to temporal windows in video.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 ch ∗ Luc Van Gool1,2 Abstract Superpixel and objectness algorithms are broadly used as a pre-processing step to generate support regions and to speed-up further computations. [sent-4, score-0.517]

2 We introduce an online, real-time video superpixel algorithm based on the recently proposed SEEDS superpixels. [sent-7, score-0.496]

3 A new capability is incorporated which delivers multiple diverse samples (hypotheses) of superpixels in the same image or video sequence. [sent-8, score-0.549]

4 The multiple samples are shown to provide a strong cue to efficiently measure the objectness of image windows, and we introduce the novel concept of objectness in temporal windows. [sent-9, score-1.227]

5 Experiments show that the video superpixels achieve comparable performance to state-of-the-art offline methods while running at 30 fps on a single 2. [sent-10, score-0.524]

6 State-of-the-art performance on objectness is also demonstrated, yet orders of magnitude faster and extended to temporal windows in video. [sent-12, score-0.798]

7 Introduction Many algorithms use superpixels or objectness scores to efficiently select areas which to analyze further. [sent-14, score-0.887]

8 In terms of its still counterparts, it comes closest to the recently introduced SEEDS superpixels [15]. [sent-21, score-0.413]

9 Similar to SEEDS, we define an objective function that prefers video superpixels to have a homogeneous color, and our video superpixels can be extracted efficiently. [sent-22, score-0.976]

10 When starting off the partition of a new video frame, we exploit the hierarchical superpixel organization ofthe previous frame, the coarser levels of which serve as initialization. [sent-27, score-0.592]

11 Moreover, we propose a method to extract multiple superpixel partitions with a value of the objective function close to that of the optimum. [sent-28, score-0.436]

12 This allows us to introduce a new and highly efficient objectness measure, together with its natural extension to videos (a tube of bounding boxes spanning a time interval). [sent-30, score-0.764]

13 We experimentally validate the video superpixel and objectness algorithms, where we use standard benchmarks where possible. [sent-33, score-1.013]

14 Related Work In this section, we review previous work related to superpixels and objectness in videos, the two tasks tackled in this paper. [sent-36, score-0.887]

15 Thus, our approach can be seen to add a third strand to video superpixel extraction, namely one that that moves the boundaries in an initial superpixel partition. [sent-45, score-0.908]

16 [16, 17] proposed a benchmark to evaluate video superpixels and a framework for streaming video segmentation using the graph-based superpixel approach of [5]. [sent-47, score-1.027]

17 The objectness measure was introduced by Alexe et al. [sent-52, score-0.563]

18 To the best ofour knowledge, objectness throughout video shots has not been introduced before. [sent-54, score-0.688]

19 It should not be confused with the recently introduced dynamic objectness [13], which extracts objectness within a frame by including instantaneous motion. [sent-55, score-1.151]

20 SEEDS for stills Let s represent the superpixel partition of an image, such that s : {1, . [sent-62, score-0.483]

21 yT she ∈ ∈SE SE,D wSh approach [e1 s5]e tf oorf extracting superpixels in stills serves as starting point for our video extension. [sent-74, score-0.546]

22 SEEDS extracts superpixels by maximizing an objective function, thus enforcing the color histograms of superpixels to be each concentrated in a single bin. [sent-78, score-0.74]

23 SEEDS for videos Our video approach propagates superpixels over multiple frames to build 3D spatio-temporal constructs. [sent-83, score-0.559]

24 As time goes on, new video superpixels can appear and others may terminate. [sent-84, score-0.488]

25 In the literature, this is controlled by constraining the number of superpixel tubes in the sequence. [sent-85, score-0.45]

26 In order to fulfill both constraints, the termination of a superpixel implies the creation of a new one in the same frame. [sent-89, score-0.531]

27 These are Ltehet partitions feotr owfh vicahli dth pea superpixels are contiguous blobs in all frames and that exhibit the correct superpixelper-frame and superpixel-rate behavior. [sent-92, score-0.487]

28 set of pixels that belong to superpixel k, at frame t. [sent-95, score-0.52]

29 To indicate all pixels of the video superpixel up to frame t, we use Atk:0. [sent-96, score-0.638]

30 It maximizes the energy by exchanging pixels between superpixels at their boundaries. [sent-115, score-0.522]

31 Both the pixel exchange between superpixels and their temporal propagation are regulated through blocks of pixels. [sent-120, score-0.722]

32 aTyheer sb elaocchk simizee caot mthbei see 2co×nd2 layer (2 2 or 3 3) and the number of layers are chosen sru (c2h ×tha 2t tohre 3 image saunbdd tihveisi nounm abt ethre o highest layer approximately yields the prescribed number of superpixels per frame. [sent-125, score-0.476]

33 Multiple pixel block exchanges between superpixels are considered, one after the other. [sent-137, score-0.533]

34 The exchanged pixel blocks are adjacent to the superpixel boundaries. [sent-139, score-0.573]

35 Let Bnt be a block of pixels of the current frame that belongs tto B the superpixel n, i. [sent-145, score-0.587]

36 c kB Btn⊂ f Arom⊂ superpixel n to m iwnhcreetahesres e txhceh objective efu bnlcotcikon B, we can use one histogram intersection computation, rather than evaluating the complete energy function. [sent-150, score-0.467]

37 A Ttmh:0u sis, higher nthtearntsheec tiinotner osfec Btiont too hthee superpixel ritp currently belongs to, the exchange is accepted, otherwise it is discarded. [sent-153, score-0.436]

38 m Tehs eth faitrs tth one tiso gthraamt v oifde Bo superpixels are of similar size and that the blocks are much smaller than the video superpixels. [sent-161, score-0.623]

39 This holds most of the time, since superpixels indeed tend to be of the same size, and the blocks are 379 defined to be at most one fourth of a superpixel in a frame, and hence, are much smaller than superpixels extending on multiple frames in the video. [sent-162, score-1.288]

40 According to the superpixel rate, some frames are selected to terminate and create superpixels. [sent-166, score-0.463]

41 They allow to evaluate which termination and creation of superpixels yield higher energy using efficient intersection distances, as well. [sent-170, score-0.585]

42 5 there is an illustration of the creation and termination of superpixels with the notation used. [sent-172, score-0.496]

43 When a superpixel is terminated, its pixels at frame t are incorporated to a neighbor superpixel. [sent-173, score-0.52]

44 (3) We terminate the superpixel with higher intersection to its neighbor among all superpixels in the frame. [sent-179, score-0.837]

45 ina |tAed, a new one should be created to fulfill the constraint of number of superpixels per frame (Sec. [sent-192, score-0.522]

46 The candidates to form a new superpixel are blocks of pixels that belong to an existing video superpixel. [sent-195, score-0.678]

47 Let Bnt ⊂ Atn:0 and Bmt ⊂ Atm:0 be blocks of superpixels Lcaentd Bidate⊂s tAo creaanted a new superpixel. [sent-196, score-0.505]

48 In principle, the algorithm can run for an infinitely long video, since it generates the partition online, and in memory we only need the histograms of the video superpixels that propagate to the current frame. [sent-212, score-0.556]

49 In the first frame of the video, the superpixels are initialized along a grid using the hierarchy of blocks. [sent-214, score-0.538]

50 Like this, the superpixel structure can be propagated from the previous frame while discarding small details. [sent-218, score-0.502]

51 Randomized SEEDS Some superpixel methods offer extra capabilities, such as the extraction of a hierarchy of superpixels [17]. [sent-222, score-0.821]

52 In the next section we exploit it to design an objectness measure of temporal windows, though we expect that applications may not be limited to that one. [sent-224, score-0.67]

53 6, we give an example of different partitions with the same number of superpixels, with similar energy value and which solutions have very similar accuracy according to the superpixel benchmarks. [sent-228, score-0.486]

54 This shows that we can extract multiple samples of superpixel partitions from the same video, all of them of comparable quality. [sent-229, score-0.476]

55 6 shows that when superimposing a diverse set of superpixel samples obtained with randomized SEEDS, the boundaries of the objects are preserved, and the boundaries due to over-segmentation fade away. [sent-249, score-0.59]

56 In the following, we first define the measure of the objectness in a still image, and then we introduce how to extend it to temporal windows (tubes of bounding boxes). [sent-251, score-0.889]

57 The objectness score is computed as the sum of the distances to the Objectness Measure for Still Images. [sent-252, score-0.569]

58 We use O to represent the intersection of several superpixel samples of randomized SEEDS. [sent-253, score-0.561]

59 O(i) takes value 1if all samples have a superpixel boundary at pixel i, and 0 otherwise. [sent-254, score-0.485]

60 Thus, O is an image that indicates in which pixels the samples of randomized SEEDS agree that there is a superpixel boundary. [sent-255, score-0.569]

61 We define the objectness score for a still image using O. [sent-256, score-0.59]

62 Let X be the set of pixels inside the bounding box, Per(X) tLheet sXet boef pixels i onf t phiex perimeter hofe t bhoeu bounding b,o Pxe,r a(Xnd) XR,C(p) the pixels that are inside the bounding box and in the same row or column as pixel p. [sent-263, score-0.549]

63 To the best of our knowledge, no earlier work has used multiple superpixel hypotheses to build an objectness score. [sent-267, score-0.915]

64 Comparison of our online video superpixels method to the state-of-the-art (s-o-a). [sent-270, score-0.551]

65 The temporal windows in shots allow for incorporating features and classifiers that exploit the spatio-temporal regions, and can easily be incorporated in any video application that uses bounding boxes. [sent-278, score-0.476]

66 The aim of video objectness is to reduce these 1050 temporal windows to the 100-1000 most likely to contain an object. [sent-282, score-0.865]

67 The video objectness score is proposed as a volumetric extension of Eq. [sent-283, score-0.687]

68 In the first frame, all possible bounding boxes are extracted densely and ranked based on the objectness score for still images. [sent-285, score-0.744]

69 In the subsequent frames, each bounding box is propagated in time by propagating the video superpixels that are completely inside the bounding box in the first frame. [sent-286, score-0.789]

70 The score is updated online as each new frame is added until the shot is finished, and accordingly, the ranking of the temporal windows is updated online as well. [sent-287, score-0.535]

71 Experiments In this section we report experimental evaluation of the introduced online video superpixel method. [sent-289, score-0.602]

72 Evaluation of Online Video SEEDS We report results of the online video superpixels on the Chen Xiph. [sent-300, score-0.572]

73 To achieve the desired amount of temporal superpixels, we select the number of superpixels per frame from a range between 200 and 600, and the superpixel rate from a range between 0 and 6. [sent-306, score-1.03]

74 This results in a total number of video superpixels between 200 and 1086. [sent-307, score-0.488]

75 Evolution of superpixel metrics as a function of the amount of randomization introduced in Eq. [sent-329, score-0.507]

76 Evaluation of Randomized SEEDS We evaluate the accuracy of the randomized superpixel samples by analyzing the effect of different levels of randomization added in Eq. [sent-334, score-0.567]

77 Evaluation of Video Objectness We report results of the video objectness measure on temporal windows to showcase the advantages of randomized SEEDS on video. [sent-348, score-1.014]

78 We also report results of objectness mise between accuracy and efficiency. [sent-349, score-0.538]

79 We report results of the objectness measure on PASCAL VOC07 [4]. [sent-351, score-0.562]

80 We use our score with the randomized SEEDS to measure the objectness in still images, without temporal propagation. [sent-355, score-0.847]

81 In this way, we are able to compare it to s-o-a objectness measures [1, 11, 6, 14]. [sent-356, score-0.517]

82 As baselines, we use the output of boundary detectors, instead of using randomized SEEDS, to compute our objectness score in still images. [sent-357, score-0.731]

83 The objectness measure based on randomised SEEDS with 5 samples outperforms the one computed using only one sample, which emphasises the usefulness of using Randomized SEEDS. [sent-362, score-0.605]

84 9b there are the results compared to s-o-a objectness measures in still images. [sent-366, score-0.538]

85 It shows that our objectness method is competitive with the s-o-a, while being an order of magnitude faster. [sent-367, score-0.539]

86 Also note that the presented objectness measure only uses superpixels, while the others rely on additional cues (e. [sent-368, score-0.541]

87 We report results for our video objectness score using the Chen dataset [3] where we manually annotated object bounding boxes in the video sequences. [sent-376, score-0.98]

88 In the video case, a stricter 50% criterion is used over the entire bounding box tube: the temporal window must overlap at least 50% with the ground truth over the entire shot of the video. [sent-377, score-0.415]

89 As these temporal objectness windows are presented as a novel concept, we compare our method to some baselines. [sent-379, score-0.747]

90 Additionally, to show the usefulness of the video objectness score (noted as 3D edge in the figure), we compare with a method that powbnejrasuctlipenroi5msp a mg eastpilhoeunsdi. [sent-381, score-0.711]

91 Comparison of the objectness measure with sampling superpixels on PASCAL VOC07 to (a) baselines, (b) s-o-a, and (c) evalua- tion of video objectness on the Chen dataset. [sent-389, score-1.546]

92 × video objectness score (3D edge) there is an improvement in accuracy because the score is updated over time. [sent-392, score-0.739]

93 It is interesting to note that the 1-sample-version benefits much more from the video objectness score than the 5-sample-version. [sent-394, score-0.687]

94 The reason why is that the video objectness score can be seen as a form of multiple samples as well: the score is the sum over 25 samples in time. [sent-395, score-0.819]

95 03s for the superpixel samples, 10−5s for the score computation r( 0th. [sent-398, score-0.43]

96 Conclusions In this paper we have introduced a novel online video superpixel algorithm that is able to run in real-time, with accuracy comparable to offline methods. [sent-404, score-0.617]

97 To achieve this, we have introduced novel concepts for temporal propagation, termination and creation of superpixels in time, using hierarchical block sizes and temporal histograms. [sent-405, score-0.871]

98 We have demonstrated a new capability of our superpixel algorithm by efficiently extracting multiple diverse samples of superpixels. [sent-406, score-0.46]

99 This allowed us to introduce a new, highly efficient objectness measure, together with its extension to video ob- jectness. [sent-407, score-0.635]

100 Finally, our experiments have shown that both the video superpixel and objectness algorithms match s-oa offline methods in terms of accuracy, but at much higher speeds. [sent-409, score-1.049]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('objectness', 0.517), ('superpixel', 0.378), ('superpixels', 0.37), ('seeds', 0.319), ('blocks', 0.135), ('bnt', 0.13), ('temporal', 0.129), ('video', 0.118), ('atn', 0.107), ('randomized', 0.104), ('windows', 0.101), ('bounding', 0.097), ('frame', 0.095), ('cbtn', 0.093), ('atm', 0.091), ('int', 0.076), ('catm', 0.074), ('catn', 0.074), ('hierarchy', 0.073), ('tubes', 0.072), ('climbing', 0.072), ('partition', 0.068), ('termination', 0.068), ('block', 0.067), ('exchanges', 0.066), ('online', 0.063), ('hill', 0.061), ('partitions', 0.058), ('exchange', 0.058), ('creation', 0.058), ('tube', 0.057), ('boxes', 0.057), ('atp', 0.056), ('bmt', 0.056), ('exchanging', 0.055), ('score', 0.052), ('terminate', 0.05), ('energy', 0.05), ('atk', 0.049), ('pixels', 0.047), ('injecting', 0.046), ('gbh', 0.046), ('randomization', 0.045), ('streaming', 0.043), ('samples', 0.04), ('intersection', 0.039), ('box', 0.039), ('boundary', 0.037), ('aated', 0.037), ('bsd', 0.037), ('catk', 0.037), ('cfurramreent', 0.037), ('oath', 0.037), ('stills', 0.037), ('videos', 0.036), ('offline', 0.036), ('frames', 0.035), ('metrics', 0.034), ('boundaries', 0.034), ('atq', 0.033), ('intermediary', 0.033), ('shot', 0.032), ('shots', 0.031), ('exchanged', 0.03), ('pixel', 0.03), ('per', 0.03), ('orders', 0.029), ('propagated', 0.029), ('nystrom', 0.029), ('supplementary', 0.028), ('hierarchical', 0.028), ('amount', 0.028), ('perimeter', 0.027), ('ntr', 0.027), ('fulfill', 0.027), ('undersegmentation', 0.027), ('sharon', 0.026), ('hj', 0.026), ('gpb', 0.026), ('layers', 0.026), ('layer', 0.025), ('starts', 0.024), ('measure', 0.024), ('usefulness', 0.024), ('blobs', 0.024), ('pascal', 0.023), ('canny', 0.023), ('magnitude', 0.022), ('cuts', 0.022), ('introduced', 0.022), ('baselines', 0.022), ('material', 0.022), ('capability', 0.021), ('hofe', 0.021), ('extracting', 0.021), ('report', 0.021), ('stream', 0.021), ('still', 0.021), ('seconds', 0.02), ('hypotheses', 0.02)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 299 iccv-2013-Online Video SEEDS for Temporal Window Objectness

Author: Michael Van_Den_Bergh, Gemma Roig, Xavier Boix, Santiago Manen, Luc Van_Gool

2 0.32365906 414 iccv-2013-Temporally Consistent Superpixels

Author: Matthias Reso, Jörn Jachalsky, Bodo Rosenhahn, Jörn Ostermann

Abstract: Superpixel algorithms represent a very useful and increasingly popular preprocessing step for a wide range of computer vision applications, as they offer the potential to boost efficiency and effectiveness. In this regards, this paper presents a highly competitive approach for temporally consistent superpixelsfor video content. The approach is based on energy-minimizing clustering utilizing a novel hybrid clustering strategy for a multi-dimensional feature space working in a global color subspace and local spatial subspaces. Moreover, a new contour evolution based strategy is introduced to ensure spatial coherency of the generated superpixels. For a thorough evaluation the proposed approach is compared to state of the art supervoxel algorithms using established benchmarks and shows a superior performance.

3 0.31891197 282 iccv-2013-Multi-view Object Segmentation in Space and Time

Author: Abdelaziz Djelouah, Jean-Sébastien Franco, Edmond Boyer, François Le_Clerc, Patrick Pérez

Abstract: In this paper, we address the problem of object segmentation in multiple views or videos when two or more viewpoints of the same scene are available. We propose a new approach that propagates segmentation coherence information in both space and time, hence allowing evidences in one image to be shared over the complete set. To this aim the segmentation is cast as a single efficient labeling problem over space and time with graph cuts. In contrast to most existing multi-view segmentation methods that rely on some form of dense reconstruction, ours only requires a sparse 3D sampling to propagate information between viewpoints. The approach is thoroughly evaluated on standard multiview datasets, as well as on videos. With static views, results compete with state of the art methods but they are achieved with significantly fewer viewpoints. With multiple videos, we report results that demonstrate the benefit of segmentation propagation through temporal cues.

4 0.25796276 71 iccv-2013-Category-Independent Object-Level Saliency Detection

Author: Yangqing Jia, Mei Han

Abstract: It is known that purely low-level saliency cues such as frequency does not lead to a good salient object detection result, requiring high-level knowledge to be adopted for successful discovery of task-independent salient objects. In this paper, we propose an efficient way to combine such high-level saliency priors and low-level appearance models. We obtain the high-level saliency prior with the objectness algorithm to find potential object candidates without the need of category information, and then enforce the consistency among the salient regions using a Gaussian MRF with the weights scaled by diverse density that emphasizes the influence of potential foreground pixels. Our model obtains saliency maps that assign high scores for the whole salient object, and achieves state-of-the-art performance on benchmark datasets covering various foreground statistics.

5 0.18499869 160 iccv-2013-Fast Object Segmentation in Unconstrained Video

Author: Anestis Papazoglou, Vittorio Ferrari

Abstract: We present a technique for separating foreground objects from the background in a video. Our method isfast, , fully automatic, and makes minimal assumptions about the video. This enables handling essentially unconstrained settings, including rapidly moving background, arbitrary object motion and appearance, and non-rigid deformations and articulations. In experiments on two datasets containing over 1400 video shots, our method outperforms a state-of-theart background subtraction technique [4] as well as methods based on clustering point tracks [6, 18, 19]. Moreover, it performs comparably to recent video object segmentation methods based on objectproposals [14, 16, 27], while being orders of magnitude faster.

6 0.14951317 383 iccv-2013-Semi-supervised Learning for Large Scale Image Cosegmentation

7 0.14947973 374 iccv-2013-Salient Region Detection by UFO: Uniqueness, Focusness and Objectness

8 0.14936909 377 iccv-2013-Segmentation Driven Object Detection with Fisher Vectors

9 0.12872888 442 iccv-2013-Video Segmentation by Tracking Many Figure-Ground Segments

10 0.12612364 144 iccv-2013-Estimating the 3D Layout of Indoor Scenes and Its Clutter from Depth Sensors

11 0.116025 379 iccv-2013-Semantic Segmentation without Annotating Segments

12 0.1148451 110 iccv-2013-Detecting Curved Symmetric Parts Using a Deformable Disc Model

13 0.087655768 172 iccv-2013-Flattening Supervoxel Hierarchies by the Uniform Entropy Slice

14 0.081379801 33 iccv-2013-A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis

15 0.079885721 424 iccv-2013-Tracking Revisited Using RGBD Camera: Unified Benchmark and Baselines

16 0.077903472 91 iccv-2013-Contextual Hypergraph Modeling for Salient Object Detection

17 0.073541693 76 iccv-2013-Coarse-to-Fine Semantic Video Segmentation Using Supervoxel Trees

18 0.073072061 318 iccv-2013-PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects

19 0.071868226 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation

20 0.071457401 201 iccv-2013-Holistic Scene Understanding for 3D Object Detection with RGBD Cameras

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.165), (1, -0.034), (2, 0.141), (3, 0.021), (4, 0.042), (5, 0.048), (6, -0.086), (7, 0.071), (8, -0.015), (9, -0.09), (10, -0.009), (11, 0.13), (12, 0.06), (13, 0.029), (14, -0.097), (15, -0.078), (16, -0.04), (17, -0.072), (18, -0.146), (19, -0.047), (20, 0.056), (21, -0.147), (22, -0.094), (23, -0.037), (24, -0.117), (25, -0.018), (26, -0.08), (27, 0.025), (28, -0.122), (29, 0.028), (30, 0.008), (31, 0.051), (32, -0.055), (33, 0.064), (34, 0.163), (35, -0.119), (36, 0.193), (37, -0.055), (38, -0.07), (39, -0.02), (40, 0.064), (41, -0.074), (42, -0.064), (43, 0.042), (44, -0.03), (45, 0.006), (46, 0.129), (47, 0.048), (48, 0.146), (49, 0.016)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96135116 299 iccv-2013-Online Video SEEDS for Temporal Window Objectness

Author: Michael Van_Den_Bergh, Gemma Roig, Xavier Boix, Santiago Manen, Luc Van_Gool

2 0.88765335 414 iccv-2013-Temporally Consistent Superpixels

Author: Matthias Reso, Jörn Jachalsky, Bodo Rosenhahn, Jörn Ostermann

3 0.72693884 282 iccv-2013-Multi-view Object Segmentation in Space and Time

Author: Abdelaziz Djelouah, Jean-Sébastien Franco, Edmond Boyer, François Le_Clerc, Patrick Pérez

4 0.62420064 383 iccv-2013-Semi-supervised Learning for Large Scale Image Cosegmentation

Author: Zhengxiang Wang, Rujie Liu

Abstract: This paper introduces to use semi-supervised learning for large scale image cosegmentation. Different from traditional unsupervised cosegmentation that does not use any segmentation groundtruth, semi-supervised cosegmentation exploits the similarity from both the very limited training image foregrounds, as well as the common object shared between the large number of unsegmented images. This would be a much practical way to effectively cosegment a large number of related images simultaneously, where previous unsupervised cosegmentation work poorly due to the large variances in appearance between different images and the lack ofsegmentation groundtruthfor guidance in cosegmentation. For semi-supervised cosegmentation in large scale, we propose an effective method by minimizing an energy function, which consists of the inter-image distance, the intraimage distance and the balance term. We also propose an iterative updating algorithm to efficiently solve this energy function, which decomposes the original energy minimization problem into sub-problems, and updates each image alternatively to reduce the number of variables in each subproblem for computation efficiency. Experiment results on iCoseg and Pascal VOC datasets show that the proposed cosegmentation method can effectively cosegment hundreds of images in less than one minute. And our semi-supervised cosegmentation is able to outperform both unsupervised cosegmentation as well asfully supervised single image segmentation, especially when the training data is limited.

5 0.54479569 76 iccv-2013-Coarse-to-Fine Semantic Video Segmentation Using Supervoxel Trees

Author: Aastha Jain, Shuanak Chatterjee, René Vidal

Abstract: We propose an exact, general and efficient coarse-to-fine energy minimization strategy for semantic video segmentation. Our strategy is based on a hierarchical abstraction of the supervoxel graph that allows us to minimize an energy defined at the finest level of the hierarchy by minimizing a series of simpler energies defined over coarser graphs. The strategy is exact, i.e., it produces the same solution as minimizing over the finest graph. It is general, i.e., it can be used to minimize any energy function (e.g., unary, pairwise, and higher-order terms) with any existing energy minimization algorithm (e.g., graph cuts and belief propagation). It also gives significant speedups in inference for several datasets with varying degrees of spatio-temporal continuity. We also discuss the strengths and weaknesses of our strategy relative to existing hierarchical approaches, and the kinds of image and video data that provide the best speedups.

6 0.52857596 33 iccv-2013-A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis

7 0.52856612 160 iccv-2013-Fast Object Segmentation in Unconstrained Video

8 0.50087667 172 iccv-2013-Flattening Supervoxel Hierarchies by the Uniform Entropy Slice

9 0.49407521 110 iccv-2013-Detecting Curved Symmetric Parts Using a Deformable Disc Model

10 0.43430513 377 iccv-2013-Segmentation Driven Object Detection with Fisher Vectors

11 0.43012333 275 iccv-2013-Motion-Aware KNN Laplacian for Video Matting

12 0.36796194 442 iccv-2013-Video Segmentation by Tracking Many Figure-Ground Segments

13 0.36117548 397 iccv-2013-Space-Time Tradeoffs in Photo Sequencing

14 0.354229 186 iccv-2013-GrabCut in One Cut

15 0.35037306 318 iccv-2013-PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects

16 0.34727234 71 iccv-2013-Category-Independent Object-Level Saliency Detection

17 0.34384915 416 iccv-2013-The Interestingness of Images

18 0.33769241 374 iccv-2013-Salient Region Detection by UFO: Uniqueness, Focusness and Objectness

19 0.33711049 349 iccv-2013-Regionlets for Generic Object Detection

20 0.33164269 144 iccv-2013-Estimating the 3D Layout of Indoor Scenes and Its Clutter from Depth Sensors

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.056), (7, 0.013), (12, 0.275), (26, 0.128), (31, 0.028), (40, 0.013), (42, 0.078), (48, 0.022), (64, 0.073), (73, 0.028), (89, 0.15), (98, 0.017)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.88452256 413 iccv-2013-Target-Driven Moire Pattern Synthesis by Phase Modulation

Author: Pei-Hen Tsai, Yung-Yu Chuang

Abstract: This paper investigates an approach for generating two grating images so that the moir e´ pattern of their superposition resembles the target image. Our method is grounded on the fundamental moir e´ theorem. By focusing on the visually most dominant (1, −1)-moir e´ component, we obtain the phase smto ddoumlaintiaonnt c (o1n,s−tr1a)in-mt on the phase shifts bee otwbteaeinn the two grating images. For improving visual appearance of the grating images and hiding capability the embedded image, a smoothness term is added to spread information between the two grating images and an appearance phase function is used to add irregular structures into grating images. The grating images can be printed on transparencies and the hidden image decoding can be performed optically by overlaying them together. The proposed method enables the creation of moir e´ art and allows visual decoding without computers.

2 0.85090387 305 iccv-2013-POP: Person Re-identification Post-rank Optimisation

Author: Chunxiao Liu, Chen Change Loy, Shaogang Gong, Guijin Wang

Abstract: Owing to visual ambiguities and disparities, person reidentification methods inevitably produce suboptimal ranklist, which still requires exhaustive human eyeballing to identify the correct target from hundreds of different likelycandidates. Existing re-identification studies focus on improving the ranking performance, but rarely look into the critical problem of optimising the time-consuming and error-prone post-rank visual search at the user end. In this study, we present a novel one-shot Post-rank OPtimisation (POP) method, which allows a user to quickly refine their search by either “one-shot” or a couple of sparse negative selections during a re-identification process. We conduct systematic behavioural studies to understand user’s searching behaviour and show that the proposed method allows correct re-identification to converge 2.6 times faster than the conventional exhaustive search. Importantly, through extensive evaluations we demonstrate that the method is capable of achieving significant improvement over the stateof-the-art distance metric learning based ranking models, even with just “one shot” feedback optimisation, by as much as over 30% performance improvement for rank 1reidentification on the VIPeR and i-LIDS datasets.

3 0.7897796 451 iccv-2013-Write a Classifier: Zero-Shot Learning Using Purely Textual Descriptions

Author: Mohamed Elhoseiny, Babak Saleh, Ahmed Elgammal

Abstract: The main question we address in this paper is how to use purely textual description of categories with no training images to learn visual classifiers for these categories. We propose an approach for zero-shot learning of object categories where the description of unseen categories comes in the form of typical text such as an encyclopedia entry, without the need to explicitly defined attributes. We propose and investigate two baseline formulations, based on regression and domain adaptation. Then, we propose a new constrained optimization formulation that combines a regression function and a knowledge transfer function with additional constraints to predict the classifier parameters for new classes. We applied the proposed approach on two fine-grained categorization datasets, and the results indicate successful classifier prediction.

same-paper 4 0.78232473 299 iccv-2013-Online Video SEEDS for Temporal Window Objectness

Author: Michael Van_Den_Bergh, Gemma Roig, Xavier Boix, Santiago Manen, Luc Van_Gool

5 0.7522552 417 iccv-2013-The Moving Pose: An Efficient 3D Kinematics Descriptor for Low-Latency Action Recognition and Detection

Author: Mihai Zanfir, Marius Leordeanu, Cristian Sminchisescu

Abstract: Human action recognition under low observational latency is receiving a growing interest in computer vision due to rapidly developing technologies in human-robot interaction, computer gaming and surveillance. In this paper we propose a fast, simple, yet powerful non-parametric Moving Pose (MP)frameworkfor low-latency human action and activity recognition. Central to our methodology is a moving pose descriptor that considers both pose information as well as differential quantities (speed and acceleration) of the human body joints within a short time window around the current frame. The proposed descriptor is used in conjunction with a modified kNN classifier that considers both the temporal location of a particular frame within the action sequence as well as the discrimination power of its moving pose descriptor compared to other frames in the training set. The resulting method is non-parametric and enables low-latency recognition, one-shot learning, and action detection in difficult unsegmented sequences. Moreover, the framework is real-time, scalable, and outperforms more sophisticated approaches on challenging benchmarks like MSR-Action3D or MSR-DailyActivities3D.

6 0.72723716 367 iccv-2013-SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels

7 0.69170141 274 iccv-2013-Monte Carlo Tree Search for Scheduling Activity Recognition

8 0.68261492 338 iccv-2013-Randomized Ensemble Tracking

9 0.66834068 136 iccv-2013-Efficient Pedestrian Detection by Directly Optimizing the Partial Area under the ROC Curve

10 0.66426092 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation

11 0.65419227 452 iccv-2013-YouTube2Text: Recognizing and Describing Arbitrary Activities Using Semantic Hierarchies and Zero-Shot Recognition

12 0.64989042 440 iccv-2013-Video Event Understanding Using Natural Language Descriptions

13 0.6492756 61 iccv-2013-Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition

14 0.64913923 150 iccv-2013-Exemplar Cut

15 0.64889324 428 iccv-2013-Translating Video Content to Natural Language Descriptions

16 0.64734989 190 iccv-2013-Handling Occlusions with Franken-Classifiers

17 0.64708686 127 iccv-2013-Dynamic Pooling for Complex Event Recognition

18 0.64481574 180 iccv-2013-From Where and How to What We See

19 0.64080012 241 iccv-2013-Learning Near-Optimal Cost-Sensitive Decision Policy for Object Detection

20 0.63817728 316 iccv-2013-Pictorial Human Spaces: How Well Do Humans Perceive a 3D Articulated Pose?