cvpr cvpr2013 cvpr2013-450 knowledge-graph by maker-knowledge-mining

450 cvpr-2013-Unsupervised Joint Object Discovery and Segmentation in Internet Images


Source: pdf

Author: Michael Rubinstein, Armand Joulin, Johannes Kopf, Ce Liu

Abstract: We present a new unsupervised algorithm to discover and segment out common objects from large and diverse image collections. In contrast to previous co-segmentation methods, our algorithm performs well even in the presence of significant amounts of noise images (images not containing a common object), as typical for datasets collected from Internet search. The key insight to our algorithm is that common object patterns should be salient within each image, while being sparse with respect to smooth transformations across images. We propose to use dense correspondences between images to capture the sparsity and visual variability of the common object over the entire database, which enables us to ignore noise objects that may be salient within their own images but do not commonly occur in others. We performed extensive numerical evaluation on es- tablished co-segmentation datasets, as well as several new datasets generated using Internet search. Our approach is able to effectively segment out the common object for diverse object categories, while naturally identifying images where the common object is not present.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Image datasets collected from Internet search vary considerably in their appearance, and typically include many noise images that do not contain the object of interest (a small subset of the car image dataset is shown in (a); the full dataset is available in the accompanying material). [sent-2, score-0.631]

2 Our algorithm automatically discovers and segments out the common object (b). [sent-3, score-0.196]

3 Note how no objects are discovered for noise images in (b). [sent-4, score-0.211]

4 Most previous co-segmentation methods, in contrast, are designed for more homogeneous datasets in which every image contains the object of interest, and, therefore, their performance degrades in the presence of noise (c). [sent-5, score-0.24]

5 Abstract We present a new unsupervised algorithm to discover and segment out common objects from large and diverse image collections. [sent-6, score-0.365]

6 In contrast to previous co-segmentation methods, our algorithm performs well even in the presence of significant amounts of noise images (images not containing a common object), as typical for datasets collected from Internet search. [sent-7, score-0.36]

7 The key insight to our algorithm is that common object patterns should be salient within each image, while being sparse with respect to smooth transformations across images. [sent-8, score-0.297]

8 We propose to use dense correspondences between images to capture the sparsity and visual variability of the common object over the entire database, which enables us to ignore noise objects that may be salient within their own images but do not commonly occur in others. [sent-9, score-0.727]

9 Our approach is able to effectively segment out the common object for diverse object categories, while naturally identifying images where the common object is not present. [sent-11, score-0.518]

10 Introduction We consider the task of jointly segmenting multiple images containing a common object. [sent-13, score-0.238]

11 The goal is to label each pixel in a set of images according to whether or not it belongs to the underlying common object, with no additional information on the images or the object class1 . [sent-14, score-0.332]

12 While numerous co-segmentation methods have been proposed, they were shown to work well mostly on small datasets, namely MSRC and iCoseg, containing salient and similar objects. [sent-17, score-0.215]

13 In fact, in most of the images in those datasets the foreground can be quite easily separated from the background based on each image alone (i. [sent-18, score-0.445]

14 Not only do the objects in images downloaded from the Internet exhibit drastically different style, color, texture, shape, pose, size, location and view-point; but such image collections also contain many noise images—images which do not contain the object of interest at all. [sent-23, score-0.445]

15 These challenges, as we demonstrate, pose great difficulties on existing cosegmentation techniques (Figure 1(c)). [sent-24, score-0.205]

16 111999333977 In this paper, we propose a novel correspondence-based object discovery and co-segmentation algorithm that performs well even in the presence of many noise images. [sent-26, score-0.273]

17 Our algorithm automatically discovers the common object among the majority of images and computes a binary object/background label mask for each image. [sent-27, score-0.303]

18 Images that do not contain the common object are naturally handled by returning an empty labeling (Figure 1(b), Figure 2). [sent-28, score-0.2]

19 Our algorithm is designed based on the assumption that pixels (features) belonging to the common object should be: (a) salient, i. [sent-29, score-0.221]

20 Given an input image dataset, we build a large-scale graphical model connecting similar images, where dense pixel correspondences are used to capture the object’s visual variability. [sent-34, score-0.275]

21 These correspondences between images allow us to separate the common object from the background and visual noise. [sent-35, score-0.516]

22 Our algorithm produces state-of-the-art results on the established MSRC and iCoseg co-segmentation datasets2, and provides considerable improvement over previous methods on several new challenging Internet datasets containing rigid and non-rigid object categories. [sent-37, score-0.216]

23 In a supervised setup, objects were treated as topics and images as documents, and generative models such as Latent Dirichlet Allocation (LDA) and Hierarchical Pitman-Yor (HPY) have been used to learn the distribution and segmentation of multiple classes simultaneously [24, 22]. [sent-45, score-0.186]

24 Recently, PageRank [7] was used to discover regions of interest in a bounding box representation [10], and selfsimilarities were used to discover a common pattern in several images [ 1]. [sent-48, score-0.371]

25 Although in these works no generative models were used to learn the distribution of visual objects, reliable matching and saliency are found to be helpful for object discovery. [sent-49, score-0.417]

26 The notions of matching and saliency were also successfully applied by Fakor et al. [sent-50, score-0.349]

27 [5], a work 1We note that while we call our method “unsupervised”, we do assume that the input image dataset contains a common visual category. [sent-51, score-0.167]

28 done in parallel to ours, for unsupervised discovery of image categories. [sent-53, score-0.212]

29 Since then, numerous methods were proposed to improve and refine the co-segmentation [16, 6, 2, 8], many of which work in the context of a pair of images with the exact same object [19, 16, 6] or require some form of user interaction [2, 4]. [sent-57, score-0.209]

30 [25] introduced the notion of ”ob- jectness” to the co-segmentation framework, showing that requiring the foreground segment to be an object often improves co-segmentation results significantly. [sent-63, score-0.296]

31 Other methods were proposed to handle images which might not contain the common object, either implicitly [9] or explicitly [11]. [sent-65, score-0.191]

32 While image annotations may facilitate object discovery and segmentation, image tags are often noisy, and bounding boxes or class labels are usually unavailable. [sent-69, score-0.249]

33 , bN}, where for each image Ii and pixel xB B= = (x {,b by), bi (x) =}, 1 w ihnedrieca ftoesr efaorcehgr imouangde ( tIhe common object), and bi (x) = 0 indicates background (not the object) at location x. [sent-79, score-0.504]

34 Recall our assumption that for an object of interest the foreground pixels should be salient, i. [sent-80, score-0.362]

35 The saliency of a pixel or a region in an image can be defined in numerous ways and exten111999334088 uercSo12345 12345 aliencyNeighbor warpedS atchingM Fnmiantoe Stgi ure2. [sent-88, score-0.371]

36 The images are shown at the top row, with two images common to the two datasets – the face and horse images in columns 1and 2, respectively. [sent-91, score-0.493]

37 Left: when adding to the two common images three images containing horses (columns 3 5), our algorithm successfully ,id reenspteificetsiv heolyr. [sent-92, score-0.384]

38 se L as :th we common object and face as “onno i msea”g, resulting mina gtehes choonrstaeins being lrasebesl (ecdo as foreground and the face being labeled as background (bottom row). [sent-93, score-0.434]

39 Right: when adding to the two common images three images containing faces, face is now recognized as common and horse as noise, and the algorithm labels the faces as foreground and the horse as background. [sent-94, score-0.804]

40 In our experiments, we used an offthe-shelf saliency measure—Cheng et al. [sent-97, score-0.276]

41 ’s Contrast-based Saliency [3]—that produced sufficiently good saliency esti- mates for our purposes, but our formulation is not limited to a particular saliency measure and others can be used. [sent-98, score-0.552]

42 [3] define the saliency of a pixel based on its color contrast to other pixels in the image (how different it is from the other pixels). [sent-100, score-0.446]

43 Since high contrast to surrounding regions is usually a stronger evidence for saliency of a region than high contrast to far away regions, they weigh the contrast by thespatial distances in the image. [sent-101, score-0.276]

44 To exploit the dataset structure and similarity between image regions, we need to establish reliable correspondences between pixels in different images. [sent-110, score-0.327]

45 This enables us to determine a pixel as background even when it may be very salient within its own image. [sent-111, score-0.284]

46 However, instead of establishing the correspondence between all pixels in a pair of images, as done by previous work, we solve and update the correspondences based on our estimation of the foreground regions. [sent-113, score-0.525]

47 This helps in ignoring background clutter and ultimately improves the correspondence between foreground pixels (Figure 3). [sent-114, score-0.374]

48 Formally, let wij denote the flow field from image Ii to image Ij . [sent-115, score-0.198]

49 Given the binary masks bi, bj, the SIFT flow objective function becomes E (wij ; bi , bj ) = ? [sent-116, score-0.349]

50 We then denote by W the set of all pixel correspondences in the dataset: W = ∪iN=1 ∪Ij ∈Ni wij . [sent-130, score-0.385]

51 Tnhdeen ndcieffse irnen thcee bdeattwaseeetn: Wthis = objective function and the original SIFT flow [15] is that it encourages matching foreground pixels in image Ii with foreground pixels in image Ij . [sent-131, score-0.608]

52 (c) Nearest neighbor ordering (bottom row; left to right) for the source image in (a), computed with a weighted Gist descriptor using the foreground estimates (top row). [sent-135, score-0.31]

53 We use the foreground mask estimates to remove background clutter when computing correspondences (a), and to improve the retrieval of neighbor images (compared to (b), the ordering in (c) places right-facing horses first, followed by leftfacing horses, with the (noise) image of a person last). [sent-138, score-0.758]

54 the contribution of this modification for establishing reliable correspondences between similar images. [sent-139, score-0.236]

55 For small datasets, we can estimate the correspondences between any pair of images, however for large datasets such computation is clearly prohibitive. [sent-140, score-0.293]

56 Therefore, we first find for each image Ii a set of similar images, Ni, based on global image statistics that are more efficient to compute, and estimate pixel correspondences with those images only. [sent-141, score-0.3]

57 We use the Gist descriptor [ 17] in our implementation, and similarly modify it to account for the foreground estimates by giving lower weight in the descriptor to pixels labeled as background. [sent-143, score-0.324]

58 Figure 3(b–c) demonstrate that better sorting of the images is achieved when using this weighted Gist descriptor, which in turn improves the set of images with which pixel correspondences are computed. [sent-144, score-0.358]

59 We use the above saliency and matching terms to define the likelihood of a pixel label: Φβ,isaliency(x) + λmatchΦimatch(x), Φi(x) =? [sent-152, score-0.376]

60 We would like the masks bi to be spatially consistent within each image, i. [sent-156, score-0.206]

61 We would also like the labeling to be consistent between images, and so we add a term accounting for the inter-image compatibility between a pixel x in image Ii and its corresponding pixel y = x wij (x) in image Ij : + Ψeijxt(x,y) = ? [sent-166, score-0.261]

62 Finally, once we have an estimate of bi, we can learn the color histograms of the background and foreground regions of ? [sent-174, score-0.324]

63 e aconndtr Hibu =tio ∪n of the pixel x to the foreground or background color model based on the segmentation estimate bi (x) : Φciolor(x, −loghbii(x)(x). [sent-182, score-0.579]

64 By combining all the aforementioned terms, we obtain a cost function, E(B; W, H), for the segmentations B given the correspondences W and the color models H: E(B;W,H) =i? [sent-184, score-0.301]

65 Our algorithm alternates between optimizing the correspondences W (Equation 2), and the binary masks B (Equation 8). [sent-192, score-0.254]

66 The algorithm then recomputes neighboring images and pixel correspondences based on the current foreground estimates, and the process is repeated for a few iterations until convergence (we typically used 5 − 10 iterations). [sent-197, score-0.492]

67 Results We conducted extensive experiments to verify our approach, both on standard co-segmentation datasets and image collections downloaded from the Internet. [sent-205, score-0.207]

68 We use two performance metrics: precision, P (the ratio of correctly labeled pixels, both foreground and background), and Jaccard similarity, J (the intersection over union of the result and ground truth segmentations). [sent-211, score-0.192]

69 Results on Co-segmentation datasets We report results for the MSRC dataset [23] (14 object classes; about 30 images per class) and iCoseg dataset [2] (30 classes; varying number of images per class), which have been widely used by previous work to evaluate co-segmentation performance. [sent-216, score-0.381]

70 Both datasets include human-given segmentations that are used for the quantitative evaluation. [sent-217, score-0.181]

71 when using the parameters above, and when setting λmatch = λext = 0, respectively), where the latter effectively reduces the method to segmenting every image independently using its saliency map and spatial regularization (combined in a Grabcut-style iterative optimization). [sent-220, score-0.327]

72 Moreover, this simple algorithm—an off-the-shelf, low-level saliency measure combined with spatial regularization—which does not use co-segmentation, is sufficient to produce accurate results (and outperforms recent techniques; see below) on the standard cosegmentation datasets ! [sent-222, score-0.591]

73 The reason is twofold: (a) all images in each visual category in those datasets contain the object of interest, and (b) for most of the images the foreground is quite easily separated from the background based on its relative saliency alone. [sent-223, score-0.922]

74 Our method outperforms theirs on all classes in MSRC and 9/16 of the classes in iCoseg (see supplementary material), and our average precision and Jaccard similarity are slightly better than theirs (Table 1). [sent-257, score-0.241]

75 Results on Internet Datasets Using the Bing API, we automatically downloaded images for three queries with query expansion through Wikipedia: car (4, 347 images), horse (6, 381 images), and airplane (4, 542 images). [sent-261, score-0.529]

76 Many objects are not very distinctive from the background in terms of color, but they were still successfully discovered due to good correspondences to other images. [sent-266, score-0.39]

77 For car, some car parts are occasionally missing as they may be less salient within their image or not well aligned to other images. [sent-267, score-0.306]

78 Similarly, for horse, the body of horses gets consistently discovered but sometimes legs are miss- ing. [sent-268, score-0.162]

79 More flexible transforms might be needed for establishing correspondences between horses. [sent-269, score-0.236]

80 For airplane, saliency plays a more important role as the uniform skies always match best regardless of the transform. [sent-270, score-0.276]

81 However the algorithm manages to correctly segment out airplanes even when they are less salient, and identifies noise images, such as that of plane cabins and jet engines, as background, since those have an overall worse matching to other images in the dataset. [sent-271, score-0.249]

82 For qualitative evaluation, we collected partial human labels for each dataset using the LabelMe annotation toolbox [2 1] and a combination of volunteers and Mechanical Turk workers, resulting in 1, 306 car, 879 horse, and 561 airplane images labeled. [sent-272, score-0.319]

83 Qualitative results for these datasets are shown in Figure 6 and the supplementary material. [sent-297, score-0.173]

84 The performance on airplane is slightly better than horse and car as in many of the images the airplane can be easily segmented out from the uniform sky background. [sent-308, score-0.58]

85 Image correspondences helped the most on the car dataset (+11% precision, +17% Jaccard similarity), probably because in many of the images the cars are not that salient, while they can be matched reliably to similar car images to be segmented correctly. [sent-309, score-0.67]

86 We also compared our results with the same three state-of-the-art cosegmentation methods as in Section 4. [sent-311, score-0.205]

87 We also compared to two baselines, one where all the pixels are classified as background (“Baseline 1”), and one where all pixels are classified as foreground (“Baseline 2”). [sent-315, score-0.405]

88 The largest gain in precision by our method is on the airplane dataset, which has the highest noise level of these three datasets. [sent-317, score-0.282]

89 False positives include a motorcycle and a headlight in the car dataset, and a tree in the horse dataset. [sent-322, score-0.24]

90 The algorithm also fails occasionally to discover objects with unique views or background. [sent-324, score-0.172]

91 Automatic discovery of cars, horses and airplanes downloaded from the Internet, containing 4, 347, 6, 381 and 4, 542 images, respectively. [sent-327, score-0.395]

92 Notice how images that do not contain the object are labeled as background. [sent-329, score-0.168]

93 The last row of each dataset shows some failure cases where no object was discovered or where the discovery is wrong or incomplete. [sent-330, score-0.347]

94 For example, had a dataset of 100 car images contained 80 images of cars and 20 images of car wheels, then using K = 16 neighbor images by our algorithm may result in intra-group connections, relating images of cars to other images of cars and images of wheels with others alike. [sent-337, score-1.061]

95 In such case the algorithm may not be able to infer that one category is more common than the other, and both cars and wheels would be segmented as foreground. [sent-338, score-0.247]

96 Conclusion We explored automatic visual object discovery and segmentation from the Internet using one query of an object category. [sent-341, score-0.405]

97 The common object often differs drastically in appearance, and a significant portion of the images may not contain the object at all. [sent-343, score-0.325]

98 We model the sparsity and saliency properties of the common object, and construct a large-scale graphical model to jointly infer a binary mask for each image. [sent-345, score-0.415]

99 We demonstrated improvement over existing cosegmentation techniques on standard co-segmentation datasets and several challenging Internet datasets. [sent-346, score-0.315]

100 Using multiple segmentations to discover objects and their extent in image collections. [sent-493, score-0.198]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('internet', 0.311), ('saliency', 0.276), ('icoseg', 0.273), ('cosegmentation', 0.205), ('foreground', 0.192), ('correspondences', 0.183), ('msrc', 0.172), ('jaccard', 0.158), ('discovery', 0.143), ('wij', 0.143), ('airplane', 0.141), ('salient', 0.14), ('bi', 0.135), ('car', 0.121), ('horse', 0.119), ('datasets', 0.11), ('horses', 0.107), ('ciolor', 0.106), ('joulin', 0.103), ('gist', 0.097), ('armand', 0.094), ('discover', 0.092), ('common', 0.09), ('bj', 0.088), ('cars', 0.085), ('background', 0.085), ('rubinstein', 0.082), ('vicente', 0.08), ('precision', 0.078), ('ii', 0.078), ('ext', 0.075), ('wheels', 0.072), ('masks', 0.071), ('segmentations', 0.071), ('unsupervised', 0.069), ('object', 0.067), ('pixels', 0.064), ('supplementary', 0.063), ('sift', 0.063), ('noise', 0.063), ('segmentation', 0.061), ('rother', 0.059), ('pixel', 0.059), ('images', 0.058), ('nxi', 0.058), ('michael', 0.057), ('downloaded', 0.056), ('flow', 0.055), ('discovered', 0.055), ('establishing', 0.053), ('kim', 0.052), ('pagerank', 0.052), ('kuettel', 0.052), ('segmenting', 0.051), ('airplanes', 0.05), ('mask', 0.049), ('user', 0.048), ('ni', 0.048), ('color', 0.047), ('cheng', 0.046), ('occasionally', 0.045), ('dataset', 0.044), ('contain', 0.043), ('si', 0.043), ('ij', 0.043), ('ordering', 0.042), ('diverse', 0.042), ('accompanying', 0.042), ('engines', 0.042), ('neighbor', 0.042), ('collections', 0.041), ('matching', 0.041), ('comparisons', 0.04), ('interest', 0.039), ('discovers', 0.039), ('labels', 0.039), ('containing', 0.039), ('russell', 0.039), ('labelme', 0.038), ('row', 0.038), ('segment', 0.037), ('qualitative', 0.037), ('similarity', 0.036), ('yuen', 0.036), ('numerous', 0.036), ('dissimilar', 0.036), ('objects', 0.035), ('xing', 0.034), ('query', 0.034), ('pages', 0.034), ('descriptor', 0.034), ('supplemental', 0.034), ('equation', 0.034), ('noticed', 0.034), ('correspondence', 0.033), ('winn', 0.033), ('nearest', 0.033), ('visual', 0.033), ('classes', 0.032), ('successfully', 0.032)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999994 450 cvpr-2013-Unsupervised Joint Object Discovery and Segmentation in Internet Images

Author: Michael Rubinstein, Armand Joulin, Johannes Kopf, Ce Liu

Abstract: We present a new unsupervised algorithm to discover and segment out common objects from large and diverse image collections. In contrast to previous co-segmentation methods, our algorithm performs well even in the presence of significant amounts of noise images (images not containing a common object), as typical for datasets collected from Internet search. The key insight to our algorithm is that common object patterns should be salient within each image, while being sparse with respect to smooth transformations across images. We propose to use dense correspondences between images to capture the sparsity and visual variability of the common object over the entire database, which enables us to ignore noise objects that may be salient within their own images but do not commonly occur in others. We performed extensive numerical evaluation on es- tablished co-segmentation datasets, as well as several new datasets generated using Internet search. Our approach is able to effectively segment out the common object for diverse object categories, while naturally identifying images where the common object is not present.

2 0.3167831 375 cvpr-2013-Saliency Detection via Graph-Based Manifold Ranking

Author: Chuan Yang, Lihe Zhang, Huchuan Lu, Xiang Ruan, Ming-Hsuan Yang

Abstract: Most existing bottom-up methods measure the foreground saliency of a pixel or region based on its contrast within a local context or the entire image, whereas a few methods focus on segmenting out background regions and thereby salient objects. Instead of considering the contrast between the salient objects and their surrounding regions, we consider both foreground and background cues in a different way. We rank the similarity of the image elements (pixels or regions) with foreground cues or background cues via graph-based manifold ranking. The saliency of the image elements is defined based on their relevances to the given seeds or queries. We represent the image as a close-loop graph with superpixels as nodes. These nodes are ranked based on the similarity to background and foreground queries, based on affinity matrices. Saliency detection is carried out in a two-stage scheme to extract background regions and foreground salient objects efficiently. Experimental results on two large benchmark databases demonstrate the proposed method performs well when against the state-of-the-art methods in terms of accuracy and speed. We also create a more difficult bench- mark database containing 5,172 images to test the proposed saliency model and make this database publicly available with this paper for further studies in the saliency field.

3 0.29185152 376 cvpr-2013-Salient Object Detection: A Discriminative Regional Feature Integration Approach

Author: Huaizu Jiang, Jingdong Wang, Zejian Yuan, Yang Wu, Nanning Zheng, Shipeng Li

Abstract: Salient object detection has been attracting a lot of interest, and recently various heuristic computational models have been designed. In this paper, we regard saliency map computation as a regression problem. Our method, which is based on multi-level image segmentation, uses the supervised learning approach to map the regional feature vector to a saliency score, and finally fuses the saliency scores across multiple levels, yielding the saliency map. The contributions lie in two-fold. One is that we show our approach, which integrates the regional contrast, regional property and regional backgroundness descriptors together to form the master saliency map, is able to produce superior saliency maps to existing algorithms most of which combine saliency maps heuristically computed from different types of features. The other is that we introduce a new regional feature vector, backgroundness, to characterize the background, which can be regarded as a counterpart of the objectness descriptor [2]. The performance evaluation on several popular benchmark data sets validates that our approach outperforms existing state-of-the-arts.

4 0.29007846 202 cvpr-2013-Hierarchical Saliency Detection

Author: Qiong Yan, Li Xu, Jianping Shi, Jiaya Jia

Abstract: When dealing with objects with complex structures, saliency detection confronts a critical problem namely that detection accuracy could be adversely affected if salient foreground or background in an image contains small-scale high-contrast patterns. This issue is common in natural images and forms a fundamental challenge for prior methods. We tackle it from a scale point of view and propose a multi-layer approach to analyze saliency cues. The final saliency map is produced in a hierarchical model. Different from varying patch sizes or downsizing images, our scale-based region handling is by finding saliency values optimally in a tree model. Our approach improves saliency detection on many images that cannot be handled well traditionally. A new dataset is also constructed. –

5 0.27659735 273 cvpr-2013-Looking Beyond the Image: Unsupervised Learning for Object Saliency and Detection

Author: Parthipan Siva, Chris Russell, Tao Xiang, Lourdes Agapito

Abstract: We propose a principled probabilistic formulation of object saliency as a sampling problem. This novel formulation allows us to learn, from a large corpus of unlabelled images, which patches of an image are of the greatest interest and most likely to correspond to an object. We then sample the object saliency map to propose object locations. We show that using only a single object location proposal per image, we are able to correctly select an object in over 42% of the images in the PASCAL VOC 2007 dataset, substantially outperforming existing approaches. Furthermore, we show that our object proposal can be used as a simple unsupervised approach to the weakly supervised annotation problem. Our simple unsupervised approach to annotating objects of interest in images achieves a higher annotation accuracy than most weakly supervised approaches.

6 0.26940697 322 cvpr-2013-PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors

7 0.23424463 374 cvpr-2013-Saliency Aggregation: A Data-Driven Approach

8 0.18530862 258 cvpr-2013-Learning Video Saliency from Human Gaze Using Candidate Selection

9 0.1767907 325 cvpr-2013-Part Discovery from Partial Correspondence

10 0.17155512 200 cvpr-2013-Harvesting Mid-level Visual Concepts from Large-Scale Internet Images

11 0.16324389 418 cvpr-2013-Submodular Salient Region Detection

12 0.15404925 411 cvpr-2013-Statistical Textural Distinctiveness for Salient Region Detection in Natural Images

13 0.15345573 107 cvpr-2013-Deformable Spatial Pyramid Matching for Fast Dense Correspondences

14 0.14472355 460 cvpr-2013-Weakly-Supervised Dual Clustering for Image Semantic Segmentation

15 0.14028077 205 cvpr-2013-Hollywood 3D: Recognizing Actions in 3D Natural Scenes

16 0.13965815 138 cvpr-2013-Efficient 2D-to-3D Correspondence Filtering for Scalable 3D Object Recognition

17 0.13349448 43 cvpr-2013-Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs

18 0.13034314 148 cvpr-2013-Ensemble Video Object Cut in Highly Dynamic Scenes

19 0.12960127 216 cvpr-2013-Improving Image Matting Using Comprehensive Sampling Sets

20 0.12846506 235 cvpr-2013-Jointly Aligning and Segmenting Multiple Web Photo Streams for the Inference of Collective Photo Storylines


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.288), (1, -0.104), (2, 0.308), (3, 0.151), (4, 0.011), (5, -0.017), (6, -0.002), (7, -0.027), (8, -0.017), (9, 0.001), (10, 0.052), (11, 0.03), (12, 0.033), (13, 0.023), (14, 0.003), (15, -0.081), (16, 0.03), (17, -0.035), (18, 0.005), (19, -0.031), (20, 0.023), (21, -0.012), (22, 0.019), (23, -0.092), (24, 0.069), (25, -0.103), (26, 0.024), (27, 0.097), (28, -0.006), (29, -0.018), (30, 0.04), (31, -0.021), (32, -0.016), (33, -0.029), (34, 0.102), (35, -0.015), (36, -0.003), (37, 0.012), (38, 0.042), (39, -0.002), (40, 0.004), (41, 0.01), (42, 0.027), (43, 0.01), (44, -0.023), (45, -0.028), (46, 0.033), (47, 0.009), (48, 0.042), (49, -0.067)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92110419 450 cvpr-2013-Unsupervised Joint Object Discovery and Segmentation in Internet Images

Author: Michael Rubinstein, Armand Joulin, Johannes Kopf, Ce Liu

Abstract: We present a new unsupervised algorithm to discover and segment out common objects from large and diverse image collections. In contrast to previous co-segmentation methods, our algorithm performs well even in the presence of significant amounts of noise images (images not containing a common object), as typical for datasets collected from Internet search. The key insight to our algorithm is that common object patterns should be salient within each image, while being sparse with respect to smooth transformations across images. We propose to use dense correspondences between images to capture the sparsity and visual variability of the common object over the entire database, which enables us to ignore noise objects that may be salient within their own images but do not commonly occur in others. We performed extensive numerical evaluation on es- tablished co-segmentation datasets, as well as several new datasets generated using Internet search. Our approach is able to effectively segment out the common object for diverse object categories, while naturally identifying images where the common object is not present.

2 0.83599126 375 cvpr-2013-Saliency Detection via Graph-Based Manifold Ranking

Author: Chuan Yang, Lihe Zhang, Huchuan Lu, Xiang Ruan, Ming-Hsuan Yang

Abstract: Most existing bottom-up methods measure the foreground saliency of a pixel or region based on its contrast within a local context or the entire image, whereas a few methods focus on segmenting out background regions and thereby salient objects. Instead of considering the contrast between the salient objects and their surrounding regions, we consider both foreground and background cues in a different way. We rank the similarity of the image elements (pixels or regions) with foreground cues or background cues via graph-based manifold ranking. The saliency of the image elements is defined based on their relevances to the given seeds or queries. We represent the image as a close-loop graph with superpixels as nodes. These nodes are ranked based on the similarity to background and foreground queries, based on affinity matrices. Saliency detection is carried out in a two-stage scheme to extract background regions and foreground salient objects efficiently. Experimental results on two large benchmark databases demonstrate the proposed method performs well when against the state-of-the-art methods in terms of accuracy and speed. We also create a more difficult bench- mark database containing 5,172 images to test the proposed saliency model and make this database publicly available with this paper for further studies in the saliency field.

3 0.83209288 202 cvpr-2013-Hierarchical Saliency Detection

Author: Qiong Yan, Li Xu, Jianping Shi, Jiaya Jia

Abstract: When dealing with objects with complex structures, saliency detection confronts a critical problem namely that detection accuracy could be adversely affected if salient foreground or background in an image contains small-scale high-contrast patterns. This issue is common in natural images and forms a fundamental challenge for prior methods. We tackle it from a scale point of view and propose a multi-layer approach to analyze saliency cues. The final saliency map is produced in a hierarchical model. Different from varying patch sizes or downsizing images, our scale-based region handling is by finding saliency values optimally in a tree model. Our approach improves saliency detection on many images that cannot be handled well traditionally. A new dataset is also constructed. –

4 0.82011944 322 cvpr-2013-PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors

Author: Keyang Shi, Keze Wang, Jiangbo Lu, Liang Lin

Abstract: Driven by recent vision and graphics applications such as image segmentation and object recognition, assigning pixel-accurate saliency values to uniformly highlight foreground objects becomes increasingly critical. More often, such fine-grained saliency detection is also desired to have a fast runtime. Motivated by these, we propose a generic and fast computational framework called PISA Pixelwise Image Saliency Aggregating complementary saliency cues based on color and structure contrasts with spatial priors holistically. Overcoming the limitations of previous methods often using homogeneous superpixel-based and color contrast-only treatment, our PISA approach directly performs saliency modeling for each individual pixel and makes use of densely overlapping, feature-adaptive observations for saliency measure computation. We further impose a spatial prior term on each of the two contrast measures, which constrains pixels rendered salient to be compact and also centered in image domain. By fusing complementary contrast measures in such a pixelwise adaptive manner, the detection effectiveness is significantly boosted. Without requiring reliable region segmentation or post– relaxation, PISA exploits an efficient edge-aware image representation and filtering technique and produces spatially coherent yet detail-preserving saliency maps. Extensive experiments on three public datasets demonstrate PISA’s superior detection accuracy and competitive runtime speed over the state-of-the-arts approaches.

5 0.81997621 376 cvpr-2013-Salient Object Detection: A Discriminative Regional Feature Integration Approach

Author: Huaizu Jiang, Jingdong Wang, Zejian Yuan, Yang Wu, Nanning Zheng, Shipeng Li

Abstract: Salient object detection has been attracting a lot of interest, and recently various heuristic computational models have been designed. In this paper, we regard saliency map computation as a regression problem. Our method, which is based on multi-level image segmentation, uses the supervised learning approach to map the regional feature vector to a saliency score, and finally fuses the saliency scores across multiple levels, yielding the saliency map. The contributions lie in two-fold. One is that we show our approach, which integrates the regional contrast, regional property and regional backgroundness descriptors together to form the master saliency map, is able to produce superior saliency maps to existing algorithms most of which combine saliency maps heuristically computed from different types of features. The other is that we introduce a new regional feature vector, backgroundness, to characterize the background, which can be regarded as a counterpart of the objectness descriptor [2]. The performance evaluation on several popular benchmark data sets validates that our approach outperforms existing state-of-the-arts.

6 0.80877924 273 cvpr-2013-Looking Beyond the Image: Unsupervised Learning for Object Saliency and Detection

7 0.80652922 411 cvpr-2013-Statistical Textural Distinctiveness for Salient Region Detection in Natural Images

8 0.79099029 418 cvpr-2013-Submodular Salient Region Detection

9 0.76987672 263 cvpr-2013-Learning the Change for Automatic Image Cropping

10 0.72476643 258 cvpr-2013-Learning Video Saliency from Human Gaze Using Candidate Selection

11 0.71640426 374 cvpr-2013-Saliency Aggregation: A Data-Driven Approach

12 0.68002993 235 cvpr-2013-Jointly Aligning and Segmenting Multiple Web Photo Streams for the Inference of Collective Photo Storylines

13 0.60648495 234 cvpr-2013-Joint Spectral Correspondence for Disparate Image Matching

14 0.60088158 464 cvpr-2013-What Makes a Patch Distinct?

15 0.58551615 22 cvpr-2013-A Non-parametric Framework for Document Bleed-through Removal

16 0.57579726 200 cvpr-2013-Harvesting Mid-level Visual Concepts from Large-Scale Internet Images

17 0.57367873 183 cvpr-2013-GRASP Recurring Patterns from a Single View

18 0.56904346 107 cvpr-2013-Deformable Spatial Pyramid Matching for Fast Dense Correspondences

19 0.56892157 352 cvpr-2013-Recovering Stereo Pairs from Anaglyphs

20 0.55192471 193 cvpr-2013-Graph Transduction Learning with Connectivity Constraints with Application to Multiple Foreground Cosegmentation


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(10, 0.174), (16, 0.03), (26, 0.056), (27, 0.068), (33, 0.314), (67, 0.088), (69, 0.047), (73, 0.019), (76, 0.015), (77, 0.012), (87, 0.07), (94, 0.021)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.98677909 324 cvpr-2013-Part-Based Visual Tracking with Online Latent Structural Learning

Author: Rui Yao, Qinfeng Shi, Chunhua Shen, Yanning Zhang, Anton van_den_Hengel

Abstract: Despite many advances made in the area, deformable targets and partial occlusions continue to represent key problems in visual tracking. Structured learning has shown good results when applied to tracking whole targets, but applying this approach to a part-based target model is complicated by the need to model the relationships between parts, and to avoid lengthy initialisation processes. We thus propose a method which models the unknown parts using latent variables. In doing so we extend the online algorithm pegasos to the structured prediction case (i.e., predicting the location of the bounding boxes) with latent part variables. To better estimate the parts, and to avoid over-fitting caused by the extra model complexity/capacity introduced by theparts, wepropose a two-stage trainingprocess, based on the primal rather than the dual form. We then show that the method outperforms the state-of-the-art (linear and non-linear kernel) trackers.

2 0.97427136 414 cvpr-2013-Structure Preserving Object Tracking

Author: Lu Zhang, Laurens van_der_Maaten

Abstract: Model-free trackers can track arbitrary objects based on a single (bounding-box) annotation of the object. Whilst the performance of model-free trackers has recently improved significantly, simultaneously tracking multiple objects with similar appearance remains very hard. In this paper, we propose a new multi-object model-free tracker (based on tracking-by-detection) that resolves this problem by incorporating spatial constraints between the objects. The spatial constraints are learned along with the object detectors using an online structured SVM algorithm. The experimental evaluation ofour structure-preserving object tracker (SPOT) reveals significant performance improvements in multi-object tracking. We also show that SPOT can improve the performance of single-object trackers by simultaneously tracking different parts of the object.

3 0.97404712 248 cvpr-2013-Learning Collections of Part Models for Object Recognition

Author: Ian Endres, Kevin J. Shih, Johnston Jiaa, Derek Hoiem

Abstract: We propose a method to learn a diverse collection of discriminative parts from object bounding box annotations. Part detectors can be trained and applied individually, which simplifies learning and extension to new features or categories. We apply the parts to object category detection, pooling part detections within bottom-up proposed regions and using a boosted classifier with proposed sigmoid weak learners for scoring. On PASCAL VOC 2010, we evaluate the part detectors ’ ability to discriminate and localize annotated keypoints. Our detection system is competitive with the best-existing systems, outperforming other HOG-based detectors on the more deformable categories.

4 0.97020984 325 cvpr-2013-Part Discovery from Partial Correspondence

Author: Subhransu Maji, Gregory Shakhnarovich

Abstract: We study the problem of part discovery when partial correspondence between instances of a category are available. For visual categories that exhibit high diversity in structure such as buildings, our approach can be used to discover parts that are hard to name, but can be easily expressed as a correspondence between pairs of images. Parts naturally emerge from point-wise landmark matches across many instances within a category. We propose a learning framework for automatic discovery of parts in such weakly supervised settings, and show the utility of the rich part library learned in this way for three tasks: object detection, category-specific saliency estimation, and fine-grained image parsing.

5 0.9701966 408 cvpr-2013-Spatiotemporal Deformable Part Models for Action Detection

Author: Yicong Tian, Rahul Sukthankar, Mubarak Shah

Abstract: Deformable part models have achieved impressive performance for object detection, even on difficult image datasets. This paper explores the generalization of deformable part models from 2D images to 3D spatiotemporal volumes to better study their effectiveness for action detection in video. Actions are treated as spatiotemporal patterns and a deformable part model is generated for each action from a collection of examples. For each action model, the most discriminative 3D subvolumes are automatically selected as parts and the spatiotemporal relations between their locations are learned. By focusing on the most distinctive parts of each action, our models adapt to intra-class variation and show robustness to clutter. Extensive experiments on several video datasets demonstrate the strength of spatiotemporal DPMs for classifying and localizing actions.

6 0.96984351 314 cvpr-2013-Online Object Tracking: A Benchmark

7 0.96921813 285 cvpr-2013-Minimum Uncertainty Gap for Robust Visual Tracking

8 0.96881831 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation

9 0.96699321 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection

10 0.96424526 164 cvpr-2013-Fast Convolutional Sparse Coding

11 0.96411568 14 cvpr-2013-A Joint Model for 2D and 3D Pose Estimation from a Single Image

12 0.96325493 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases

13 0.96306247 360 cvpr-2013-Robust Estimation of Nonrigid Transformation for Point Set Registration

14 0.96154428 426 cvpr-2013-Tensor-Based Human Body Modeling

15 0.96148831 267 cvpr-2013-Least Soft-Threshold Squares Tracking

16 0.96128213 277 cvpr-2013-MODEC: Multimodal Decomposable Models for Human Pose Estimation

17 0.96051759 60 cvpr-2013-Beyond Physical Connections: Tree Models in Human Pose Estimation

18 0.96014386 143 cvpr-2013-Efficient Large-Scale Structured Learning

19 0.96007514 131 cvpr-2013-Discriminative Non-blind Deblurring

20 0.95985806 221 cvpr-2013-Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection