cvpr cvpr2013 cvpr2013-374 knowledge-graph by maker-knowledge-mining

374 cvpr-2013-Saliency Aggregation: A Data-Driven Approach


Source: pdf

Author: Long Mai, Yuzhen Niu, Feng Liu

Abstract: A variety of methods have been developed for visual saliency analysis. These methods often complement each other. This paper addresses the problem of aggregating various saliency analysis methods such that the aggregation result outperforms each individual one. We have two major observations. First, different methods perform differently in saliency analysis. Second, the performance of a saliency analysis method varies with individual images. Our idea is to use data-driven approaches to saliency aggregation that appropriately consider the performance gaps among individual methods and the performance dependence of each method on individual images. This paper discusses various data-driven approaches and finds that the image-dependent aggregation method works best. Specifically, our method uses a Conditional Random Field (CRF) framework for saliency aggregation that not only models the contribution from individual saliency map but also the interaction between neighboringpixels. To account for the dependence of aggregation on an individual image, our approach selects a subset of images similar to the input image from a training data set and trains the CRF aggregation model only using this subset instead of the whole training set. Our experiments on public saliency benchmarks show that our aggregation method outperforms each individual saliency method and is robust with the selection of aggregated methods.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Saliency Aggregation: A Data-driven Approach Long Mai Yuzhen Niu Feng Liu Department of Computer Science, Portland State University Portland, OR, 97207 USA {mt long ,yuzhen , Abstract A variety of methods have been developed for visual saliency analysis. [sent-1, score-0.782]

2 This paper addresses the problem of aggregating various saliency analysis methods such that the aggregation result outperforms each individual one. [sent-3, score-1.297]

3 Second, the performance of a saliency analysis method varies with individual images. [sent-6, score-0.929]

4 Our idea is to use data-driven approaches to saliency aggregation that appropriately consider the performance gaps among individual methods and the performance dependence of each method on individual images. [sent-7, score-1.495]

5 This paper discusses various data-driven approaches and finds that the image-dependent aggregation method works best. [sent-8, score-0.382]

6 Specifically, our method uses a Conditional Random Field (CRF) framework for saliency aggregation that not only models the contribution from individual saliency map but also the interaction between neighboringpixels. [sent-9, score-2.094]

7 To account for the dependence of aggregation on an individual image, our approach selects a subset of images similar to the input image from a training data set and trains the CRF aggregation model only using this subset instead of the whole training set. [sent-10, score-0.974]

8 Our experiments on public saliency benchmarks show that our aggregation method outperforms each individual saliency method and is robust with the selection of aggregated methods. [sent-11, score-2.122]

9 Introduction Visual saliency measures low-level stimuli to the human vision system that grabs a viewer’s attention in the early stage of visual processing [17]. [sent-13, score-0.798]

10 It has been used in a wide range of computer vision, multimedia, and graphics applications, such as automatic object detection [16], image retrieval [23], video summarization [25], adaptive image compression [7], and content-aware image/video resizing [32]. [sent-14, score-0.033]

11 There is a rich literature on image saliency analysis [1, 2, 4–6, 8–13, 15, 17–20, 22, 24–31, 33, 35–45]. [sent-15, score-0.798]

12 These methods design a variety of biologically plausible models or use fl iu}@cs . [sent-16, score-0.015]

13 Individual saliency methods, such as GC [6], FT [2], and CA [12], often complement each other. [sent-20, score-0.827]

14 Saliency aggregation can effectively combine their results and perform better than each of them. [sent-21, score-0.397]

15 data-driven approaches to compute a saliency map from an image. [sent-22, score-0.81]

16 While these methods achieve good results statistically on public benchmarks, each of these methods has its own advantages and disadvantages. [sent-23, score-0.024]

17 More interestingly, different saliency methods can often complement each other. [sent-25, score-0.827]

18 Therefore, the aggregation of these saliency analysis results can likely outperform each individual one, as reported in a recent study [5]. [sent-26, score-1.28]

19 Their study shows that the combination of a few best-performed saliency analysis methods using pre-defined functions, such as averaging, can improve each individual one. [sent-27, score-0.916]

20 In this paper, we present a data-driven approach to saliency aggregation. [sent-28, score-0.782]

21 Our method combines saliency maps from various methods with diverged properties and large performance gaps. [sent-29, score-0.829]

22 To effectively combine these saliency maps, our approach uses machine learning methods to learn an aggregation model that appropriately determines the con- tribution of each individual method. [sent-30, score-1.349]

23 Specifically, we use a Conditional Random Field (CRF) framework [21] for 1 1 1 1 1 123 2 9 1 9 saliency aggregation that not only models the contribution from individual saliency map but also the interaction between neighboring pixels. [sent-31, score-2.142]

24 It has been observed that the performance of each individual method varies over images. [sent-32, score-0.131]

25 Therefore, saliency aggregation should be customized to each individual image. [sent-33, score-1.282]

26 To account for the dependence of aggregation on individual image, our approach first selects from a training dataset a subset of similar images to the input image and trains the CRF aggregation model only using this subset instead of the whole training set. [sent-34, score-0.974]

27 Compared to standard aggregation methods that use predefined combination functions and treat each individual method equally, our data-driven method has the following advantages. [sent-35, score-0.529]

28 First, our method considers the performance gaps among individual saliency analysis methods and better determines their contribution in aggregation. [sent-36, score-0.999]

29 Second, our method considers that the performance of each individual saliency analysis method varies over images and is able to customize an appropriate aggregation model to each input image. [sent-37, score-1.349]

30 As more and more saliency analysis methods have been developed recently, our research provides a way to best use the existing and forthcoming saliency methods and allows the possibility for pushing forward the state-theart results in saliency analysis. [sent-38, score-2.378]

31 Saliency Aggregation Our method starts from running a set of m saliency analysis algorithms, {Mi ? [sent-40, score-0.798]

32 1 ≤ i ≤ m}, on a given image I, aynsids produces m saliency maps, { mSi} }? [sent-41, score-0.782]

33 , ,1 ≤n ai g≤iv emn} i,m one ef Ior, aeandch p algorithm. [sent-42, score-0.023]

34 i1n a saliency map encodes the saliency value at pixel p. [sent-44, score-1.656]

35 The saliency value in each map is normalized to [0, 1]. [sent-45, score-0.824]

36 Our goal is to take these m saliency maps as input and produce a final saliency map S. [sent-46, score-1.631]

37 This section begins with the standard aggregation methods from previous work that use pre-defined combination functions and then elaborates our data-driven saliency aggregation approaches. [sent-47, score-1.592]

38 Standard Saliency Aggregation To serve as our baseline method, we first apply the combination strategies from [5] to saliency aggregation. [sent-50, score-0.8]

39 We then discuss their performance to motivate our data-driven aggregation approaches. [sent-51, score-0.396]

40 1 ≤ i ≤ m} computed feronm a an image aI,l tehnec aggregated saliency ≤va mlu}e cSo(mp)at pixel p of I modeled as the probability is S(p) = P(yp = 1|S1(p) , S2 (p) , . [sent-53, score-0.862]

41 m=1ζ(Si(p) , (1) where Si (p) represents the saliency value of pixel p in the saliency map Si, yp is a binary random variable taking the value 1 if p is a salient pixel and 0 otherwise, and Z is a constant. [sent-56, score-2.018]

42 Following [5], we implemented three different options for the function ζ in Equation 1, including ζ1(x) = x, ζ2(x) = exp(x), and ζ3(x) =lo−g(1x). [sent-57, score-0.014]

43 (2) We used these standard aggregation methods to combine a range of saliency analysis methods and tested them on two public saliency benchmarks FT [2] and SS [3 1]. [sent-58, score-2.031]

44 Figure 2 (a) shows that when these methods are used to aggregate three best-performed methods, they can produce encouraging results. [sent-59, score-0.036]

45 On the FT benchmark, the aggregation methods produce comparable results to the best individual method, and on the SS benchmark, they outperform each individual one. [sent-60, score-0.597]

46 When the individual methods have large performance gaps, these standard aggregation methods produce less successful results, as shown in Figure 2 (b). [sent-61, score-0.497]

47 The main reason is that they do not consider the performance difference among individual methods and treat them equally. [sent-62, score-0.115]

48 Therefore, the low-performance individual methods compromise the aggregation result. [sent-63, score-0.496]

49 This happens even when only the best-performed methods are aggregated. [sent-64, score-0.014]

50 Data-driven Saliency Aggregation We observe that while various saliency analysis methods often complement each other, there are performance gaps among them. [sent-67, score-0.912]

51 Moreover, the performance of each method varies over individual images. [sent-68, score-0.131]

52 Therefore, saliency aggregation should be individual method-aware and individual image-aware. [sent-69, score-1.364]

53 We design data-driven approaches to achieve such saliency aggregation. [sent-70, score-0.782]

54 1 Pixel-wise Aggregation Our first method associates each pixel p with a feature vector x(p) = (S1(p) , S2 (p) , · · · , Sm (p)), where Si (p) is the saliency value at p in t(hpe) saliency map Si. [sent-73, score-1.658]

55 We also assign a binary random variable yp, which indicates whether the pixel is salient or not. [sent-74, score-0.066]

56 We compute the final saliency value S(p) as the posterior probability P(yp = 1|x(p)). [sent-76, score-0.796]

57 Specifically, we model P(yp = 1|x(p)) using th=e logistic m. [sent-77, score-0.02]

58 m+ 1} is the set ofmodel parameters wwhheicrhe weigh th|ie = =co 1n. [sent-87, score-0.062]

59 ) denotes the sigmoid function σ(z) = 1 + ex1p(−z) (4) The parameter λ can be learned using a standard logistic regression technique on the training data. [sent-91, score-0.034]

60 LIN, EXP, and LOG refer to the three ζ functions in Equation 2 that are used in the standard combination function defined in Equation 1, respectively. [sent-94, score-0.032]

61 can then appropriately account for the performance gaps among individual saliency methods. [sent-95, score-0.983]

62 2 Aggregation using Conditional Random Field One potential problem with estimating the saliency value for each pixel individually is its ignorance of the interaction between neighboring pixels. [sent-98, score-0.919]

63 Our second method addresses this problem by modeling saliency estimation using binary Conditional Random Field (CRF) [21]. [sent-99, score-0.799]

64 We use CRF to capture the relation between neighboring pixels. [sent-100, score-0.048]

65 Their method estimates saliency map directly using image features. [sent-103, score-0.81]

66 In contrast, our method uses CRF to aggregate saliency analysis results from multiple methods. [sent-104, score-0.819]

67 Like the pixel-wise aggregation method, we associate each node with a saliency feature vector x(p) = (S1(p) , S2 (p) , · · · , Sm(p)) and a binary random label yp, 1 for sali(epn)t a·n·d· ,0S for non-salient. [sent-106, score-1.164]

68 The saliency label of each pixel depends not only on its feature vector, but also the labels of neighboring pixels. [sent-107, score-0.886]

69 We use a grid-shaped CRF to model the relationship between the label and feature and the feature-dependent relationship between the labels of neighboring pixels. [sent-109, score-0.119]

70 We define the conditional distribution of labels Y = {yp|p ∈ I} on eth teh efe caotnurdeisti oXn =l {xp |ibpu ∈ti oIn} as lfaoblleolws Ys, P(Y |X;θ) = Z1exp(? [sent-110, score-0.097]

71 ∈Np where p is a pixel in image I, xp is its feature, and yp is its saliency label. [sent-115, score-1.241]

72 fd(xp, yp) is the feature function that defines the relationship between the feature and label. [sent-117, score-0.025]

73 fs (xp, xq, yp, yq) is another feature function that defines the feature-dependent relationship between the labels of neighboring pixels p and q. [sent-118, score-0.138]

74 We define the feature function fd(xp, yp) based on only the input saliency maps Si. [sent-122, score-0.806]

75 1 where {λi} is a subset of the CRF model parameters and wSih h(per)e ei s{ λthe} saliency vseatlu oef a tht pixel p imn otdhee saliency map aSndi. [sent-127, score-1.696]

76 The feature function fs (xp, xq, yp, yq) has two components to model the data-dependent relationship between the 1 1 1 1 1 13 3 3 31 1 labels of neighboring pixels. [sent-128, score-0.138]

77 Particularly, if a pixel takes a high saliency value than its neighbor in an individual saliency map, it is also likely to take a more salient label after aggregation. [sent-130, score-1.744]

78 =1αi(1(yp= 1,yq= 0)− (8) 1(yp = 0, yq = 1)) (Si(p) − Si (q)) where αi are CRF model parameters in this feature function. [sent-133, score-0.189]

79 fc(xp, xq , yp, yq) follows the idea from [24] to incorporate the observation that neighboring pixels with similar col- ors should have similar saliency labels. [sent-136, score-0.968]

80 is the color difference between pixel p ahnerde q Iin( pth)e − R IG(Bq) ? [sent-141, score-0.055]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('saliency', 0.782), ('aggregation', 0.382), ('yp', 0.297), ('yq', 0.189), ('xp', 0.127), ('xq', 0.122), ('crf', 0.114), ('individual', 0.1), ('gaps', 0.069), ('neighboring', 0.048), ('complement', 0.045), ('fs', 0.044), ('fd', 0.04), ('portland', 0.04), ('fc', 0.037), ('si', 0.037), ('pixel', 0.035), ('isi', 0.033), ('appropriately', 0.032), ('conditional', 0.032), ('varies', 0.031), ('salient', 0.031), ('benchmarks', 0.03), ('dependence', 0.03), ('map', 0.028), ('fe', 0.028), ('trains', 0.028), ('sm', 0.027), ('relationship', 0.025), ('public', 0.024), ('maps', 0.024), ('ilso', 0.023), ('customize', 0.023), ('caotnurdeisti', 0.023), ('emn', 0.023), ('tehnec', 0.023), ('ofmodel', 0.023), ('meanpts', 0.023), ('ior', 0.023), ('diverged', 0.023), ('aeandch', 0.023), ('wwhheicrhe', 0.023), ('ft', 0.023), ('aggregated', 0.022), ('labels', 0.021), ('efe', 0.021), ('tribution', 0.021), ('sali', 0.021), ('otdhee', 0.021), ('aggregate', 0.021), ('interaction', 0.02), ('logistic', 0.02), ('np', 0.02), ('bq', 0.02), ('ahnerde', 0.02), ('ignorance', 0.02), ('niu', 0.02), ('combination', 0.018), ('customized', 0.018), ('selects', 0.018), ('resizing', 0.018), ('subset', 0.017), ('addresses', 0.017), ('associates', 0.017), ('gc', 0.017), ('oxn', 0.017), ('determines', 0.017), ('viewer', 0.016), ('ors', 0.016), ('stimuli', 0.016), ('pushing', 0.016), ('weigh', 0.016), ('analysis', 0.016), ('exp', 0.016), ('tht', 0.016), ('ss', 0.016), ('produce', 0.015), ('biologically', 0.015), ('hpe', 0.015), ('considers', 0.015), ('treat', 0.015), ('combine', 0.015), ('cso', 0.015), ('imn', 0.015), ('encodes', 0.015), ('summarization', 0.015), ('value', 0.014), ('motivate', 0.014), ('mai', 0.014), ('options', 0.014), ('happens', 0.014), ('field', 0.014), ('ys', 0.014), ('ca', 0.014), ('oin', 0.014), ('sigmoid', 0.014), ('compromise', 0.014), ('begins', 0.014), ('functions', 0.014), ('mp', 0.014)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0 374 cvpr-2013-Saliency Aggregation: A Data-Driven Approach

Author: Long Mai, Yuzhen Niu, Feng Liu

Abstract: A variety of methods have been developed for visual saliency analysis. These methods often complement each other. This paper addresses the problem of aggregating various saliency analysis methods such that the aggregation result outperforms each individual one. We have two major observations. First, different methods perform differently in saliency analysis. Second, the performance of a saliency analysis method varies with individual images. Our idea is to use data-driven approaches to saliency aggregation that appropriately consider the performance gaps among individual methods and the performance dependence of each method on individual images. This paper discusses various data-driven approaches and finds that the image-dependent aggregation method works best. Specifically, our method uses a Conditional Random Field (CRF) framework for saliency aggregation that not only models the contribution from individual saliency map but also the interaction between neighboringpixels. To account for the dependence of aggregation on an individual image, our approach selects a subset of images similar to the input image from a training data set and trains the CRF aggregation model only using this subset instead of the whole training set. Our experiments on public saliency benchmarks show that our aggregation method outperforms each individual saliency method and is robust with the selection of aggregated methods.

2 0.61319149 322 cvpr-2013-PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors

Author: Keyang Shi, Keze Wang, Jiangbo Lu, Liang Lin

Abstract: Driven by recent vision and graphics applications such as image segmentation and object recognition, assigning pixel-accurate saliency values to uniformly highlight foreground objects becomes increasingly critical. More often, such fine-grained saliency detection is also desired to have a fast runtime. Motivated by these, we propose a generic and fast computational framework called PISA Pixelwise Image Saliency Aggregating complementary saliency cues based on color and structure contrasts with spatial priors holistically. Overcoming the limitations of previous methods often using homogeneous superpixel-based and color contrast-only treatment, our PISA approach directly performs saliency modeling for each individual pixel and makes use of densely overlapping, feature-adaptive observations for saliency measure computation. We further impose a spatial prior term on each of the two contrast measures, which constrains pixels rendered salient to be compact and also centered in image domain. By fusing complementary contrast measures in such a pixelwise adaptive manner, the detection effectiveness is significantly boosted. Without requiring reliable region segmentation or post– relaxation, PISA exploits an efficient edge-aware image representation and filtering technique and produces spatially coherent yet detail-preserving saliency maps. Extensive experiments on three public datasets demonstrate PISA’s superior detection accuracy and competitive runtime speed over the state-of-the-arts approaches.

3 0.59191203 375 cvpr-2013-Saliency Detection via Graph-Based Manifold Ranking

Author: Chuan Yang, Lihe Zhang, Huchuan Lu, Xiang Ruan, Ming-Hsuan Yang

Abstract: Most existing bottom-up methods measure the foreground saliency of a pixel or region based on its contrast within a local context or the entire image, whereas a few methods focus on segmenting out background regions and thereby salient objects. Instead of considering the contrast between the salient objects and their surrounding regions, we consider both foreground and background cues in a different way. We rank the similarity of the image elements (pixels or regions) with foreground cues or background cues via graph-based manifold ranking. The saliency of the image elements is defined based on their relevances to the given seeds or queries. We represent the image as a close-loop graph with superpixels as nodes. These nodes are ranked based on the similarity to background and foreground queries, based on affinity matrices. Saliency detection is carried out in a two-stage scheme to extract background regions and foreground salient objects efficiently. Experimental results on two large benchmark databases demonstrate the proposed method performs well when against the state-of-the-art methods in terms of accuracy and speed. We also create a more difficult bench- mark database containing 5,172 images to test the proposed saliency model and make this database publicly available with this paper for further studies in the saliency field.

4 0.58572358 376 cvpr-2013-Salient Object Detection: A Discriminative Regional Feature Integration Approach

Author: Huaizu Jiang, Jingdong Wang, Zejian Yuan, Yang Wu, Nanning Zheng, Shipeng Li

Abstract: Salient object detection has been attracting a lot of interest, and recently various heuristic computational models have been designed. In this paper, we regard saliency map computation as a regression problem. Our method, which is based on multi-level image segmentation, uses the supervised learning approach to map the regional feature vector to a saliency score, and finally fuses the saliency scores across multiple levels, yielding the saliency map. The contributions lie in two-fold. One is that we show our approach, which integrates the regional contrast, regional property and regional backgroundness descriptors together to form the master saliency map, is able to produce superior saliency maps to existing algorithms most of which combine saliency maps heuristically computed from different types of features. The other is that we introduce a new regional feature vector, backgroundness, to characterize the background, which can be regarded as a counterpart of the objectness descriptor [2]. The performance evaluation on several popular benchmark data sets validates that our approach outperforms existing state-of-the-arts.

5 0.58416331 202 cvpr-2013-Hierarchical Saliency Detection

Author: Qiong Yan, Li Xu, Jianping Shi, Jiaya Jia

Abstract: When dealing with objects with complex structures, saliency detection confronts a critical problem namely that detection accuracy could be adversely affected if salient foreground or background in an image contains small-scale high-contrast patterns. This issue is common in natural images and forms a fundamental challenge for prior methods. We tackle it from a scale point of view and propose a multi-layer approach to analyze saliency cues. The final saliency map is produced in a hierarchical model. Different from varying patch sizes or downsizing images, our scale-based region handling is by finding saliency values optimally in a tree model. Our approach improves saliency detection on many images that cannot be handled well traditionally. A new dataset is also constructed. –

6 0.53020585 273 cvpr-2013-Looking Beyond the Image: Unsupervised Learning for Object Saliency and Detection

7 0.4366031 258 cvpr-2013-Learning Video Saliency from Human Gaze Using Candidate Selection

8 0.30179748 411 cvpr-2013-Statistical Textural Distinctiveness for Salient Region Detection in Natural Images

9 0.27168509 418 cvpr-2013-Submodular Salient Region Detection

10 0.23424463 450 cvpr-2013-Unsupervised Joint Object Discovery and Segmentation in Internet Images

11 0.22068462 205 cvpr-2013-Hollywood 3D: Recognizing Actions in 3D Natural Scenes

12 0.21598595 325 cvpr-2013-Part Discovery from Partial Correspondence

13 0.18002678 384 cvpr-2013-Segment-Tree Based Cost Aggregation for Stereo Matching

14 0.13765793 200 cvpr-2013-Harvesting Mid-level Visual Concepts from Large-Scale Internet Images

15 0.12506483 263 cvpr-2013-Learning the Change for Automatic Image Cropping

16 0.088323772 464 cvpr-2013-What Makes a Patch Distinct?

17 0.084689915 53 cvpr-2013-BFO Meets HOG: Feature Extraction Based on Histograms of Oriented p.d.f. Gradients for Image Classification

18 0.080468304 326 cvpr-2013-Patch Match Filter: Efficient Edge-Aware Filtering Meets Randomized Search for Fast Correspondence Field Estimation

19 0.066191003 157 cvpr-2013-Exploring Implicit Image Statistics for Visual Representativeness Modeling

20 0.064335793 291 cvpr-2013-Motionlets: Mid-level 3D Parts for Human Motion Recognition


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.125), (1, -0.226), (2, 0.614), (3, 0.322), (4, -0.151), (5, -0.039), (6, -0.012), (7, -0.084), (8, 0.076), (9, 0.033), (10, -0.02), (11, 0.058), (12, -0.028), (13, 0.014), (14, -0.048), (15, 0.057), (16, -0.02), (17, -0.0), (18, -0.017), (19, -0.045), (20, -0.04), (21, -0.038), (22, 0.019), (23, 0.071), (24, -0.026), (25, 0.022), (26, -0.046), (27, -0.016), (28, 0.003), (29, 0.018), (30, -0.009), (31, 0.005), (32, -0.012), (33, -0.007), (34, -0.017), (35, -0.024), (36, -0.017), (37, -0.025), (38, 0.021), (39, -0.005), (40, 0.006), (41, -0.002), (42, -0.002), (43, -0.024), (44, 0.029), (45, -0.019), (46, 0.008), (47, 0.021), (48, -0.032), (49, -0.017)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99116164 374 cvpr-2013-Saliency Aggregation: A Data-Driven Approach

Author: Long Mai, Yuzhen Niu, Feng Liu

Abstract: A variety of methods have been developed for visual saliency analysis. These methods often complement each other. This paper addresses the problem of aggregating various saliency analysis methods such that the aggregation result outperforms each individual one. We have two major observations. First, different methods perform differently in saliency analysis. Second, the performance of a saliency analysis method varies with individual images. Our idea is to use data-driven approaches to saliency aggregation that appropriately consider the performance gaps among individual methods and the performance dependence of each method on individual images. This paper discusses various data-driven approaches and finds that the image-dependent aggregation method works best. Specifically, our method uses a Conditional Random Field (CRF) framework for saliency aggregation that not only models the contribution from individual saliency map but also the interaction between neighboringpixels. To account for the dependence of aggregation on an individual image, our approach selects a subset of images similar to the input image from a training data set and trains the CRF aggregation model only using this subset instead of the whole training set. Our experiments on public saliency benchmarks show that our aggregation method outperforms each individual saliency method and is robust with the selection of aggregated methods.

2 0.92346388 322 cvpr-2013-PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors

Author: Keyang Shi, Keze Wang, Jiangbo Lu, Liang Lin

Abstract: Driven by recent vision and graphics applications such as image segmentation and object recognition, assigning pixel-accurate saliency values to uniformly highlight foreground objects becomes increasingly critical. More often, such fine-grained saliency detection is also desired to have a fast runtime. Motivated by these, we propose a generic and fast computational framework called PISA Pixelwise Image Saliency Aggregating complementary saliency cues based on color and structure contrasts with spatial priors holistically. Overcoming the limitations of previous methods often using homogeneous superpixel-based and color contrast-only treatment, our PISA approach directly performs saliency modeling for each individual pixel and makes use of densely overlapping, feature-adaptive observations for saliency measure computation. We further impose a spatial prior term on each of the two contrast measures, which constrains pixels rendered salient to be compact and also centered in image domain. By fusing complementary contrast measures in such a pixelwise adaptive manner, the detection effectiveness is significantly boosted. Without requiring reliable region segmentation or post– relaxation, PISA exploits an efficient edge-aware image representation and filtering technique and produces spatially coherent yet detail-preserving saliency maps. Extensive experiments on three public datasets demonstrate PISA’s superior detection accuracy and competitive runtime speed over the state-of-the-arts approaches.

3 0.90710455 376 cvpr-2013-Salient Object Detection: A Discriminative Regional Feature Integration Approach

Author: Huaizu Jiang, Jingdong Wang, Zejian Yuan, Yang Wu, Nanning Zheng, Shipeng Li

Abstract: Salient object detection has been attracting a lot of interest, and recently various heuristic computational models have been designed. In this paper, we regard saliency map computation as a regression problem. Our method, which is based on multi-level image segmentation, uses the supervised learning approach to map the regional feature vector to a saliency score, and finally fuses the saliency scores across multiple levels, yielding the saliency map. The contributions lie in two-fold. One is that we show our approach, which integrates the regional contrast, regional property and regional backgroundness descriptors together to form the master saliency map, is able to produce superior saliency maps to existing algorithms most of which combine saliency maps heuristically computed from different types of features. The other is that we introduce a new regional feature vector, backgroundness, to characterize the background, which can be regarded as a counterpart of the objectness descriptor [2]. The performance evaluation on several popular benchmark data sets validates that our approach outperforms existing state-of-the-arts.

4 0.87693882 375 cvpr-2013-Saliency Detection via Graph-Based Manifold Ranking

Author: Chuan Yang, Lihe Zhang, Huchuan Lu, Xiang Ruan, Ming-Hsuan Yang

Abstract: Most existing bottom-up methods measure the foreground saliency of a pixel or region based on its contrast within a local context or the entire image, whereas a few methods focus on segmenting out background regions and thereby salient objects. Instead of considering the contrast between the salient objects and their surrounding regions, we consider both foreground and background cues in a different way. We rank the similarity of the image elements (pixels or regions) with foreground cues or background cues via graph-based manifold ranking. The saliency of the image elements is defined based on their relevances to the given seeds or queries. We represent the image as a close-loop graph with superpixels as nodes. These nodes are ranked based on the similarity to background and foreground queries, based on affinity matrices. Saliency detection is carried out in a two-stage scheme to extract background regions and foreground salient objects efficiently. Experimental results on two large benchmark databases demonstrate the proposed method performs well when against the state-of-the-art methods in terms of accuracy and speed. We also create a more difficult bench- mark database containing 5,172 images to test the proposed saliency model and make this database publicly available with this paper for further studies in the saliency field.

5 0.85367996 202 cvpr-2013-Hierarchical Saliency Detection

Author: Qiong Yan, Li Xu, Jianping Shi, Jiaya Jia

Abstract: When dealing with objects with complex structures, saliency detection confronts a critical problem namely that detection accuracy could be adversely affected if salient foreground or background in an image contains small-scale high-contrast patterns. This issue is common in natural images and forms a fundamental challenge for prior methods. We tackle it from a scale point of view and propose a multi-layer approach to analyze saliency cues. The final saliency map is produced in a hierarchical model. Different from varying patch sizes or downsizing images, our scale-based region handling is by finding saliency values optimally in a tree model. Our approach improves saliency detection on many images that cannot be handled well traditionally. A new dataset is also constructed. –

6 0.84131789 411 cvpr-2013-Statistical Textural Distinctiveness for Salient Region Detection in Natural Images

7 0.75547302 418 cvpr-2013-Submodular Salient Region Detection

8 0.74345845 273 cvpr-2013-Looking Beyond the Image: Unsupervised Learning for Object Saliency and Detection

9 0.74295789 258 cvpr-2013-Learning Video Saliency from Human Gaze Using Candidate Selection

10 0.49110892 263 cvpr-2013-Learning the Change for Automatic Image Cropping

11 0.4409219 464 cvpr-2013-What Makes a Patch Distinct?

12 0.41609341 450 cvpr-2013-Unsupervised Joint Object Discovery and Segmentation in Internet Images

13 0.3144339 205 cvpr-2013-Hollywood 3D: Recognizing Actions in 3D Natural Scenes

14 0.30579889 200 cvpr-2013-Harvesting Mid-level Visual Concepts from Large-Scale Internet Images

15 0.29706302 157 cvpr-2013-Exploring Implicit Image Statistics for Visual Representativeness Modeling

16 0.27602804 325 cvpr-2013-Part Discovery from Partial Correspondence

17 0.2066936 291 cvpr-2013-Motionlets: Mid-level 3D Parts for Human Motion Recognition

18 0.16367993 321 cvpr-2013-PDM-ENLOR: Learning Ensemble of Local PDM-Based Regressions

19 0.1593888 384 cvpr-2013-Segment-Tree Based Cost Aggregation for Stereo Matching

20 0.15634128 416 cvpr-2013-Studying Relationships between Human Gaze, Description, and Computer Vision


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(1, 0.211), (10, 0.076), (16, 0.014), (26, 0.034), (33, 0.223), (67, 0.169), (69, 0.049), (87, 0.067)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.81655616 374 cvpr-2013-Saliency Aggregation: A Data-Driven Approach

Author: Long Mai, Yuzhen Niu, Feng Liu

Abstract: A variety of methods have been developed for visual saliency analysis. These methods often complement each other. This paper addresses the problem of aggregating various saliency analysis methods such that the aggregation result outperforms each individual one. We have two major observations. First, different methods perform differently in saliency analysis. Second, the performance of a saliency analysis method varies with individual images. Our idea is to use data-driven approaches to saliency aggregation that appropriately consider the performance gaps among individual methods and the performance dependence of each method on individual images. This paper discusses various data-driven approaches and finds that the image-dependent aggregation method works best. Specifically, our method uses a Conditional Random Field (CRF) framework for saliency aggregation that not only models the contribution from individual saliency map but also the interaction between neighboringpixels. To account for the dependence of aggregation on an individual image, our approach selects a subset of images similar to the input image from a training data set and trains the CRF aggregation model only using this subset instead of the whole training set. Our experiments on public saliency benchmarks show that our aggregation method outperforms each individual saliency method and is robust with the selection of aggregated methods.

2 0.8095417 103 cvpr-2013-Decoding Children's Social Behavior

Author: James M. Rehg, Gregory D. Abowd, Agata Rozga, Mario Romero, Mark A. Clements, Stan Sclaroff, Irfan Essa, Opal Y. Ousley, Yin Li, Chanho Kim, Hrishikesh Rao, Jonathan C. Kim, Liliana Lo Presti, Jianming Zhang, Denis Lantsman, Jonathan Bidwell, Zhefan Ye

Abstract: We introduce a new problem domain for activity recognition: the analysis of children ’s social and communicative behaviors based on video and audio data. We specifically target interactions between children aged 1–2 years and an adult. Such interactions arise naturally in the diagnosis and treatment of developmental disorders such as autism. We introduce a new publicly-available dataset containing over 160 sessions of a 3–5 minute child-adult interaction. In each session, the adult examiner followed a semistructured play interaction protocol which was designed to elicit a broad range of social behaviors. We identify the key technical challenges in analyzing these behaviors, and describe methods for decoding the interactions. We present experimental results that demonstrate the potential of the dataset to drive interesting research questions, and show preliminary results for multi-modal activity recognition.

3 0.80644614 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection

Author: Wanli Ouyang, Xiaogang Wang

Abstract: In this paper, we address the challenging problem of detecting pedestrians who appear in groups and have interaction. A new approach is proposed for single-pedestrian detection aided by multi-pedestrian detection. A mixture model of multi-pedestrian detectors is designed to capture the unique visual cues which are formed by nearby multiple pedestrians but cannot be captured by single-pedestrian detectors. A probabilistic framework is proposed to model the relationship between the configurations estimated by single- and multi-pedestrian detectors, and to refine the single-pedestrian detection result with multi-pedestrian detection. It can integrate with any single-pedestrian detector without significantly increasing the computation load. 15 state-of-the-art single-pedestrian detection approaches are investigated on three widely used public datasets: Caltech, TUD-Brussels andETH. Experimental results show that our framework significantly improves all these approaches. The average improvement is 9% on the Caltech-Test dataset, 11% on the TUD-Brussels dataset and 17% on the ETH dataset in terms of average miss rate. The lowest average miss rate is reduced from 48% to 43% on the Caltech-Test dataset, from 55% to 50% on the TUD-Brussels dataset and from 51% to 41% on the ETH dataset.

4 0.80601025 383 cvpr-2013-Seeking the Strongest Rigid Detector

Author: Rodrigo Benenson, Markus Mathias, Tinne Tuytelaars, Luc Van_Gool

Abstract: The current state of the art solutions for object detection describe each class by a set of models trained on discovered sub-classes (so called “components ”), with each model itself composed of collections of interrelated parts (deformable models). These detectors build upon the now classic Histogram of Oriented Gradients+linear SVM combo. In this paper we revisit some of the core assumptions in HOG+SVM and show that by properly designing the feature pooling, feature selection, preprocessing, and training methods, it is possible to reach top quality, at least for pedestrian detections, using a single rigid component. We provide experiments for a large design space, that give insights into the design of classifiers, as well as relevant information for practitioners. Our best detector is fully feed-forward, has a single unified architecture, uses only histograms of oriented gradients and colour information in monocular static images, and improves over 23 other methods on the INRIA, ETHand Caltech-USA datasets, reducing the average miss-rate over HOG+SVM by more than 30%.

5 0.80558527 160 cvpr-2013-Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-Based Classification

Author: Enrique G. Ortiz, Alan Wright, Mubarak Shah

Abstract: This paper presents an end-to-end video face recognition system, addressing the difficult problem of identifying a video face track using a large dictionary of still face images of a few hundred people, while rejecting unknown individuals. A straightforward application of the popular ?1minimization for face recognition on a frame-by-frame basis is prohibitively expensive, so we propose a novel algorithm Mean Sequence SRC (MSSRC) that performs video face recognition using a joint optimization leveraging all of the available video data and the knowledge that the face track frames belong to the same individual. By adding a strict temporal constraint to the ?1-minimization that forces individual frames in a face track to all reconstruct a single identity, we show the optimization reduces to a single minimization over the mean of the face track. We also introduce a new Movie Trailer Face Dataset collected from 101 movie trailers on YouTube. Finally, we show that our methodmatches or outperforms the state-of-the-art on three existing datasets (YouTube Celebrities, YouTube Faces, and Buffy) and our unconstrained Movie Trailer Face Dataset. More importantly, our method excels at rejecting unknown identities by at least 8% in average precision.

6 0.80547011 45 cvpr-2013-Articulated Pose Estimation Using Discriminative Armlet Classifiers

7 0.80471992 2 cvpr-2013-3D Pictorial Structures for Multiple View Articulated Pose Estimation

8 0.80163032 275 cvpr-2013-Lp-Norm IDF for Large Scale Image Search

9 0.79746753 142 cvpr-2013-Efficient Detector Adaptation for Object Detection in a Video

10 0.79461515 375 cvpr-2013-Saliency Detection via Graph-Based Manifold Ranking

11 0.79347664 345 cvpr-2013-Real-Time Model-Based Rigid Object Pose Estimation and Tracking Combining Dense and Sparse Visual Cues

12 0.79330677 246 cvpr-2013-Learning Binary Codes for High-Dimensional Data Using Bilinear Projections

13 0.78906482 339 cvpr-2013-Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation

14 0.78702062 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection

15 0.78381336 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval

16 0.7766695 288 cvpr-2013-Modeling Mutual Visibility Relationship in Pedestrian Detection

17 0.77407438 122 cvpr-2013-Detection Evolution with Multi-order Contextual Co-occurrence

18 0.76592797 322 cvpr-2013-PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors

19 0.76456434 438 cvpr-2013-Towards Pose Robust Face Recognition

20 0.76278245 60 cvpr-2013-Beyond Physical Connections: Tree Models in Human Pose Estimation