iccv iccv2013 iccv2013-137 knowledge-graph by maker-knowledge-mining

137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction


Source: pdf

Author: Ming-Ming Cheng, Jonathan Warrell, Wen-Yan Lin, Shuai Zheng, Vibhav Vineet, Nigel Crook

Abstract: Detecting visually salient regions in images is one of the fundamental problems in computer vision. We propose a novel method to decompose an image into large scale perceptually homogeneous elements for efficient salient region detection, using a soft image abstraction representation. By considering both appearance similarity and spatial distribution of image pixels, the proposed representation abstracts out unnecessary image details, allowing the assignment of comparable saliency values across similar regions, and producing perceptually accurate salient region detection. We evaluate our salient region detection approach on the largest publicly available dataset with pixel accurate annotations. The experimental results show that the proposed method outperforms 18 alternate methods, reducing the mean absolute error by 25.2% compared to the previous best result, while being computationally more efficient.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Efficient Salient Region Detection with Soft Image Abstraction Ming-Ming Cheng Jonathan Warrell Wen-Yan Lin Shuai Zheng Vibhav Vineet Vision Group, Oxford Brookes University Nigel Crook Abstract Detecting visually salient regions in images is one of the fundamental problems in computer vision. [sent-1, score-0.359]

2 We propose a novel method to decompose an image into large scale perceptually homogeneous elements for efficient salient region detection, using a soft image abstraction representation. [sent-2, score-1.452]

3 By considering both appearance similarity and spatial distribution of image pixels, the proposed representation abstracts out unnecessary image details, allowing the assignment of comparable saliency values across similar regions, and producing perceptually accurate salient region detection. [sent-3, score-1.469]

4 We evaluate our salient region detection approach on the largest publicly available dataset with pixel accurate annotations. [sent-4, score-0.454]

5 The experimental results show that the proposed method outperforms 18 alternate methods, reducing the mean absolute error by 25. [sent-5, score-0.041]

6 Introduction The automatic detection of salient object regions in images involves a soft decomposition of foreground and back- ground image elements [7]. [sent-8, score-0.67]

7 This kind of decomposition is a key component of many computer vision and graphics tasks. [sent-9, score-0.069]

8 In terms of improving salient region detection, there are two emerging trends: • Global cues: which enable the assignment of comparGabloleb saliency wvahiluches across tshiemi alsasri image regions apnadwhich are preferred to local cues [2, 14, 16, 26, 31, 42]. [sent-11, score-0.945]

9 • Image abstraction: where an image is decomposed into perceptually homogeneous e ilmemageen its, a process ewdh inictoh abstracts out unnecessary detail and which is important for high quality saliency detection [42]. [sent-12, score-1.152]

10 We use soft image abstraction to decompose an image into large scale perceptually homogeneous elements (see Fig. [sent-14, score-1.098]

11 3), which abstract unnecessary details, assign comparable saliency values across similar image regions, and produce perceptually accurate salient regions detection results (b). [sent-15, score-1.26]

12 In this paper, we propose a novel soft image abstraction approach that captures large scale perceptually homogeneous elements, thus enabling effective estimation of global saliency cues. [sent-16, score-1.4]

13 Unlike previous techniques that rely on super-pixels for image abstraction [42], we use histogram quantization to collect appearance samples for a global Gaussian Mixture Model (GMM) based decomposition. [sent-17, score-0.434]

14 Components sharing the same spatial support are further grouped to provide a more compact and meaningful presentation. [sent-18, score-0.113]

15 This soft abstraction avoids the hard decision boundaries of super pixels, allowing abstraction components with very large spatial support. [sent-19, score-0.858]

16 This allows the subsequent global saliency cues to uniformly highlight entire salient object regions. [sent-20, score-0.98]

17 Finally, we integrate the two global saliency cues, Global Uniqueness (GU) and Color Spatial Distribution (CSD), by automatically identifying which one is more likely to provide the correct identification of the salient region. [sent-21, score-0.714]

18 We extensively evaluate our salient object region detection method on the largest publicly available dataset with 1000 images containing pixel accurate salient region annotations [2]. [sent-22, score-0.808]

19 The evaluation results show that each of our individual measures (GU and CSD) significantly outper- forms existing 18 alternate approaches, and the final Global Cues (GC) saliency map reduces the mean absolute error by 25. [sent-23, score-0.388]

20 1 1Results for these methods on the entire dataset and our prototype software can be found in our project page: http://mmcheng. [sent-26, score-0.038]

21 Related work While often treated as an image processing operation, saliency has its roots within human perception. [sent-34, score-0.377]

22 A considerable amount of research in cognitive psychology [48] and neurobiology [18] has been devoted to discovering the mechanisms of visual attention in humans [38]. [sent-36, score-0.081]

23 These regions where attention is focused are termed salient regions. [sent-37, score-0.404]

24 Insights from psycho-visual research have influenced computational saliency detection methods, resulting in significant improvements in performance [5]. [sent-38, score-0.403]

25 Our research is situated in the highly active field ofvisual attention modelling. [sent-39, score-0.075]

26 We refer interested readers to recent survey papers for a detailed discussion of 65 models [5], as well as quantitative analysis of different methods in the two major research directions: salient object region detection [7] and human fixation prediction [6, 32]. [sent-41, score-0.461]

27 Here, we mainly focus on discussing bottom-up, low-level salient object region detection methods. [sent-42, score-0.443]

28 Inspired by the early representation model of Koch and Ullman [35], Itti et al. [sent-43, score-0.057]

29 [30] proposed highly influential computational methods, which use local centre-surrounded differences across multi-scale image features to detect image saliency. [sent-44, score-0.057]

30 A large number of methods have been proposed to extend this method, including the fuzzy growing method by Ma and Zhang [37], and graph-based visual saliency detection by Harel et al. [sent-45, score-0.431]

31 Later, Hou and Zhang [28] proposed an interesting spectral-based method, which finds differential components in the spectral domain. [sent-47, score-0.115]

32 Methods modeling global properties have become popular recently as they enable the assignment of comparable saliency values across similar image regions, and thus can uniformly highlight the entire object regions [16]. [sent-51, score-0.741]

33 [25] use a patch based approach to incorporate global properties. [sent-53, score-0.079]

34 [47] estimate saliency over the whole image relative to a large dictionary ofimages. [sent-55, score-0.347]

35 [36] measure center-surrounded histograms over windows of various sizes and aspect ratios in a sliding window manner, and learn the combination weights relative to other saliency cues. [sent-57, score-0.347]

36 While these algorithms are generally better at preserving global image structures and are able to highlight entire salient object regions, they suffer from high computational complexity. [sent-58, score-0.48]

37 Finding efficient and compact representations has been shown to be a promising way of modeling global considerations. [sent-59, score-0.079]

38 Initial efforts tried to adopt only luminance [50] or first-order average color [2] to effectively estimate consistent results. [sent-60, score-0.058]

39 However, they ignored complex color variations in natural images and spatial relationships across image parts. [sent-61, score-0.139]

40 [16] proposed a region contrast-based method to model global contrast, showing significantly improved performance. [sent-63, score-0.145]

41 However, due to the use of image segments, saliency cues like spatial distribution cannot be easily formulated. [sent-64, score-0.48]

42 [42] made the important observation that decomposing an image into perceptually homogeneous elements, which abstract unnecessary details, is important for high quality salient object detection. [sent-66, score-0.949]

43 They used superpixels to abstract the image into perceptually uniform regions and efficient N-D Gaussian filtering to estimate global saliency cues. [sent-67, score-0.93]

44 As detailed in §3, we propose a GmMateM g lboabsaeld s aablisetnracyct representation, tdo capture large oscseale a perceptually homogeneous elements, resulting in the efficient evaluation of global cues and improved salient object region detection accuracy. [sent-68, score-1.166]

45 Histogram based efficient GMM decomposition In order to get an abstract global representation which effectively captures perceptually homogeneous elements, we cluster image colors and represent them using Gaussian Mixture Models (GMM). [sent-72, score-0.88]

46 Each pixel color Ix is represented 11553300 as a weighted combination of several GMM components, with its probability of belonging to a component c given by: p(c|Ix) =? [sent-73, score-0.302]

47 epresent respectively the weight, mean color, and covariance matrix of the cth component. [sent-75, score-0.067]

48 We use the GMM to decompose an image in to perceptually homogenous elements. [sent-76, score-0.505]

49 These elements are structurally representative and abstract away unnecessary details. [sent-77, score-0.241]

50 Notice that our GMM-based representation better captures large scale perceptually homogeneous elements than superpixel representations (as in Fig. [sent-80, score-0.749]

51 We will discuss how our global homogeneous components representation benefits global saliency cue estimation in §4. [sent-82, score-0.868]

52 e GMM-based representation is clustering pixel colors and fitting them to each GMM component. [sent-84, score-0.225]

53 Such clustering can be achieved using Orchard and Bouman’s algorithm [40], which starts with all pixels in a single cluster and iteratively uses the eigenvalues and eigenvector of the covariance matrix to decide which cluster to split and the splitting point. [sent-85, score-0.251]

54 Inspired by [16], we first run color quantization in RGB color space with each color channel divided in to 12 uniform parts and choose the most frequently occurring colors which account for 95% of the image pixels. [sent-86, score-0.302]

55 This typically result in a histogram based representation with N bins (on average N = 85 for 1000 images dataset [2] as reported by [16]). [sent-87, score-0.114]

56 We take each histogram bin as a weighted color sample to build the color covariance matrix and learn the remaining parameters of the GMM (the means and probabilities for belonging to each component) from the weighted bins. [sent-88, score-0.416]

57 3) to associate image pixels with histogram betiansil efodr i computational efficiency. [sent-90, score-0.164]

58 Spatial overlap based components clustering Direct GMM based color clustering ignores valuable spatial correlations in images. [sent-93, score-0.389]

59 3(a), the 0th and 6th GMM components have similar spatial supports, and thus have high probability of belonging to the same object, even when their colors (shown in side of Fig. [sent-95, score-0.397]

60 We explore the potential of such spatial relations to build pairwise correlation be- tween GMM components as illustrated in Fig. [sent-97, score-0.199]

61 3(b), where the correlation of two GMM components ci and cj is defined as their spatial agreement: C(ci,cj) =mi? [sent-98, score-0.199]

62 pixel Ix belonging to each GMM components is typically sparse ? [sent-109, score-0.286]

63 (a) Illustration of our soft image abstraction (b)Cor elation(c)Cluster dGM (d)Abstractionby[42] Figure 3. [sent-116, score-0.387]

64 Example of global components representation for the source image shown in Fig. [sent-117, score-0.251]

65 Components in our GMM based × representation (a) are further clustered according to the their spatial correlations (b) to get a more meaningful global representation (c), which better capture perceptually homogeneous regions. [sent-119, score-0.92]

66 In (a), the 0-6th sub-images represent the probabilities of image pixels belonging to each GMM component, while the last subimage shows a reconstruction using these GMM components. [sent-120, score-0.24]

67 An abstract representation by [42] using superpixels is shown in (d). [sent-121, score-0.09]

68 (with a very high probability of belonging to one of the top two components). [sent-122, score-0.163]

69 This allows us to scan the image once and find all the pairwise component correlations simultaneously. [sent-123, score-0.084]

70 For every pixel, we only choose the top two components with the highest probability and make this pixel only contribute to these two components. [sent-124, score-0.195]

71 In the implementation, we blur the probability maps of the image pixels belonging to each component by a 3 3 uniform kernel to allow tinheg ctoor eraeclahtio conm cpaolcnuelnatti obny tao 3 3co ×ns 3id uneri a rsmm aklel surrounding neighborhood. [sent-125, score-0.269]

72 The correlation matrix of these GMM components is taken as their similarity for message-passing based clustering [22]. [sent-126, score-0.204]

73 We use message-passing based clustering as it does not need a predefined cluster number, making it applicable to an unknown underlying distribution. [sent-127, score-0.117]

74 After such clustering, the probability of each pixel color Ix belonging to each cluster C is the sum of its probabilities for belonging ttoo aeallc Gh McluMst components c i onf t ihtes pcrluosbtaebr:i p(C|Ix) = p(C|Ib) = ? [sent-128, score-0.637]

75 c∈C (3) where Ib is the quantized histogram bin color of Ix. [sent-130, score-0.149]

76 3(c), we demonstrate an example of clustering a GMM of 7 initial components to 3 clusters with more homogenous semantic relations. [sent-132, score-0.263]

77 In this example, although the red flower and its dark shadows have quite dissimilar colors, they are successfully grouped together since they cover approximately the same spatial region. [sent-133, score-0.084]

78 3 115533 11 is a toy example for easier illustration, and we use 15 initial GMM components in order to capture images with more complicated structure in our final implementation. [sent-135, score-0.115]

79 Hierarchical representation and indexing The proposed representation forms a 4-layer hierarchical structure with an index table to associate cross-layer relations efficiently. [sent-138, score-0.185]

80 The 0th layer contains all the image pixels, thus allowing us to generate full resolution saliency maps. [sent-139, score-0.516]

81 When estimating global saliency cues in §4, we mainly work at the higher layer wl shaeline fnecyasi cbulee sin i no §rd4e,r wtoe a mllaowin lya wrgeo ksca alte t perceptually homogenous elements to receive similar saliency values, and to speed up the computation time. [sent-141, score-1.581]

82 In the hierarchi- cal representation, only the 0th layer contains a large number of elements (the same as image pixels). [sent-142, score-0.24]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('gmm', 0.458), ('perceptually', 0.367), ('saliency', 0.347), ('salient', 0.288), ('abstraction', 0.269), ('homogeneous', 0.191), ('layer', 0.135), ('belonging', 0.127), ('soft', 0.118), ('components', 0.115), ('elements', 0.105), ('unnecessary', 0.103), ('ix', 0.103), ('homogenous', 0.09), ('abstracts', 0.088), ('cues', 0.08), ('global', 0.079), ('highlight', 0.075), ('gc', 0.074), ('csd', 0.073), ('regions', 0.071), ('colors', 0.066), ('region', 0.066), ('ib', 0.063), ('cluster', 0.059), ('color', 0.058), ('clustering', 0.058), ('representation', 0.057), ('histogram', 0.057), ('detection', 0.056), ('spatial', 0.053), ('fixation', 0.051), ('decompose', 0.048), ('correlations', 0.047), ('attention', 0.045), ('pixel', 0.044), ('probabilities', 0.043), ('alternate', 0.041), ('clustered', 0.04), ('orchard', 0.039), ('oscseale', 0.039), ('brookes', 0.039), ('ility', 0.039), ('ncn', 0.039), ('nigel', 0.039), ('representat', 0.039), ('shuai', 0.039), ('covariance', 0.039), ('uniformly', 0.038), ('assignment', 0.038), ('entire', 0.038), ('component', 0.037), ('associate', 0.037), ('neurobiology', 0.036), ('esult', 0.036), ('crook', 0.036), ('mss', 0.036), ('ooppyyrriigghhtt', 0.036), ('pixels', 0.036), ('cheng', 0.036), ('gu', 0.036), ('probability', 0.036), ('subsequent', 0.035), ('bin', 0.034), ('efodr', 0.034), ('subimage', 0.034), ('pci', 0.034), ('bouman', 0.034), ('warrell', 0.034), ('indexing', 0.034), ('allowing', 0.034), ('superpixels', 0.033), ('uniform', 0.033), ('structurally', 0.033), ('discussing', 0.033), ('ullman', 0.033), ('decomposition', 0.032), ('wtoe', 0.031), ('jonathan', 0.031), ('perazzi', 0.031), ('grouped', 0.031), ('correlation', 0.031), ('layers', 0.031), ('seg', 0.03), ('roots', 0.03), ('benefiting', 0.03), ('ofvisual', 0.03), ('quantization', 0.029), ('influential', 0.029), ('harel', 0.029), ('captures', 0.029), ('meaningful', 0.029), ('ns', 0.029), ('cor', 0.028), ('ttoo', 0.028), ('goferman', 0.028), ('cth', 0.028), ('across', 0.028), ('fuzzy', 0.028), ('enable', 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000011 137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction

Author: Ming-Ming Cheng, Jonathan Warrell, Wen-Yan Lin, Shuai Zheng, Vibhav Vineet, Nigel Crook

Abstract: Detecting visually salient regions in images is one of the fundamental problems in computer vision. We propose a novel method to decompose an image into large scale perceptually homogeneous elements for efficient salient region detection, using a soft image abstraction representation. By considering both appearance similarity and spatial distribution of image pixels, the proposed representation abstracts out unnecessary image details, allowing the assignment of comparable saliency values across similar regions, and producing perceptually accurate salient region detection. We evaluate our salient region detection approach on the largest publicly available dataset with pixel accurate annotations. The experimental results show that the proposed method outperforms 18 alternate methods, reducing the mean absolute error by 25.2% compared to the previous best result, while being computationally more efficient.

2 0.3728056 71 iccv-2013-Category-Independent Object-Level Saliency Detection

Author: Yangqing Jia, Mei Han

Abstract: It is known that purely low-level saliency cues such as frequency does not lead to a good salient object detection result, requiring high-level knowledge to be adopted for successful discovery of task-independent salient objects. In this paper, we propose an efficient way to combine such high-level saliency priors and low-level appearance models. We obtain the high-level saliency prior with the objectness algorithm to find potential object candidates without the need of category information, and then enforce the consistency among the salient regions using a Gaussian MRF with the weights scaled by diverse density that emphasizes the influence of potential foreground pixels. Our model obtains saliency maps that assign high scores for the whole salient object, and achieves state-of-the-art performance on benchmark datasets covering various foreground statistics.

3 0.34641117 372 iccv-2013-Saliency Detection via Dense and Sparse Reconstruction

Author: Xiaohui Li, Huchuan Lu, Lihe Zhang, Xiang Ruan, Ming-Hsuan Yang

Abstract: In this paper, we propose a visual saliency detection algorithm from the perspective of reconstruction errors. The image boundaries are first extracted via superpixels as likely cues for background templates, from which dense and sparse appearance models are constructed. For each image region, we first compute dense and sparse reconstruction errors. Second, the reconstruction errors are propagated based on the contexts obtained from K-means clustering. Third, pixel-level saliency is computed by an integration of multi-scale reconstruction errors and refined by an object-biased Gaussian model. We apply the Bayes formula to integrate saliency measures based on dense and sparse reconstruction errors. Experimental results show that the proposed algorithm performs favorably against seventeen state-of-the-art methods in terms of precision and recall. In addition, the proposed algorithm is demonstrated to be more effective in highlighting salient objects uniformly and robust to background noise.

4 0.32650298 91 iccv-2013-Contextual Hypergraph Modeling for Salient Object Detection

Author: Xi Li, Yao Li, Chunhua Shen, Anthony Dick, Anton Van_Den_Hengel

Abstract: Salient object detection aims to locate objects that capture human attention within images. Previous approaches often pose this as a problem of image contrast analysis. In this work, we model an image as a hypergraph that utilizes a set of hyperedges to capture the contextual properties of image pixels or regions. As a result, the problem of salient object detection becomes one of finding salient vertices and hyperedges in the hypergraph. The main advantage of hypergraph modeling is that it takes into account each pixel’s (or region ’s) affinity with its neighborhood as well as its separation from image background. Furthermore, we propose an alternative approach based on centerversus-surround contextual contrast analysis, which performs salient object detection by optimizing a cost-sensitive support vector machine (SVM) objective function. Experimental results on four challenging datasets demonstrate the effectiveness of the proposed approaches against the stateof-the-art approaches to salient object detection.

5 0.25776908 50 iccv-2013-Analysis of Scores, Datasets, and Models in Visual Saliency Prediction

Author: Ali Borji, Hamed R. Tavakoli, Dicky N. Sihite, Laurent Itti

Abstract: Significant recent progress has been made in developing high-quality saliency models. However, less effort has been undertaken on fair assessment of these models, over large standardized datasets and correctly addressing confounding factors. In this study, we pursue a critical and quantitative look at challenges (e.g., center-bias, map smoothing) in saliency modeling and the way they affect model accuracy. We quantitatively compare 32 state-of-the-art models (using the shuffled AUC score to discount center-bias) on 4 benchmark eye movement datasets, for prediction of human fixation locations and scanpath sequence. We also account for the role of map smoothing. We find that, although model rankings vary, some (e.g., AWS, LG, AIM, and HouNIPS) consistently outperform other models over all datasets. Some models work well for prediction of both fixation locations and scanpath sequence (e.g., Judd, GBVS). Our results show low prediction accuracy for models over emotional stimuli from the NUSEF dataset. Our last benchmark, for the first time, gauges the ability of models to decode the stimulus category from statistics of fixations, saccades, and model saliency values at fixated locations. In this test, ITTI and AIM models win over other models. Our benchmark provides a comprehensive high-level picture of the strengths and weaknesses of many popular models, and suggests future research directions in saliency modeling.

6 0.24181257 373 iccv-2013-Saliency and Human Fixations: State-of-the-Art and Study of Comparison Metrics

7 0.23842002 374 iccv-2013-Salient Region Detection by UFO: Uniqueness, Focusness and Objectness

8 0.21899962 371 iccv-2013-Saliency Detection via Absorbing Markov Chain

9 0.21864042 396 iccv-2013-Space-Time Robust Representation for Action Recognition

10 0.20348679 370 iccv-2013-Saliency Detection in Large Point Sets

11 0.19361787 369 iccv-2013-Saliency Detection: A Boolean Map Approach

12 0.16510355 217 iccv-2013-Initialization-Insensitive Visual Tracking through Voting with Salient Local Features

13 0.15141842 287 iccv-2013-Neighbor-to-Neighbor Search for Fast Coding of Feature Vectors

14 0.12976927 381 iccv-2013-Semantically-Based Human Scanpath Estimation with HMMs

15 0.10680857 56 iccv-2013-Automatic Registration of RGB-D Scans via Salient Directions

16 0.09800192 411 iccv-2013-Symbiotic Segmentation and Part Localization for Fine-Grained Categorization

17 0.092986889 74 iccv-2013-Co-segmentation by Composition

18 0.089549512 439 iccv-2013-Video Co-segmentation for Meaningful Action Extraction

19 0.083593845 420 iccv-2013-Topology-Constrained Layered Tracking with Latent Flow

20 0.074682236 25 iccv-2013-A Novel Earth Mover's Distance Methodology for Image Matching with Gaussian Mixture Models


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.183), (1, -0.044), (2, 0.386), (3, -0.193), (4, -0.123), (5, 0.009), (6, 0.024), (7, -0.034), (8, 0.015), (9, -0.028), (10, -0.002), (11, 0.027), (12, 0.005), (13, -0.019), (14, -0.007), (15, -0.021), (16, 0.079), (17, -0.021), (18, -0.007), (19, 0.1), (20, -0.008), (21, -0.011), (22, 0.047), (23, -0.01), (24, -0.041), (25, -0.04), (26, 0.032), (27, -0.005), (28, -0.011), (29, 0.003), (30, -0.009), (31, 0.015), (32, 0.044), (33, 0.02), (34, 0.009), (35, 0.023), (36, 0.003), (37, -0.01), (38, -0.01), (39, 0.041), (40, -0.033), (41, 0.049), (42, -0.017), (43, -0.044), (44, -0.019), (45, -0.025), (46, -0.017), (47, -0.009), (48, -0.042), (49, 0.009)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.95225227 91 iccv-2013-Contextual Hypergraph Modeling for Salient Object Detection

Author: Xi Li, Yao Li, Chunhua Shen, Anthony Dick, Anton Van_Den_Hengel

Abstract: Salient object detection aims to locate objects that capture human attention within images. Previous approaches often pose this as a problem of image contrast analysis. In this work, we model an image as a hypergraph that utilizes a set of hyperedges to capture the contextual properties of image pixels or regions. As a result, the problem of salient object detection becomes one of finding salient vertices and hyperedges in the hypergraph. The main advantage of hypergraph modeling is that it takes into account each pixel’s (or region ’s) affinity with its neighborhood as well as its separation from image background. Furthermore, we propose an alternative approach based on centerversus-surround contextual contrast analysis, which performs salient object detection by optimizing a cost-sensitive support vector machine (SVM) objective function. Experimental results on four challenging datasets demonstrate the effectiveness of the proposed approaches against the stateof-the-art approaches to salient object detection.

same-paper 2 0.94440073 137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction

Author: Ming-Ming Cheng, Jonathan Warrell, Wen-Yan Lin, Shuai Zheng, Vibhav Vineet, Nigel Crook

Abstract: Detecting visually salient regions in images is one of the fundamental problems in computer vision. We propose a novel method to decompose an image into large scale perceptually homogeneous elements for efficient salient region detection, using a soft image abstraction representation. By considering both appearance similarity and spatial distribution of image pixels, the proposed representation abstracts out unnecessary image details, allowing the assignment of comparable saliency values across similar regions, and producing perceptually accurate salient region detection. We evaluate our salient region detection approach on the largest publicly available dataset with pixel accurate annotations. The experimental results show that the proposed method outperforms 18 alternate methods, reducing the mean absolute error by 25.2% compared to the previous best result, while being computationally more efficient.

3 0.93063211 369 iccv-2013-Saliency Detection: A Boolean Map Approach

Author: Jianming Zhang, Stan Sclaroff

Abstract: A novel Boolean Map based Saliency (BMS) model is proposed. An image is characterized by a set of binary images, which are generated by randomly thresholding the image ’s color channels. Based on a Gestalt principle of figure-ground segregation, BMS computes saliency maps by analyzing the topological structure of Boolean maps. BMS is simple to implement and efficient to run. Despite its simplicity, BMS consistently achieves state-of-the-art performance compared with ten leading methods on five eye tracking datasets. Furthermore, BMS is also shown to be advantageous in salient object detection.

4 0.91610509 71 iccv-2013-Category-Independent Object-Level Saliency Detection

Author: Yangqing Jia, Mei Han

Abstract: It is known that purely low-level saliency cues such as frequency does not lead to a good salient object detection result, requiring high-level knowledge to be adopted for successful discovery of task-independent salient objects. In this paper, we propose an efficient way to combine such high-level saliency priors and low-level appearance models. We obtain the high-level saliency prior with the objectness algorithm to find potential object candidates without the need of category information, and then enforce the consistency among the salient regions using a Gaussian MRF with the weights scaled by diverse density that emphasizes the influence of potential foreground pixels. Our model obtains saliency maps that assign high scores for the whole salient object, and achieves state-of-the-art performance on benchmark datasets covering various foreground statistics.

5 0.91582733 374 iccv-2013-Salient Region Detection by UFO: Uniqueness, Focusness and Objectness

Author: Peng Jiang, Haibin Ling, Jingyi Yu, Jingliang Peng

Abstract: The goal of saliency detection is to locate important pixels or regions in an image which attract humans ’ visual attention the most. This is a fundamental task whose output may serve as the basis for further computer vision tasks like segmentation, resizing, tracking and so forth. In this paper we propose a novel salient region detection algorithm by integrating three important visual cues namely uniqueness, focusness and objectness (UFO). In particular, uniqueness captures the appearance-derived visual contrast; focusness reflects the fact that salient regions are often photographed in focus; and objectness helps keep completeness of detected salient regions. While uniqueness has been used for saliency detection for long, it is new to integrate focusness and objectness for this purpose. In fact, focusness and objectness both provide important saliency information complementary of uniqueness. In our experiments using public benchmark datasets, we show that, even with a simple pixel level combination of the three components, the proposed approach yields significant improve- ment compared with previously reported methods.

6 0.90544826 372 iccv-2013-Saliency Detection via Dense and Sparse Reconstruction

7 0.90484202 50 iccv-2013-Analysis of Scores, Datasets, and Models in Visual Saliency Prediction

8 0.89054769 373 iccv-2013-Saliency and Human Fixations: State-of-the-Art and Study of Comparison Metrics

9 0.87485301 370 iccv-2013-Saliency Detection in Large Point Sets

10 0.86217016 371 iccv-2013-Saliency Detection via Absorbing Markov Chain

11 0.77590662 396 iccv-2013-Space-Time Robust Representation for Action Recognition

12 0.6057446 217 iccv-2013-Initialization-Insensitive Visual Tracking through Voting with Salient Local Features

13 0.50728697 381 iccv-2013-Semantically-Based Human Scanpath Estimation with HMMs

14 0.4360829 74 iccv-2013-Co-segmentation by Composition

15 0.34893006 56 iccv-2013-Automatic Registration of RGB-D Scans via Salient Directions

16 0.34866703 416 iccv-2013-The Interestingness of Images

17 0.34766841 112 iccv-2013-Detecting Irregular Curvilinear Structures in Gray Scale and Color Imagery Using Multi-directional Oriented Flux

18 0.33917081 411 iccv-2013-Symbiotic Segmentation and Part Localization for Fine-Grained Categorization

19 0.33875456 325 iccv-2013-Predicting Primary Gaze Behavior Using Social Saliency Fields

20 0.33852017 193 iccv-2013-Heterogeneous Auto-similarities of Characteristics (HASC): Exploiting Relational Information for Classification


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(1, 0.076), (2, 0.113), (7, 0.037), (26, 0.075), (31, 0.107), (35, 0.017), (42, 0.096), (45, 0.066), (57, 0.013), (64, 0.057), (73, 0.037), (89, 0.163), (97, 0.032)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92310429 137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction

Author: Ming-Ming Cheng, Jonathan Warrell, Wen-Yan Lin, Shuai Zheng, Vibhav Vineet, Nigel Crook

Abstract: Detecting visually salient regions in images is one of the fundamental problems in computer vision. We propose a novel method to decompose an image into large scale perceptually homogeneous elements for efficient salient region detection, using a soft image abstraction representation. By considering both appearance similarity and spatial distribution of image pixels, the proposed representation abstracts out unnecessary image details, allowing the assignment of comparable saliency values across similar regions, and producing perceptually accurate salient region detection. We evaluate our salient region detection approach on the largest publicly available dataset with pixel accurate annotations. The experimental results show that the proposed method outperforms 18 alternate methods, reducing the mean absolute error by 25.2% compared to the previous best result, while being computationally more efficient.

2 0.90080643 73 iccv-2013-Class-Specific Simplex-Latent Dirichlet Allocation for Image Classification

Author: Mandar Dixit, Nikhil Rasiwasia, Nuno Vasconcelos

Abstract: An extension of the latent Dirichlet allocation (LDA), denoted class-specific-simplex LDA (css-LDA), is proposed for image classification. An analysis of the supervised LDA models currently used for this task shows that the impact of class information on the topics discovered by these models is very weak in general. This implies that the discovered topics are driven by general image regularities, rather than the semantic regularities of interest for classification. To address this, we introduce a model that induces supervision in topic discovery, while retaining the original flexibility of LDA to account for unanticipated structures of interest. The proposed css-LDA is an LDA model with class supervision at the level of image features. In css-LDA topics are discovered per class, i.e. a single set of topics shared across classes is replaced by multiple class-specific topic sets. This model can be used for generative classification using the Bayes decision rule or even extended to discriminative classification with support vector machines (SVMs). A css-LDA model can endow an image with a vector of class and topic specific count statistics that are similar to the Bag-of-words (BoW) histogram. SVM-based discriminants can be learned for classes in the space of these histograms. The effectiveness of css-LDA model in both generative and discriminative classification frameworks is demonstrated through an extensive experimental evaluation, involving multiple benchmark datasets, where it is shown to outperform all existing LDA based image classification approaches.

3 0.89337516 180 iccv-2013-From Where and How to What We See

Author: S. Karthikeyan, Vignesh Jagadeesh, Renuka Shenoy, Miguel Ecksteinz, B.S. Manjunath

Abstract: Eye movement studies have confirmed that overt attention is highly biased towards faces and text regions in images. In this paper we explore a novel problem of predicting face and text regions in images using eye tracking data from multiple subjects. The problem is challenging as we aim to predict the semantics (face/text/background) only from eye tracking data without utilizing any image information. The proposed algorithm spatially clusters eye tracking data obtained in an image into different coherent groups and subsequently models the likelihood of the clusters containing faces and text using afully connectedMarkov Random Field (MRF). Given the eye tracking datafrom a test image, itpredicts potential face/head (humans, dogs and cats) and text locations reliably. Furthermore, the approach can be used to select regions of interest for further analysis by object detectors for faces and text. The hybrid eye position/object detector approach achieves better detection performance and reduced computation time compared to using only the object detection algorithm. We also present a new eye tracking dataset on 300 images selected from ICDAR, Street-view, Flickr and Oxford-IIIT Pet Dataset from 15 subjects.

4 0.89125854 38 iccv-2013-Action Recognition with Actons

Author: Jun Zhu, Baoyuan Wang, Xiaokang Yang, Wenjun Zhang, Zhuowen Tu

Abstract: With the improved accessibility to an exploding amount of video data and growing demands in a wide range of video analysis applications, video-based action recognition/classification becomes an increasingly important task in computer vision. In this paper, we propose a two-layer structure for action recognition to automatically exploit a mid-level “acton ” representation. The weakly-supervised actons are learned via a new max-margin multi-channel multiple instance learning framework, which can capture multiple mid-level action concepts simultaneously. The learned actons (with no requirement for detailed manual annotations) observe theproperties ofbeing compact, informative, discriminative, and easy to scale. The experimental results demonstrate the effectiveness ofapplying the learned actons in our two-layer structure, and show the state-ofthe-art recognition performance on two challenging action datasets, i.e., Youtube and HMDB51.

5 0.89017701 376 iccv-2013-Scene Text Localization and Recognition with Oriented Stroke Detection

Author: Lukáš Neumann, Jiri Matas

Abstract: An unconstrained end-to-end text localization and recognition method is presented. The method introduces a novel approach for character detection and recognition which combines the advantages of sliding-window and connected component methods. Characters are detected and recognized as image regions which contain strokes of specific orientations in a specific relative position, where the strokes are efficiently detected by convolving the image gradient field with a set of oriented bar filters. Additionally, a novel character representation efficiently calculated from the values obtained in the stroke detection phase is introduced. The representation is robust to shift at the stroke level, which makes it less sensitive to intra-class variations and the noise induced by normalizing character size and positioning. The effectiveness of the representation is demonstrated by the results achieved in the classification of real-world characters using an euclidian nearestneighbor classifier trained on synthetic data in a plain form. The method was evaluated on a standard dataset, where it achieves state-of-the-art results in both text localization and recognition.

6 0.88693643 420 iccv-2013-Topology-Constrained Layered Tracking with Latent Flow

7 0.88609493 117 iccv-2013-Discovering Details and Scene Structure with Hierarchical Iconoid Shift

8 0.88024127 315 iccv-2013-PhotoOCR: Reading Text in Uncontrolled Conditions

9 0.87381709 210 iccv-2013-Image Retrieval Using Textual Cues

10 0.87303495 275 iccv-2013-Motion-Aware KNN Laplacian for Video Matting

11 0.87055218 415 iccv-2013-Text Localization in Natural Images Using Stroke Feature Transform and Text Covariance Descriptors

12 0.86935699 197 iccv-2013-Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition

13 0.86926007 412 iccv-2013-Synergistic Clustering of Image and Segment Descriptors for Unsupervised Scene Understanding

14 0.86890829 72 iccv-2013-Characterizing Layouts of Outdoor Scenes Using Spatial Topic Processes

15 0.86837649 253 iccv-2013-Linear Sequence Discriminant Analysis: A Model-Based Dimensionality Reduction Method for Vector Sequences

16 0.86810917 269 iccv-2013-Modeling Occlusion by Discriminative AND-OR Structures

17 0.86613905 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization

18 0.86588115 340 iccv-2013-Real-Time Articulated Hand Pose Estimation Using Semi-supervised Transductive Regression Forests

19 0.8649472 227 iccv-2013-Large-Scale Image Annotation by Efficient and Robust Kernel Metric Learning

20 0.86482847 406 iccv-2013-Style-Aware Mid-level Representation for Discovering Visual Connections in Space and Time