iccv iccv2013 iccv2013-373 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Nicolas Riche, Matthieu Duvinage, Matei Mancas, Bernard Gosselin, Thierry Dutoit
Abstract: Visual saliency has been an increasingly active research area in the last ten years with dozens of saliency models recently published. Nowadays, one of the big challenges in the field is to find a way to fairly evaluate all of these models. In this paper, on human eye fixations ,we compare the ranking of 12 state-of-the art saliency models using 12 similarity metrics. The comparison is done on Jian Li ’s database containing several hundreds of natural images. Based on Kendall concordance coefficient, it is shown that some of the metrics are strongly correlated leading to a redundancy in the performance metrics reported in the available benchmarks. On the other hand, other metrics provide a more diverse picture of models ’ overall performance. As a recommendation, three similarity metrics should be used to obtain a complete point of view of saliency model performance.
Reference: text
sentIndex sentText sentNum sentScore
1 be Abstract Visual saliency has been an increasingly active research area in the last ten years with dozens of saliency models recently published. [sent-4, score-1.144]
2 In this paper, on human eye fixations ,we compare the ranking of 12 state-of-the art saliency models using 12 similarity metrics. [sent-6, score-1.103]
3 Based on Kendall concordance coefficient, it is shown that some of the metrics are strongly correlated leading to a redundancy in the performance metrics reported in the available benchmarks. [sent-8, score-1.003]
4 On the other hand, other metrics provide a more diverse picture of models ’ overall performance. [sent-9, score-0.428]
5 As a recommendation, three similarity metrics should be used to obtain a complete point of view of saliency model performance. [sent-10, score-1.02]
6 By outputing saliency maps that estimate the probability of each image area to grab our attention, those models allow to automatically predict the most relevant regions from images. [sent-13, score-0.633]
7 Since the early 2000s, an increasing amount of saliency models have been proposed mainly splitting into two approaches. [sent-15, score-0.587]
8 In terms of validation, the first category uses one gold standard: Precision/Recall/F-measure metrics while the other category uses a lot of different metrics. [sent-17, score-0.421]
9 Due to the diversity of available metrics for eye fixation prediction assessment, several benchmarks were proposed. [sent-18, score-0.805]
10 In 2011, Toets proposed in [27] to compare saliency models based on the Spearman’s rank correlation coefficient. [sent-19, score-0.667]
11 Lemeur in [16] also reported about methods for comparing scanpaths and saliency maps. [sent-22, score-0.585]
12 Although these benchmarks are major contributions, none of those studies deeply discussed the relevance of their similarity metrics mix. [sent-23, score-0.484]
13 Therefore, the redundancy of these metrics is discussed in the paper which is organized as follows. [sent-24, score-0.398]
14 Section 2 contains a review of all similarity metrics based on human eye fixations. [sent-25, score-0.652]
15 Section 3 describes the methods and experiments used to study metrics based on Kendall concordance scores. [sent-26, score-0.632]
16 Literature Review of Similarity Metrics In this section, all the similarity metrics that have been used to assess saliency models are presented. [sent-29, score-1.072]
17 In the next sections, the similarity metrics are briefly described within the first three categories of the proposed taxonomy, namely value-based, location-based and distribution-based. [sent-35, score-0.463]
18 Proposed two-dimensional taxonomy for visual salience metrics 2. [sent-37, score-0.44]
19 Value-based metrics: focus on saliency map values at eye gaze positions This first category of metrics compares the saliency amplitudes with the corresponding eye fixations maps. [sent-39, score-2.123]
20 The idea is to quantify the saliency map values at the eye fixation locations and to normalize it with the saliency map variance: NSS(p) =SM(pσ)S −M μSM (1) where p is the location of one fixation and SM is the saliency map which is normalized to have a zero mean and unit standard deviation. [sent-43, score-2.409]
21 2 Percentile The percentile metric is, for each pixel p on the eye fixation map, a ratio between the number of pixels in the saliency map with values smaller than the one corresponding to pixel p from the eye fixation map and the total number of pixels (Eq. [sent-49, score-1.572]
22 P(p) =|x ∈ X : SM|S(Mx)| < SM(p)| (3) where X is the set of all pixels of the saliency map SM, p is the location of one eye fixation and |SM| indicates the ptot iasl t nhuem lobceart of pixels. [sent-52, score-0.977]
23 8 where the saliency is normalized between 0 and 1. [sent-59, score-0.577]
24 Distribution-based metrics: focus on saliency and gaze statistical distributions In the literature, there are similarity and dissimilaritybased metrics between two distributions. [sent-64, score-1.124]
25 Here, two dissimilarity and three similarity metrics are described. [sent-65, score-0.491]
26 Many authors like [24], [26] and [17] already used this metric to compare saliency maps with human eyes fixations. [sent-69, score-0.731]
27 The KL-Div is a measure of the information lost when the saliency maps probability distribution (called SM) is used to approximate the human eye fixation map probability distribution (called FM). [sent-70, score-1.056]
28 It computes the minimal cost to transform the probability distribution of the saliency maps SM into the the one of the human eye fixations FM. [sent-89, score-0.98]
29 4 Spearman’s rank correlation coefficient The Spearman’s rank correlation coefficient metric [27] is defined as the CC metric (Eq. [sent-115, score-0.498]
30 Toets uses this metrics in [27] to evaluate 13 models. [sent-118, score-0.398]
31 5 Similarity metric The similarity metric [14] also uses the normalized probability distributions of the saliency map SM and human eye fixation map FM. [sent-121, score-1.346]
32 Location-based Metrics: focus on location of salient regions at gaze positions Location-based metrics are based on the notion of AUROC (Area under the Receiver Operating Characteristic curve) coming from signal detection theory. [sent-140, score-0.51]
33 First, fixations pixels were counted once and the same number of random pixels are extracted from the saliency map. [sent-145, score-0.745]
34 The idea is that no saliency algorithm can perform better (on average) than the area under the ROC curve dictated by inter-subject variability for each image. [sent-154, score-0.557]
35 j Finally, tinheg AUC of the saliency map is normalized by this ideal AUC. [sent-156, score-0.611]
36 3 Borji implementation (AUC Borji) In [5], Borji applied to saliency maps validation a suitable AUC metric called shuffled AUC. [sent-159, score-0.781]
37 In classical AUC, saliency map values from random points from the image are addressed to create a binary mask. [sent-160, score-0.632]
38 In the shuffled AUC metric, saliency values and fixations from another image (instead of random) of the same dataset are taken into account. [sent-161, score-0.778]
39 This point is important because the AUROC scores can dramatically increase if a saliency map is weighted by a centred Gaussian. [sent-163, score-0.635]
40 Indeed, human eye fixations are rarely near the edges of general test images and the amateur photographer often places salient objects in the image centre. [sent-164, score-0.412]
41 Experimental Setup The 12 metrics presented in section 2 have two inputs: the saliency map SM and the human fixation maps FM. [sent-171, score-1.298]
42 Database and Eye Fixation Maps The human eye fixation maps used in this paper are those in the database published by Li [18]. [sent-176, score-0.497]
43 This database provides eye fixation ground truth (collected with an eye tracker) for 235 colour images, of a size of 480 x 640 pixels. [sent-177, score-0.574]
44 The images are quite different with objects of interest of different sizes (small, medium and large) to avoid a size-based bias of the saliency models. [sent-179, score-0.557]
45 Saliency maps from 12 models Twelve state-of-the-art models are used to obtain different saliency maps in this study. [sent-182, score-0.709]
46 They represent the updated version of the online available models from Borji’s review paper [4] where models are sorted based on their mechanism to obtain saliency map. [sent-183, score-0.617]
47 So, we use a wide methodbased range of recently published saliency models. [sent-184, score-0.557]
48 SR [9], PFT [8], PQFT [8] and Achanta [1] use a spectral analysis approach to compute their saliency map. [sent-188, score-0.557]
49 First, it shows which metrics are close to each other. [sent-195, score-0.398]
50 Second, it intends to reduce the dimensionality of the used metrics and see which metrics should be used to do an efficient benchmark. [sent-196, score-0.796]
51 Indeed, it is important to decide which metrics should be used together because they are complementary and which ones it is useless to compute together because they will provide redundant information. [sent-197, score-0.481]
52 2 produce a saliency map for each of the 235 images of the database and these saliency maps are compared with the corresponding human eye fixation map. [sent-199, score-1.645]
53 The comparison is achieved using all the 12 previously described metrics (section 2). [sent-200, score-0.398]
54 A mean score can be computed on the whole database for each model using the different metrics which leads to 12 different rankings of the 12 models for each comparison metric. [sent-201, score-0.537]
55 This is due to the fact that the output of the metrics can be very different in terms of range of score value and some of them should be maximized (correlation measures) while others should be minimized (divergence measures). [sent-203, score-0.454]
56 By contrast, the relative rank of the different models is a consistent measure common to all metrics and its range is here between 1 and 12 (from the best model to the weakest). [sent-205, score-0.473]
57 (b) Kendall’s Measure on group of metrics with Classical Multidimensional scaling of Evaluation Measures in 2D: 1. [sent-231, score-0.418]
58 The third uses previous results to provide a fair assessment of the 12 saliency models. [sent-251, score-0.646]
59 1 Intra-Group Metrics The concordance is computed between all metrics into the three categories: value-based (amplitude), location-based and distribution-based metrics (Tab. [sent-256, score-1.003]
60 This means that these metrics provide some complementary information: they might provide different results for the same Table 4. [sent-259, score-0.425]
61 78848689 saliency map, thus one of those metrics cannot just be ignored without a possible information loose about model ranking. [sent-263, score-0.955]
62 However, one can see that the concordance between the amplitude metrics is high, which means that those measures are close and can therefore be summarized by a small subset of value-based metrics. [sent-264, score-0.65]
63 2 Inter-Group Metrics Contrary to the intra-group study that does not achieve enough concordance, the inter-group suggests that some metrics are very close as it is shown in the Kendall matrix of Fig. [sent-267, score-0.448]
64 On the opposite side, the KL-DiV metric seems like an outlier in this matrix and it is different from most of the other metrics in terms of model ranking. [sent-270, score-0.493]
65 (b) Kendall’s Measure on group of metrics with Classical Multidimensional scaling of Evaluation Measures in 2D: 2. [sent-275, score-0.418]
66 1, we decide to use a concordance of 98 % as a threshold to fuse metrics (in terms of rank). [sent-287, score-0.629]
67 This threshold means that only the rank of 2 or 3 couples of models can be inverted on the 12 models which means that the differences in terms of classification of the saliency models are really minor. [sent-288, score-0.716]
68 By using this threshold, five metrics can be fused to create a new metric called Cluster. [sent-289, score-0.52]
69 1b), the concordance of the metrics contained into the Cluster is of 98. [sent-291, score-0.605]
70 The ranking of Cluster is defined as the mean ranking of all the metrics composing it. [sent-293, score-0.569]
71 In this case the five metrics can be summarized well enough by any ofthem. [sent-295, score-0.421]
72 5 shows an assessment of each of the 12 saliency models with different metrics. [sent-307, score-0.651]
73 The resulting Cluster and the Global metrics are included. [sent-308, score-0.398]
74 One can see the rankings of the different saliency models from the best (1) to the weakest (12) on the 8 remaining metrics after dimensionality reduction. [sent-309, score-1.066]
75 The Global metric is the final ranking result after mixing all the remaining metrics (rank mean between the different rankings). [sent-310, score-0.567]
76 5 shows that the best saliency models in predicting human eye-tracking on Li’s database are RARE, AWS and PQFT. [sent-312, score-0.652]
77 Discussion Our approach is able to fairly compare saliency models rankings and not values, thus it is not possible here to test if the differences between the saliency models are statistically significant. [sent-314, score-1.252]
78 To obtain this information, one should go back to the metrics values. [sent-316, score-0.398]
79 Our study also shows that one metric is not enough to evaluate the saliency model ranking on eye fixation data. [sent-317, score-1.162]
80 Model assessment for Cluster, Global and all metrics not included in Cluster. [sent-319, score-0.462]
81 The 3 bold-marked similarity metrics are enough to provide a fair saliency models rank comparison in terms of similarity with human eye-tracking data. [sent-320, score-1.241]
82 The minimal set of similarity metrics which should be used is a) one of the metrics composing the Cluster, b) AUCBorji and c) KL-Div. [sent-321, score-0.884]
83 The use of those three metrics is enough to cover most of the space in Fig. [sent-322, score-0.421]
84 2) which means that Cluster must be used to assess any saliency model. [sent-326, score-0.579]
85 To represent the Cluster metric any of its component metrics namely Percentile or AUC-Judd or CC or NSS or Similarity metrics can be used (Fig. [sent-328, score-0.891]
86 The third recommended metric is AUC-Borji which is complementary in terms of information to the two others by bringing comparison between the eye-tracking data and saliency maps peak locations. [sent-336, score-0.752]
87 Therefore, some saliency model benchmarks existing online as Borji [3] or Judd [13] use partly redundant similarity measures. [sent-339, score-0.672]
88 Concerning Judd benchmark, while the choice of AUC-Judd and EMD metrics make sense, the use of the Similarity metric is redundant with AUC-Judd. [sent-341, score-0.522]
89 Conclusions In this paper, we reviewed 12 state-of-the-art similarity metrics for visual saliency models validation and compared them over Jian Li’s human eye-tracking fixations database with 12 recently published saliency models. [sent-344, score-1.883]
90 The conclusion of our comparison study is that evaluating a saliency model with human fixations using only one similarity metric is not enough to be fair. [sent-346, score-0.988]
91 In addition to Cluster, we suggest that 2 others metrics (AUC-Borji and KL-Div) with complementary interpretation should be used to fairly compare saliency models based on human eye-tracking data. [sent-348, score-1.129]
92 An implementation of the codes used in this paper to fairly assess his own saliency models on Li database is provided online [2] in the website project section. [sent-349, score-0.671]
93 Quantitative analysis of human-model agreement in visual saliency modeling: A comparative study. [sent-387, score-0.6]
94 Spatio-temporal saliency detection using phase spectrum of quaternion fourier transform. [sent-406, score-0.557]
95 A benchmark of computational models of saliency to predict human fixations. [sent-442, score-0.641]
96 Methods for comparing scanpaths and saliency maps: strengths and weaknesses. [sent-452, score-0.585]
97 Visual saliency based on scale-space analysis in the frequency domain. [sent-467, score-0.557]
98 Rare2012: A multi-scale raritybased saliency detection with its comparative statistical analysis. [sent-517, score-0.557]
99 Sun: A bayesian framework for saliency using natural statistics. [sent-566, score-0.557]
100 Learning a saliency map using fixated locations in natural scenes. [sent-571, score-0.591]
wordName wordTfidf (topN-words)
[('saliency', 0.557), ('metrics', 0.398), ('kendall', 0.262), ('fixation', 0.23), ('nss', 0.217), ('concordance', 0.207), ('fixations', 0.188), ('eye', 0.156), ('emd', 0.12), ('sm', 0.113), ('auc', 0.107), ('borji', 0.106), ('metric', 0.095), ('judd', 0.092), ('percentile', 0.08), ('gaze', 0.077), ('ranking', 0.074), ('coefficient', 0.074), ('cc', 0.068), ('similarity', 0.065), ('assessment', 0.064), ('cluster', 0.063), ('aws', 0.058), ('fij', 0.058), ('duvinage', 0.057), ('mancas', 0.057), ('itti', 0.052), ('mover', 0.051), ('pf', 0.051), ('fm', 0.049), ('rankings', 0.048), ('attention', 0.047), ('pele', 0.046), ('earth', 0.046), ('maps', 0.046), ('rank', 0.045), ('rare', 0.045), ('centred', 0.044), ('agreement', 0.043), ('taxonomy', 0.042), ('classical', 0.041), ('spearman', 0.039), ('multidimensional', 0.038), ('auroc', 0.038), ('cmds', 0.038), ('fmi', 0.038), ('gosselin', 0.038), ('mons', 0.038), ('peeters', 0.038), ('riche', 0.038), ('smj', 0.038), ('toets', 0.038), ('correlation', 0.035), ('salient', 0.035), ('map', 0.034), ('meur', 0.033), ('shuffled', 0.033), ('belgian', 0.033), ('spear', 0.033), ('hft', 0.033), ('weakest', 0.033), ('human', 0.033), ('jian', 0.033), ('database', 0.032), ('models', 0.03), ('fairly', 0.03), ('score', 0.029), ('redundant', 0.029), ('indeed', 0.029), ('scanpath', 0.028), ('scanpaths', 0.028), ('dissimilarity', 0.028), ('interpretation', 0.027), ('called', 0.027), ('distributions', 0.027), ('study', 0.027), ('complementary', 0.027), ('peters', 0.027), ('useless', 0.027), ('others', 0.027), ('selective', 0.026), ('zhao', 0.026), ('fair', 0.025), ('threshold', 0.024), ('divergence', 0.024), ('composing', 0.023), ('amplitude', 0.023), ('lot', 0.023), ('validation', 0.023), ('xx', 0.023), ('enough', 0.023), ('measures', 0.022), ('assess', 0.022), ('author', 0.022), ('ieee', 0.022), ('benchmarks', 0.021), ('benchmark', 0.021), ('normalized', 0.02), ('scaling', 0.02), ('achanta', 0.02)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999911 373 iccv-2013-Saliency and Human Fixations: State-of-the-Art and Study of Comparison Metrics
Author: Nicolas Riche, Matthieu Duvinage, Matei Mancas, Bernard Gosselin, Thierry Dutoit
Abstract: Visual saliency has been an increasingly active research area in the last ten years with dozens of saliency models recently published. Nowadays, one of the big challenges in the field is to find a way to fairly evaluate all of these models. In this paper, on human eye fixations ,we compare the ranking of 12 state-of-the art saliency models using 12 similarity metrics. The comparison is done on Jian Li ’s database containing several hundreds of natural images. Based on Kendall concordance coefficient, it is shown that some of the metrics are strongly correlated leading to a redundancy in the performance metrics reported in the available benchmarks. On the other hand, other metrics provide a more diverse picture of models ’ overall performance. As a recommendation, three similarity metrics should be used to obtain a complete point of view of saliency model performance.
2 0.58768719 50 iccv-2013-Analysis of Scores, Datasets, and Models in Visual Saliency Prediction
Author: Ali Borji, Hamed R. Tavakoli, Dicky N. Sihite, Laurent Itti
Abstract: Significant recent progress has been made in developing high-quality saliency models. However, less effort has been undertaken on fair assessment of these models, over large standardized datasets and correctly addressing confounding factors. In this study, we pursue a critical and quantitative look at challenges (e.g., center-bias, map smoothing) in saliency modeling and the way they affect model accuracy. We quantitatively compare 32 state-of-the-art models (using the shuffled AUC score to discount center-bias) on 4 benchmark eye movement datasets, for prediction of human fixation locations and scanpath sequence. We also account for the role of map smoothing. We find that, although model rankings vary, some (e.g., AWS, LG, AIM, and HouNIPS) consistently outperform other models over all datasets. Some models work well for prediction of both fixation locations and scanpath sequence (e.g., Judd, GBVS). Our results show low prediction accuracy for models over emotional stimuli from the NUSEF dataset. Our last benchmark, for the first time, gauges the ability of models to decode the stimulus category from statistics of fixations, saccades, and model saliency values at fixated locations. In this test, ITTI and AIM models win over other models. Our benchmark provides a comprehensive high-level picture of the strengths and weaknesses of many popular models, and suggests future research directions in saliency modeling.
3 0.4423629 71 iccv-2013-Category-Independent Object-Level Saliency Detection
Author: Yangqing Jia, Mei Han
Abstract: It is known that purely low-level saliency cues such as frequency does not lead to a good salient object detection result, requiring high-level knowledge to be adopted for successful discovery of task-independent salient objects. In this paper, we propose an efficient way to combine such high-level saliency priors and low-level appearance models. We obtain the high-level saliency prior with the objectness algorithm to find potential object candidates without the need of category information, and then enforce the consistency among the salient regions using a Gaussian MRF with the weights scaled by diverse density that emphasizes the influence of potential foreground pixels. Our model obtains saliency maps that assign high scores for the whole salient object, and achieves state-of-the-art performance on benchmark datasets covering various foreground statistics.
4 0.44142917 372 iccv-2013-Saliency Detection via Dense and Sparse Reconstruction
Author: Xiaohui Li, Huchuan Lu, Lihe Zhang, Xiang Ruan, Ming-Hsuan Yang
Abstract: In this paper, we propose a visual saliency detection algorithm from the perspective of reconstruction errors. The image boundaries are first extracted via superpixels as likely cues for background templates, from which dense and sparse appearance models are constructed. For each image region, we first compute dense and sparse reconstruction errors. Second, the reconstruction errors are propagated based on the contexts obtained from K-means clustering. Third, pixel-level saliency is computed by an integration of multi-scale reconstruction errors and refined by an object-biased Gaussian model. We apply the Bayes formula to integrate saliency measures based on dense and sparse reconstruction errors. Experimental results show that the proposed algorithm performs favorably against seventeen state-of-the-art methods in terms of precision and recall. In addition, the proposed algorithm is demonstrated to be more effective in highlighting salient objects uniformly and robust to background noise.
5 0.37203154 91 iccv-2013-Contextual Hypergraph Modeling for Salient Object Detection
Author: Xi Li, Yao Li, Chunhua Shen, Anthony Dick, Anton Van_Den_Hengel
Abstract: Salient object detection aims to locate objects that capture human attention within images. Previous approaches often pose this as a problem of image contrast analysis. In this work, we model an image as a hypergraph that utilizes a set of hyperedges to capture the contextual properties of image pixels or regions. As a result, the problem of salient object detection becomes one of finding salient vertices and hyperedges in the hypergraph. The main advantage of hypergraph modeling is that it takes into account each pixel’s (or region ’s) affinity with its neighborhood as well as its separation from image background. Furthermore, we propose an alternative approach based on centerversus-surround contextual contrast analysis, which performs salient object detection by optimizing a cost-sensitive support vector machine (SVM) objective function. Experimental results on four challenging datasets demonstrate the effectiveness of the proposed approaches against the stateof-the-art approaches to salient object detection.
6 0.31766868 396 iccv-2013-Space-Time Robust Representation for Action Recognition
7 0.29498473 369 iccv-2013-Saliency Detection: A Boolean Map Approach
8 0.26270971 374 iccv-2013-Salient Region Detection by UFO: Uniqueness, Focusness and Objectness
9 0.25966397 381 iccv-2013-Semantically-Based Human Scanpath Estimation with HMMs
10 0.256271 370 iccv-2013-Saliency Detection in Large Point Sets
11 0.2455876 371 iccv-2013-Saliency Detection via Absorbing Markov Chain
12 0.24181257 137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction
13 0.23827551 217 iccv-2013-Initialization-Insensitive Visual Tracking through Voting with Salient Local Features
14 0.19022463 180 iccv-2013-From Where and How to What We See
15 0.15267567 316 iccv-2013-Pictorial Human Spaces: How Well Do Humans Perceive a 3D Articulated Pose?
16 0.1464117 247 iccv-2013-Learning to Predict Gaze in Egocentric Video
17 0.14393017 67 iccv-2013-Calibration-Free Gaze Estimation Using Human Gaze Patterns
18 0.12401581 325 iccv-2013-Predicting Primary Gaze Behavior Using Social Saliency Fields
19 0.10344283 411 iccv-2013-Symbiotic Segmentation and Part Localization for Fine-Grained Categorization
20 0.10339738 25 iccv-2013-A Novel Earth Mover's Distance Methodology for Image Matching with Gaussian Mixture Models
topicId topicWeight
[(0, 0.154), (1, -0.046), (2, 0.535), (3, -0.296), (4, -0.183), (5, -0.001), (6, 0.093), (7, -0.047), (8, 0.024), (9, 0.065), (10, -0.009), (11, -0.046), (12, -0.037), (13, 0.004), (14, 0.032), (15, 0.012), (16, -0.028), (17, -0.005), (18, 0.027), (19, 0.044), (20, -0.005), (21, 0.001), (22, 0.026), (23, 0.006), (24, 0.035), (25, 0.006), (26, 0.044), (27, 0.007), (28, 0.011), (29, 0.022), (30, -0.017), (31, -0.045), (32, -0.02), (33, -0.009), (34, 0.044), (35, 0.006), (36, 0.012), (37, -0.026), (38, -0.029), (39, -0.01), (40, 0.043), (41, 0.013), (42, 0.018), (43, -0.009), (44, 0.031), (45, -0.01), (46, -0.012), (47, 0.003), (48, -0.011), (49, 0.041)]
simIndex simValue paperId paperTitle
same-paper 1 0.97728747 373 iccv-2013-Saliency and Human Fixations: State-of-the-Art and Study of Comparison Metrics
Author: Nicolas Riche, Matthieu Duvinage, Matei Mancas, Bernard Gosselin, Thierry Dutoit
Abstract: Visual saliency has been an increasingly active research area in the last ten years with dozens of saliency models recently published. Nowadays, one of the big challenges in the field is to find a way to fairly evaluate all of these models. In this paper, on human eye fixations ,we compare the ranking of 12 state-of-the art saliency models using 12 similarity metrics. The comparison is done on Jian Li ’s database containing several hundreds of natural images. Based on Kendall concordance coefficient, it is shown that some of the metrics are strongly correlated leading to a redundancy in the performance metrics reported in the available benchmarks. On the other hand, other metrics provide a more diverse picture of models ’ overall performance. As a recommendation, three similarity metrics should be used to obtain a complete point of view of saliency model performance.
2 0.94819427 50 iccv-2013-Analysis of Scores, Datasets, and Models in Visual Saliency Prediction
Author: Ali Borji, Hamed R. Tavakoli, Dicky N. Sihite, Laurent Itti
Abstract: Significant recent progress has been made in developing high-quality saliency models. However, less effort has been undertaken on fair assessment of these models, over large standardized datasets and correctly addressing confounding factors. In this study, we pursue a critical and quantitative look at challenges (e.g., center-bias, map smoothing) in saliency modeling and the way they affect model accuracy. We quantitatively compare 32 state-of-the-art models (using the shuffled AUC score to discount center-bias) on 4 benchmark eye movement datasets, for prediction of human fixation locations and scanpath sequence. We also account for the role of map smoothing. We find that, although model rankings vary, some (e.g., AWS, LG, AIM, and HouNIPS) consistently outperform other models over all datasets. Some models work well for prediction of both fixation locations and scanpath sequence (e.g., Judd, GBVS). Our results show low prediction accuracy for models over emotional stimuli from the NUSEF dataset. Our last benchmark, for the first time, gauges the ability of models to decode the stimulus category from statistics of fixations, saccades, and model saliency values at fixated locations. In this test, ITTI and AIM models win over other models. Our benchmark provides a comprehensive high-level picture of the strengths and weaknesses of many popular models, and suggests future research directions in saliency modeling.
3 0.9167847 369 iccv-2013-Saliency Detection: A Boolean Map Approach
Author: Jianming Zhang, Stan Sclaroff
Abstract: A novel Boolean Map based Saliency (BMS) model is proposed. An image is characterized by a set of binary images, which are generated by randomly thresholding the image ’s color channels. Based on a Gestalt principle of figure-ground segregation, BMS computes saliency maps by analyzing the topological structure of Boolean maps. BMS is simple to implement and efficient to run. Despite its simplicity, BMS consistently achieves state-of-the-art performance compared with ten leading methods on five eye tracking datasets. Furthermore, BMS is also shown to be advantageous in salient object detection.
4 0.89697963 91 iccv-2013-Contextual Hypergraph Modeling for Salient Object Detection
Author: Xi Li, Yao Li, Chunhua Shen, Anthony Dick, Anton Van_Den_Hengel
Abstract: Salient object detection aims to locate objects that capture human attention within images. Previous approaches often pose this as a problem of image contrast analysis. In this work, we model an image as a hypergraph that utilizes a set of hyperedges to capture the contextual properties of image pixels or regions. As a result, the problem of salient object detection becomes one of finding salient vertices and hyperedges in the hypergraph. The main advantage of hypergraph modeling is that it takes into account each pixel’s (or region ’s) affinity with its neighborhood as well as its separation from image background. Furthermore, we propose an alternative approach based on centerversus-surround contextual contrast analysis, which performs salient object detection by optimizing a cost-sensitive support vector machine (SVM) objective function. Experimental results on four challenging datasets demonstrate the effectiveness of the proposed approaches against the stateof-the-art approaches to salient object detection.
5 0.8538745 71 iccv-2013-Category-Independent Object-Level Saliency Detection
Author: Yangqing Jia, Mei Han
Abstract: It is known that purely low-level saliency cues such as frequency does not lead to a good salient object detection result, requiring high-level knowledge to be adopted for successful discovery of task-independent salient objects. In this paper, we propose an efficient way to combine such high-level saliency priors and low-level appearance models. We obtain the high-level saliency prior with the objectness algorithm to find potential object candidates without the need of category information, and then enforce the consistency among the salient regions using a Gaussian MRF with the weights scaled by diverse density that emphasizes the influence of potential foreground pixels. Our model obtains saliency maps that assign high scores for the whole salient object, and achieves state-of-the-art performance on benchmark datasets covering various foreground statistics.
6 0.84136933 374 iccv-2013-Salient Region Detection by UFO: Uniqueness, Focusness and Objectness
7 0.82752967 372 iccv-2013-Saliency Detection via Dense and Sparse Reconstruction
8 0.82392514 370 iccv-2013-Saliency Detection in Large Point Sets
9 0.79751682 371 iccv-2013-Saliency Detection via Absorbing Markov Chain
10 0.77933669 137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction
11 0.69284034 396 iccv-2013-Space-Time Robust Representation for Action Recognition
12 0.65536582 381 iccv-2013-Semantically-Based Human Scanpath Estimation with HMMs
13 0.59689486 217 iccv-2013-Initialization-Insensitive Visual Tracking through Voting with Salient Local Features
14 0.54411262 325 iccv-2013-Predicting Primary Gaze Behavior Using Social Saliency Fields
15 0.3595632 247 iccv-2013-Learning to Predict Gaze in Egocentric Video
16 0.33572212 67 iccv-2013-Calibration-Free Gaze Estimation Using Human Gaze Patterns
17 0.32562762 180 iccv-2013-From Where and How to What We See
18 0.26222104 74 iccv-2013-Co-segmentation by Composition
19 0.26131773 25 iccv-2013-A Novel Earth Mover's Distance Methodology for Image Matching with Gaussian Mixture Models
20 0.24971297 312 iccv-2013-Perceptual Fidelity Aware Mean Squared Error
topicId topicWeight
[(2, 0.047), (7, 0.015), (12, 0.034), (26, 0.058), (31, 0.04), (42, 0.105), (64, 0.02), (73, 0.025), (78, 0.01), (84, 0.015), (89, 0.113), (95, 0.024), (97, 0.393)]
simIndex simValue paperId paperTitle
same-paper 1 0.77026868 373 iccv-2013-Saliency and Human Fixations: State-of-the-Art and Study of Comparison Metrics
Author: Nicolas Riche, Matthieu Duvinage, Matei Mancas, Bernard Gosselin, Thierry Dutoit
Abstract: Visual saliency has been an increasingly active research area in the last ten years with dozens of saliency models recently published. Nowadays, one of the big challenges in the field is to find a way to fairly evaluate all of these models. In this paper, on human eye fixations ,we compare the ranking of 12 state-of-the art saliency models using 12 similarity metrics. The comparison is done on Jian Li ’s database containing several hundreds of natural images. Based on Kendall concordance coefficient, it is shown that some of the metrics are strongly correlated leading to a redundancy in the performance metrics reported in the available benchmarks. On the other hand, other metrics provide a more diverse picture of models ’ overall performance. As a recommendation, three similarity metrics should be used to obtain a complete point of view of saliency model performance.
2 0.66772217 347 iccv-2013-Recursive Estimation of the Stein Center of SPD Matrices and Its Applications
Author: Hesamoddin Salehian, Guang Cheng, Baba C. Vemuri, Jeffrey Ho
Abstract: Symmetric positive-definite (SPD) matrices are ubiquitous in Computer Vision, Machine Learning and Medical Image Analysis. Finding the center/average of a population of such matrices is a common theme in many algorithms such as clustering, segmentation, principal geodesic analysis, etc. The center of a population of such matrices can be defined using a variety of distance/divergence measures as the minimizer of the sum of squared distances/divergences from the unknown center to the members of the population. It is well known that the computation of the Karcher mean for the space of SPD matrices which is a negativelycurved Riemannian manifold is computationally expensive. Recently, the LogDet divergence-based center was shown to be a computationally attractive alternative. However, the LogDet-based mean of more than two matrices can not be computed in closed form, which makes it computationally less attractive for large populations. In this paper we present a novel recursive estimator for center based on the Stein distance which is the square root of the LogDet di– vergence that is significantly faster than the batch mode computation of this center. The key theoretical contribution is a closed-form solution for the weighted Stein center of two SPD matrices, which is used in the recursive computation of the Stein center for a population of SPD matrices. Additionally, we show experimental evidence of the convergence of our recursive Stein center estimator to the batch mode Stein center. We present applications of our recursive estimator to K-means clustering and image indexing depicting significant time gains over corresponding algorithms that use the batch mode computations. For the latter application, we develop novel hashing functions using the Stein distance and apply it to publicly available data sets, and experimental results have shown favorable com– ∗This research was funded in part by the NIH grant NS066340 to BCV. †Corresponding author parisons to other competing methods.
3 0.60045117 412 iccv-2013-Synergistic Clustering of Image and Segment Descriptors for Unsupervised Scene Understanding
Author: Daniel M. Steinberg, Oscar Pizarro, Stefan B. Williams
Abstract: With the advent of cheap, high fidelity, digital imaging systems, the quantity and rate of generation of visual data can dramatically outpace a humans ability to label or annotate it. In these situations there is scope for the use of unsupervised approaches that can model these datasets and automatically summarise their content. To this end, we present a totally unsupervised, and annotation-less, model for scene understanding. This model can simultaneously cluster whole-image and segment descriptors, therebyforming an unsupervised model of scenes and objects. We show that this model outperforms other unsupervised models that can only cluster one source of information (image or segment) at once. We are able to compare unsupervised and supervised techniques using standard measures derived from confusion matrices and contingency tables. This shows that our unsupervised model is competitive with current supervised and weakly-supervised models for scene understanding on standard datasets. We also demonstrate our model operating on a dataset with more than 100,000 images col- lected by an autonomous underwater vehicle.
4 0.582807 227 iccv-2013-Large-Scale Image Annotation by Efficient and Robust Kernel Metric Learning
Author: Zheyun Feng, Rong Jin, Anil Jain
Abstract: One of the key challenges in search-based image annotation models is to define an appropriate similarity measure between images. Many kernel distance metric learning (KML) algorithms have been developed in order to capture the nonlinear relationships between visual features and semantics ofthe images. Onefundamental limitation in applying KML to image annotation is that it requires converting image annotations into binary constraints, leading to a significant information loss. In addition, most KML algorithms suffer from high computational cost due to the requirement that the learned matrix has to be positive semi-definitive (PSD). In this paper, we propose a robust kernel metric learning (RKML) algorithm based on the regression technique that is able to directly utilize image annotations. The proposed method is also computationally more efficient because PSD property is automatically ensured by regression. We provide the theoretical guarantee for the proposed algorithm, and verify its efficiency and effectiveness for image annotation by comparing it to state-of-the-art approaches for both distance metric learning and image annotation. ,
5 0.56670511 50 iccv-2013-Analysis of Scores, Datasets, and Models in Visual Saliency Prediction
Author: Ali Borji, Hamed R. Tavakoli, Dicky N. Sihite, Laurent Itti
Abstract: Significant recent progress has been made in developing high-quality saliency models. However, less effort has been undertaken on fair assessment of these models, over large standardized datasets and correctly addressing confounding factors. In this study, we pursue a critical and quantitative look at challenges (e.g., center-bias, map smoothing) in saliency modeling and the way they affect model accuracy. We quantitatively compare 32 state-of-the-art models (using the shuffled AUC score to discount center-bias) on 4 benchmark eye movement datasets, for prediction of human fixation locations and scanpath sequence. We also account for the role of map smoothing. We find that, although model rankings vary, some (e.g., AWS, LG, AIM, and HouNIPS) consistently outperform other models over all datasets. Some models work well for prediction of both fixation locations and scanpath sequence (e.g., Judd, GBVS). Our results show low prediction accuracy for models over emotional stimuli from the NUSEF dataset. Our last benchmark, for the first time, gauges the ability of models to decode the stimulus category from statistics of fixations, saccades, and model saliency values at fixated locations. In this test, ITTI and AIM models win over other models. Our benchmark provides a comprehensive high-level picture of the strengths and weaknesses of many popular models, and suggests future research directions in saliency modeling.
6 0.56025267 20 iccv-2013-A Max-Margin Perspective on Sparse Representation-Based Classification
7 0.56009638 372 iccv-2013-Saliency Detection via Dense and Sparse Reconstruction
8 0.52996004 425 iccv-2013-Tracking via Robust Multi-task Multi-view Joint Sparse Representation
9 0.52652955 369 iccv-2013-Saliency Detection: A Boolean Map Approach
10 0.49235478 91 iccv-2013-Contextual Hypergraph Modeling for Salient Object Detection
11 0.48839283 371 iccv-2013-Saliency Detection via Absorbing Markov Chain
12 0.48022097 71 iccv-2013-Category-Independent Object-Level Saliency Detection
13 0.440411 180 iccv-2013-From Where and How to What We See
14 0.44022813 257 iccv-2013-Log-Euclidean Kernels for Sparse Representation and Dictionary Learning
15 0.43796536 396 iccv-2013-Space-Time Robust Representation for Action Recognition
16 0.43666014 338 iccv-2013-Randomized Ensemble Tracking
17 0.43533909 137 iccv-2013-Efficient Salient Region Detection with Soft Image Abstraction
18 0.43157923 45 iccv-2013-Affine-Constrained Group Sparse Coding and Its Application to Image-Based Classifications
19 0.43133163 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning
20 0.42880535 381 iccv-2013-Semantically-Based Human Scanpath Estimation with HMMs