nips nips2006 nips2006-86 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jonathan Harel, Christof Koch, Pietro Perona
Abstract: A new bottom-up visual saliency model, Graph-Based Visual Saliency (GBVS), is proposed. It consists of two steps: rst forming activation maps on certain feature channels, and then normalizing them in a way which highlights conspicuity and admits combination with other maps. The model is simple, and biologically plausible insofar as it is naturally parallelized. This model powerfully predicts human xations on 749 variations of 108 natural images, achieving 98% of the ROC area of a human-based control, whereas the classical algorithms of Itti & Koch ([2], [3], [4]) achieve only 84%. 1
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract A new bottom-up visual saliency model, Graph-Based Visual Saliency (GBVS), is proposed. [sent-5, score-0.578]
2 It consists of two steps: rst forming activation maps on certain feature channels, and then normalizing them in a way which highlights conspicuity and admits combination with other maps. [sent-6, score-0.617]
3 This model powerfully predicts human xations on 749 variations of 108 natural images, achieving 98% of the ROC area of a human-based control, whereas the classical algorithms of Itti & Koch ([2], [3], [4]) achieve only 84%. [sent-8, score-0.355]
4 The ability to predict, given an image (or video), where a human might xate in a xed-time freeviewing scenario has long been of interest in the vision community. [sent-11, score-0.223]
5 , [2], [9]) are based on biologically motivated feature selection, followed by center-surround operations which highlight local gradients, and nally a combination step leading to a "master map". [sent-17, score-0.147]
6 However, ultimately, Bruce computes a function which is additive in feature maps, with the main contribution materializing as a method of operating on a feature map in such a way to get an activation, or saliency, map. [sent-19, score-0.273]
7 Itti and Baldi de ne "surprise" in general, but ultimately compute a saliency map in the classical [2] sense for each of a number of feature channels, then operate on these maps using another function aimed at highlighting local variation. [sent-20, score-0.888]
8 In the classic algorithms, step (s1) is done using biologically inspired lters, step (s2) is accomplished by subtracting feature maps at different scales (henceforth, "c-s" for "center" - "surround"), and step (s3) is accomplished in one of three ways: 1. [sent-25, score-0.45]
9 a normalization scheme based on local maxima [2] ( "max-ave"), 2. [sent-26, score-0.16]
10 We take a different approach, exploiting the computational power, topographical structure, and parallel nature of graph algorithms to achieve natural and ef cient saliency computations. [sent-29, score-0.652]
11 We de ne Markov chains over various image maps, and treat the equilibrium distribution over map locations as activation and saliency values. [sent-30, score-1.197]
12 This idea is not completely new: Brockmann and Geisel [8] suggest that scanpaths might be predicted by properly de ned Levy ights over saliency elds, and more recently Boccignone and Ferraro [7] do the same. [sent-31, score-0.591]
13 Importantly, they assume that a saliency map is already available, and offer an alternative to the winner-takes-all approach of mapping this object to a set of xation locations. [sent-32, score-0.819]
14 Here, we take a uni ed approach to steps (s2) and (s3) of saliency computation, by using dissimilarity and saliency to de ne edge weights on graphs which are interpreted as Markov chains. [sent-36, score-1.135]
15 We also directly compare our method to others, using power to predict human xations as a performance metric. [sent-38, score-0.324]
16 The contributions of this paper are as follows: (1) A complete bottom-up saliency model based on graph computations, GBVS, including a framework for "activation" and "normalization/combination". [sent-39, score-0.652]
17 , foliage) with the eye-movement xation data of seven human subjects, from a recent study by Einhäuser et. [sent-41, score-0.275]
18 2 The Proposed Method: Graph-Based Saliency (GBVS) Given an image I, we wish to ultimately highlight a handful of `signi cant' locations where the image is `informative' according to some criterion, e. [sent-44, score-0.198]
19 As previously explained, this process is conditioned on rst computing feature maps (s1), e. [sent-47, score-0.196]
20 Our goal is to compute an activation map A : [n]2 ! [sent-54, score-0.487]
21 R, such that, intuitively, locations (i; j) 2 [n]2 where I, or as a proxy, M (i; j); is somehow unusual in its neighborhood will correspond to high values of activation A. [sent-55, score-0.494]
22 Also, the maps M , and later A, are presented as square (n n) only for expository simplicity. [sent-61, score-0.144]
23 Nothing in this paper will depend critically on the square assumtion, and, in practice, rectangular maps are used instead. [sent-62, score-0.144]
24 Consider now the fully-connected directed graph GA , obtained by connecting every node of the lattice M , labelled with two indices (i; j) 2 [n]2 , with all other n 1 nodes. [sent-68, score-0.278]
25 The directed edge from node (i; j) to node (p; q) will be assigned a weight w1 ((i; j); (p; q)) , d((i; j)jj(p; q)) F (i a2 + b2 F (a; b) , exp : 2 2 p; j q), where is a free parameter of our algorithm2 . [sent-69, score-0.169]
26 Thus, the weight of the edge from node (i; j) to node (p; q) is proportional to their dissimilarity and to their closeness in the domain of M . [sent-70, score-0.249]
27 We may now de ne a Markov chain on GA by normalizing the weights of the outbound edges of each node to 1, and drawing an equivalence between nodes & states, and edges weights & transition probabilities . [sent-72, score-0.268]
28 The result is an activation measure which is derived from pairwise contrast. [sent-74, score-0.347]
29 Computations can be carried out independently at each node: in a synchronous environment, at each time step, each node simply sums incoming mass, then passes along measured partitions of this mass to its neighbors according to outbound edge weights. [sent-77, score-0.231]
30 The same simple process happening at all nodes simultaneously gives rise to an equilibrium distribution of mass. [sent-78, score-0.15]
31 Technical Notes The equilibrium distribution of this chain exists and is unique because the chain is ergodic, a property which emerges from the fact that our underlying graph GA is by construction strongly connected. [sent-79, score-0.289]
32 2 "Normalizing" an Activation Map (s3) The aim of the "normalization" step of the algorithm is much less clear than that of the activation step. [sent-84, score-0.378]
33 Earlier, three separate approaches were mentioned as existing benchmarks, and also the recent work of Itti on surprise [4] comes into the saliency computation at this stage of the process (although it can also be applied to s2 as mentioned above). [sent-86, score-0.617]
34 We shall state the goal of this step as: concentrating mass on activation maps. [sent-87, score-0.488]
35 If mass is not concentrated on individual activation maps prior to additive combination, then the resulting master map may be too nearly uniform and hence uninformative. [sent-88, score-0.769]
36 Although this may seem trivial, it is on some level the very soul of any saliency algorithm: concentrating activation into a few key locations. [sent-89, score-0.901]
37 2 In our experiments, this parameter was set to approximately one tenth to one fth of the map width. [sent-90, score-0.14]
38 3 Our implementation, not optimized for speed, converges on a single map of size 25 37 in fractions of a second on a 2. [sent-92, score-0.14]
39 Armed with the mass-concentration de nition, we propose another Markovian algorithm as follows: This time, we begin with an activation map4 A : [n]2 ! [sent-94, score-0.347]
40 We construct a graph GN with n2 nodes labelled with indices from [n]2 . [sent-96, score-0.27]
41 1 Experimental Results Preliminaries and paradigm We perform saliency computations on real images of the natural world, and compare the power of the resulting maps to predict human xations. [sent-102, score-0.875]
42 The experimental paradigm we pursue is the following: for each of a set of images, we compute a set of feature maps using standard techniques. [sent-103, score-0.196]
43 Then, we proccess each of these feature maps using some activation algorithm, and then some normalization algorithm, and then simply sum over the feature channels. [sent-104, score-0.755]
44 The resulting master saliency map is scored (using an ROC area metric described below) relative to xation data collected for the corresponding image, and labelled according to the activation and normalization algorithms used to obtain it. [sent-105, score-1.602]
45 We then pool over a corpus of images, and the resulting set of scored and labelled master saliency maps is analyzed in various ways presented below. [sent-106, score-0.826]
46 Some notes follow: Algorithm Labels: Hereafter, "graph (i)" and "graph (ii)" refer to the activation algorithm described in section 2. [sent-107, score-0.383]
47 The difference is that in graph (i), the parameter = 2:5, whereas in graph (ii), = 5. [sent-110, score-0.28]
48 "graph (iii)" and "graph (iv)" refer to the an iterated repitition of the normalization algorithm described in section 2. [sent-111, score-0.187]
49 The normalization algorithm referred to as "I" corresponds to "Identity", with the most naive normalization rule: it does nothing, leaving activations unchanged prior to subsequent combination. [sent-114, score-0.32]
50 Performance metric: We wish to give a reward quantity to a saliency map, given some target locations, e. [sent-118, score-0.539]
51 , in the case of natural images, a set of locations at which human observers xated. [sent-120, score-0.168]
52 For any one threshold saliency value, one can treat the saliency map as a classi er, with all points above threshold indicated as "target" and all points below threshold as "background". [sent-121, score-1.245]
53 For any particular value of the threshold, there is some fraction of the actual target points which are labelled as such (true positive rate), and some fraction of points which were not target but labelled as such anyway (false positive rate). [sent-122, score-0.252]
54 This is the performance metric we use to measure how well a saliency map predicts xation locations on a given image. [sent-124, score-0.921]
55 , if the graph-based activation step is concatenated with the graph-based normalization step, we will call the resulting algorithm GBVS. [sent-128, score-0.538]
56 5 We note that this normalization step of GBS can be iterated times to improve performance. [sent-130, score-0.218]
57 6 with the intuition being that competition among competing saliency regions can settle, at which point it is wise to terminate 7 http://www. [sent-133, score-0.512]
58 [1], human and primate xation data was collected on 108 images, each modi ed8 in nine ways. [sent-137, score-0.275]
59 Figure 2 shows an example image from this collection, together with "x"s marking the xation points of three human subjects on this particular picture. [sent-138, score-0.394]
60 In the present study, 749 unique modi cations of the 108 original images, and 24149 human xations from [1] were used. [sent-139, score-0.291]
61 Only pictures for which xation data from three human subjects were available were used. [sent-140, score-0.345]
62 Each image was cropped to 600 400 pixels and was presented to subjects so that it took up 76 55 of their visual eld. [sent-141, score-0.185]
63 In order to facilitate a fair comparison of algorithms, the rst step of the saliency algorithm, feature extraction (s1), was the same for every experiment. [sent-142, score-0.595]
64 Each of these 12 maps was nally downsampled to a 25 37 raw feature map. [sent-144, score-0.196]
65 "c-s" (center-surround) activation maps were computed by subtracting, from each raw feature map, a feature map on the same channel originally computed at a scale 4 binary orders of magnitude smaller in overall resolution and then resized smoothly to size 25 37. [sent-145, score-0.805]
66 The other activation procedures are described in section 2. [sent-147, score-0.347]
67 The normalization procedures are all earlier described and named. [sent-152, score-0.16]
68 Figure 2 shows an actual image with the resulting saliency maps from two different (activation, normalization) schemes. [sent-153, score-0.705]
69 57 Figure 2: (a) An image from the data-set with xations indicated using x's. [sent-156, score-0.232]
70 (b) The saliency map formed when using (activation,normalization)= (graph (i),graph (iii)). [sent-157, score-0.652]
71 (c) Saliency map for (activation,normalization)=(c-s,DoG) Finally, we show the performance of this algorithm on the corpus of images. [sent-158, score-0.14]
72 For each image, a mean inter-subject ROC area was computed as follows: for each of the three subjects who viewed an image, the xation points of the remaining two subjects were convolved with a circular, decaying kernel with decay constant matched to the decaying cone density in the retina. [sent-159, score-0.46]
73 This was treated as a saliency map derived directly from human xations, and with the target points being set to the 8 Modi cations were made to change the luminance contrast either up or down in selected circular regions. [sent-160, score-0.858]
74 xations of the rst subject, an ROC area was computed for a single subject. [sent-163, score-0.282]
75 For each range of this quantity, a mean performance metric was computed for various activation and normalization schemes. [sent-165, score-0.584]
76 For any particular scheme, an ROC area was computed using the resulting saliency map together with the xations from all 3 human subjects as target points to detect. [sent-166, score-1.139]
77 Each curve represents a different activation scheme, while averaging over individual image numbers and normalization schemes. [sent-193, score-0.556]
78 (b) A mean ROC metric is similarly computed, instead holding the normalization constant while varying the activation scheme. [sent-194, score-0.549]
79 8 Figure 4: We compare the predictive power of ve saliency algorithms. [sent-210, score-0.545]
80 The best performer is the method which combines a graph based activation algorithm with a graph based normalization algorithm. [sent-211, score-0.787]
81 The combination of a few possible pairs of activation schemes together with normalization schemes is summarized in Table 1, with notes indicating where certain combinations correspond to established benchmarks. [sent-212, score-0.603]
82 55 for the Itti & Koch saliency algorithms [2] on these images. [sent-215, score-0.512]
83 In [1] 9 To form a true upper bound, one would need the xation data of many more than three humans on each image. [sent-216, score-0.196]
84 57, which is remarkably close and plausible if you assume slightly more sophisticated feature maps (for instance, at more scales). [sent-218, score-0.196]
85 Table 1: Performance of end-to-end algorithms activation algorithm ROC area (fraction10 ) graph (ii) graph (i) graph (ii) graph (ii) graph (ii) graph (i) self-info graph (iv) graph (iv) I ave-max graph (iii) graph (iii) I 0. [sent-219, score-1.811]
86 The rst observation is that, because nodes are on average closer to a few center nodes than to any particular point along the image periphery, it is an emergent property that GBVS promotes higher saliency values in the center of the image plane. [sent-231, score-0.843]
87 We hypothesize that this "center bias" is favorable with respect to predicting xations due to human experience both with photographs, which are typically taken with a central subject, and with everyday life in which head motion often results in gazing straight ahead. [sent-232, score-0.291]
88 However, if we introduce this center bias to the output of the standard algorithms' master maps (via pointwise multiplication), we nd that the standard algorithms predict xations better, but still worse than GBVS. [sent-235, score-0.436]
89 We note here that what lacks from GBVS described as above is any notion of a multiresolution representation of map data. [sent-245, score-0.199]
90 Therefore, because multiresolution representations are so basic, one may extend both the graph-based activation and normalization steps to a multiresolution version as follows: We begin with, instead of a single map A : [n]2 ! [sent-246, score-0.765]
91 R, a collection of maps fAi g, with each Ai : [ni ]2 ! [sent-247, score-0.144]
92 The authors suggest a de nition whereby: (1) each point in each map is assigned a set of locations, (2) this set corresponds to the spatial support of this point in the highest resolution map, and (3) the distance between two sets of locations is given as the mean of the set of pairwise distances. [sent-251, score-0.2]
93 Therefore, we have presented a method of computing bottom-up saliency maps which shows a remarkable consistency with the attentional deployment of human subjects. [sent-254, score-0.764]
94 The method uses a novel application of ideas from graph theory to concentrate mass on activation maps, and to form activation maps from raw features. [sent-255, score-1.046]
95 We compared our method with established models and found that ours performed favorably, for both of the key steps in our organization of saliency computations. [sent-256, score-0.512]
96 Acknowledgments The authors express sincere gratitude to Wolfgang Einhäuser for his offering of natural images, and the xation data associated with them from a study with seven human subjects. [sent-258, score-0.275]
97 Niebur "A model of saliency based visual attention for rapid scene analysis", IEEE Transactions on Pattern Analysis and Machine 1998 [3] L. [sent-269, score-0.669]
98 Koch "A saliency-based search mechanism for overt and covert shifts of visual attention", Vision Research, 2000 [4] L. [sent-271, score-0.133]
99 König, "Does luminance-contrast contribute to saliency map for overt visual attention? [sent-297, score-0.785]
100 Gilchrist "Visual correlates of xation selection: Effects of scale and time. [sent-312, score-0.167]
wordName wordTfidf (topN-words)
[('saliency', 0.512), ('activation', 0.347), ('gbvs', 0.21), ('itti', 0.202), ('roc', 0.187), ('xations', 0.183), ('koch', 0.176), ('xation', 0.167), ('normalization', 0.16), ('maps', 0.144), ('graph', 0.14), ('map', 0.14), ('human', 0.108), ('einh', 0.105), ('surprise', 0.105), ('dog', 0.1), ('nl', 0.093), ('bruce', 0.091), ('equilibrium', 0.089), ('dissimilarity', 0.08), ('subjects', 0.07), ('master', 0.07), ('node', 0.069), ('labelled', 0.069), ('mass', 0.068), ('overt', 0.067), ('iv', 0.067), ('visual', 0.066), ('attention', 0.064), ('biologically', 0.064), ('area', 0.064), ('outbound', 0.063), ('nodes', 0.061), ('locations', 0.06), ('multiresolution', 0.059), ('feature', 0.052), ('images', 0.051), ('image', 0.049), ('iii', 0.048), ('perona', 0.048), ('niebur', 0.047), ('normalizing', 0.045), ('luminance', 0.044), ('braun', 0.044), ('user', 0.043), ('metric', 0.042), ('boccignone', 0.042), ('concentrating', 0.042), ('ferraro', 0.042), ('foliage', 0.042), ('geisel', 0.042), ('nig', 0.042), ('organic', 0.042), ('scanpaths', 0.042), ('tsotsos', 0.042), ('ga', 0.04), ('ultimately', 0.04), ('center', 0.039), ('ii', 0.037), ('termination', 0.037), ('xate', 0.037), ('brockmann', 0.037), ('ights', 0.037), ('ro', 0.037), ('tremendous', 0.037), ('notes', 0.036), ('computed', 0.035), ('accomplished', 0.035), ('surround', 0.033), ('somehow', 0.033), ('emergent', 0.033), ('costa', 0.033), ('power', 0.033), ('salient', 0.032), ('edge', 0.031), ('step', 0.031), ('baldi', 0.031), ('scored', 0.031), ('levy', 0.031), ('chain', 0.03), ('fraction', 0.03), ('schemes', 0.03), ('humans', 0.029), ('activating', 0.029), ('highlights', 0.029), ('operating', 0.029), ('vision', 0.029), ('psychophysics', 0.028), ('unusual', 0.028), ('jj', 0.028), ('benchmarks', 0.028), ('scene', 0.027), ('computations', 0.027), ('target', 0.027), ('subtracting', 0.027), ('circular', 0.027), ('iterated', 0.027), ('decaying', 0.027), ('threshold', 0.027), ('neighborhood', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999964 86 nips-2006-Graph-Based Visual Saliency
Author: Jonathan Harel, Christof Koch, Pietro Perona
Abstract: A new bottom-up visual saliency model, Graph-Based Visual Saliency (GBVS), is proposed. It consists of two steps: rst forming activation maps on certain feature channels, and then normalizing them in a way which highlights conspicuity and admits combination with other maps. The model is simple, and biologically plausible insofar as it is naturally parallelized. This model powerfully predicts human xations on 749 variations of 108 natural images, achieving 98% of the ROC area of a human-based control, whereas the classical algorithms of Itti & Koch ([2], [3], [4]) achieve only 84%. 1
2 0.50377297 8 nips-2006-A Nonparametric Approach to Bottom-Up Visual Saliency
Author: Wolf Kienzle, Felix A. Wichmann, Matthias O. Franz, Bernhard Schölkopf
Abstract: This paper addresses the bottom-up influence of local image information on human eye movements. Most existing computational models use a set of biologically plausible linear filters, e.g., Gabor or Difference-of-Gaussians filters as a front-end, the outputs of which are nonlinearly combined into a real number that indicates visual saliency. Unfortunately, this requires many design parameters such as the number, type, and size of the front-end filters, as well as the choice of nonlinearities, weighting and normalization schemes etc., for which biological plausibility cannot always be justified. As a result, these parameters have to be chosen in a more or less ad hoc way. Here, we propose to learn a visual saliency model directly from human eye movement data. The model is rather simplistic and essentially parameter-free, and therefore contrasts recent developments in the field that usually aim at higher prediction rates at the cost of additional parameters and increasing model complexity. Experimental results show that—despite the lack of any biological prior knowledge—our model performs comparably to existing approaches, and in fact learns image features that resemble findings from several previous studies. In particular, its maximally excitatory stimuli have center-surround structure, similar to receptive fields in the early human visual system. 1
3 0.11961122 167 nips-2006-Recursive ICA
Author: Honghao Shan, Lingyun Zhang, Garrison W. Cottrell
Abstract: Independent Component Analysis (ICA) is a popular method for extracting independent features from visual data. However, as a fundamentally linear technique, there is always nonlinear residual redundancy that is not captured by ICA. Hence there have been many attempts to try to create a hierarchical version of ICA, but so far none of the approaches have a natural way to apply them more than once. Here we show that there is a relatively simple technique that transforms the absolute values of the outputs of a previous application of ICA into a normal distribution, to which ICA maybe applied again. This results in a recursive ICA algorithm that may be applied any number of times in order to extract higher order structure from previous layers. 1
4 0.11749637 91 nips-2006-Hierarchical Dirichlet Processes with Random Effects
Author: Seyoung Kim, Padhraic Smyth
Abstract: Data sets involving multiple groups with shared characteristics frequently arise in practice. In this paper we extend hierarchical Dirichlet processes to model such data. Each group is assumed to be generated from a template mixture model with group level variability in both the mixing proportions and the component parameters. Variabilities in mixing proportions across groups are handled using hierarchical Dirichlet processes, also allowing for automatic determination of the number of components. In addition, each group is allowed to have its own component parameters coming from a prior described by a template mixture model. This group-level variability in the component parameters is handled using a random effects model. We present a Markov Chain Monte Carlo (MCMC) sampling algorithm to estimate model parameters and demonstrate the method by applying it to the problem of modeling spatial brain activation patterns across multiple images collected via functional magnetic resonance imaging (fMRI). 1
5 0.10460953 9 nips-2006-A Nonparametric Bayesian Method for Inferring Features From Similarity Judgments
Author: Daniel J. Navarro, Thomas L. Griffiths
Abstract: The additive clustering model is widely used to infer the features of a set of stimuli from their similarities, on the assumption that similarity is a weighted linear function of common features. This paper develops a fully Bayesian formulation of the additive clustering model, using methods from nonparametric Bayesian statistics to allow the number of features to vary. We use this to explore several approaches to parameter estimation, showing that the nonparametric Bayesian approach provides a straightforward way to obtain estimates of both the number of features used in producing similarity judgments and their importance. 1
6 0.09707208 39 nips-2006-Balanced Graph Matching
7 0.087477297 14 nips-2006-A Small World Threshold for Economic Network Formation
8 0.082629539 70 nips-2006-Doubly Stochastic Normalization for Spectral Clustering
9 0.078343108 31 nips-2006-Analysis of Contour Motions
10 0.064839534 66 nips-2006-Detecting Humans via Their Pose
11 0.063886888 78 nips-2006-Fast Discriminative Visual Codebooks using Randomized Clustering Forests
12 0.060947292 94 nips-2006-Image Retrieval and Classification Using Local Distance Functions
13 0.059418991 74 nips-2006-Efficient Structure Learning of Markov Networks using $L 1$-Regularization
14 0.058106992 18 nips-2006-A selective attention multi--chip system with dynamic synapses and spiking neurons
15 0.057867359 92 nips-2006-High-Dimensional Graphical Model Selection Using $\ell 1$-Regularized Logistic Regression
16 0.057517737 117 nips-2006-Learning on Graph with Laplacian Regularization
17 0.056469016 110 nips-2006-Learning Dense 3D Correspondence
18 0.05601434 77 nips-2006-Fast Computation of Graph Kernels
19 0.055376865 185 nips-2006-Subordinate class recognition using relational object models
20 0.054042254 80 nips-2006-Fundamental Limitations of Spectral Clustering
topicId topicWeight
[(0, -0.205), (1, -0.021), (2, 0.14), (3, -0.067), (4, 0.073), (5, -0.162), (6, -0.136), (7, 0.02), (8, -0.007), (9, -0.087), (10, 0.307), (11, -0.055), (12, 0.035), (13, -0.17), (14, -0.211), (15, 0.03), (16, 0.09), (17, 0.016), (18, -0.087), (19, -0.144), (20, 0.311), (21, -0.125), (22, -0.241), (23, 0.215), (24, -0.177), (25, -0.074), (26, -0.012), (27, 0.017), (28, -0.095), (29, 0.162), (30, 0.09), (31, 0.025), (32, -0.106), (33, -0.018), (34, -0.026), (35, 0.049), (36, 0.082), (37, 0.015), (38, 0.075), (39, 0.028), (40, 0.07), (41, 0.021), (42, -0.015), (43, 0.001), (44, 0.031), (45, 0.034), (46, -0.065), (47, 0.047), (48, -0.018), (49, 0.022)]
simIndex simValue paperId paperTitle
same-paper 1 0.96575063 86 nips-2006-Graph-Based Visual Saliency
Author: Jonathan Harel, Christof Koch, Pietro Perona
Abstract: A new bottom-up visual saliency model, Graph-Based Visual Saliency (GBVS), is proposed. It consists of two steps: rst forming activation maps on certain feature channels, and then normalizing them in a way which highlights conspicuity and admits combination with other maps. The model is simple, and biologically plausible insofar as it is naturally parallelized. This model powerfully predicts human xations on 749 variations of 108 natural images, achieving 98% of the ROC area of a human-based control, whereas the classical algorithms of Itti & Koch ([2], [3], [4]) achieve only 84%. 1
2 0.83398032 8 nips-2006-A Nonparametric Approach to Bottom-Up Visual Saliency
Author: Wolf Kienzle, Felix A. Wichmann, Matthias O. Franz, Bernhard Schölkopf
Abstract: This paper addresses the bottom-up influence of local image information on human eye movements. Most existing computational models use a set of biologically plausible linear filters, e.g., Gabor or Difference-of-Gaussians filters as a front-end, the outputs of which are nonlinearly combined into a real number that indicates visual saliency. Unfortunately, this requires many design parameters such as the number, type, and size of the front-end filters, as well as the choice of nonlinearities, weighting and normalization schemes etc., for which biological plausibility cannot always be justified. As a result, these parameters have to be chosen in a more or less ad hoc way. Here, we propose to learn a visual saliency model directly from human eye movement data. The model is rather simplistic and essentially parameter-free, and therefore contrasts recent developments in the field that usually aim at higher prediction rates at the cost of additional parameters and increasing model complexity. Experimental results show that—despite the lack of any biological prior knowledge—our model performs comparably to existing approaches, and in fact learns image features that resemble findings from several previous studies. In particular, its maximally excitatory stimuli have center-surround structure, similar to receptive fields in the early human visual system. 1
3 0.34726626 174 nips-2006-Similarity by Composition
Author: Oren Boiman, Michal Irani
Abstract: We propose a new approach for measuring similarity between two signals, which is applicable to many machine learning tasks, and to many signal types. We say that a signal S1 is “similar” to a signal S2 if it is “easy” to compose S1 from few large contiguous chunks of S2 . Obviously, if we use small enough pieces, then any signal can be composed of any other. Therefore, the larger those pieces are, the more similar S1 is to S2 . This induces a local similarity score at every point in the signal, based on the size of its supported surrounding region. These local scores can in turn be accumulated in a principled information-theoretic way into a global similarity score of the entire S1 to S2 . “Similarity by Composition” can be applied between pairs of signals, between groups of signals, and also between different portions of the same signal. It can therefore be employed in a wide variety of machine learning problems (clustering, classification, retrieval, segmentation, attention, saliency, labelling, etc.), and can be applied to a wide range of signal types (images, video, audio, biological data, etc.) We show a few such examples. 1
4 0.32464039 9 nips-2006-A Nonparametric Bayesian Method for Inferring Features From Similarity Judgments
Author: Daniel J. Navarro, Thomas L. Griffiths
Abstract: The additive clustering model is widely used to infer the features of a set of stimuli from their similarities, on the assumption that similarity is a weighted linear function of common features. This paper develops a fully Bayesian formulation of the additive clustering model, using methods from nonparametric Bayesian statistics to allow the number of features to vary. We use this to explore several approaches to parameter estimation, showing that the nonparametric Bayesian approach provides a straightforward way to obtain estimates of both the number of features used in producing similarity judgments and their importance. 1
5 0.29774827 18 nips-2006-A selective attention multi--chip system with dynamic synapses and spiking neurons
Author: Chiara Bartolozzi, Giacomo Indiveri
Abstract: Selective attention is the strategy used by biological sensory systems to solve the problem of limited parallel processing capacity: salient subregions of the input stimuli are serially processed, while non–salient regions are suppressed. We present an mixed mode analog/digital Very Large Scale Integration implementation of a building block for a multi–chip neuromorphic hardware model of selective attention. We describe the chip’s architecture and its behavior, when its is part of a multi–chip system with a spiking retina as input, and show how it can be used to implement in real-time flexible models of bottom-up attention. 1
6 0.28707933 73 nips-2006-Efficient Methods for Privacy Preserving Face Detection
7 0.28633431 39 nips-2006-Balanced Graph Matching
8 0.28407028 91 nips-2006-Hierarchical Dirichlet Processes with Random Effects
9 0.28233761 70 nips-2006-Doubly Stochastic Normalization for Spectral Clustering
10 0.27571625 52 nips-2006-Clustering appearance and shape by learning jigsaws
11 0.27137879 31 nips-2006-Analysis of Contour Motions
12 0.271321 167 nips-2006-Recursive ICA
13 0.26285285 78 nips-2006-Fast Discriminative Visual Codebooks using Randomized Clustering Forests
14 0.2312375 42 nips-2006-Bayesian Image Super-resolution, Continued
15 0.23084243 76 nips-2006-Emergence of conjunctive visual features by quadratic independent component analysis
16 0.22509402 169 nips-2006-Relational Learning with Gaussian Processes
17 0.22158717 74 nips-2006-Efficient Structure Learning of Markov Networks using $L 1$-Regularization
18 0.2208288 120 nips-2006-Learning to Traverse Image Manifolds
19 0.21882699 14 nips-2006-A Small World Threshold for Economic Network Formation
20 0.21463251 117 nips-2006-Learning on Graph with Laplacian Regularization
topicId topicWeight
[(1, 0.09), (3, 0.038), (7, 0.082), (9, 0.051), (12, 0.012), (20, 0.019), (22, 0.04), (28, 0.303), (44, 0.059), (57, 0.091), (64, 0.015), (65, 0.038), (69, 0.04), (71, 0.024), (90, 0.011)]
simIndex simValue paperId paperTitle
1 0.90576506 90 nips-2006-Hidden Markov Dirichlet Process: Modeling Genetic Recombination in Open Ancestral Space
Author: Kyung-ah Sohn, Eric P. Xing
Abstract: We present a new statistical framework called hidden Markov Dirichlet process (HMDP) to jointly model the genetic recombinations among possibly infinite number of founders and the coalescence-with-mutation events in the resulting genealogies. The HMDP posits that a haplotype of genetic markers is generated by a sequence of recombination events that select an ancestor for each locus from an unbounded set of founders according to a 1st-order Markov transition process. Conjoining this process with a mutation model, our method accommodates both between-lineage recombination and within-lineage sequence variations, and leads to a compact and natural interpretation of the population structure and inheritance process underlying haplotype data. We have developed an efficient sampling algorithm for HMDP based on a two-level nested P´ lya urn scheme. On both simulated o and real SNP haplotype data, our method performs competitively or significantly better than extant methods in uncovering the recombination hotspots along chromosomal loci; and in addition it also infers the ancestral genetic patterns and offers a highly accurate map of ancestral compositions of modern populations. 1
same-paper 2 0.80679911 86 nips-2006-Graph-Based Visual Saliency
Author: Jonathan Harel, Christof Koch, Pietro Perona
Abstract: A new bottom-up visual saliency model, Graph-Based Visual Saliency (GBVS), is proposed. It consists of two steps: rst forming activation maps on certain feature channels, and then normalizing them in a way which highlights conspicuity and admits combination with other maps. The model is simple, and biologically plausible insofar as it is naturally parallelized. This model powerfully predicts human xations on 749 variations of 108 natural images, achieving 98% of the ROC area of a human-based control, whereas the classical algorithms of Itti & Koch ([2], [3], [4]) achieve only 84%. 1
3 0.60293192 8 nips-2006-A Nonparametric Approach to Bottom-Up Visual Saliency
Author: Wolf Kienzle, Felix A. Wichmann, Matthias O. Franz, Bernhard Schölkopf
Abstract: This paper addresses the bottom-up influence of local image information on human eye movements. Most existing computational models use a set of biologically plausible linear filters, e.g., Gabor or Difference-of-Gaussians filters as a front-end, the outputs of which are nonlinearly combined into a real number that indicates visual saliency. Unfortunately, this requires many design parameters such as the number, type, and size of the front-end filters, as well as the choice of nonlinearities, weighting and normalization schemes etc., for which biological plausibility cannot always be justified. As a result, these parameters have to be chosen in a more or less ad hoc way. Here, we propose to learn a visual saliency model directly from human eye movement data. The model is rather simplistic and essentially parameter-free, and therefore contrasts recent developments in the field that usually aim at higher prediction rates at the cost of additional parameters and increasing model complexity. Experimental results show that—despite the lack of any biological prior knowledge—our model performs comparably to existing approaches, and in fact learns image features that resemble findings from several previous studies. In particular, its maximally excitatory stimuli have center-surround structure, similar to receptive fields in the early human visual system. 1
4 0.51029181 167 nips-2006-Recursive ICA
Author: Honghao Shan, Lingyun Zhang, Garrison W. Cottrell
Abstract: Independent Component Analysis (ICA) is a popular method for extracting independent features from visual data. However, as a fundamentally linear technique, there is always nonlinear residual redundancy that is not captured by ICA. Hence there have been many attempts to try to create a hierarchical version of ICA, but so far none of the approaches have a natural way to apply them more than once. Here we show that there is a relatively simple technique that transforms the absolute values of the outputs of a previous application of ICA into a normal distribution, to which ICA maybe applied again. This results in a recursive ICA algorithm that may be applied any number of times in order to extract higher order structure from previous layers. 1
5 0.50898588 34 nips-2006-Approximate Correspondences in High Dimensions
Author: Kristen Grauman, Trevor Darrell
Abstract: Pyramid intersection is an efficient method for computing an approximate partial matching between two sets of feature vectors. We introduce a novel pyramid embedding based on a hierarchy of non-uniformly shaped bins that takes advantage of the underlying structure of the feature space and remains accurate even for sets with high-dimensional feature vectors. The matching similarity is computed in linear time and forms a Mercer kernel. Whereas previous matching approximation algorithms suffer from distortion factors that increase linearly with the feature dimension, we demonstrate that our approach can maintain constant accuracy even as the feature dimension increases. When used as a kernel in a discriminative classifier, our approach achieves improved object recognition results over a state-of-the-art set kernel. 1
6 0.50661939 119 nips-2006-Learning to Rank with Nonsmooth Cost Functions
7 0.50580448 160 nips-2006-Part-based Probabilistic Point Matching using Equivalence Constraints
8 0.5042882 118 nips-2006-Learning to Model Spatial Dependency: Semi-Supervised Discriminative Random Fields
9 0.50346971 158 nips-2006-PG-means: learning the number of clusters in data
10 0.50262868 32 nips-2006-Analysis of Empirical Bayesian Methods for Neuroelectromagnetic Source Localization
11 0.50261045 72 nips-2006-Efficient Learning of Sparse Representations with an Energy-Based Model
12 0.50257248 112 nips-2006-Learning Nonparametric Models for Probabilistic Imitation
13 0.50135458 110 nips-2006-Learning Dense 3D Correspondence
14 0.50124162 51 nips-2006-Clustering Under Prior Knowledge with Application to Image Segmentation
15 0.50106013 184 nips-2006-Stratification Learning: Detecting Mixed Density and Dimensionality in High Dimensional Point Clouds
16 0.50102472 80 nips-2006-Fundamental Limitations of Spectral Clustering
17 0.50097406 65 nips-2006-Denoising and Dimension Reduction in Feature Space
18 0.50082517 43 nips-2006-Bayesian Model Scoring in Markov Random Fields
19 0.5000025 3 nips-2006-A Complexity-Distortion Approach to Joint Pattern Alignment
20 0.49954733 87 nips-2006-Graph Laplacian Regularization for Large-Scale Semidefinite Programming