nips nips2008 nips2008-46 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Rama Natarajan, Iain Murray, Ladan Shams, Richard S. Zemel
Abstract: We explore a recently proposed mixture model approach to understanding interactions between conflicting sensory cues. Alternative model formulations, differing in their sensory noise models and inference methods, are compared based on their fit to experimental data. Heavy-tailed sensory likelihoods yield a better description of the subjects’ response behavior than standard Gaussian noise models. We study the underlying cause for this result, and then present several testable predictions of these models. 1
Reference: text
sentIndex sentText sentNum sentScore
1 Heavy-tailed sensory likelihoods yield a better description of the subjects’ response behavior than standard Gaussian noise models. [sent-8, score-0.355]
2 A well-tested hypothesis with regards to multi-sensory cue interaction is that the individual sensory estimates are combined in a linear fashion, weighted by their relative reliabilities. [sent-14, score-0.26]
3 Most studies that expound this linear approach assume that sensory noise in the different modalities are independent of each other, and that the sensory likelihoods can be well approximated by Gaussian distributions. [sent-15, score-0.436]
4 Recent studies [2; 3; 4; 5] have proposed a particular form of mixture model to address response behavior in situations with a large conflict between sensory stimuli. [sent-19, score-0.287]
5 The basic intuition behind these models is that large stimulus disparities might be a consequence of the stimuli having resulted from multiple underlying causal factors. [sent-21, score-0.439]
6 We describe two approaches to inference under this model — causal averaging and causal selection — and analyze the model predictions on our simulation of an auditory localization task [6]. [sent-27, score-0.927]
7 The environmental variables of interest are the spatial locations of an auditory and visual stimulus, denoted by sa and sv respectively. [sent-28, score-1.025]
8 Information about the stimuli is provided by noisy sensory cues xa and xv . [sent-29, score-0.946]
9 The model evaluates sensory cues under two discrete hypotheses (C = {1, 2}) regarding the causal structure underlying the generation of the stimuli. [sent-30, score-0.473]
10 The hypotheses are that the two stimuli could arise from the same (C = 1) or different (C = 2) causal events. [sent-31, score-0.306]
11 The model is characterized by (i) the sensory likelihoods P (xv |sv ) and P(xa |sa ), (ii) the prior distributions P (sv , sa ) over true stimulus positions and (iii) the prior over hypotheses P (C). [sent-33, score-0.837]
12 1 Generating sensory data The standard model assumes Gaussian sensory likelihoods and prior distributions. [sent-35, score-0.44]
13 The true auditory and visual stimulus positions are assumed to be the same for C = 1, i. [sent-36, score-0.255]
14 , 2 sa = sv = s drawn from a zero-mean Gaussian prior distribution: s ∼ N (0, σp ) where σp is standard deviation of the distribution. [sent-38, score-0.778]
15 The noisy sensory evidence xa is a sample from a 2 Gaussian distribution with mean sa = s and standard deviation σa : xa ∼ N (xa ; sa = s, σa ). [sent-39, score-1.689]
16 2 Similarly for the visual evidence: xv ∼ N (xv ; sv = s, σv ). [sent-40, score-0.757]
17 When there are C = 2 underlying causes, they are drawn independently from the zero-mean 2 2 2 Gaussian prior distribution: sv ∼ N (0, σp ); sa ∼ N (0, σp ). [sent-41, score-0.778]
18 Then xv ∼ N (xv ; sv , σv ) and 2 xa ∼ N (xa ; sa , σa ). [sent-42, score-1.483]
19 Given this particular causal generative model, the conditional likelihoods in Equation 1 are defined as P (xv , xa |C = 1) = P (xv |sv = s)P (xa |sa = s)P (s)ds and P (xv , xa |C = 2) = P (xv |sv )P (sv )dsv P (xa |sa )P (sa )dsa . [sent-44, score-1.047]
20 The conditional sensory likelihoods are specified as: P (xv , xa |sv , sa , C) = P (xv |sv )P (xa |sa ). [sent-45, score-1.04]
21 1 Causal averaging The conditional posterior over stimulus variables is calculated for each hypothesis as P (sv , sa |xv , xa , C = 1) and P (sv , sa |xv , xa , C = 2). [sent-49, score-1.748]
22 The standard approach to computing the full posterior distribution of interest P (sa , sv |xa , xv ) is by integrating the evidence over both hypotheses weighted by the posterior distribution over C (Equation 1). [sent-50, score-0.822]
23 2 Causal selection An alternative approach is to calculate an approximate posterior distribution by first selecting the hypothesis C ∗ that maximizes the posterior distribution P (C|xv , xa ). [sent-54, score-0.465]
24 C ∗ = argmax P (C|xv , xa ) (4) C={1,2} Then the posterior distribution over stimulus location is approximated as follows: Psel (sv , sa |xv , xa ) ≈ P (sv , sa |xv , xa , C = C ∗ ) P (xv , xa |sv , sa , C = C ∗ )P (sv , sa |C = C ∗ ) = P (xv , xa |C = C ∗ ) (5) (6) 2. [sent-56, score-3.549]
25 3 Evaluating the models on experimental data Here, we evaluate the causal averaging and selection models on an auditory localization task [6] where visual and auditory stimuli were presented at varying spatial and temporal disparities. [sent-57, score-0.887]
26 In addition to reporting the location of the auditory target, subjects were also asked to report on whether they perceived the two stimuli to be perceptually unified. [sent-58, score-0.31]
27 The visual stimuli were assumed to be temporally coincident with the auditory stimuli and presented at varying spatial disparities {0◦ , 5◦ , 10◦ , 15◦ , 20◦ , 25◦ } left or right of sound. [sent-64, score-0.398]
28 Sensory evidence xa and xv were corrupted by Gaussian noise as described earlier. [sent-65, score-0.722]
29 Each stimulus combination {sa , sv } was presented with equal probability 2000 times. [sent-66, score-0.447]
30 On each trial, the model computes a posterior probability distribution over stimulus locations conditioned on the noisy cues xa and xv according to one of Equations 3 or 6. [sent-68, score-0.912]
31 It then estimates visual and auditory locations sa and sv as the peak of the posterior distribution (maximum aposteriori ˆ ˆ estimate): sa = argmaxsa P (sa , sv |xa , xv ). [sent-69, score-2.149]
32 Goodness of fit sv was computed using squared error loss to quantify the amount by which model estimates differed from the behavioral data. [sent-74, score-0.429]
33 For analysis, the trials were dichotomized into unity and non-unity trials based on the perception of spatial unity. [sent-75, score-0.692]
34 A trial was classified as unity if the posterior probability P (C = 1|xv , xa ) was greater than some threshold ρ and non-unity otherwise. [sent-76, score-0.752]
35 , the estimates sa and sv ) were averaged across trials in each ˆ ˆ 2 category. [sent-79, score-0.919]
36 The parameters of the model are: 1) the stimulus location variance σp , 2–3) the 2 2 observation variances σa and σv , 4) the prior mixture proportion ω = P (C = 1), and 5) the unity perception threshold ρ. [sent-80, score-0.57]
37 Regardless of stimulus disparity, whenever visual and auditory stimuli were perceived as unity, the predicted response bias was very high (dashed gray; Figure 1A). [sent-87, score-0.501]
38 When the stimuli appeared to not be unified, the auditory location was biased away from the visual stimulus — increasingly so as disparity decreased (dashed black; Figure 1A). [sent-89, score-0.524]
39 (+/− deg) Percent bias 60 10 8 6 4 10 8 6 4 −60 −80 2 Unity Non−unity −100 −25 −20 −15 −10 −5 0 5 10 15 Spatial disparity sv−sa (deg. [sent-92, score-0.256]
40 ) 25 E: Causal averaging model 25 20 25 F: Causal selection model 20 20 Unity trials Non−unity trials Unity trials Non−unity trials 15 Percent of trials Percent of trials 20 10 5 0 −25 −20 −15 −10 −5 0 5 10 15 Localisation error (deg. [sent-95, score-0.887]
41 ) Figure 1: Simulation results - Gaussian sensory likelihoods: In this, and all subsequent figures, solid lines plot the actual behavioral data reported in [6] and dashed lines are the model predictions. [sent-97, score-0.3]
42 (D) Distribution of localization errors in data, for sv − sa = 0; re-printed with permission from [6]. [sent-107, score-0.886]
43 (E,F) Localization errors predicted by the causal averaging and causal selection models respectively. [sent-108, score-0.658]
44 The predicted curves for unity trials (dashed gray; Figures 1B,C) are all concave, whereas they were actually observed to be convex (solid gray lines). [sent-110, score-0.526]
45 An analysis of the behavioral data derived from the spatially coincident stimulus conditions (sv − sa = 0) revealed a distinct pattern (Figure 1D). [sent-113, score-0.568]
46 On unity trials, localization error was 0◦ implying that the responses were clustered around the auditory target. [sent-114, score-0.585]
47 1 Heavy-tailed likelihood formulation In this section, we re-formulate the sensory likelihoods P (xa |sa ) and P (xv |sv ) as a mixture of Gaussian and uniform distributions. [sent-118, score-0.332]
48 (1 − π) (1 − π) 2 2 xv ∼ πN (xv ; sv , σv ) + ; xa ∼ πN (xa ; sa , σa ) + (7) rl rl 3. [sent-120, score-1.483]
49 2 Simulation results with heavy-tailed sensory likelihoods Figure 2 presents predictions made by the theoretical models based on heavy-tailed likelihoods. [sent-121, score-0.339]
50 Both models now provide a much better fit to bias and variance, compared to their A: Localisation biases B: Causal averaging model 100 C: Causal selection model 14 80 14 Dat Unity Dat Non−unity 12 Dat Unity Dat Non−unity 12 40 20 0 −20 −40 Std dev. [sent-122, score-0.274]
51 (+/− deg) Percent bias 60 10 8 6 4 10 8 6 4 −60 −80 2 Dat Unity Dat Non−unity −100 −25 −20 −15 −10 −5 0 5 10 15 Spatial disparity sv−sa (deg. [sent-124, score-0.256]
52 ) 25 a E: Causal averaging model 25 20 25 F: Causal selection model 20 20 Unity trials Non−unity trials Unity trials Non−unity trials 15 Percent of trials Percent of trials 20 10 5 0 −25 −20 −15 −10 −5 0 5 10 15 Localisation error (deg. [sent-127, score-0.887]
53 (D) Distribution of localization errors in data, for sv − sa = 0. [sent-137, score-0.886]
54 (E,F) Localization errors predicted by the heavy-tailed causal averaging and causal selection models. [sent-138, score-0.64]
55 The heavy-tailed causal averaging model (Figure 2B) makes reasonable predictions with regards to variability. [sent-140, score-0.412]
56 Here too, the best-fitting model is causal selection (dashed line; Figures 2A,C). [sent-142, score-0.294]
57 The localization error distribution (Figure 2F) very closely matches the true observations (Figure 2D) in how the unity responses are uni-modally distributed about the target location sa , and nonunity responses are bi-modally distributed either side of the target. [sent-143, score-0.933]
58 Visually, this is a better prediction of the true distribution of errors, compared to the prediction made by the Gaussian causal selection model (Figure 1F); we are unable to make a quantitative comparison for want of access to the raw data. [sent-144, score-0.326]
59 Compared with the results in Figure 1, our models make very different bias and variance predictions for spatial disparities not tested. [sent-145, score-0.246]
60 The congruent case |sv − sa | = 0 is chosen for reference; |sv − sa | = 10 and |sv − sa | = 25 are chosen since the Gaussian and heavy-tailed models tend to differ most in their predictions at these disparities. [sent-153, score-1.329]
61 In general, most of the samples on unity trials are from the region of space where both the auditory and visual likelihoods overlap. [sent-155, score-0.774]
62 When true disparity |sv − sa | = 0, it means that the two likelihoods overlap maximally (Figures 3Aii and 3Cii). [sent-156, score-0.761]
63 Hence regardless of the form of the likelihood, variability on unity trials at |sv − sa | = 0 should be roughly between σv and σa . [sent-157, score-1.001]
64 A: Gaussian likelihoods, unity B: Gaussian likelihoods, non−unity C: Heavy−tailed likelihoods, unity D: Heavy−tailed likelihoods, non−unity 200 100 i. [sent-159, score-0.714]
65 ) a a Figure 3: Analyzing the likelihood models: Results from the causal selection models. [sent-175, score-0.302]
66 In all plots, light-gray histograms are samples xv from visual likelihood distribution; dark-gay histograms plot xa . [sent-176, score-0.817]
67 Black histograms are built only from samples xa on which either unity (A,C) or non-unity (B,D) judgment was made. [sent-177, score-0.722]
68 Now one of the biggest differences between the likelihood models is what happens to this variability as |sv − sa | increases. [sent-179, score-0.585]
69 This is reflected in the gradually increasing variability on unity trials corresponding to the better matching convex curves predicted by the heavy-tailed model (Figure 2C). [sent-184, score-0.666]
70 Here, the biggest difference between the likelihood models is that in the Gaussian case, after a certain spatial limit, the variability tends to increase with increasing |sv − sa |. [sent-186, score-0.669]
71 This is because as disparity increases, the degree of overlap between two likelihoods decreases and variability approaches σa (Figures 3Bi,3Biii). [sent-188, score-0.457]
72 With heavy-tailed likelihoods, the tails of the two likelihoods continue to overlap even as disparity increases; hence the variability is roughly constant (Figures 3Di,3Diii). [sent-190, score-0.459]
73 4 Model Predictions Quantitative predictions — variance and bias: Our heavy-tailed causal selection model makes two predictions with regards to variability and bias for stimulus conditions not yet tested. [sent-191, score-0.679]
74 One prediction is that on non-unity trials, as spatial disparity sv − sa increases, the localisation variability continues to remain constant at roughly a value equivalent to the standard deviation of the auditory likelihood (Figure 2C; black dashed plot). [sent-192, score-1.518]
75 However, response percent bias approaches zero (Figure 2A; black dashed plot), indicating that when spatial disparity is very high and the stimuli are perceived as being independent, auditory localisation response is consistent with auditory dominance. [sent-193, score-1.052]
76 A second prediction is that percent bias gradually decreases with increasing disparity on unity trials as well. [sent-194, score-0.879]
77 This suggests that even when highly disparate stimuli are perceived as being unified, perception may be dominated by the auditory cues. [sent-195, score-0.247]
78 Our results also predict that the variability in this case continues to increase very gradually with increasing disparity up to some spatial limits (|sv −sa | = 20◦ in our simulations) after which it begins to decrease. [sent-196, score-0.412]
79 Qualitative prediction — distribution of localization errors: Our model also makes a qualitative prediction concerning the distribution of localisation errors for incongruent (sv − sa = 0) stimulus conditions. [sent-198, score-0.832]
80 In both Figures 4A and B, localization error on unity trials is equivalent to the stimulu disparity sv − sa = 10◦ , indicating that even at this high disparity, responses are cluttered closer to the visual stimulus location. [sent-199, score-1.678]
81 dev (+/− deg) Unity trials Non−unity trials Percent of trials Percent of trials 15 C: Heavy−tailed predictions: variability 20 Unity trials Non−unity trials Percent bias A: Gaussian predictions (Psel) 20 1 25 −20 Causal averaging Causal selection −10 0 10 Spatial disparity s −s (deg. [sent-205, score-1.249]
82 ) v 0 a Figure 4: Model predictions: (A,B) Localization error distributions predited by the Gaussian and heavy-tailed causal selection models. [sent-207, score-0.274]
83 Plots correspond to stimulus condition sv = 20;sa = 10. [sent-208, score-0.447]
84 (C,D) Response variability and bias predicted by they heavy-tailed causal averaging and selection models on simulation of an audio-visual localization task [3]. [sent-209, score-0.679]
85 Specificity to experimental task: In the experimental task we have examined here [6], subjects were subjects were asked to first indicate the perceived location of sound on each trial and then to report their judgement of unity. [sent-210, score-0.257]
86 The requirement to explicitly make a unity judgement may incur an experimental bias towards the causal selection model. [sent-211, score-0.733]
87 To explore the potential influence of task instructions on subjects’ inference strategy, we tested our models on a simulation of a different audio-visual spatial localisation task [3]. [sent-212, score-0.276]
88 Here, subjects were asked to report on both visual and auditory stimulus locations and were not explicitly instructed to make unity judgements. [sent-213, score-0.688]
89 However, they do not analyse variability in the subjects’ responses and this aspect of behavior as a function of spatial disparity is not readily obvious in their published data. [sent-215, score-0.401]
90 We evaluated both our heavy-tailed causal averaging as well as causal selection models on a simulation of this experiment. [sent-216, score-0.633]
91 Causal averaging predicts that response variability will monotonically increase with increasing disparity, while selection predicts a less straightforward trend (Figure 4C). [sent-218, score-0.353]
92 Both models predict a similar amount of response bias and that it will decrease with increasing disparity (Figure 4C). [sent-219, score-0.349]
93 Adaptation of the prior: One interesting aspect of inference under this generative model is that as the value of ω = P (C = 1) increases, the variability also increases for both unity and non-unity trials across all disparities. [sent-222, score-0.619]
94 5 Discussion In this paper, we ventured to understand the computational mechanisms underlying sensory cue interactions that give rise to a particular pattern of non-linear response behavior [6], using a mixture of two different models that could have generated the sensory data. [sent-228, score-0.472]
95 We proposed that the form of the sensory likelihood is a critical feature that drives non-linear behavior, especially at large stimulus disparities. [sent-229, score-0.263]
96 Qualitative fits of summarised statistics such as bias and variance are insufficient to make any strong claims about human perceptual processes; nevertheless, this work provides some insight into the potential functional role of sensory noise. [sent-233, score-0.249]
97 One downside about our results is that even though the model bias for unity trials captures the slightly increasing trend as disparity decreases, it is not as large as in the behavioral data (close to 100%) or as that predicted by the Gaussian models. [sent-240, score-0.859]
98 Then one response strategy might be to ignore the posterior probability P (sa |xv , xa ) once unity is judged and then set sa = sv ; although this results in ˆ ˆ prediction of higher bias, the strategy is not Bayes-optimal. [sent-243, score-1.579]
99 On the experimental side, one of the major inadequacies of most experimental paradigms is that the only (approximate) measure of a subject’s perceptual uncertainty involves measuring the response variability across a large number of trials. [sent-247, score-0.25]
100 Bayesian inference explains perception of unity and ventriloquism aftereffect. [sent-262, score-0.426]
wordName wordTfidf (topN-words)
[('sa', 0.422), ('xv', 0.363), ('unity', 0.357), ('sv', 0.356), ('xa', 0.342), ('causal', 0.231), ('disparity', 0.191), ('sensory', 0.144), ('localisation', 0.14), ('likelihoods', 0.132), ('auditory', 0.126), ('trials', 0.121), ('variability', 0.101), ('stimulus', 0.091), ('dat', 0.081), ('localization', 0.081), ('averaging', 0.078), ('percent', 0.075), ('non', 0.07), ('spatial', 0.066), ('bias', 0.065), ('figures', 0.06), ('response', 0.057), ('disparities', 0.052), ('cues', 0.05), ('stimuli', 0.047), ('deg', 0.047), ('perceived', 0.047), ('predictions', 0.045), ('subjects', 0.044), ('heavy', 0.044), ('selection', 0.043), ('perceptual', 0.04), ('tailed', 0.038), ('regards', 0.038), ('visual', 0.038), ('dashed', 0.036), ('cue', 0.036), ('behavioral', 0.033), ('ladan', 0.032), ('psel', 0.032), ('simulation', 0.032), ('location', 0.031), ('predicted', 0.03), ('biases', 0.03), ('posterior', 0.029), ('likelihood', 0.028), ('mixture', 0.028), ('hypotheses', 0.028), ('icting', 0.028), ('errors', 0.027), ('gaussian', 0.027), ('std', 0.027), ('perception', 0.027), ('trend', 0.024), ('lines', 0.024), ('trial', 0.024), ('interactions', 0.023), ('xation', 0.023), ('histograms', 0.023), ('behavior', 0.022), ('hypothesis', 0.022), ('alongside', 0.022), ('coincident', 0.022), ('inadequacies', 0.022), ('judgement', 0.022), ('ventriloquism', 0.022), ('uni', 0.021), ('responses', 0.021), ('estimates', 0.02), ('model', 0.02), ('inference', 0.02), ('tails', 0.019), ('ict', 0.019), ('gradually', 0.019), ('black', 0.019), ('qualitative', 0.019), ('solid', 0.019), ('multisensory', 0.019), ('rding', 0.019), ('stocker', 0.019), ('gray', 0.018), ('models', 0.018), ('increasing', 0.018), ('continues', 0.017), ('evidence', 0.017), ('decreases', 0.017), ('locations', 0.017), ('overlap', 0.016), ('prediction', 0.016), ('faculty', 0.016), ('biggest', 0.016), ('eero', 0.016), ('konrad', 0.016), ('predicts', 0.016), ('proportion', 0.016), ('studies', 0.016), ('formulations', 0.016), ('experimental', 0.015), ('asked', 0.015)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 46 nips-2008-Characterizing response behavior in multisensory perception with conflicting cues
Author: Rama Natarajan, Iain Murray, Ladan Shams, Richard S. Zemel
Abstract: We explore a recently proposed mixture model approach to understanding interactions between conflicting sensory cues. Alternative model formulations, differing in their sensory noise models and inference methods, are compared based on their fit to experimental data. Heavy-tailed sensory likelihoods yield a better description of the subjects’ response behavior than standard Gaussian noise models. We study the underlying cause for this result, and then present several testable predictions of these models. 1
2 0.13776162 223 nips-2008-Structure Learning in Human Sequential Decision-Making
Author: Daniel Acuna, Paul R. Schrater
Abstract: We use graphical models and structure learning to explore how people learn policies in sequential decision making tasks. Studies of sequential decision-making in humans frequently find suboptimal performance relative to an ideal actor that knows the graph model that generates reward in the environment. We argue that the learning problem humans face also involves learning the graph structure for reward generation in the environment. We formulate the structure learning problem using mixtures of reward models, and solve the optimal action selection problem using Bayesian Reinforcement Learning. We show that structure learning in one and two armed bandit problems produces many of the qualitative behaviors deemed suboptimal in previous studies. Our argument is supported by the results of experiments that demonstrate humans rapidly learn and exploit new reward structure. 1
3 0.12337431 40 nips-2008-Bounds on marginal probability distributions
Author: Joris M. Mooij, Hilbert J. Kappen
Abstract: We propose a novel bound on single-variable marginal probability distributions in factor graphs with discrete variables. The bound is obtained by propagating local bounds (convex sets of probability distributions) over a subtree of the factor graph, rooted in the variable of interest. By construction, the method not only bounds the exact marginal probability distribution of a variable, but also its approximate Belief Propagation marginal (“belief”). Thus, apart from providing a practical means to calculate bounds on marginals, our contribution also lies in providing a better understanding of the error made by Belief Propagation. We show that our bound outperforms the state-of-the-art on some inference problems arising in medical diagnosis. 1
4 0.12103796 191 nips-2008-Recursive Segmentation and Recognition Templates for 2D Parsing
Author: Leo Zhu, Yuanhao Chen, Yuan Lin, Chenxi Lin, Alan L. Yuille
Abstract: Language and image understanding are two major goals of artificial intelligence which can both be conceptually formulated in terms of parsing the input signal into a hierarchical representation. Natural language researchers have made great progress by exploiting the 1D structure of language to design efficient polynomialtime parsing algorithms. By contrast, the two-dimensional nature of images makes it much harder to design efficient image parsers and the form of the hierarchical representations is also unclear. Attempts to adapt representations and algorithms from natural language have only been partially successful. In this paper, we propose a Hierarchical Image Model (HIM) for 2D image parsing which outputs image segmentation and object recognition. This HIM is represented by recursive segmentation and recognition templates in multiple layers and has advantages for representation, inference, and learning. Firstly, the HIM has a coarse-to-fine representation which is capable of capturing long-range dependency and exploiting different levels of contextual information. Secondly, the structure of the HIM allows us to design a rapid inference algorithm, based on dynamic programming, which enables us to parse the image rapidly in polynomial time. Thirdly, we can learn the HIM efficiently in a discriminative manner from a labeled dataset. We demonstrate that HIM outperforms other state-of-the-art methods by evaluation on the challenging public MSRC image dataset. Finally, we sketch how the HIM architecture can be extended to model more complex image phenomena. 1
5 0.097792819 153 nips-2008-Nonlinear causal discovery with additive noise models
Author: Patrik O. Hoyer, Dominik Janzing, Joris M. Mooij, Jan R. Peters, Bernhard Schölkopf
Abstract: The discovery of causal relationships between a set of observed variables is a fundamental problem in science. For continuous-valued data linear acyclic causal models with additive noise are often used because these models are well understood and there are well-known methods to fit them to data. In reality, of course, many causal relationships are more or less nonlinear, raising some doubts as to the applicability and usefulness of purely linear methods. In this contribution we show that the basic linear framework can be generalized to nonlinear models. In this extended framework, nonlinearities in the data-generating process are in fact a blessing rather than a curse, as they typically provide information on the underlying causal system and allow more aspects of the true data-generating mechanisms to be identified. In addition to theoretical results we show simulations and some simple real data experiments illustrating the identification power provided by nonlinearities. 1
6 0.083566129 38 nips-2008-Bio-inspired Real Time Sensory Map Realignment in a Robotic Barn Owl
7 0.082231492 244 nips-2008-Unifying the Sensory and Motor Components of Sensorimotor Adaptation
8 0.080308184 89 nips-2008-Gates
9 0.076522365 206 nips-2008-Sequential effects: Superstition or rational behavior?
10 0.075886697 231 nips-2008-Temporal Dynamics of Cognitive Control
11 0.069791667 24 nips-2008-An improved estimator of Variance Explained in the presence of noise
12 0.064401962 92 nips-2008-Generative versus discriminative training of RBMs for classification of fMRI images
13 0.062217198 108 nips-2008-Integrating Locally Learned Causal Structures with Overlapping Variables
14 0.056775987 120 nips-2008-Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text
15 0.055761158 67 nips-2008-Effects of Stimulus Type and of Error-Correcting Code Design on BCI Speller Performance
16 0.052527376 74 nips-2008-Estimating the Location and Orientation of Complex, Correlated Neural Activity using MEG
17 0.052428473 86 nips-2008-Finding Latent Causes in Causal Networks: an Efficient Approach Based on Markov Blankets
18 0.050878525 172 nips-2008-Optimal Response Initiation: Why Recent Experience Matters
19 0.047843456 60 nips-2008-Designing neurophysiology experiments to optimally constrain receptive field models along parametric submanifolds
20 0.045364074 66 nips-2008-Dynamic visual attention: searching for coding length increments
topicId topicWeight
[(0, -0.108), (1, 0.042), (2, 0.093), (3, -0.017), (4, -0.01), (5, -0.043), (6, -0.054), (7, 0.064), (8, 0.106), (9, 0.029), (10, -0.015), (11, 0.102), (12, -0.056), (13, -0.029), (14, -0.046), (15, 0.068), (16, 0.107), (17, -0.031), (18, -0.083), (19, 0.011), (20, 0.067), (21, 0.006), (22, -0.003), (23, -0.062), (24, 0.156), (25, -0.115), (26, 0.01), (27, 0.015), (28, -0.078), (29, 0.023), (30, -0.141), (31, -0.051), (32, 0.075), (33, 0.012), (34, -0.144), (35, -0.099), (36, -0.042), (37, -0.046), (38, -0.06), (39, -0.113), (40, 0.125), (41, -0.0), (42, -0.121), (43, -0.046), (44, -0.042), (45, -0.17), (46, 0.075), (47, 0.053), (48, 0.195), (49, -0.015)]
simIndex simValue paperId paperTitle
same-paper 1 0.95847821 46 nips-2008-Characterizing response behavior in multisensory perception with conflicting cues
Author: Rama Natarajan, Iain Murray, Ladan Shams, Richard S. Zemel
Abstract: We explore a recently proposed mixture model approach to understanding interactions between conflicting sensory cues. Alternative model formulations, differing in their sensory noise models and inference methods, are compared based on their fit to experimental data. Heavy-tailed sensory likelihoods yield a better description of the subjects’ response behavior than standard Gaussian noise models. We study the underlying cause for this result, and then present several testable predictions of these models. 1
2 0.51294154 89 nips-2008-Gates
Author: Tom Minka, John Winn
Abstract: Gates are a new notation for representing mixture models and context-sensitive independence in factor graphs. Factor graphs provide a natural representation for message-passing algorithms, such as expectation propagation. However, message passing in mixture models is not well captured by factor graphs unless the entire mixture is represented by one factor, because the message equations have a containment structure. Gates capture this containment structure graphically, allowing both the independences and the message-passing equations for a model to be readily visualized. Different variational approximations for mixture models can be understood as different ways of drawing the gates in a model. We present general equations for expectation propagation and variational message passing in the presence of gates. 1
3 0.43781853 124 nips-2008-Load and Attentional Bayes
Author: Peter Dayan
Abstract: Selective attention is a most intensively studied psychological phenomenon, rife with theoretical suggestions and schisms. A critical idea is that of limited capacity, the allocation of which has produced continual conflict about such phenomena as early and late selection. An influential resolution of this debate is based on the notion of perceptual load (Lavie, 2005), which suggests that low-load, easy tasks, because they underuse the total capacity of attention, mandatorily lead to the processing of stimuli that are irrelevant to the current attentional set; whereas high-load, difficult tasks grab all resources for themselves, leaving distractors high and dry. We argue that this theory presents a challenge to Bayesian theories of attention, and suggest an alternative, statistical, account of key supporting data. 1
4 0.42056972 153 nips-2008-Nonlinear causal discovery with additive noise models
Author: Patrik O. Hoyer, Dominik Janzing, Joris M. Mooij, Jan R. Peters, Bernhard Schölkopf
Abstract: The discovery of causal relationships between a set of observed variables is a fundamental problem in science. For continuous-valued data linear acyclic causal models with additive noise are often used because these models are well understood and there are well-known methods to fit them to data. In reality, of course, many causal relationships are more or less nonlinear, raising some doubts as to the applicability and usefulness of purely linear methods. In this contribution we show that the basic linear framework can be generalized to nonlinear models. In this extended framework, nonlinearities in the data-generating process are in fact a blessing rather than a curse, as they typically provide information on the underlying causal system and allow more aspects of the true data-generating mechanisms to be identified. In addition to theoretical results we show simulations and some simple real data experiments illustrating the identification power provided by nonlinearities. 1
5 0.41322583 86 nips-2008-Finding Latent Causes in Causal Networks: an Efficient Approach Based on Markov Blankets
Author: Jean-philippe Pellet, AndrĂŠElisseeff
Abstract: Causal structure-discovery techniques usually assume that all causes of more than one variable are observed. This is the so-called causal sufficiency assumption. In practice, it is untestable, and often violated. In this paper, we present an efficient causal structure-learning algorithm, suited for causally insufficient data. Similar to algorithms such as IC* and FCI, the proposed approach drops the causal sufficiency assumption and learns a structure that indicates (potential) latent causes for pairs of observed variables. Assuming a constant local density of the data-generating graph, our algorithm makes a quadratic number of conditionalindependence tests w.r.t. the number of variables. We show with experiments that our algorithm is comparable to the state-of-the-art FCI algorithm in accuracy, while being several orders of magnitude faster on large problems. We conclude that MBCS* makes a new range of causally insufficient problems computationally tractable. Keywords: Graphical Models, Structure Learning, Causal Inference. 1 Introduction: Task Definition & Related Work The statistical definition of causality pioneered by Pearl (2000) and Spirtes et al. (2001) has shed new light on how to detect causation. Central in this approach is the automated detection of causeeffect relationships using observational (i.e., non-experimental) data. This can be a necessary task, as in many situations, performing randomized controlled experiments to unveil causation can be impossible, unethical , or too costly. When the analysis deals with variables that cannot be manipulated, being able to learn from data collected by observing the running system is the only possibility. It turns out that learning the full causal structure of a set of variables is, in its most general form , impossible. If we suppose that the
6 0.40221986 40 nips-2008-Bounds on marginal probability distributions
7 0.38363731 66 nips-2008-Dynamic visual attention: searching for coding length increments
8 0.37471041 172 nips-2008-Optimal Response Initiation: Why Recent Experience Matters
9 0.35181737 244 nips-2008-Unifying the Sensory and Motor Components of Sensorimotor Adaptation
10 0.33840552 108 nips-2008-Integrating Locally Learned Causal Structures with Overlapping Variables
11 0.3372004 223 nips-2008-Structure Learning in Human Sequential Decision-Making
12 0.33467638 231 nips-2008-Temporal Dynamics of Cognitive Control
13 0.32687351 222 nips-2008-Stress, noradrenaline, and realistic prediction of mouse behaviour using reinforcement learning
14 0.32081041 38 nips-2008-Bio-inspired Real Time Sensory Map Realignment in a Robotic Barn Owl
15 0.3149254 24 nips-2008-An improved estimator of Variance Explained in the presence of noise
16 0.29347512 94 nips-2008-Goal-directed decision making in prefrontal cortex: a computational framework
17 0.28890017 187 nips-2008-Psychiatry: Insights into depression through normative decision-making models
18 0.26907939 206 nips-2008-Sequential effects: Superstition or rational behavior?
19 0.25118214 7 nips-2008-A computational model of hippocampal function in trace conditioning
20 0.25003102 82 nips-2008-Fast Computation of Posterior Mode in Multi-Level Hierarchical Models
topicId topicWeight
[(6, 0.078), (7, 0.056), (12, 0.018), (15, 0.019), (21, 0.373), (28, 0.159), (57, 0.051), (59, 0.018), (63, 0.015), (71, 0.012), (77, 0.035), (78, 0.013), (83, 0.044)]
simIndex simValue paperId paperTitle
same-paper 1 0.81648535 46 nips-2008-Characterizing response behavior in multisensory perception with conflicting cues
Author: Rama Natarajan, Iain Murray, Ladan Shams, Richard S. Zemel
Abstract: We explore a recently proposed mixture model approach to understanding interactions between conflicting sensory cues. Alternative model formulations, differing in their sensory noise models and inference methods, are compared based on their fit to experimental data. Heavy-tailed sensory likelihoods yield a better description of the subjects’ response behavior than standard Gaussian noise models. We study the underlying cause for this result, and then present several testable predictions of these models. 1
2 0.8116352 13 nips-2008-Adapting to a Market Shock: Optimal Sequential Market-Making
Author: Sanmay Das, Malik Magdon-Ismail
Abstract: We study the profit-maximization problem of a monopolistic market-maker who sets two-sided prices in an asset market. The sequential decision problem is hard to solve because the state space is a function. We demonstrate that the belief state is well approximated by a Gaussian distribution. We prove a key monotonicity property of the Gaussian state update which makes the problem tractable, yielding the first optimal sequential market-making algorithm in an established model. The algorithm leads to a surprising insight: an optimal monopolist can provide more liquidity than perfectly competitive market-makers in periods of extreme uncertainty, because a monopolist is willing to absorb initial losses in order to learn a new valuation rapidly so she can extract higher profits later. 1
3 0.70513797 153 nips-2008-Nonlinear causal discovery with additive noise models
Author: Patrik O. Hoyer, Dominik Janzing, Joris M. Mooij, Jan R. Peters, Bernhard Schölkopf
Abstract: The discovery of causal relationships between a set of observed variables is a fundamental problem in science. For continuous-valued data linear acyclic causal models with additive noise are often used because these models are well understood and there are well-known methods to fit them to data. In reality, of course, many causal relationships are more or less nonlinear, raising some doubts as to the applicability and usefulness of purely linear methods. In this contribution we show that the basic linear framework can be generalized to nonlinear models. In this extended framework, nonlinearities in the data-generating process are in fact a blessing rather than a curse, as they typically provide information on the underlying causal system and allow more aspects of the true data-generating mechanisms to be identified. In addition to theoretical results we show simulations and some simple real data experiments illustrating the identification power provided by nonlinearities. 1
4 0.6472044 134 nips-2008-Mixed Membership Stochastic Blockmodels
Author: Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg, Eric P. Xing
Abstract: In many settings, such as protein interactions and gene regulatory networks, collections of author-recipient email, and social networks, the data consist of pairwise measurements, e.g., presence or absence of links between pairs of objects. Analyzing such data with probabilistic models requires non-standard assumptions, since the usual independence or exchangeability assumptions no longer hold. In this paper, we introduce a class of latent variable models for pairwise measurements: mixed membership stochastic blockmodels. Models in this class combine a global model of dense patches of connectivity (blockmodel) with a local model to instantiate node-specific variability in the connections (mixed membership). We develop a general variational inference algorithm for fast approximate posterior inference. We demonstrate the advantages of mixed membership stochastic blockmodel with applications to social networks and protein interaction networks. 1
5 0.52277851 48 nips-2008-Clustering via LP-based Stabilities
Author: Nikos Komodakis, Nikos Paragios, Georgios Tziritas
Abstract: A novel center-based clustering algorithm is proposed in this paper. We first formulate clustering as an NP-hard linear integer program and we then use linear programming and the duality theory to derive the solution of this optimization problem. This leads to an efficient and very general algorithm, which works in the dual domain, and can cluster data based on an arbitrary set of distances. Despite its generality, it is independent of initialization (unlike EM-like methods such as K-means), has guaranteed convergence, can automatically determine the number of clusters, and can also provide online optimality bounds about the quality of the estimated clustering solutions. To deal with the most critical issue in a centerbased clustering algorithm (selection of cluster centers), we also introduce the notion of stability of a cluster center, which is a well defined LP-based quantity that plays a key role to our algorithm’s success. Furthermore, we also introduce, what we call, the margins (another key ingredient in our algorithm), which can be roughly thought of as dual counterparts to stabilities and allow us to obtain computationally efficient approximations to the latter. Promising experimental results demonstrate the potentials of our method.
6 0.49700919 86 nips-2008-Finding Latent Causes in Causal Networks: an Efficient Approach Based on Markov Blankets
7 0.47903639 108 nips-2008-Integrating Locally Learned Causal Structures with Overlapping Variables
8 0.47117028 79 nips-2008-Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning
9 0.47107881 62 nips-2008-Differentiable Sparse Coding
10 0.47073418 202 nips-2008-Robust Regression and Lasso
11 0.47028315 217 nips-2008-Sparsity of SVMs that use the epsilon-insensitive loss
12 0.46962339 162 nips-2008-On the Design of Loss Functions for Classification: theory, robustness to outliers, and SavageBoost
13 0.46911585 135 nips-2008-Model Selection in Gaussian Graphical Models: High-Dimensional Consistency of \boldmath$\ell 1$-regularized MLE
14 0.46839488 14 nips-2008-Adaptive Forward-Backward Greedy Algorithm for Sparse Learning with Linear Models
15 0.46826914 226 nips-2008-Supervised Dictionary Learning
16 0.46805015 21 nips-2008-An Homotopy Algorithm for the Lasso with Online Observations
17 0.46662122 231 nips-2008-Temporal Dynamics of Cognitive Control
18 0.46644029 24 nips-2008-An improved estimator of Variance Explained in the presence of noise
19 0.46632442 96 nips-2008-Hebbian Learning of Bayes Optimal Decisions
20 0.46611461 196 nips-2008-Relative Margin Machines