nips nips2009 nips2009-41 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Marcel V. Gerven, Botond Cseke, Robert Oostenveld, Tom Heskes
Abstract: We introduce a novel multivariate Laplace (MVL) distribution as a sparsity promoting prior for Bayesian source localization that allows the specification of constraints between and within sources. We represent the MVL distribution as a scale mixture that induces a coupling between source variances instead of their means. Approximation of the posterior marginals using expectation propagation is shown to be very efficient due to properties of the scale mixture representation. The computational bottleneck amounts to computing the diagonal elements of a sparse matrix inverse. Our approach is illustrated using a mismatch negativity paradigm for which MEG data and a structural MRI have been acquired. We show that spatial coupling leads to sources which are active over larger cortical areas as compared with an uncoupled prior. 1
Reference: text
sentIndex sentText sentNum sentScore
1 We represent the MVL distribution as a scale mixture that induces a coupling between source variances instead of their means. [sent-2, score-0.646]
2 Approximation of the posterior marginals using expectation propagation is shown to be very efficient due to properties of the scale mixture representation. [sent-3, score-0.287]
3 We show that spatial coupling leads to sources which are active over larger cortical areas as compared with an uncoupled prior. [sent-6, score-0.54]
4 Let q, p, and t denote the number of sensors, sources and time points, respectively. [sent-8, score-0.223]
5 Sensor readings Y ∈ Rq×t and source currents S ∈ Rp×t are related by Y = XS + E (1) where X ∈ Rq×p is a lead field matrix that represents how sources project onto the sensors and E ∈ Rq×t represents sensor noise. [sent-9, score-0.77]
6 Unfortunately, localizing distributed sources is an ill-posed inverse problem that only admits a unique solution when additional constraints are defined. [sent-10, score-0.313]
7 In a Bayesian setting, these constraints take the form of a prior on the sources [3, 19]. [sent-11, score-0.306]
8 Popular choices of prior source amplitude distributions are Gaussian or Laplace priors, whose MAP estimates correspond to minimum norm and minimum current estimates, respectively [18]. [sent-12, score-0.442]
9 In contrast, minimum current estimates lead to focal source estimates that may be scattered too much throughout the brain volume [9]. [sent-14, score-0.521]
10 In this paper, we take the Laplace prior as our point of departure for Bayesian source localization (instead of using just the MAP estimate). [sent-15, score-0.428]
11 Here, in contrast, we assume a multivariate Laplace distribution over all sources, which allows sources to be coupled. [sent-17, score-0.357]
12 Since the posterior cannot be computed exactly, we formulate an efficient expectation propagation 1 algorithm [12] which allows us to approximate the posterior of interest for very large models. [sent-20, score-0.21]
13 Efficiency arises from the block diagonal form of the approximate posterior covariance matrix due to properties of the scale mixture representation. [sent-21, score-0.301]
14 The computational bottleneck then reduces to computation of the diagonal elements of a sparse matrix inverse, which can be solved through Cholesky decomposition of a sparse matrix and application of the Takahashi equation [17]. [sent-22, score-0.232]
15 2 Bayesian source localization In a Bayesian setting, the goal of source localization is to estimate the posterior p(S | Y, X, Σ, Θ) ∝ p(Y | S, X, Σ)p(S | Θ) (2) where the likelihood term p(Y | S) = t N (yt | Xst , Σ) factorizes over time and Σ represents sensor noise. [sent-25, score-0.921]
16 1 The source localization problem can be formulated as a (Bayesian) linear regression problem where the source currents s play the role of the regression coefficients and rows of the lead field matrix X can be interpreted as covariates. [sent-31, score-0.84]
17 In the following, we define a multivariate Laplace distribution, represented in terms of a scale mixture, as a convenient prior that incorporates both spatio-temporal and sparsity constraints. [sent-32, score-0.276]
18 L (s | λ) = (4) Eltoft et al [5] defined the multivariate Laplace distribution as a scale mixture of a multivariate √ Gaussian given by zΣ1/2 s where s is a standard normal multivariate Gaussian, Σ is a positive definite matrix, and z is drawn from a univariate exponential distribution. [sent-36, score-0.604]
19 In contrast, we use an alternative formulation of the multivariate Laplace distribution that couples the variances of the sources rather than the source currents themselves. [sent-38, score-0.809]
20 For an uncoupled multivariate Laplace distribution, this generalization reads L (s | λ) = 2 N si | 0, u2 + vi N vi | 0, 1/λ2 N ui | 0, 1/λ2 i dudv (5) i such that each source current si gets assigned scale variables ui and vi . [sent-41, score-3.115]
21 We can interpret the scale variables corresponding to source i as indicators of its relevance: the larger (the posterior estimate 2 of) u2 + vi , the more relevant the corresponding source. [sent-42, score-0.802]
22 2 f s1 s2 ··· sp g1 g2 ··· gp ··· up u1 v1 u2 v2 h1 vp h2 Figure 1: Factor graph representation of Bayesian source localization with a multivariate Laplace prior. [sent-44, score-0.508]
23 Factors gi correspond to the coupling between sources and scales. [sent-46, score-0.445]
24 Factors h1 and h2 represent the (identical) multivariate Gaussians on u and v with prior precision matrix J. [sent-47, score-0.288]
25 This definition yields a coupling in the magnitudes of the source currents through their variances. [sent-50, score-0.604]
26 Note that this approach is defining the multivariate Laplace with the help of a multivariate exponential distribution [10]. [sent-52, score-0.292]
27 3 Approximate inference Our goal is to compute posterior marginals for sources si as well as scale variables ui and vi in order to determine source relevance. [sent-57, score-1.753]
28 Using EP, we will approximate p(z) with q (z) ∝ t0 (z) i ti (z), where ¯ the ti (z) are Gaussian functions as well. [sent-68, score-0.36]
29 Equation (6) introduces 2p auxiliary Gaussian variables (u, v) that are coupled to the si ’s by p non-Gaussian factors, thus, we have to approximate p terms. [sent-70, score-0.457]
30 The multivariate Laplace distribution defined in [5] introduces one auxiliary variable and couples all the si sj terms to it, therefore, it would lead to p2 non-Gaussian terms to be approximated. [sent-71, score-0.547]
31 Moreover, as we will see below, the a priori independence of u and v and 3 the form of the terms ti (z) results in an approximation of the posterior with the same block-diagonal structure as that of t0 (z). [sent-72, score-0.225]
32 ¯ ¯i ¯ In each step, EP updates ti with t∗ by defining q \i ∝ t0 (z) \i tj , minimizing KL ti q \i q ∗ with ¯i respect to q ∗ and setting t∗ ∝ q ∗ /q \i . [sent-73, score-0.355]
33 It can be shown that when ti depends only on a subset of ¯ variables zi (in our case on zi = (si , ui , vi )) then so does ti . [sent-74, score-1.289]
34 The minimization of the KL divergence then boils down to the minimization of KL ti (zi )q \i (zi ) q ∗ (zi ) with respect to q ∗ (zi ) ¯ ¯i and ti is updated to t∗ (zi ) ∝ q ∗ (zi )/q \i (zi ). [sent-75, score-0.33]
35 , q ∗ (si , ui , vi ) is a Gaussian with the same mean and covariance matrix as q i (zi ) ∝ ti (zi )q \i (zi ). [sent-78, score-0.982]
36 We will now work out the EP update for the i-th term approximation in more detail to show by ¯ induction that ti (si , ui , vi ) factorizes into independent terms for si , ui , and vi . [sent-84, score-2.08]
37 Since ui and vi play exactly the same role, it is also easy to see that the term approximation is always symmetric in ui and vi . [sent-85, score-1.565]
38 Let us suppose that q (si , ui , vi ) and consequently q \i (si , ui , vi ) factorizes into independent terms for si , ui , and vi , e. [sent-86, score-2.663]
39 , we can write 2 2 2 q \i (si , ui , vi ) = N (si | mi , σi )N (ui | 0, νi )N (vi | 0, νi ). [sent-88, score-0.797]
40 (9) ¯ By initializing ti (si , ui , vi ) = 1, we have q(z) ∝ t0 (z) and the factorization of q \i (si , ui , vi ) follows directly from the factorization of t0 (z) into independent terms for s, u, and v. [sent-89, score-1.707]
41 (10) (11) 2 Since q i (ui , vi ) only depends on u2 and vi and is thus invariant under sign changes of ui and vi , i we must have E [ui ] = E [vi ] = 0, as well as E [ui vi ] = 0. [sent-92, score-1.98]
42 Because of symmetry, we further have 2 2 2 E u2 = E vi = (E u2 + E vi )/2. [sent-93, score-0.806]
43 Since q i (ui , vi ) can be expressed as a function of u2 + vi i i i only, this variance can be computed from (11) using one-dimensional Gauss-Laguerre numerical quadrature [15]. [sent-94, score-0.893]
44 The first and second moments of si conditioned upon ui and vi follow directly from (10). [sent-95, score-1.136]
45 Because both (10) and (11) are invariant under sign changes of ui and vi , we must have 2 E [si ui ] = E [si vi ] = 0. [sent-96, score-1.542]
46 Furthermore, since the conditional moments again depend only on u2 + vi , i 2 also E [si ] and E si can be computed with one-dimensional Gauss-Laguerre integration. [sent-97, score-0.768]
47 Summarizing, we have shown that if the old term approximations factorize into independent terms for si , ui , ¯i and vi , the new term approximation after an EP update, t∗ (si , ui , vi ) ∝ q ∗ (si , ui , vi )/q \i (si , ui , vi ), must do the same. [sent-98, score-3.427]
48 Furthermore, given the cavity distribution q \i (si , ui , vi ), all required moments can be computed using one-dimensional numerical integration. [sent-99, score-0.878]
49 The crucial observation here is that the terms ti (si , ui , vi ) introduce dependencies between si and (ui , vi ), as expressed in Eqs. [sent-100, score-1.636]
50 That is, also when the expectations are taken with respect to the exact p(s, u, v) we have E [ui ] = E [vi ] = E [ui vi ] = E [si ui ] = E [si vi ] = 0 and 2 2 E u2 = E vi . [sent-103, score-1.577]
51 The variance of the scales E u2 + vi determines the amount of regularization i i on the source parameter si such that large variance implies little regularization. [sent-104, score-1.079]
52 Calculating q \i (si , ui , vi ) 4 requires the computation of the marginal moments q(si ), q(ui ) and q(vi ). [sent-107, score-0.839]
53 This precision matrix has the block-diagonal form XT X/σ 2 + Ks 0 0 K= (12) 0 λ2 J + Ku 0 2 0 0 λ J + Kv where J is a sparse precision matrix which determines the coupling, and Ks , Ku , and Kv = Ku are diagonal matrices that contain the contributions of the term approximations. [sent-109, score-0.297]
54 4 Experiments Returning to the source localization problem, we will show that the MVL prior can be used to induce constraints on the source estimates. [sent-113, score-0.74]
55 The lead field matrix is defined for the three x, y, and z orientations in each of the source locations and was normalized to correct for depth bias. [sent-124, score-0.447]
56 In the next section, we compare source estimates for the MMN difference wave that have been obtained when using either a decoupled or a coupled MVL prior. [sent-127, score-0.554]
57 For ease of exposition, we focus on a spatial prior induced by the coupling of neighboring sources. [sent-128, score-0.354]
58 Differences in the source estimates will therefore arise only from the form of the 11589 × 11589 sparse precision matrix J. [sent-130, score-0.46]
59 The first estimate is obtained by assuming that there is no coupling between elements of the lead field matrix, such that J = I. [sent-131, score-0.26]
60 The second estimate is obtained by assuming a coupling between neighboring sources i and j within the brain volume with fixed strength c. [sent-133, score-0.577]
61 This coupling is specified through the unnormalized precision matrix ˆ ˆ ˆ ˆ ˆ ˆ J by assuming Jix ,jx = Jiy ,jy = Jiz ,jz = −c while diagonal elements Jii are set to 1 − j=i Jij . [sent-134, score-0.362]
62 2 This prior dictates that the magnitude of the variances of the source currents are coupled between sources. [sent-135, score-0.558]
63 For the coupling strength c, we use correlation as a guiding principle. [sent-136, score-0.271]
64 Specifically, correlation between sources si and sj is given by ˆ rij = J−1 ˆ / J−1 ij 1 2 ii ˆ J−1 1 2 . [sent-138, score-0.569]
65 Note that this also leads to more distant sources having non-zero correlations. [sent-141, score-0.248]
66 5 J L C Figure 2: Spatial coupling leads to the normalized precision matrix J with coupling of neighboring source orientations in the x, y, and z directions. [sent-143, score-0.954]
67 The correlation matrix C shows the correlations between the source orientations. [sent-145, score-0.411]
68 neighboring sources is motivated by the notion that we expect neighboring sources to be similarly though not equivalently involved for a given task. [sent-147, score-0.544]
69 Figure 2 demonstrates how a chosen coupling leads to a particular structure of J, where irregularities in J are caused by the structure of the imaged brain volume. [sent-149, score-0.304]
70 The correlation matrix C shows the correlations between the sources induced by the structure of J. [sent-153, score-0.351]
71 Zeros in the correlation matrix arise from the independence between source orientations x, y, and z. [sent-154, score-0.431]
72 5 Results Figure 3 depicts the difference wave that was obtained by subtracting the trial average for standard tones from the trial average for deviant tones. [sent-155, score-0.232]
73 6 Figure 4: Source estimates using a decoupled prior (top) or a coupled prior (bottom). [sent-165, score-0.323]
74 Figure 5: Relative variance using a decoupled prior (top) or a coupled prior (bottom). [sent-167, score-0.328]
75 We now proceed to localizing the sources of the activation induced by mismatch negativity. [sent-170, score-0.328]
76 Figure 4 depicts the localized sources when using either a decoupled MVL prior or a coupled MVL prior. [sent-171, score-0.481]
77 The coupled spatial prior leads to stronger source currents that are spread over a larger brain volume. [sent-172, score-0.632]
78 MVL source localization has correctly identified the source over left temporal cortex but does not capture the source over right temporal cortex that is also hypothesized to be present (cf. [sent-173, score-1.004]
79 Differences between the decoupled and the coupled prior become more salient when we look at the relative variance of the auxiliary variables as shown in Fig. [sent-178, score-0.319]
80 Relative variance is defined here as posterior variance minus prior variance of the auxiliary variables, normalized to be between zero and one. [sent-180, score-0.303]
81 This measure indicates the change in magnitude of the variance of the auxiliary variables, and thus indirectly that of the sources via Eq. [sent-181, score-0.316]
82 Since only sources with non-zero contributions should have high variance, this measure can be used to indicate the relevance of a source. [sent-183, score-0.255]
83 Figure 5 7 shows that temporal sources in both left and right hemispheres are relevant. [sent-184, score-0.255]
84 The relevance of the temporal source in the right hemisphere becomes more pronounced when using the coupled prior. [sent-185, score-0.432]
85 6 Discussion In this paper, we introduced a multivariate Laplace prior as the basis for Bayesian source localization. [sent-186, score-0.471]
86 By formulating this prior as a scale mixture we were able to approximate posteriors of interest using expectation propagation in an efficient manner. [sent-187, score-0.291]
87 Computation time is mainly influenced by the sparsity structure of the precision matrix J which is used to specify interactions between sources by coupling their variances. [sent-188, score-0.6]
88 It was shown that coupling of neighboring sources leads to source estimates that are somewhat more spatially smeared as compared with a decoupled prior. [sent-190, score-0.932]
89 However, posterior marginals can still be used to exclude irrelevant sources since these will typically have a mean activation close to zero with small variance. [sent-195, score-0.346]
90 Note that it is straightforward to impose other constraints since this only requires the specification of suitable interactions between sources through J. [sent-199, score-0.275]
91 For instance, the spatial prior could be made more realistic by taking anatomical constraints into account or by the inclusion of coupling between sources over time. [sent-200, score-0.557]
92 Other constraints that can be implemented with our approach are the coupling of individual orientations within a source, or even the coupling of source estimates between different subjects. [sent-201, score-0.852]
93 Coupling of source orientations has been realized before in [9] through an 1 / 2 norm, although not using a fully Bayesian approach. [sent-202, score-0.336]
94 In future work, we aim to examine the effect of the proposed priors and optimize the regularization and coupling parameters via empirical Bayes [4]. [sent-203, score-0.248]
95 Other directions for further research are inclusion of the noise variance in the optimization procedure and dealing with the depth bias that often arises in distributed source models in a more principled way. [sent-204, score-0.379]
96 To obtain a generalization of the univariate Laplace distribution, we used a multivariate exponential distribution of the scales, to be compared with the multivariate log-normal distribution in [11]. [sent-207, score-0.339]
97 Since (the efficiency of) our method for approximate inference only depends on the sparsity of the multivariate scale distribution, and not on its precise form, it should be feasible to compute approximate marginals for the model presented in [11] as well. [sent-210, score-0.345]
98 Concluding, we believe the scale mixture representation of the multivariate Laplace distribution to be a promising approach to Bayesian distributed source localization. [sent-211, score-0.521]
99 Combining sparsity and rotau tional invariance in EEG/MEG source reconstruction. [sent-282, score-0.315]
100 Classes of multivariate exponential and multivariate geometric distributions derived from Markov processes. [sent-287, score-0.292]
wordName wordTfidf (topN-words)
[('vi', 0.403), ('ui', 0.368), ('si', 0.297), ('source', 0.283), ('mvl', 0.226), ('sources', 0.223), ('coupling', 0.222), ('laplace', 0.216), ('ti', 0.165), ('ep', 0.144), ('multivariate', 0.134), ('negativity', 0.103), ('currents', 0.099), ('zi', 0.094), ('localization', 0.091), ('decoupled', 0.087), ('coupled', 0.085), ('deviant', 0.082), ('mismatch', 0.072), ('takahashi', 0.072), ('moments', 0.068), ('marginals', 0.063), ('meg', 0.062), ('dudv', 0.062), ('mmn', 0.062), ('tones', 0.062), ('posterior', 0.06), ('brain', 0.057), ('bayesian', 0.057), ('ku', 0.056), ('wave', 0.056), ('scale', 0.056), ('xs', 0.055), ('prior', 0.054), ('precision', 0.054), ('factorizes', 0.053), ('orientations', 0.053), ('neighboring', 0.049), ('correlation', 0.049), ('mixture', 0.048), ('variance', 0.048), ('cholesky', 0.047), ('univariate', 0.047), ('eld', 0.046), ('matrix', 0.046), ('auxiliary', 0.045), ('kl', 0.045), ('netherlands', 0.044), ('sensors', 0.044), ('posteriors', 0.043), ('estimates', 0.043), ('eltoft', 0.041), ('nijmegen', 0.041), ('oddball', 0.041), ('uncoupled', 0.041), ('diagonal', 0.04), ('rq', 0.04), ('numerical', 0.039), ('lead', 0.038), ('propagation', 0.038), ('sensor', 0.037), ('ks', 0.037), ('variances', 0.037), ('ims', 0.036), ('magnetoencephalography', 0.036), ('sparse', 0.034), ('correlations', 0.033), ('couples', 0.033), ('localizing', 0.033), ('depicts', 0.032), ('bottleneck', 0.032), ('neuroimage', 0.032), ('temporal', 0.032), ('relevance', 0.032), ('sparsity', 0.032), ('amd', 0.031), ('ministry', 0.031), ('minimum', 0.031), ('ms', 0.03), ('approximate', 0.03), ('spatial', 0.029), ('kv', 0.029), ('constraints', 0.029), ('inverse', 0.028), ('normal', 0.027), ('depth', 0.027), ('mi', 0.026), ('priors', 0.026), ('volume', 0.026), ('tj', 0.025), ('mri', 0.025), ('leads', 0.025), ('exponential', 0.024), ('characteristics', 0.024), ('gaussian', 0.024), ('interactions', 0.023), ('differs', 0.023), ('term', 0.023), ('expectation', 0.022), ('arises', 0.021)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000006 41 nips-2009-Bayesian Source Localization with the Multivariate Laplace Prior
Author: Marcel V. Gerven, Botond Cseke, Robert Oostenveld, Tom Heskes
Abstract: We introduce a novel multivariate Laplace (MVL) distribution as a sparsity promoting prior for Bayesian source localization that allows the specification of constraints between and within sources. We represent the MVL distribution as a scale mixture that induces a coupling between source variances instead of their means. Approximation of the posterior marginals using expectation propagation is shown to be very efficient due to properties of the scale mixture representation. The computational bottleneck amounts to computing the diagonal elements of a sparse matrix inverse. Our approach is illustrated using a mismatch negativity paradigm for which MEG data and a structural MRI have been acquired. We show that spatial coupling leads to sources which are active over larger cortical areas as compared with an uncoupled prior. 1
2 0.20123374 100 nips-2009-Gaussian process regression with Student-t likelihood
Author: Jarno Vanhatalo, Pasi Jylänki, Aki Vehtari
Abstract: In the Gaussian process regression the observation model is commonly assumed to be Gaussian, which is convenient in computational perspective. However, the drawback is that the predictive accuracy of the model can be significantly compromised if the observations are contaminated by outliers. A robust observation model, such as the Student-t distribution, reduces the influence of outlying observations and improves the predictions. The problem, however, is the analytically intractable inference. In this work, we discuss the properties of a Gaussian process regression model with the Student-t likelihood and utilize the Laplace approximation for approximate inference. We compare our approach to a variational approximation and a Markov chain Monte Carlo scheme, which utilize the commonly used scale mixture representation of the Student-t distribution. 1
3 0.19220026 140 nips-2009-Linearly constrained Bayesian matrix factorization for blind source separation
Author: Mikkel Schmidt
Abstract: We present a general Bayesian approach to probabilistic matrix factorization subject to linear constraints. The approach is based on a Gaussian observation model and Gaussian priors with bilinear equality and inequality constraints. We present an efficient Markov chain Monte Carlo inference procedure based on Gibbs sampling. Special cases of the proposed model are Bayesian formulations of nonnegative matrix factorization and factor analysis. The method is evaluated on a blind source separation problem. We demonstrate that our algorithm can be used to extract meaningful and interpretable features that are remarkably different from features extracted using existing related matrix factorization techniques.
4 0.18266319 17 nips-2009-A Sparse Non-Parametric Approach for Single Channel Separation of Known Sounds
Author: Paris Smaragdis, Madhusudana Shashanka, Bhiksha Raj
Abstract: In this paper we present an algorithm for separating mixed sounds from a monophonic recording. Our approach makes use of training data which allows us to learn representations of the types of sounds that compose the mixture. In contrast to popular methods that attempt to extract compact generalizable models for each sound from training data, we employ the training data itself as a representation of the sources in the mixture. We show that mixtures of known sounds can be described as sparse combinations of the training data itself, and in doing so produce significantly better separation results as compared to similar systems based on compact statistical models. Keywords: Example-Based Representation, Signal Separation, Sparse Models. 1
5 0.14643021 170 nips-2009-Nonlinear directed acyclic structure learning with weakly additive noise models
Author: Arthur Gretton, Peter Spirtes, Robert E. Tillman
Abstract: The recently proposed additive noise model has advantages over previous directed structure learning approaches since it (i) does not assume linearity or Gaussianity and (ii) can discover a unique DAG rather than its Markov equivalence class. However, for certain distributions, e.g. linear Gaussians, the additive noise model is invertible and thus not useful for structure learning, and it was originally proposed for the two variable case with a multivariate extension which requires enumerating all possible DAGs. We introduce weakly additive noise models, which extends this framework to cases where the additive noise model is invertible and when additive noise is not present. We then provide an algorithm that learns an equivalence class for such models from data, by combining a PC style search using recent advances in kernel measures of conditional dependence with local searches for additive noise models in substructures of the Markov equivalence class. This results in a more computationally efficient approach that is useful for arbitrary distributions even when additive noise models are invertible. 1
6 0.10353205 240 nips-2009-Sufficient Conditions for Agnostic Active Learnable
7 0.10253958 43 nips-2009-Bayesian estimation of orientation preference maps
8 0.10134234 94 nips-2009-Fast Learning from Non-i.i.d. Observations
9 0.090513639 225 nips-2009-Sparsistent Learning of Varying-coefficient Models with Structural Changes
10 0.08965607 150 nips-2009-Maximum likelihood trajectories for continuous-time Markov chains
11 0.082360595 247 nips-2009-Time-rescaling methods for the estimation and assessment of non-Poisson neural encoding models
12 0.080288745 187 nips-2009-Particle-based Variational Inference for Continuous Systems
13 0.077967256 224 nips-2009-Sparse and Locally Constant Gaussian Graphical Models
14 0.07613761 254 nips-2009-Variational Gaussian-process factor analysis for modeling spatio-temporal data
15 0.075075477 228 nips-2009-Speeding up Magnetic Resonance Image Acquisition by Bayesian Multi-Slice Adaptive Compressed Sensing
16 0.0710335 255 nips-2009-Variational Inference for the Nested Chinese Restaurant Process
17 0.070054397 131 nips-2009-Learning from Neighboring Strokes: Combining Appearance and Context for Multi-Domain Sketch Recognition
18 0.069500536 57 nips-2009-Conditional Random Fields with High-Order Features for Sequence Labeling
19 0.065473706 38 nips-2009-Augmenting Feature-driven fMRI Analyses: Semi-supervised learning and resting state activity
20 0.063252762 23 nips-2009-Accelerating Bayesian Structural Inference for Non-Decomposable Gaussian Graphical Models
topicId topicWeight
[(0, -0.209), (1, -0.027), (2, 0.057), (3, 0.021), (4, 0.042), (5, -0.084), (6, 0.177), (7, -0.088), (8, -0.043), (9, -0.097), (10, -0.051), (11, -0.027), (12, -0.06), (13, 0.053), (14, 0.03), (15, -0.012), (16, -0.068), (17, 0.102), (18, -0.091), (19, -0.227), (20, -0.031), (21, 0.028), (22, -0.333), (23, 0.013), (24, -0.127), (25, -0.116), (26, -0.038), (27, 0.08), (28, 0.114), (29, -0.042), (30, -0.012), (31, -0.048), (32, 0.002), (33, -0.146), (34, -0.01), (35, -0.098), (36, 0.152), (37, 0.03), (38, -0.014), (39, -0.003), (40, 0.073), (41, 0.078), (42, -0.041), (43, 0.078), (44, -0.012), (45, -0.005), (46, 0.029), (47, 0.064), (48, -0.172), (49, 0.008)]
simIndex simValue paperId paperTitle
same-paper 1 0.98241538 41 nips-2009-Bayesian Source Localization with the Multivariate Laplace Prior
Author: Marcel V. Gerven, Botond Cseke, Robert Oostenveld, Tom Heskes
Abstract: We introduce a novel multivariate Laplace (MVL) distribution as a sparsity promoting prior for Bayesian source localization that allows the specification of constraints between and within sources. We represent the MVL distribution as a scale mixture that induces a coupling between source variances instead of their means. Approximation of the posterior marginals using expectation propagation is shown to be very efficient due to properties of the scale mixture representation. The computational bottleneck amounts to computing the diagonal elements of a sparse matrix inverse. Our approach is illustrated using a mismatch negativity paradigm for which MEG data and a structural MRI have been acquired. We show that spatial coupling leads to sources which are active over larger cortical areas as compared with an uncoupled prior. 1
2 0.79370308 140 nips-2009-Linearly constrained Bayesian matrix factorization for blind source separation
Author: Mikkel Schmidt
Abstract: We present a general Bayesian approach to probabilistic matrix factorization subject to linear constraints. The approach is based on a Gaussian observation model and Gaussian priors with bilinear equality and inequality constraints. We present an efficient Markov chain Monte Carlo inference procedure based on Gibbs sampling. Special cases of the proposed model are Bayesian formulations of nonnegative matrix factorization and factor analysis. The method is evaluated on a blind source separation problem. We demonstrate that our algorithm can be used to extract meaningful and interpretable features that are remarkably different from features extracted using existing related matrix factorization techniques.
3 0.62522632 17 nips-2009-A Sparse Non-Parametric Approach for Single Channel Separation of Known Sounds
Author: Paris Smaragdis, Madhusudana Shashanka, Bhiksha Raj
Abstract: In this paper we present an algorithm for separating mixed sounds from a monophonic recording. Our approach makes use of training data which allows us to learn representations of the types of sounds that compose the mixture. In contrast to popular methods that attempt to extract compact generalizable models for each sound from training data, we employ the training data itself as a representation of the sources in the mixture. We show that mixtures of known sounds can be described as sparse combinations of the training data itself, and in doing so produce significantly better separation results as compared to similar systems based on compact statistical models. Keywords: Example-Based Representation, Signal Separation, Sparse Models. 1
4 0.60505736 100 nips-2009-Gaussian process regression with Student-t likelihood
Author: Jarno Vanhatalo, Pasi Jylänki, Aki Vehtari
Abstract: In the Gaussian process regression the observation model is commonly assumed to be Gaussian, which is convenient in computational perspective. However, the drawback is that the predictive accuracy of the model can be significantly compromised if the observations are contaminated by outliers. A robust observation model, such as the Student-t distribution, reduces the influence of outlying observations and improves the predictions. The problem, however, is the analytically intractable inference. In this work, we discuss the properties of a Gaussian process regression model with the Student-t likelihood and utilize the Laplace approximation for approximate inference. We compare our approach to a variational approximation and a Markov chain Monte Carlo scheme, which utilize the commonly used scale mixture representation of the Student-t distribution. 1
5 0.52761346 170 nips-2009-Nonlinear directed acyclic structure learning with weakly additive noise models
Author: Arthur Gretton, Peter Spirtes, Robert E. Tillman
Abstract: The recently proposed additive noise model has advantages over previous directed structure learning approaches since it (i) does not assume linearity or Gaussianity and (ii) can discover a unique DAG rather than its Markov equivalence class. However, for certain distributions, e.g. linear Gaussians, the additive noise model is invertible and thus not useful for structure learning, and it was originally proposed for the two variable case with a multivariate extension which requires enumerating all possible DAGs. We introduce weakly additive noise models, which extends this framework to cases where the additive noise model is invertible and when additive noise is not present. We then provide an algorithm that learns an equivalence class for such models from data, by combining a PC style search using recent advances in kernel measures of conditional dependence with local searches for additive noise models in substructures of the Markov equivalence class. This results in a more computationally efficient approach that is useful for arbitrary distributions even when additive noise models are invertible. 1
6 0.47170284 150 nips-2009-Maximum likelihood trajectories for continuous-time Markov chains
7 0.44243625 42 nips-2009-Bayesian Sparse Factor Models and DAGs Inference and Comparison
8 0.44040358 43 nips-2009-Bayesian estimation of orientation preference maps
9 0.42356494 254 nips-2009-Variational Gaussian-process factor analysis for modeling spatio-temporal data
10 0.40248522 247 nips-2009-Time-rescaling methods for the estimation and assessment of non-Poisson neural encoding models
11 0.39893588 59 nips-2009-Construction of Nonparametric Bayesian Models from Parametric Bayes Equations
12 0.39541104 94 nips-2009-Fast Learning from Non-i.i.d. Observations
13 0.3859821 152 nips-2009-Measuring model complexity with the prior predictive
14 0.38563451 78 nips-2009-Efficient Moments-based Permutation Tests
15 0.37988156 131 nips-2009-Learning from Neighboring Strokes: Combining Appearance and Context for Multi-Domain Sketch Recognition
16 0.35283431 23 nips-2009-Accelerating Bayesian Structural Inference for Non-Decomposable Gaussian Graphical Models
17 0.3466424 165 nips-2009-Noise Characterization, Modeling, and Reduction for In Vivo Neural Recording
18 0.33839032 188 nips-2009-Perceptual Multistability as Markov Chain Monte Carlo Inference
19 0.33569151 240 nips-2009-Sufficient Conditions for Agnostic Active Learnable
20 0.32953382 222 nips-2009-Sparse Estimation Using General Likelihoods and Non-Factorial Priors
topicId topicWeight
[(21, 0.01), (24, 0.065), (25, 0.045), (35, 0.072), (36, 0.129), (39, 0.048), (42, 0.011), (58, 0.098), (62, 0.199), (71, 0.088), (81, 0.053), (86, 0.08), (91, 0.013)]
simIndex simValue paperId paperTitle
1 0.90632623 13 nips-2009-A Neural Implementation of the Kalman Filter
Author: Robert Wilson, Leif Finkel
Abstract: Recent experimental evidence suggests that the brain is capable of approximating Bayesian inference in the face of noisy input stimuli. Despite this progress, the neural underpinnings of this computation are still poorly understood. In this paper we focus on the Bayesian filtering of stochastic time series and introduce a novel neural network, derived from a line attractor architecture, whose dynamics map directly onto those of the Kalman filter in the limit of small prediction error. When the prediction error is large we show that the network responds robustly to changepoints in a way that is qualitatively compatible with the optimal Bayesian model. The model suggests ways in which probability distributions are encoded in the brain and makes a number of testable experimental predictions. 1
2 0.9050222 164 nips-2009-No evidence for active sparsification in the visual cortex
Author: Pietro Berkes, Ben White, Jozsef Fiser
Abstract: The proposal that cortical activity in the visual cortex is optimized for sparse neural activity is one of the most established ideas in computational neuroscience. However, direct experimental evidence for optimal sparse coding remains inconclusive, mostly due to the lack of reference values on which to judge the measured sparseness. Here we analyze neural responses to natural movies in the primary visual cortex of ferrets at different stages of development and of rats while awake and under different levels of anesthesia. In contrast with prediction from a sparse coding model, our data shows that population and lifetime sparseness decrease with visual experience, and increase from the awake to anesthetized state. These results suggest that the representation in the primary visual cortex is not actively optimized to maximize sparseness. 1
same-paper 3 0.86901832 41 nips-2009-Bayesian Source Localization with the Multivariate Laplace Prior
Author: Marcel V. Gerven, Botond Cseke, Robert Oostenveld, Tom Heskes
Abstract: We introduce a novel multivariate Laplace (MVL) distribution as a sparsity promoting prior for Bayesian source localization that allows the specification of constraints between and within sources. We represent the MVL distribution as a scale mixture that induces a coupling between source variances instead of their means. Approximation of the posterior marginals using expectation propagation is shown to be very efficient due to properties of the scale mixture representation. The computational bottleneck amounts to computing the diagonal elements of a sparse matrix inverse. Our approach is illustrated using a mismatch negativity paradigm for which MEG data and a structural MRI have been acquired. We show that spatial coupling leads to sources which are active over larger cortical areas as compared with an uncoupled prior. 1
4 0.7494604 162 nips-2009-Neural Implementation of Hierarchical Bayesian Inference by Importance Sampling
Author: Lei Shi, Thomas L. Griffiths
Abstract: The goal of perception is to infer the hidden states in the hierarchical process by which sensory data are generated. Human behavior is consistent with the optimal statistical solution to this problem in many tasks, including cue combination and orientation detection. Understanding the neural mechanisms underlying this behavior is of particular importance, since probabilistic computations are notoriously challenging. Here we propose a simple mechanism for Bayesian inference which involves averaging over a few feature detection neurons which fire at a rate determined by their similarity to a sensory stimulus. This mechanism is based on a Monte Carlo method known as importance sampling, commonly used in computer science and statistics. Moreover, a simple extension to recursive importance sampling can be used to perform hierarchical Bayesian inference. We identify a scheme for implementing importance sampling with spiking neurons, and show that this scheme can account for human behavior in cue combination and the oblique effect. 1
5 0.74455887 19 nips-2009-A joint maximum-entropy model for binary neural population patterns and continuous signals
Author: Sebastian Gerwinn, Philipp Berens, Matthias Bethge
Abstract: Second-order maximum-entropy models have recently gained much interest for describing the statistics of binary spike trains. Here, we extend this approach to take continuous stimuli into account as well. By constraining the joint secondorder statistics, we obtain a joint Gaussian-Boltzmann distribution of continuous stimuli and binary neural firing patterns, for which we also compute marginal and conditional distributions. This model has the same computational complexity as pure binary models and fitting it to data is a convex problem. We show that the model can be seen as an extension to the classical spike-triggered average/covariance analysis and can be used as a non-linear method for extracting features which a neural population is sensitive to. Further, by calculating the posterior distribution of stimuli given an observed neural response, the model can be used to decode stimuli and yields a natural spike-train metric. Therefore, extending the framework of maximum-entropy models to continuous variables allows us to gain novel insights into the relationship between the firing patterns of neural ensembles and the stimuli they are processing. 1
6 0.73681736 62 nips-2009-Correlation Coefficients are Insufficient for Analyzing Spike Count Dependencies
7 0.73387599 141 nips-2009-Local Rules for Global MAP: When Do They Work ?
8 0.73212683 260 nips-2009-Zero-shot Learning with Semantic Output Codes
9 0.73061067 132 nips-2009-Learning in Markov Random Fields using Tempered Transitions
10 0.72988927 18 nips-2009-A Stochastic approximation method for inference in probabilistic graphical models
11 0.72936606 129 nips-2009-Learning a Small Mixture of Trees
12 0.72853529 250 nips-2009-Training Factor Graphs with Reinforcement Learning for Efficient MAP Inference
13 0.72555375 40 nips-2009-Bayesian Nonparametric Models on Decomposable Graphs
14 0.72500533 224 nips-2009-Sparse and Locally Constant Gaussian Graphical Models
15 0.72368795 158 nips-2009-Multi-Label Prediction via Sparse Infinite CCA
16 0.72263318 174 nips-2009-Nonparametric Latent Feature Models for Link Prediction
17 0.72229153 210 nips-2009-STDP enables spiking neurons to detect hidden causes of their inputs
18 0.72199094 72 nips-2009-Distribution Matching for Transduction
19 0.72138637 97 nips-2009-Free energy score space
20 0.71970153 99 nips-2009-Functional network reorganization in motor cortex can be explained by reward-modulated Hebbian learning