nips nips2012 nips2012-151 knowledge-graph by maker-knowledge-mining

151 nips-2012-High-Order Multi-Task Feature Learning to Identify Longitudinal Phenotypic Markers for Alzheimer's Disease Progression Prediction

Source: pdf

Author: Hua Wang, Feiping Nie, Heng Huang, Jingwen Yan, Sungeun Kim, Shannon Risacher, Andrew Saykin, Li Shen

Abstract: Alzheimer’s disease (AD) is a neurodegenerative disorder characterized by progressive impairment of memory and other cognitive functions. Regression analysis has been studied to relate neuroimaging measures to cognitive status. However, whether these measures have further predictive power to infer a trajectory of cognitive performance over time is still an under-explored but important topic in AD research. We propose a novel high-order multi-task learning model to address this issue. The proposed model explores the temporal correlations existing in imaging and cognitive data by structured sparsity-inducing norms. The sparsity of the model enables the selection of a small number of imaging measures while maintaining high prediction accuracy. The empirical studies, using the longitudinal imaging and cognitive data of the ADNI cohort, have yielded promising results.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract Alzheimer’s disease (AD) is a neurodegenerative disorder characterized by progressive impairment of memory and other cognitive functions. [sent-6, score-0.59]

2 Regression analysis has been studied to relate neuroimaging measures to cognitive status. [sent-7, score-0.558]

3 However, whether these measures have further predictive power to infer a trajectory of cognitive performance over time is still an under-explored but important topic in AD research. [sent-8, score-0.508]

4 The proposed model explores the temporal correlations existing in imaging and cognitive data by structured sparsity-inducing norms. [sent-10, score-0.759]

5 The sparsity of the model enables the selection of a small number of imaging measures while maintaining high prediction accuracy. [sent-11, score-0.364]

6 The empirical studies, using the longitudinal imaging and cognitive data of the ADNI cohort, have yielded promising results. [sent-12, score-1.002]

7 1 Introduction Neuroimaging is a powerful tool for characterizing neurodegenerative process in the progression of Alzheimer’s disease (AD). [sent-13, score-0.223]

8 Neuroimaging measures have been widely studied to predict disease status and/or cognitive performance [1, 2, 3, 4, 5, 6, 7]. [sent-14, score-0.566]

9 However, whether these measures have further predictive power to infer a trajectory of cognitive performance over time is still an underexplored yet important topic in AD research. [sent-15, score-0.508]

10 A simple strategy typically used in longitudinal studies (e. [sent-16, score-0.368]

11 This approach may be inadequate to distinguish the complete dynamics of cognitive trajectories and thus become unable to identify underlying neurodegenerative mechanism. [sent-19, score-0.471]

12 Therefore, if longitudinal cognitive outcomes are available, it would be beneﬁcial to use the complete information for the identiﬁcation of relevant imaging markers [9, 10]. [sent-24, score-1.347]

13 1 Figure 1: Longitudinal multi-task regression of cognitive trajectories on MRI measures. [sent-35, score-0.454]

14 However, how to identify the temporal imaging features that predict longitudinal outcomes is a challenging machine learning problem. [sent-36, score-0.742]

15 For example, both input neuroimaging measures (samples × features × time) and output cognitive scores (samples × scores × time) are 3D tensors. [sent-38, score-0.722]

16 Thus, it is not trivial to build the longitudinal learning model for tensor data. [sent-39, score-0.462]

17 cognitive score) at two consecutive time points are often correlated. [sent-42, score-0.42]

18 How to efﬁciently include such correlations of associations cross time is unclear. [sent-43, score-0.176]

19 Third, some longitudinal learning tasks are often interrelated to each other. [sent-44, score-0.446]

20 How to integrate such tasks correlations into longitudinal learning model is under-explored. [sent-46, score-0.481]

21 In this paper, we focus on the problem of predicting longitudinal cognitive trajectories using neuroimaging measures. [sent-47, score-0.862]

22 We propose a novel high-order multi-task feature learning approach to identify longitudinal neuroimaging markers that can accurately predict cognitive scores over all the time points. [sent-48, score-1.341]

23 The sparsity-inducing norms are introduced to integrate the correlations existing in both features and tasks. [sent-49, score-0.105]

24 As a result, the selected imaging markers can fully differentiate the entire longitudinal trajectory of relevant scores and better capture the associations between imaging markers and cognitive changes over time. [sent-50, score-2.145]

25 Because the structured sparsity-inducing norms enforce the correlations along two directions of the learned coefﬁcient tensor, the parameters in different sparsity norms are tangled together by distinct structures and lead to a difﬁcult optimization problem. [sent-51, score-0.134]

26 We apply the proposed longitudinal multi-task regression method to the ADNI cohort. [sent-54, score-0.426]

27 In our experiments, the proposed method not only achieves competitive prediction accuracy but also identiﬁes a small number of imaging markers that are consistent with prior knowledge. [sent-55, score-0.591]

28 2 High-Order Multi-Task Feature Learning Using Sparsity-Inducing Norms For AD progression prediction using longitudinal phenotypic markers, the input imaging features are a set of matrices X = {X1 , X2 , . [sent-56, score-0.797]

29 , XT } ∈ Rd×n×T corresponding to the measurements at T consecutive time points, where Xt is the phenotypic measurements for a certain type of imaging markers, such as voxel-based morphometry (VBM) markers (see details in Section 3) used in this study, at time t (1 ≤ t ≤ T ). [sent-59, score-0.757]

30 Obviously, X is a tensor data with d imaging features, n subject samples and T time points. [sent-60, score-0.387]

31 The output cognitive assessments for the same set of subjects are a set of matrices Y = {Y1 , Y2 , . [sent-61, score-0.446]

32 , YT } ∈ Rn×c×T for a certain type of the cognitive measurements, such as RAVLT memory scores (see details in Section 3), at the same T consecutive time points. [sent-64, score-0.527]

33 Again, Y is a tensor data with n samples, c scores, and T time points. [sent-65, score-0.118]

34 Prior regression analyses typically study the associations between imaging features and cognitive measures at each time point separately, which is equivalent to assume that the learning tasks, i. [sent-67, score-0.911]

35 Although this assumption can simplify the problem and make the solution easier to obtain, it overlooks the temporal correlations of imaging and cognitive measures. [sent-70, score-0.759]

36 Middle: the matrix unfolded from B along the ﬁrst mode (feature dimension). [sent-72, score-0.118]

37 Right: the matrix unfolded from B along the second mode (task dimension). [sent-73, score-0.118]

38 As a result, we aim to learn a coefﬁcient tensor (a stack of coefﬁcient matrices) B = {B1 , · · · , Bn } ∈ Rd×c×T , as illustrated in the left panel of Figure 2, to reveal the temporal changes of the coefﬁcient matrices. [sent-75, score-0.226]

39 Therefore it does not take into account the longitudinal correlations between imaging features and cognitive measures. [sent-82, score-1.078]

40 Because our goal in the association study is to select the imaging markers which are connected to the temporal changes of all the cognitive measures, the T groups of regression tasks at different time points should not be decoupled and have to be performed simultaneously. [sent-83, score-1.205]

41 By solving the objective J1 , the imaging features with common inﬂuences across all the time points for all the cognitive measures will be selected due to the second term in Eq. [sent-94, score-0.783]

42 (2) couples all the learning tasks together, which, though, still does not address the correlations among different learning tasks at different time points. [sent-98, score-0.174]

43 As discussed earlier, during the AD progression, many cognitive measures are interrelated together and their effects during the process could overlap, thus it is necessary to further develop the objective J1 in Eq. [sent-99, score-0.531]

44 (2) to leverage the useful information conveyed by the correlations among different cognitive measures. [sent-100, score-0.441]

45 In order to capture the longitudinal patterns of the AD data, we consider two types of tasks correlations. [sent-101, score-0.405]

46 First, for an individual cognitive measure, although its association to the imaging features at different stages of the disease could be different, its associations patterns at two consecutive time points tend to be similar [9]. [sent-102, score-0.881]

47 Second, we know that [4, 14] during the AD progression, different cognitive measures are interrelated to each other. [sent-103, score-0.501]

48 Mathematically speaking, the above two types of correlations can both be described by the low ranks of the coefﬁcient matrices unfolded from the 3 coefﬁcient tensor along different modes. [sent-104, score-0.263]

49 (3) indeed minimize the rank of the unfolded learning model B, such that the two types of correlations among the learning tasks at different time points can be utilized. [sent-114, score-0.23]

50 Due to its capabilities for both imaging marker selection and task correlation integration on longitudinal data, we call J2 deﬁned in Eq. [sent-115, score-0.662]

51 (3) as the proposed HighOrder Multi-Task Feature Learning model, by which we will study the problem of longitudinal data analysis to predict cognitive trajectories and identify relevant imaging markers. [sent-116, score-1.136]

52 ˜ Lemma 2 Given two semi-positive deﬁnite matrices A and A, the following inequality holds: ˜1 tr A 2 − 1 1 ˜ tr AA− 2 2 1 ≤ tr A 2 − 1 1 tr AA− 2 2 . [sent-134, score-0.784]

53 Following the same idea, we also introduce a small perturbation ξ > 0 to replace M ∗ by tr M M T + ξI 4 1 2 for the same reason. [sent-140, score-0.196]

54 Calculate the diagonal matrix D(g) , where the i-th diagonal element is computed as D(g) (i, i) = 1 (g),k 2 T t=1 bt 2 2 ¯ calculate D(g) = (g) 1 2 −1 2 T (g) B(1) B(1) ˆ ; calculate D(g) = 1 2 (g) −1 2 T (g) B(2) B(2) ; . [sent-154, score-0.321]

55 ≤ 1 2 1 tr A 2 , which is equiv- Now we prove the convergence of Algorithm 1, which is summarized by the following theorem. [sent-166, score-0.196]

56 According to Step 3 of Algorithm 1 we know that the following inequality holds: T T L(g+1) + α ˜T ˜ tr Bt DBt + β t=1 t=1 T L (g) T ˜T ¯ ˜ tr Bt D Bt + β +α tr +β t=1 ≤ t=1 T T Bt DBt ˜ ˆ ˜T tr Bt DBt (8) T T ¯ Bt DBt tr +β t=1 ˆ T Bt DBt tr . [sent-171, score-1.176]

57 (8) we can derive: ˜T ˜ ˜ ˜T ¯ ˜ ˜T ˆ L(g+1) + α tr B(1) DB(1) + β tr B(1) B(1) D + β tr B(2) B(2) D ≤ T L(g) + α T T tr B(1) DB(1) + β t=1 (9) T T ¯ tr B(1) B(1) D + β t=1 T ˆ tr B(2) B(2) D . [sent-173, score-1.176]

58 ˜ ˜T tr B(1) B(1) − tr 1 ˜ ˜T T B(1) B(1) B(1) B(1) 2 −1 2 1 ˜ ˜T T B(2) B(2) B(2) B(2) 2 −1 2 T ≤ tr B(1) B(1) − tr 1 T T B(1) B(1) B(1) B(1) 2 −1 2 T B(2) B(2) 1 T T B(2) B(2) B(2) B(2) 2 −1 2 , (12) ˜ ˜T B(2) B(2) tr − tr ≤ tr − tr . [sent-176, score-1.568]

59 (10–13) together, we can obtain: d T (g+1),k 2 ||2 L(g+1) + α ||bt k=1 t=1 d ˜ ˜T ˜ ˜T + β tr B(1) B(1) + β tr B(2) B(2) ≤ T (14) (g),k 2 ||2 L(g+1) + α ||bt k=1 T T + β tr B(1) B(1) + β tr B(2) B(2) t=1 Thus, our algorithm decreases the objective value of Eq. [sent-178, score-0.814]

60 3 Experiments We evaluate the proposed method by applying it to the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort to examine the association between a wide range of imaging measures and two types of cognitive measures over a certain period of time. [sent-190, score-0.897]

61 Our goal is to discover a compact set of imaging markers that are closely related to cognitive trajectories. [sent-191, score-0.956]

62 One goal of ADNI has been to test whether serial MRI, PET, other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of Mild Cognitive Impairment (MCI) and early AD. [sent-197, score-0.183]

63 We performed voxel-based morphometry (VBM) on the MRI data by following [8], and extracted mean modulated gray matter (GM) measures for 90 target regions of interest (ROIs) (see Figure 3 for the ROI list and detailed deﬁnitions of these ROIs in [3]). [sent-203, score-0.12]

64 These measures were adjusted for the baseline intracranial volume (ICV) using the regression weights derived from the healthy control (HC) participants at the baseline. [sent-204, score-0.199]

65 We also downloaded the longitudinal scores of the participants in two independent cognitive assessments including Fluency Test and Rey’s Auditory Verbal Learning Test (RAVLT). [sent-205, score-0.909]

66 The details of these cognitive assessments can be found in the ADNI procedure manuals2 . [sent-206, score-0.413]

67 The time points examined in this study for both imaging markers and cognitive assessments included baseline (BL), Month 6 (M6), Month 12 (M12) and Month 24 (M24). [sent-207, score-1.052]

68 All the participants with no missing BL/M6/M12/M24 MRI measurements and cognitive measures were included in this study. [sent-208, score-0.506]

69 We examined 3 RAVLT scores RAVLT TOTAL, RAVLT TOT6 and RAVLT RECOG, and 2 Fluency scores FLU ANIM and FLU VEG. [sent-210, score-0.164]

70 1 Improved Cognitive Score Prediction from Longitudinal Imaging Markers We ﬁrst evaluate the proposed method by applying it to the ADNI cohort for predicting the two types of cognitive scores using the VBM markers, tracked over four different time points. [sent-212, score-0.512]

71 We compare the proposed method against its two close counterparts including multivariate linear regression (LR) and ridge regression (RR). [sent-215, score-0.116]

72 135 each cognitive measure at each time point separately, and thus they cannot make use of the temporal correlation. [sent-234, score-0.438]

73 We also compare our method to a recent longitudinal method, called as Temporal Group Lasso Multi-Task Regression (TGL) [9]. [sent-235, score-0.368]

74 TGL takes into account the longitudinal property of the data, which, however, is designed to analyze only one single memory score at a time. [sent-236, score-0.418]

75 In contrast, besides imposing structured sparsity via tensor ℓ2,1 -norm regularization for imaging marker selection, our new method also imposes two trace norm regularizations to capture the interrelationships among different cognitive measures over the temporal dimension. [sent-237, score-1.0]

76 Thus, the proposed method is able to perform association study for all the relevant scores of a cognitive test at the same time, e. [sent-238, score-0.526]

77 First, we only impose the ℓ2,1 -norm regularization on the unfolded coefﬁcient tensor B along the feature mode, denoted as “ℓ2,1 -norm only”. [sent-242, score-0.213]

78 Second, we only impose the trace norm regularizations on the two coefﬁcient matrices unfolded from the coefﬁcient tensor B along the feature and task modes respectively, denoted as “trace norm only”. [sent-243, score-0.346]

79 To measure prediction performance, we use standard 5-fold cross-validation strategy by computing the root mean square error (RMSE) between the predicted and actual values of the cognitive scores on the testing data only. [sent-247, score-0.447]

80 Speciﬁcally, the whole set of subjects are equally and randomly partitioned into ﬁve subsets, and each time the subjects within one subset are selected as the testing samples and all other subjects in the remaining four subsets are used for training the regression models. [sent-248, score-0.181]

81 First, because LR and RR methods by nature can only deal with one individual cognitive measure at one single time point at a time, they cannot beneﬁt from the correlations across different cognitive measures over the entire time course. [sent-253, score-0.949]

82 Second, although TGL method improves the previous two methods in that it does take into account longitudinal data patterns, it still assumes all the test scores (i. [sent-254, score-0.45]

83 , learning tasks) from one cognitive assessment to be independent, which, though, is not true in reality. [sent-256, score-0.419]

84 (3) and justiﬁes our motivation to impose ℓ2,1 norm regularization for feature selection and trace norm regularization to capture task correlations. [sent-261, score-0.118]

85 2 Identiﬁcation of Longitudinal Imaging Markers Because one of the primary goals of our regression analysis is to identify a subset of imaging markers which are highly correlated to the AD progression reﬂected by the cognitive changes over time. [sent-263, score-1.171]

86 Therefore, we examine the imaging markers identiﬁed by the proposed methods with respect to the longitudinal changes encoded by the cognitive scores recorded at the four consecutive time points. [sent-264, score-1.486]

87 Shown in Figure 3 are (1) the heat map of the learned weights (magnitudes of the average regression weights for all three RAVLT scores at each time point) of the VBM measures at different time points calculated by our method; and (2) the top 10 weights mapped onto the brain anatomy. [sent-273, score-0.283]

88 These ﬁndings are in accordance with the known knowledge that in the pathological pathway of AD, medial temporal lobe is ﬁrstly affected, followed by progressive neocortical damage [19, 20]. [sent-276, score-0.107]

89 In summary, the identiﬁed longitudinally stable imaging markers are highly suggestive and strongly agree with the existing research ﬁndings, which warrants the correctness of the discovered imagingcognition associations to reveal the complex relationships between MRI measures and cognitive scores. [sent-278, score-1.184]

90 4 Conclusion To reveal the relationship between longitudinal cognitive measures and neuroimaging markers, we have proposed a novel high-order multi-task feature learning model, which selects the longitudinal imaging markers that can accurately predict cognitive measures at all the time points. [sent-280, score-2.443]

91 As a result, these imaging markers could fully differentiate the entire longitudinal trajectory of relevant cognitive measures and better capture the associations between imaging markers and cognitive changes over time. [sent-281, score-2.523]

92 The validations using ADNI imaging and cognitive data have demonstrated the promise of our method. [sent-284, score-0.634]

93 Predictive markers for ad in a multi-modality framework: an analysis of mci progression in the adni population. [sent-294, score-0.827]

94 Predicting clinical scores from magnetic resonance scans in alzheimer’s disease. [sent-297, score-0.113]

95 Whole genome association study of brain-wide imaging phenotypes for identifying quantitative trait loci in MCI and AD: A study of the ADNI cohort. [sent-302, score-0.436]

96 Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance. [sent-313, score-0.412]

97 Identifying ad-sensitive and cognitionrelevant imaging biomarkers via joint classiﬁcation and regression. [sent-405, score-0.31]

98 3d mapping of mini-mental state examination performance in clinical and preclinical alzheimer disease. [sent-467, score-0.167]

99 Atrophy of the medial occipitotemporal, inferior, and middle temporal gyri in non-demented elderly predict decline to Alzheimer’s disease. [sent-474, score-0.126]

100 Cortical thickness analysis to detect progressive mild cognitive impairment: a reference to alzheimer’s disease. [sent-487, score-0.399]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('longitudinal', 0.368), ('cognitive', 0.365), ('markers', 0.322), ('bt', 0.321), ('imaging', 0.269), ('ravlt', 0.249), ('adni', 0.226), ('tr', 0.196), ('alzheimer', 0.136), ('ad', 0.126), ('risacher', 0.109), ('neuroimaging', 0.098), ('progression', 0.098), ('dbt', 0.096), ('saykin', 0.096), ('measures', 0.095), ('tensor', 0.094), ('unfolded', 0.093), ('disease', 0.084), ('nie', 0.082), ('scores', 0.082), ('fluency', 0.078), ('correlations', 0.076), ('associations', 0.076), ('lr', 0.071), ('mri', 0.071), ('phenotypic', 0.062), ('tgl', 0.062), ('regression', 0.058), ('mci', 0.055), ('assessment', 0.054), ('aa', 0.052), ('vbm', 0.051), ('yt', 0.05), ('temporal', 0.049), ('assessments', 0.048), ('month', 0.047), ('participants', 0.046), ('xt', 0.042), ('interrelated', 0.041), ('biomarkers', 0.041), ('cohort', 0.041), ('impairment', 0.041), ('neurodegenerative', 0.041), ('regularizations', 0.041), ('atrophy', 0.038), ('trait', 0.038), ('neuroimage', 0.037), ('tasks', 0.037), ('aging', 0.036), ('identify', 0.034), ('progressive', 0.034), ('rr', 0.034), ('shen', 0.034), ('subjects', 0.033), ('coef', 0.033), ('trace', 0.032), ('panel', 0.032), ('kim', 0.032), ('association', 0.032), ('consecutive', 0.031), ('huang', 0.031), ('flu', 0.031), ('gyri', 0.031), ('longitudinally', 0.031), ('nho', 0.031), ('recog', 0.031), ('clinical', 0.031), ('trajectories', 0.031), ('objective', 0.03), ('norm', 0.03), ('norms', 0.029), ('loci', 0.027), ('investigators', 0.027), ('unfold', 0.027), ('bk', 0.027), ('feature', 0.026), ('reveal', 0.026), ('score', 0.025), ('marker', 0.025), ('rois', 0.025), ('sylvester', 0.025), ('heng', 0.025), ('morphometry', 0.025), ('memory', 0.025), ('mode', 0.025), ('changes', 0.025), ('time', 0.024), ('trajectory', 0.024), ('study', 0.024), ('neurobiol', 0.024), ('arlington', 0.024), ('recalled', 0.024), ('medial', 0.024), ('remembered', 0.024), ('relevant', 0.023), ('degenerated', 0.023), ('predict', 0.022), ('bioinformatics', 0.022), ('identifying', 0.022)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000012 151 nips-2012-High-Order Multi-Task Feature Learning to Identify Longitudinal Phenotypic Markers for Alzheimer's Disease Progression Prediction

Author: Hua Wang, Feiping Nie, Heng Huang, Jingwen Yan, Sungeun Kim, Shannon Risacher, Andrew Saykin, Li Shen

2 0.13783975 284 nips-2012-Q-MKL: Matrix-induced Regularization in Multi-Kernel Learning with Applications to Neuroimaging

Author: Chris Hinrichs, Vikas Singh, Jiming Peng, Sterling Johnson

Abstract: Multiple Kernel Learning (MKL) generalizes SVMs to the setting where one simultaneously trains a linear classiﬁer and chooses an optimal combination of given base kernels. Model complexity is typically controlled using various norm regularizations on the base kernel mixing coefﬁcients. Existing methods neither regularize nor exploit potentially useful information pertaining to how kernels in the input set ‘interact’; that is, higher order kernel-pair relationships that can be easily obtained via unsupervised (similarity, geodesics), supervised (correlation in errors), or domain knowledge driven mechanisms (which features were used to construct the kernel?). We show that by substituting the norm penalty with an arbitrary quadratic function Q 0, one can impose a desired covariance structure on mixing weights, and use this as an inductive bias when learning the concept. This formulation signiﬁcantly generalizes the widely used 1- and 2-norm MKL objectives. We explore the model’s utility via experiments on a challenging Neuroimaging problem, where the goal is to predict a subject’s conversion to Alzheimer’s Disease (AD) by exploiting aggregate information from many distinct imaging modalities. Here, our new model outperforms the state of the art (p-values 10−3 ). We brieﬂy discuss ramiﬁcations in terms of learning bounds (Rademacher complexity). 1

3 0.1309157 203 nips-2012-Locating Changes in Highly Dependent Data with Unknown Number of Change Points

Author: Azadeh Khaleghi, Daniil Ryabko

Abstract: The problem of multiple change point estimation is considered for sequences with unknown number of change points. A consistency framework is suggested that is suitable for highly dependent time-series, and an asymptotically consistent algorithm is proposed. In order for the consistency to be established the only assumption required is that the data is generated by stationary ergodic time-series distributions. No modeling, independence or parametric assumptions are made; the data are allowed to be dependent and the dependence can be of arbitrary form. The theoretical results are complemented with experimental evaluations. 1

4 0.13040648 276 nips-2012-Probabilistic Event Cascades for Alzheimer's disease

Author: Jonathan Huang, Daniel Alexander

Abstract: Accurate and detailed models of neurodegenerative disease progression are crucially important for reliable early diagnosis and the determination of effective treatments. We introduce the ALPACA (Alzheimer’s disease Probabilistic Cascades) model, a generative model linking latent Alzheimer’s progression dynamics to observable biomarker data. In contrast with previous works which model disease progression as a ﬁxed event ordering, we explicitly model the variability over such orderings among patients which is more realistic, particularly for highly detailed progression models. We describe efﬁcient learning algorithms for ALPACA and discuss promising experimental results on a real cohort of Alzheimer’s patients from the Alzheimer’s Disease Neuroimaging Initiative. 1

5 0.11104196 153 nips-2012-How Prior Probability Influences Decision Making: A Unifying Probabilistic Model

Author: Yanping Huang, Timothy Hanks, Mike Shadlen, Abram L. Friesen, Rajesh P. Rao

Abstract: How does the brain combine prior knowledge with sensory evidence when making decisions under uncertainty? Two competing descriptive models have been proposed based on experimental data. The ﬁrst posits an additive offset to a decision variable, implying a static effect of the prior. However, this model is inconsistent with recent data from a motion discrimination task involving temporal integration of uncertain sensory evidence. To explain this data, a second model has been proposed which assumes a time-varying inﬂuence of the prior. Here we present a normative model of decision making that incorporates prior knowledge in a principled way. We show that the additive offset model and the time-varying prior model emerge naturally when decision making is viewed within the framework of partially observable Markov decision processes (POMDPs). Decision making in the model reduces to (1) computing beliefs given observations and prior information in a Bayesian manner, and (2) selecting actions based on these beliefs to maximize the expected sum of future rewards. We show that the model can explain both data previously explained using the additive offset model as well as more recent data on the time-varying inﬂuence of prior knowledge on decision making. 1

6 0.10402285 41 nips-2012-Ancestor Sampling for Particle Gibbs

7 0.10233539 292 nips-2012-Regularized Off-Policy TD-Learning

8 0.10042726 363 nips-2012-Wavelet based multi-scale shape features on arbitrary surfaces for cortical thickness discrimination

9 0.095179595 187 nips-2012-Learning curves for multi-task Gaussian process regression

10 0.090917125 312 nips-2012-Simultaneously Leveraging Output and Task Structures for Multiple-Output Regression

11 0.07653635 333 nips-2012-Synchronization can Control Regularization in Neural Systems via Correlated Noise Processes

12 0.067436226 334 nips-2012-Tensor Decomposition for Fast Parsing with Latent-Variable PCFGs

13 0.062536731 124 nips-2012-Factorial LDA: Sparse Multi-Dimensional Text Models

14 0.061861649 208 nips-2012-Matrix reconstruction with the local max norm

15 0.061478559 218 nips-2012-Mixing Properties of Conditional Markov Chains with Unbounded Feature Functions

16 0.055086255 252 nips-2012-On Multilabel Classification and Ranking with Partial Feedback

17 0.054332573 199 nips-2012-Link Prediction in Graphs with Autoregressive Features

18 0.053543288 86 nips-2012-Convex Multi-view Subspace Learning

19 0.053450152 319 nips-2012-Sparse Prediction with the $k$-Support Norm

20 0.051424686 299 nips-2012-Scalable imputation of genetic data with a discrete fragmentation-coagulation process

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.138), (1, 0.007), (2, 0.026), (3, 0.049), (4, 0.026), (5, -0.002), (6, 0.005), (7, 0.021), (8, 0.005), (9, -0.028), (10, -0.009), (11, -0.083), (12, 0.001), (13, -0.01), (14, 0.048), (15, -0.006), (16, 0.041), (17, 0.065), (18, 0.095), (19, -0.114), (20, 0.081), (21, 0.094), (22, -0.156), (23, -0.123), (24, 0.022), (25, -0.111), (26, -0.01), (27, 0.072), (28, 0.017), (29, -0.002), (30, 0.084), (31, -0.091), (32, 0.004), (33, -0.005), (34, 0.07), (35, 0.043), (36, -0.003), (37, -0.098), (38, -0.102), (39, -0.108), (40, -0.085), (41, 0.037), (42, 0.033), (43, -0.105), (44, 0.179), (45, -0.057), (46, -0.077), (47, -0.024), (48, -0.133), (49, 0.006)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94611174 151 nips-2012-High-Order Multi-Task Feature Learning to Identify Longitudinal Phenotypic Markers for Alzheimer's Disease Progression Prediction

Author: Hua Wang, Feiping Nie, Heng Huang, Jingwen Yan, Sungeun Kim, Shannon Risacher, Andrew Saykin, Li Shen

2 0.57874346 276 nips-2012-Probabilistic Event Cascades for Alzheimer's disease

Author: Jonathan Huang, Daniel Alexander

3 0.53947157 203 nips-2012-Locating Changes in Highly Dependent Data with Unknown Number of Change Points

Author: Azadeh Khaleghi, Daniil Ryabko

4 0.50452495 53 nips-2012-Bayesian Pedigree Analysis using Measure Factorization

Author: Bonnie Kirkpatrick, Alexandre Bouchard-côté

Abstract: Pedigrees, or family trees, are directed graphs used to identify sites of the genome that are correlated with the presence or absence of a disease. With the advent of genotyping and sequencing technologies, there has been an explosion in the amount of data available, both in the number of individuals and in the number of sites. Some pedigrees number in the thousands of individuals. Meanwhile, analysis methods have remained limited to pedigrees of < 100 individuals which limits analyses to many small independent pedigrees. Disease models, such those used for the linkage analysis log-odds (LOD) estimator, have similarly been limited. This is because linkage analysis was originally designed with a different task in mind, that of ordering the sites in the genome, before there were technologies that could reveal the order. LODs are difﬁcult to interpret and nontrivial to extend to consider interactions among sites. These developments and difﬁculties call for the creation of modern methods of pedigree analysis. Drawing from recent advances in graphical model inference and transducer theory, we introduce a simple yet powerful formalism for expressing genetic disease models. We show that these disease models can be turned into accurate and computationally efﬁcient estimators. The technique we use for constructing the variational approximation has potential applications to inference in other large-scale graphical models. This method allows inference on larger pedigrees than previously analyzed in the literature, which improves disease site prediction. 1

5 0.49374431 363 nips-2012-Wavelet based multi-scale shape features on arbitrary surfaces for cortical thickness discrimination

Author: Won H. Kim, Deepti Pachauri, Charles Hatt, Moo. K. Chung, Sterling Johnson, Vikas Singh

Abstract: Hypothesis testing on signals deﬁned on surfaces (such as the cortical surface) is a fundamental component of a variety of studies in Neuroscience. The goal here is to identify regions that exhibit changes as a function of the clinical condition under study. As the clinical questions of interest move towards identifying very early signs of diseases, the corresponding statistical differences at the group level invariably become weaker and increasingly hard to identify. Indeed, after a multiple comparisons correction is adopted (to account for correlated statistical tests over all surface points), very few regions may survive. In contrast to hypothesis tests on point-wise measurements, in this paper, we make the case for performing statistical analysis on multi-scale shape descriptors that characterize the local topological context of the signal around each surface vertex. Our descriptors are based on recent results from harmonic analysis, that show how wavelet theory extends to non-Euclidean settings (i.e., irregular weighted graphs). We provide strong evidence that these descriptors successfully pick up group-wise differences, where traditional methods either fail or yield unsatisfactory results. Other than this primary application, we show how the framework allows performing cortical surface smoothing in the native space without mappint to a unit sphere. 1

6 0.46508056 46 nips-2012-Assessing Blinding in Clinical Trials

7 0.46215302 153 nips-2012-How Prior Probability Influences Decision Making: A Unifying Probabilistic Model

8 0.46070784 299 nips-2012-Scalable imputation of genetic data with a discrete fragmentation-coagulation process

9 0.43540564 292 nips-2012-Regularized Off-Policy TD-Learning

10 0.4077341 157 nips-2012-Identification of Recurrent Patterns in the Activation of Brain Networks

11 0.40281007 41 nips-2012-Ancestor Sampling for Particle Gibbs

12 0.37925544 319 nips-2012-Sparse Prediction with the $k$-Support Norm

13 0.36829987 86 nips-2012-Convex Multi-view Subspace Learning

14 0.35353801 333 nips-2012-Synchronization can Control Regularization in Neural Systems via Correlated Noise Processes

15 0.35282519 208 nips-2012-Matrix reconstruction with the local max norm

16 0.3466233 167 nips-2012-Kernel Hyperalignment

17 0.34527946 312 nips-2012-Simultaneously Leveraging Output and Task Structures for Multiple-Output Regression

18 0.3414416 110 nips-2012-Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems

19 0.34093374 66 nips-2012-Causal discovery with scale-mixture model for spatiotemporal variance dependencies

20 0.32758602 136 nips-2012-Forward-Backward Activation Algorithm for Hierarchical Hidden Markov Models

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.038), (11, 0.317), (17, 0.015), (21, 0.019), (38, 0.1), (42, 0.024), (54, 0.017), (55, 0.031), (64, 0.014), (74, 0.031), (76, 0.15), (80, 0.055), (92, 0.053), (94, 0.052)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.78631175 238 nips-2012-Neurally Plausible Reinforcement Learning of Working Memory Tasks

Author: Jaldert Rombouts, Pieter Roelfsema, Sander M. Bohte

Abstract: A key function of brains is undoubtedly the abstraction and maintenance of information from the environment for later use. Neurons in association cortex play an important role in this process: by learning these neurons become tuned to relevant features and represent the information that is required later as a persistent elevation of their activity [1]. It is however not well known how such neurons acquire these task-relevant working memories. Here we introduce a biologically plausible learning scheme grounded in Reinforcement Learning (RL) theory [2] that explains how neurons become selective for relevant information by trial and error learning. The model has memory units which learn useful internal state representations to solve working memory tasks by transforming partially observable Markov decision problems (POMDP) into MDPs. We propose that synaptic plasticity is guided by a combination of attentional feedback signals from the action selection stage to earlier processing levels and a globally released neuromodulatory signal. Feedback signals interact with feedforward signals to form synaptic tags at those connections that are responsible for the stimulus-response mapping. The neuromodulatory signal interacts with tagged synapses to determine the sign and strength of plasticity. The learning scheme is generic because it can train networks in different tasks, simply by varying inputs and rewards. It explains how neurons in association cortex learn to 1) temporarily store task-relevant information in non-linear stimulus-response mapping tasks [1, 3, 4] and 2) learn to optimally integrate probabilistic evidence for perceptual decision making [5, 6]. 1

same-paper 2 0.7700423 151 nips-2012-High-Order Multi-Task Feature Learning to Identify Longitudinal Phenotypic Markers for Alzheimer's Disease Progression Prediction

Author: Hua Wang, Feiping Nie, Heng Huang, Jingwen Yan, Sungeun Kim, Shannon Risacher, Andrew Saykin, Li Shen

3 0.74057305 225 nips-2012-Multi-task Vector Field Learning

Author: Binbin Lin, Sen Yang, Chiyuan Zhang, Jieping Ye, Xiaofei He

Abstract: Multi-task learning (MTL) aims to improve generalization performance by learning multiple related tasks simultaneously and identifying the shared information among tasks. Most of existing MTL methods focus on learning linear models under the supervised setting. We propose a novel semi-supervised and nonlinear approach for MTL using vector ﬁelds. A vector ﬁeld is a smooth mapping from the manifold to the tangent spaces which can be viewed as a directional derivative of functions on the manifold. We argue that vector ﬁelds provide a natural way to exploit the geometric structure of data as well as the shared differential structure of tasks, both of which are crucial for semi-supervised multi-task learning. In this paper, we develop multi-task vector ﬁeld learning (MTVFL) which learns the predictor functions and the vector ﬁelds simultaneously. MTVFL has the following key properties. (1) The vector ﬁelds MTVFL learns are close to the gradient ﬁelds of the predictor functions. (2) Within each task, the vector ﬁeld is required to be as parallel as possible which is expected to span a low dimensional subspace. (3) The vector ﬁelds from all tasks share a low dimensional subspace. We formalize our idea in a regularization framework and also provide a convex relaxation method to solve the original non-convex problem. The experimental results on synthetic and real data demonstrate the effectiveness of our proposed approach. 1

4 0.69908339 327 nips-2012-Structured Learning of Gaussian Graphical Models

Author: Karthik Mohan, Mike Chung, Seungyeop Han, Daniela Witten, Su-in Lee, Maryam Fazel

Abstract: We consider estimation of multiple high-dimensional Gaussian graphical models corresponding to a single set of nodes under several distinct conditions. We assume that most aspects of the networks are shared, but that there are some structured differences between them. Speciﬁcally, the network differences are generated from node perturbations: a few nodes are perturbed across networks, and most or all edges stemming from such nodes differ between networks. This corresponds to a simple model for the mechanism underlying many cancers, in which the gene regulatory network is disrupted due to the aberrant activity of a few speciﬁc genes. We propose to solve this problem using the perturbed-node joint graphical lasso, a convex optimization problem that is based upon the use of a row-column overlap norm penalty. We then solve the convex problem using an alternating directions method of multipliers algorithm. Our proposal is illustrated on synthetic data and on an application to brain cancer gene expression data. 1

5 0.67919111 305 nips-2012-Selective Labeling via Error Bound Minimization

Author: Quanquan Gu, Tong Zhang, Jiawei Han, Chris H. Ding

Abstract: In many practical machine learning problems, the acquisition of labeled data is often expensive and/or time consuming. This motivates us to study a problem as follows: given a label budget, how to select data points to label such that the learning performance is optimized. We propose a selective labeling method by analyzing the out-of-sample error of Laplacian regularized Least Squares (LapRLS). In particular, we derive a deterministic out-of-sample error bound for LapRLS trained on subsampled data, and propose to select a subset of data points to label by minimizing this upper bound. Since the minimization is a combinational problem, we relax it into continuous domain and solve it by projected gradient descent. Experiments on benchmark datasets show that the proposed method outperforms the state-of-the-art methods.

6 0.55529481 363 nips-2012-Wavelet based multi-scale shape features on arbitrary surfaces for cortical thickness discrimination

7 0.55504322 284 nips-2012-Q-MKL: Matrix-induced Regularization in Multi-Kernel Learning with Applications to Neuroimaging

8 0.54321492 68 nips-2012-Clustering Aggregation as Maximum-Weight Independent Set

9 0.54187101 79 nips-2012-Compressive neural representation of sparse, high-dimensional probabilities

10 0.54032046 276 nips-2012-Probabilistic Event Cascades for Alzheimer's disease

11 0.53744906 157 nips-2012-Identification of Recurrent Patterns in the Activation of Brain Networks

12 0.53022498 364 nips-2012-Weighted Likelihood Policy Search with Model Selection

13 0.52926123 90 nips-2012-Deep Learning of Invariant Features via Simulated Fixations in Video

14 0.52921742 318 nips-2012-Sparse Approximate Manifolds for Differential Geometric MCMC

15 0.52782953 203 nips-2012-Locating Changes in Highly Dependent Data with Unknown Number of Change Points

16 0.52734286 256 nips-2012-On the connections between saliency and tracking

17 0.52643675 188 nips-2012-Learning from Distributions via Support Measure Machines

18 0.5261777 139 nips-2012-Fused sparsity and robust estimation for linear models with unknown variance

19 0.52562404 354 nips-2012-Truly Nonparametric Online Variational Inference for Hierarchical Dirichlet Processes

20 0.52503449 163 nips-2012-Isotropic Hashing