nips nips2009 nips2009-70 knowledge-graph by maker-knowledge-mining

70 nips-2009-Discriminative Network Models of Schizophrenia


Source: pdf

Author: Irina Rish, Benjamin Thyreau, Bertrand Thirion, Marion Plaze, Marie-laure Paillere-martinot, Catherine Martelli, Jean-luc Martinot, Jean-baptiste Poline, Guillermo A. Cecchi

Abstract: Schizophrenia is a complex psychiatric disorder that has eluded a characterization in terms of local abnormalities of brain activity, and is hypothesized to affect the collective, “emergent” working of the brain. We propose a novel data-driven approach to capture emergent features using functional brain networks [4] extracted from fMRI data, and demonstrate its advantage over traditional region-of-interest (ROI) and local, task-specific linear activation analyzes. Our results suggest that schizophrenia is indeed associated with disruption of global brain properties related to its functioning as a network, which cannot be explained by alteration of local activation patterns. Moreover, further exploitation of interactions by sparse Markov Random Field classifiers shows clear gain over linear methods, such as Gaussian Naive Bayes and SVM, allowing to reach 86% accuracy (over 50% baseline - random guess), which is quite remarkable given that it is based on a single fMRI experiment using a simple auditory task. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 We propose a novel data-driven approach to capture emergent features using functional brain networks [4] extracted from fMRI data, and demonstrate its advantage over traditional region-of-interest (ROI) and local, task-specific linear activation analyzes. [sent-10, score-0.713]

2 Our results suggest that schizophrenia is indeed associated with disruption of global brain properties related to its functioning as a network, which cannot be explained by alteration of local activation patterns. [sent-11, score-0.963]

3 The objective of this work is to identify biomarkers predictive of schizophrenia based on fMRI data collected for both schizophrenic and non-schizophrenic subjects performing a simple auditory task in the scanner [14]. [sent-16, score-1.008]

4 , stroke or Parkinsons disease), schizophrenia appears to be “delocalized”, i. [sent-19, score-0.333]

5 difficult to attribute to a dysfunction of some par1 ticular brain areas1 . [sent-21, score-0.09]

6 To test this hypothesis, we measured diverse topological features of the functional networks and compared them across the normal subjects and schizophrenic patients groups. [sent-23, score-0.862]

7 Specifically, we decided to ask the following questions: (1) What specific effects does schizophrenia have on the functional connectivity of brain networks? [sent-24, score-0.609]

8 (2) Does schizophrenia affect functional connectivity in ways that are congruent with the effect it has on area-specific, task-dependent activations? [sent-25, score-0.481]

9 (3) Is it possible to use functional connectivity to improve the classification accuracy of schizophrenic patients? [sent-26, score-0.55]

10 In answer to these questions, we will show that degree maps, which assign to each voxel the number of its neighbors in a network, identify spatially clustered groups of voxels with statistically significant group (i. [sent-27, score-0.746]

11 schizophrenic) differences; moreover, these highly significant voxel subsets are quite stable over different data subsets. [sent-30, score-0.244]

12 In contrast, standard linear activation maps commonly used in fMRI analysis show much weaker group differences as well as stability. [sent-31, score-0.732]

13 Moreover, degree maps yield very informative features, allowing for up to 86% classification accuracy (with 50% baseline), as opposed to standard local voxel activations. [sent-32, score-0.628]

14 Finally, we demonstrate that traditional approaches based on a direct comparison of the correlation at the level of relevant regions of interest (ROIs) or using a functional parcellation technique [17], do not reveal any statistically significant differences between the groups. [sent-34, score-0.291]

15 Indeed, a more data-driven approach that exploits properties of voxel-level networks appears to be necessary in order to achieve high discriminative power. [sent-35, score-0.081]

16 2 Background and Related Work In Functional Magnetic Resonance Imaging (fMRI), a MR scanner non-invasively records a subject’s blood-oxygenation-level dependent (BOLD) signal, known to be correlated with neural activity, as a subject performs a task of interest (e. [sent-36, score-0.076]

17 Commonly used activation maps depict the “activity” level of each voxel determined by the linear correlation of its time course with the stimulus (see Supplemental Material for details). [sent-41, score-0.91]

18 Indeed, as it was shown in [8], highly predictive models of mental states can be built from voxels with sub-maximal activation. [sent-43, score-0.333]

19 However, our focus herein is not just predictive modeling, but rather discovery of interpretable features with high discriminative power. [sent-45, score-0.157]

20 , less than 15) known brain regions believed to be relevant to the task or phenomenon of interest. [sent-54, score-0.126]

21 In this paper, we demonstrate that such model-based region-of-interest (ROI) analysis may fail to reveal informative interactions which, nevertheless, become visible at the finer-grain voxel level when using a purely data-driven, networkbased approach [4]. [sent-55, score-0.291]

22 Moreover, while recent publications have already indicated that functional networks in the schizophrenic brain display disrupted topological properties, we demonstrate, for the first time, that (1) specific topological properties (e. [sent-56, score-0.809]

23 Two groups of 12 subjects each were submitted to the same experimental paradigm involving language: schizophrenic patients and age-matched normal controls (same experiment was performed with a third group of alcoholic patients, yielding similar results - see Suppl. [sent-60, score-0.668]

24 The studies had been performed after approval of the local ethics committee and all subjects were studied after they gave written informed consent. [sent-62, score-0.07]

25 The task is based on auditory stimuli; subjects listen to emotionally neutral sentences either in native (French) or foreign language. [sent-63, score-0.233]

26 In order to catch attention of subjects, each trial begins with a short (200 ms) auditory tone, followed by the actual sentence. [sent-66, score-0.085]

27 A full fMRI run contains 96 trials, with 32 sentences in French (native), 32 sentences in foreign languages, and 32 silence interval controls. [sent-69, score-0.226]

28 Several subjects were excluded from the consideration due to excessive head motion in the scanner, leaving us with 11 schizophrenic and 11 healthy subjects, i. [sent-75, score-0.472]

29 Each sample associated with roughly 53,000 voxels (after removing out-of-brain voxels from the original 53 × 63 × 46 image), over 420 time points (TRs), i. [sent-78, score-0.552]

30 1 Model-Driven Approach using ROI First, we decided to test whether the interactions between several known regions of interest (ROIs) would contain enough discriminative information about schizophrenic versus normal subjects. [sent-84, score-0.638]

31 A second set of 600 ROI’s was defined automatically using a parcellation algorithm [17] that estimates, for each subject, a collection of regions based on task-based functional signal similarity and position in the MNI space. [sent-88, score-0.183]

32 Time series were extracted as the spatial mean over each ROI, leading to 10 time series per subject for the predefined ROIs and 600 for the parcellation technique. [sent-89, score-0.104]

33 correlation weighted by the ”Language French” condition versus correlation weighted by ”Control” condition after convolution with a standard hemodynamic response function). [sent-94, score-0.088]

34 Those connectivity measures were then tested for significance using standards non parametric tests between groups (Wilcoxon signed-rank test) with corrected p-values for multiple comparisons. [sent-95, score-0.089]

35 For each subject, and each run, a separate functional network was constructed. [sent-100, score-0.117]

36 Next, we measured a number of its topological features, including the degree distribution, mean degree, the size of the largest connected subgraph (giant component), and so on (see the supplemental material for the full list). [sent-101, score-0.333]

37 To find out whether local task-dependent linear activations alone could possibly explain the differences between the schizophrenic and normal brains, we used as a baseline set of features based on the standard voxel activation maps. [sent-104, score-1.239]

38 For each subject, and for each run, activation maps, as well as their differences, or activation contrast maps, were obtained using several regressors based on the language task, as described in the supplemental material (for simplicity, we will refer to all such maps as activation maps). [sent-105, score-1.524]

39 The activation values of each voxel were subsequently used as features in the classification task. [sent-106, score-0.695]

40 Similarly to degree maps, we also computed a global feature, mean-activation (mean-t-val)), by taking the mean absolute value of the voxel’s t-statistics. [sent-107, score-0.207]

41 Both activation and degree maps for each sample were also normalized, i. [sent-108, score-0.802]

42 3 Classification Approaches First, off-the-shelf methods such Gaussian Naive Bayes (GNB) and Support Vector Machines (SVM) were used in order to compare the discriminative power of different sets of features described above. [sent-112, score-0.07]

43 Moreover, we decided to further investigate our hypothesis that interactions among voxels contain highly discriminative information, and compare those linear classifiers against probabilistic graphical models that explicitly model such interactions. [sent-113, score-0.423]

44 the (inverse) covariance matrix parameter) for each class Y = {0, 1} (schizophrenic vs non-schizophrenic), and then choose the most-likely class label arg maxc p(x|c)P (c) for each unlabeled test sample x. [sent-148, score-0.105]

45 The variables were ranked in the ascending order of their p-values (lower p = higher confidence in between-group differences), and classification results on top k voxels will be presented for a range of k values. [sent-151, score-0.276]

46 First, we observed that correlations (blind to experimental paradigm) between regions and within subjects were very strong and significant (p-value of 0. [sent-158, score-0.106]

47 05, corrected for the number of comparisons) when tested against 0 for all subjects (mean correlation > 0. [sent-159, score-0.141]

48 The parcellation technique led to some smaller p-values, but also to a stricter correction for multiple comparison and no correlation was close to the corrected threshold. [sent-162, score-0.186]

49 In conclusion, we could not detect significant differences between the schizophrenic patient data and normal subjects in either the BOLD signal correlation or the interaction between the signal and the main experimental contrast (native language versus silence). [sent-164, score-0.682]

50 Note that the mean (normalized) degree at those voxels was always (significantly) higher for normals than for schizophrenics. [sent-173, score-0.456]

51 (b) Direct comparison of voxel p-values and FDR threshold: p-values sorted in ascending order; FDR test select voxels with p < α · k/N (α - false-positive rate, k - the index of a p-value in the sorted sequence, N - the total number of voxels). [sent-174, score-0.52]

52 Degree maps show much stronger statistical differences between the schizophrenic vs. [sent-180, score-0.67]

53 Figure 2 show the 2-sample t-test results for the full degree map and the activation maps, after False-Discovery Rate (FDR) correction for multiple comparisons (standard in fMRI analysis), at α = 0. [sent-182, score-0.712]

54 , have significantly lower (normalized) voxel degrees in that area than the normal group (possibly due to a more even spread of degrees in schizophrenic vs. [sent-189, score-0.77]

55 Moreover, degree maps demonstrate much higher stability than activation maps with respect to selecting a subset of top ranked voxels over different subsets of data. [sent-191, score-1.282]

56 Figure 3a shows that degree maps have up to almost 70% top-ranked voxels in common over different training data sets when using the leaveone-subject out cross-validation, while activation maps have below 50% voxels in common between different selected subsets. [sent-192, score-1.558]

57 This property of degree vs activation features is particularly important for interpretability of predictive modeling. [sent-193, score-0.76]

58 A closer look at the degree distributions reveals that a large percentage of the differential connectivity appears to be due to long-distance, inter-hemispheric links. [sent-196, score-0.242]

59 Figure 3a compares (normalized) histograms, for schizophrenic (red) versus normal (blue) groups, of the fraction of inter-hemispheric connections over the total number of connections, computed for each subject within the group. [sent-197, score-0.569]

60 The schizophrenic group shows a significant bias towards low relative inter-hemispheric connectivity. [sent-198, score-0.448]

61 A t-test analysis of the distributions indicates that differences are statistically significant (p=2. [sent-199, score-0.064]

62 Moreover, it is evident that a major contributor to the high degree difference discussed before is the presence of a large number of inter-hemispheric connections in the normal group, which is lacking in schizophrenic group. [sent-201, score-0.706]

63 Furthermore, we selected a bilateral regions of interest (ROI’s) corresponding to left and right Brodmann Area 22 (roughly, the clusters in Figure 2a), such that the linear activation for these ROI’s was not significantly different between the groups, even in the uncorrected case. [sent-202, score-0.454]

64 For each subject, the link between the left and 6 Stability of top−ranked voxel subset 0. [sent-203, score-0.244]

65 4 degree(full) degree (long distance) degree(inter−hemispheric) activation 1 (and 3) activation 2 (and 4) activation 5 activation 6 activation 7 activation 8 0. [sent-205, score-2.688]

66 1 0 0 1000 2000 3000 4000 5000 # of top−ranked voxels selected 1 0. [sent-208, score-0.276]

67 the percent of voxels in common among the subsets of k top variables selected at all CV folds. [sent-221, score-0.276]

68 For each subject, we compute the fraction of inter-hemispheric connections over the total number of connections, and plot a normalized histogram over all subjects in a particular group (normal - blue, schizophrenic red). [sent-223, score-0.564]

69 (G) mean activation (A) D+A C+A G+A G +D +C G+D+C+A (GNB 27. [sent-228, score-0.418]

70 Clearly, the normal group displays a high density of interhemispheric connections, which are significantly disrupted in the schizophrenic group (p=3. [sent-259, score-0.605]

71 This provides a strong indication that the group differences in connectivity cannot be explained by differences in local activation. [sent-261, score-0.236]

72 While more details are presented in the supplemental material, we outline here the main observations: while mean activation (we used map 8, the best performer for SVM on the full set of voxels - see Table1b) had an relatively low p-value of 5. [sent-267, score-0.796]

73 3 × 10−2 for mean-degree, the predictive power of the latter, alone or in combination with some other features, was the best among global features reaching 27. [sent-269, score-0.117]

74 5% in schizophrenic vs normal classification (Table 1a), while mean activation yielded more than 40% error with all classifiers. [sent-270, score-0.97]

75 While mean-degree indicates the presence of discriminative information in voxel degrees, its generalization ability, though the best among global features and their combinations, is relatively poor. [sent-274, score-0.341]

76 However, voxel-level degree maps turned out to be excellent predictive features, often outperforming activation features by far. [sent-275, score-0.919]

77 Table 1b compares prediction made by SVM on complete maps (without voxel subset selection): both full and long-distance degree maps greatly outperform all activation maps, achieving 16% error vs. [sent-276, score-1.284]

78 above 30% for even the best-performing activation map 8. [sent-277, score-0.444]

79 Next, in Figure 4, we compare the predictive power of different maps when using all three classifiers: Support Vector Machines (SVM), Gaussian Naive Bayes (GNB) and sparse Gaussian Markov Random Field (MRF), on the subsets of k top-ranked voxels, for a variety of k values. [sent-278, score-0.261]

80 We used the best-performing activation map 8 from the Table above, as well as maps 1 and 6 (that survived FDR); map 6 was also outperforming other activation maps in low-voxel regime. [sent-279, score-1.347]

81 To avoid clutter, we only plot the two best-performing degree maps out of three (i. [sent-280, score-0.384]

82 We can see that: (a) Degree maps frequently outperform activation maps, for all classifiers we used; the differences are 7 Support Vector Machine: schizophrenic vs normal Gaussian Naive Bayes schizophrenic vs normal 0. [sent-285, score-1.79]

83 6 activation 1 FrenchNative − Silence activation 6 FrenchNative activation 8 Silence degree (long−distance) degree (full) 0. [sent-288, score-1.614]

84 6 MRF vs GNB vs SVM: schizophrenic vs normal Markov Random Field: schizophrenic vs normal 0. [sent-290, score-1.248]

85 8 activation 1 FrenchNative − Silence activation 6 FrenchNative activation 8 Silence degree (long−distance) degree (full) 0. [sent-292, score-1.614]

86 8 activation 1 FrenchNative − Silence activation 6 FrenchNative activation 8 Silence degree (long−distance) degree (full) 0. [sent-295, score-1.614]

87 1): degree (long−distance) GNB: degree (long−distance) SVM:degree (long−distance) 0. [sent-297, score-0.36]

88 1 2 10 3 10 K top voxels (ttest) 2 10 3 10 4 10 K top voxels (ttest) 5 10 50 100 150 200 K top voxels (ttest) 250 300 0. [sent-317, score-0.828]

89 1 50 100 150 200 250 300 K top voxels (ttest) (a) (b) (c) (d) Figure 4: Classification results comparing (a) GNB, (b) SVM and (c) sparse MRF on degree versus activation contrast maps; (d) all three classifiers compared on long-distance degree maps (best-performing for MRF). [sent-318, score-1.258]

90 particularly noticeable when the number of selected voxels is relatively low. [sent-319, score-0.276]

91 (b) Full and long-distance degree maps perform quite similarly, with long-distance map achieving the best result (14% error) using MRFs. [sent-322, score-0.41]

92 (c) Among the activation maps only, while the map 8 (“Silence”) outperforms others on the full set of voxels using SVM, its behavior in low-voxel regime is quite poor (always above 30-35% error); instead, map 6 (“FrenchNative”) achieves best performance among activation maps in this regime3 . [sent-323, score-1.606]

93 From neuroscience perspective, we provided strong support for the hypothesis that schizophrenia is associated with the disruption of global, emergent brain properties which cannot be explained just by alteration of local activation patterns. [sent-329, score-1.003]

94 Note that the schizophrenia patients studied here have been selected for their prominent, persistent, and pharmaco-resistant auditory hallucinations [14], which might have increased their clinical homogeneity. [sent-332, score-0.516]

95 However, the patient group is not representative of the full spectrum of the disease, and thus our conclusions may not necessarily apply to all schizophrenia patients, due to the clinical characteristics and size of the studied samples. [sent-333, score-0.439]

96 3 We also observed that performing normalization really helped activation maps, since otherwise their performance could get much worse, especially with MRFs - we provide those results in supplemental material. [sent-337, score-0.46]

97 Temporal and cross-subject probabilistic models for fmri prediction tasks. [sent-356, score-0.131]

98 Statistical parametric maps in functional imaging - a general linear approach. [sent-422, score-0.29]

99 Left superior temporal gyrus activation during sentence perception negatively correlates with auditory hallucination severity in schizophrenia patients. [sent-470, score-0.989]

100 Dealing with the shortcomings of spatial normalization: Multi-subject parcellation of fmri datasets. [sent-500, score-0.192]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('activation', 0.418), ('schizophrenic', 0.402), ('schizophrenia', 0.333), ('voxels', 0.276), ('voxel', 0.244), ('maps', 0.204), ('degree', 0.18), ('frenchnative', 0.153), ('fmri', 0.131), ('fdr', 0.121), ('roi', 0.113), ('gnb', 0.111), ('silence', 0.108), ('mrf', 0.106), ('brain', 0.09), ('functional', 0.086), ('auditory', 0.085), ('normal', 0.078), ('topological', 0.077), ('rois', 0.073), ('patients', 0.072), ('vs', 0.072), ('subjects', 0.07), ('differences', 0.064), ('disruption', 0.062), ('connectivity', 0.062), ('gyrus', 0.061), ('parcellation', 0.061), ('predictive', 0.057), ('svm', 0.056), ('cea', 0.055), ('neurospin', 0.055), ('angular', 0.054), ('correction', 0.054), ('temporal', 0.051), ('france', 0.05), ('ttest', 0.049), ('interactions', 0.047), ('connections', 0.046), ('group', 0.046), ('correlation', 0.044), ('classi', 0.044), ('networks', 0.044), ('subject', 0.043), ('cij', 0.042), ('sentences', 0.042), ('supplemental', 0.042), ('emergent', 0.042), ('cingulum', 0.042), ('psychiatry', 0.042), ('survive', 0.042), ('sentence', 0.041), ('decided', 0.038), ('discriminative', 0.037), ('cuneus', 0.036), ('inserm', 0.036), ('native', 0.036), ('saclay', 0.036), ('friston', 0.036), ('regions', 0.036), ('ers', 0.035), ('full', 0.034), ('scanner', 0.033), ('maxc', 0.033), ('alteration', 0.033), ('disrupted', 0.033), ('features', 0.033), ('french', 0.031), ('mid', 0.031), ('network', 0.031), ('paris', 0.031), ('herein', 0.03), ('naive', 0.029), ('collective', 0.028), ('anterior', 0.028), ('bellivier', 0.028), ('bilaterally', 0.028), ('biomarkers', 0.028), ('brodmann', 0.028), ('cecchi', 0.028), ('disconnection', 0.028), ('hemispheric', 0.028), ('martinot', 0.028), ('plaze', 0.028), ('shfj', 0.028), ('thirion', 0.028), ('trs', 0.028), ('global', 0.027), ('outperforming', 0.027), ('anatomical', 0.027), ('corrected', 0.027), ('long', 0.027), ('clinical', 0.026), ('map', 0.026), ('field', 0.026), ('disease', 0.025), ('hypothesis', 0.025), ('cortex', 0.025), ('language', 0.024), ('survived', 0.024)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9999997 70 nips-2009-Discriminative Network Models of Schizophrenia

Author: Irina Rish, Benjamin Thyreau, Bertrand Thirion, Marion Plaze, Marie-laure Paillere-martinot, Catherine Martelli, Jean-luc Martinot, Jean-baptiste Poline, Guillermo A. Cecchi

Abstract: Schizophrenia is a complex psychiatric disorder that has eluded a characterization in terms of local abnormalities of brain activity, and is hypothesized to affect the collective, “emergent” working of the brain. We propose a novel data-driven approach to capture emergent features using functional brain networks [4] extracted from fMRI data, and demonstrate its advantage over traditional region-of-interest (ROI) and local, task-specific linear activation analyzes. Our results suggest that schizophrenia is indeed associated with disruption of global brain properties related to its functioning as a network, which cannot be explained by alteration of local activation patterns. Moreover, further exploitation of interactions by sparse Markov Random Field classifiers shows clear gain over linear methods, such as Gaussian Naive Bayes and SVM, allowing to reach 86% accuracy (over 50% baseline - random guess), which is quite remarkable given that it is based on a single fMRI experiment using a simple auditory task. 1

2 0.35637841 86 nips-2009-Exploring Functional Connectivities of the Human Brain using Multivariate Information Analysis

Author: Barry Chai, Dirk Walther, Diane Beck, Li Fei-fei

Abstract: In this study, we present a new method for establishing fMRI pattern-based functional connectivity between brain regions by estimating their multivariate mutual information. Recent advances in the numerical approximation of highdimensional probability distributions allow us to successfully estimate mutual information from scarce fMRI data. We also show that selecting voxels based on the multivariate mutual information of local activity patterns with respect to ground truth labels leads to higher decoding accuracy than established voxel selection methods. We validate our approach with a 6-way scene categorization fMRI experiment. Multivariate information analysis is able to find strong information sharing between PPA and RSC, consistent with existing neuroscience studies on scenes. Furthermore, an exploratory whole-brain analysis uncovered other brain regions that share information with the PPA-RSC scene network.

3 0.19149496 110 nips-2009-Hierarchical Mixture of Classification Experts Uncovers Interactions between Brain Regions

Author: Bangpeng Yao, Dirk Walther, Diane Beck, Li Fei-fei

Abstract: The human brain can be described as containing a number of functional regions. These regions, as well as the connections between them, play a key role in information processing in the brain. However, most existing multi-voxel pattern analysis approaches either treat multiple regions as one large uniform region or several independent regions, ignoring the connections between them. In this paper we propose to model such connections in an Hidden Conditional Random Field (HCRF) framework, where the classiďŹ er of one region of interest (ROI) makes predictions based on not only its voxels but also the predictions from ROIs that it connects to. Furthermore, we propose a structural learning method in the HCRF framework to automatically uncover the connections between ROIs. We illustrate this approach with fMRI data acquired while human subjects viewed images of different natural scene categories and show that our model can improve the top-level (the classiďŹ er combining information from all ROIs) and ROI-level prediction accuracy, as well as uncover some meaningful connections between ROIs. 1

4 0.16835704 38 nips-2009-Augmenting Feature-driven fMRI Analyses: Semi-supervised learning and resting state activity

Author: Andreas Bartels, Matthew Blaschko, Jacquelyn A. Shelton

Abstract: Resting state activity is brain activation that arises in the absence of any task, and is usually measured in awake subjects during prolonged fMRI scanning sessions where the only instruction given is to close the eyes and do nothing. It has been recognized in recent years that resting state activity is implicated in a wide variety of brain function. While certain networks of brain areas have different levels of activation at rest and during a task, there is nevertheless significant similarity between activations in the two cases. This suggests that recordings of resting state activity can be used as a source of unlabeled data to augment discriminative regression techniques in a semi-supervised setting. We evaluate this setting empirically yielding three main results: (i) regression tends to be improved by the use of Laplacian regularization even when no additional unlabeled data are available, (ii) resting state data seem to have a similar marginal distribution to that recorded during the execution of a visual processing task implying largely similar types of activation, and (iii) this source of information can be broadly exploited to improve the robustness of empirical inference in fMRI studies, an inherently data poor domain. 1

5 0.14106639 47 nips-2009-Boosting with Spatial Regularization

Author: Yongxin Xi, Uri Hasson, Peter J. Ramadge, Zhen J. Xiang

Abstract: By adding a spatial regularization kernel to a standard loss function formulation of the boosting problem, we develop a framework for spatially informed boosting. From this regularized loss framework we derive an efficient boosting algorithm that uses additional weights/priors on the base classifiers. We prove that the proposed algorithm exhibits a “grouping effect”, which encourages the selection of all spatially local, discriminative base classifiers. The algorithm’s primary advantage is in applications where the trained classifier is used to identify the spatial pattern of discriminative information, e.g. the voxel selection problem in fMRI. We demonstrate the algorithm’s performance on various data sets. 1

6 0.11997212 251 nips-2009-Unsupervised Detection of Regions of Interest Using Iterative Link Analysis

7 0.1037435 125 nips-2009-Learning Brain Connectivity of Alzheimer's Disease from Neuroimaging Data

8 0.10200758 219 nips-2009-Slow, Decorrelated Features for Pretraining Complex Cell-like Networks

9 0.1000321 83 nips-2009-Estimating image bases for visual image reconstruction from human brain activity

10 0.099219874 43 nips-2009-Bayesian estimation of orientation preference maps

11 0.084679171 261 nips-2009-fMRI-Based Inter-Subject Cortical Alignment Using Functional Connectivity

12 0.076734774 58 nips-2009-Constructing Topological Maps using Markov Random Fields and Loop-Closure Detection

13 0.075327173 260 nips-2009-Zero-shot Learning with Semantic Output Codes

14 0.071357585 246 nips-2009-Time-Varying Dynamic Bayesian Networks

15 0.069960304 141 nips-2009-Local Rules for Global MAP: When Do They Work ?

16 0.06631542 224 nips-2009-Sparse and Locally Constant Gaussian Graphical Models

17 0.061980899 237 nips-2009-Subject independent EEG-based BCI decoding

18 0.052972261 119 nips-2009-Kernel Methods for Deep Learning

19 0.050166372 77 nips-2009-Efficient Match Kernel between Sets of Features for Visual Recognition

20 0.049790747 49 nips-2009-Breaking Boundaries Between Induction Time and Diagnosis Time Active Information Acquisition


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.177), (1, -0.147), (2, -0.031), (3, 0.103), (4, -0.028), (5, 0.189), (6, 0.038), (7, -0.269), (8, -0.344), (9, -0.022), (10, 0.042), (11, -0.069), (12, -0.033), (13, 0.116), (14, -0.016), (15, -0.038), (16, -0.052), (17, -0.037), (18, 0.053), (19, 0.026), (20, -0.099), (21, 0.012), (22, 0.033), (23, 0.034), (24, 0.015), (25, -0.017), (26, 0.047), (27, -0.021), (28, 0.056), (29, 0.056), (30, -0.012), (31, -0.034), (32, -0.014), (33, -0.01), (34, 0.021), (35, -0.023), (36, 0.041), (37, -0.037), (38, -0.055), (39, 0.043), (40, 0.048), (41, 0.019), (42, -0.101), (43, 0.007), (44, 0.002), (45, -0.039), (46, -0.025), (47, 0.026), (48, 0.001), (49, -0.002)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94956404 70 nips-2009-Discriminative Network Models of Schizophrenia

Author: Irina Rish, Benjamin Thyreau, Bertrand Thirion, Marion Plaze, Marie-laure Paillere-martinot, Catherine Martelli, Jean-luc Martinot, Jean-baptiste Poline, Guillermo A. Cecchi

Abstract: Schizophrenia is a complex psychiatric disorder that has eluded a characterization in terms of local abnormalities of brain activity, and is hypothesized to affect the collective, “emergent” working of the brain. We propose a novel data-driven approach to capture emergent features using functional brain networks [4] extracted from fMRI data, and demonstrate its advantage over traditional region-of-interest (ROI) and local, task-specific linear activation analyzes. Our results suggest that schizophrenia is indeed associated with disruption of global brain properties related to its functioning as a network, which cannot be explained by alteration of local activation patterns. Moreover, further exploitation of interactions by sparse Markov Random Field classifiers shows clear gain over linear methods, such as Gaussian Naive Bayes and SVM, allowing to reach 86% accuracy (over 50% baseline - random guess), which is quite remarkable given that it is based on a single fMRI experiment using a simple auditory task. 1

2 0.91328859 86 nips-2009-Exploring Functional Connectivities of the Human Brain using Multivariate Information Analysis

Author: Barry Chai, Dirk Walther, Diane Beck, Li Fei-fei

Abstract: In this study, we present a new method for establishing fMRI pattern-based functional connectivity between brain regions by estimating their multivariate mutual information. Recent advances in the numerical approximation of highdimensional probability distributions allow us to successfully estimate mutual information from scarce fMRI data. We also show that selecting voxels based on the multivariate mutual information of local activity patterns with respect to ground truth labels leads to higher decoding accuracy than established voxel selection methods. We validate our approach with a 6-way scene categorization fMRI experiment. Multivariate information analysis is able to find strong information sharing between PPA and RSC, consistent with existing neuroscience studies on scenes. Furthermore, an exploratory whole-brain analysis uncovered other brain regions that share information with the PPA-RSC scene network.

3 0.8096894 110 nips-2009-Hierarchical Mixture of Classification Experts Uncovers Interactions between Brain Regions

Author: Bangpeng Yao, Dirk Walther, Diane Beck, Li Fei-fei

Abstract: The human brain can be described as containing a number of functional regions. These regions, as well as the connections between them, play a key role in information processing in the brain. However, most existing multi-voxel pattern analysis approaches either treat multiple regions as one large uniform region or several independent regions, ignoring the connections between them. In this paper we propose to model such connections in an Hidden Conditional Random Field (HCRF) framework, where the classiďŹ er of one region of interest (ROI) makes predictions based on not only its voxels but also the predictions from ROIs that it connects to. Furthermore, we propose a structural learning method in the HCRF framework to automatically uncover the connections between ROIs. We illustrate this approach with fMRI data acquired while human subjects viewed images of different natural scene categories and show that our model can improve the top-level (the classiďŹ er combining information from all ROIs) and ROI-level prediction accuracy, as well as uncover some meaningful connections between ROIs. 1

4 0.77057934 125 nips-2009-Learning Brain Connectivity of Alzheimer's Disease from Neuroimaging Data

Author: Shuai Huang, Jing Li, Liang Sun, Jun Liu, Teresa Wu, Kewei Chen, Adam Fleisher, Eric Reiman, Jieping Ye

Abstract: Recent advances in neuroimaging techniques provide great potentials for effective diagnosis of Alzheimer’s disease (AD), the most common form of dementia. Previous studies have shown that AD is closely related to the alternation in the functional brain network, i.e., the functional connectivity among different brain regions. In this paper, we consider the problem of learning functional brain connectivity from neuroimaging, which holds great promise for identifying image-based markers used to distinguish Normal Controls (NC), patients with Mild Cognitive Impairment (MCI), and patients with AD. More specifically, we study sparse inverse covariance estimation (SICE), also known as exploratory Gaussian graphical models, for brain connectivity modeling. In particular, we apply SICE to learn and analyze functional brain connectivity patterns from different subject groups, based on a key property of SICE, called the “monotone property” we established in this paper. Our experimental results on neuroimaging PET data of 42 AD, 116 MCI, and 67 NC subjects reveal several interesting connectivity patterns consistent with literature findings, and also some new patterns that can help the knowledge discovery of AD. 1 In trod u cti on Alzheimer’s disease (AD) is a fatal, neurodegenerative disorder characterized by progressive impairment of memory and other cognitive functions. It is the most common form of dementia and currently affects over five million Americans; this number will grow to as many as 14 million by year 2050. The current knowledge about the cause of AD is very limited; clinical diagnosis is imprecise with definite diagnosis only possible by autopsy; also, there is currently no cure for AD, while most drugs only alleviate the symptoms. To tackle these challenging issues, the rapidly advancing neuroimaging techniques provide great potentials. These techniques, such as MRI, PET, and fMRI, produce data (images) of brain structure and function, making it possible to identify the difference between AD and normal brains. Recent studies have demonstrated that neuroimaging data provide more sensitive and consistent measures of AD onset and progression than conventional clinical assessment and neuropsychological tests [1]. Recent studies have found that AD is closely related to the alternation in the functional brain network, i.e., the functional connectivity among different brain regions [ 2]-[3]. Specifically, it has been shown that functional connectivity substantially decreases between the hippocampus and other regions of AD brains [3]-[4]. Also, some studies have found increased connectivity between the regions in the frontal lobe [ 6]-[7]. Learning functional brain connectivity from neuroimaging data holds great promise for identifying image-based markers used to distinguish among AD, MCI (Mild Cognitive Impairment), and normal aging. Note that MCI is a transition stage from normal aging to AD. Understanding and precise diagnosis of MCI have significant clinical value since it can serve as an early warning sign of AD. Despite all these, existing research in functional brain connectivity modeling suffers from limitations. A large body of functional connectivity modeling has been based on correlation analysis [2]-[3], [5]. However, correlation only captures pairwise information and fails to provide a complete account for the interaction of many (more than two) brain regions. Other multivariate statistical methods have also been used, such as Principle Component Analysis (PCA) [8], PCA-based Scaled Subprofile Model [9], Independent Component Analysis [10]-[11], and Partial Least Squares [12]-[13], which group brain regions into latent components. The brain regions within each component are believed to have strong connectivity, while the connectivity between components is weak. One major drawback of these methods is that the latent components may not correspond to any biological entities, causing difficulty in interpretation. In addition, graphical models have been used to study brain connectivity, such as structural equation models [14]-[15], dynamic causal models [16], and Granger causality. However, most of these approaches are confirmative, rather than exploratory, in the sense that they require a prior model of brain connectivity to begin with. This makes them inadequate for studying AD brain connectivity, because there is little prior knowledge about which regions should be involved and how they are connected. This makes exploratory models highly desirable. In this paper, we study sparse inverse covariance estimation (SICE), also known as exploratory Gaussian graphical models, for brain connectivity modeling. Inverse covariance matrix has a clear interpretation that the off-diagonal elements correspond to partial correlations, i.e., the correlation between each pair of brain regions given all other regions. This provides a much better model for brain connectivity than simple correlation analysis which models each pair of regions without considering other regions. Also, imposing sparsity on the inverse covariance estimation ensures a reliable brain connectivity to be modeled with limited sample size, which is usually the case in AD studies since clinical samples are difficult to obtain. From a domain perspective, imposing sparsity is also valid because neurological findings have demonstrated that a brain region usually only directly interacts with a few other brain regions in neurological processes [ 2]-[3]. Various algorithms for achieving SICE have been developed in recent year [ 17]-[22]. In addition, SICE has been used in various applications [17], [21], [23]-[26]. In this paper, we apply SICE to learn functional brain connectivity from neuroimaging and analyze the difference among AD, MCI, and NC based on a key property of SICE, called the “monotone property” we established in this paper. Unlike the previous study which is based on a specific level of sparsity [26], the monotone property allows us to study the connectivity pattern using different levels of sparsity and obtain an order for the strength of connection between pairs of brain regions. In addition, we apply bootstrap hypothesis testing to assess the significance of the connection. Our experimental results on PET data of 42 AD, 116 MCI, and 67 NC subjects enrolled in the Alzheimer’s Disease Neuroimaging Initiative project reveal several interesting connectivity patterns consistent with literature findings, and also some new patterns that can help the knowledge discovery of AD. 2 S ICE : B ack grou n d an d th e Mon oton e P rop erty An inverse covariance matrix can be represented graphically. If used to represent brain connectivity, the nodes are activated brain regions; existence of an arc between two nodes means that the two brain regions are closely related in the brain's functiona l process. Let be all the brain regions under study. We assume that follows a multivariate Gaussian distribution with mean and covariance matrix . Let be the inverse covariance matrix. Suppose we have samples (e.g., subjects with AD) for these brain regions. Note that we will only illustrate here the SICE for AD, whereas the SICE for MCI and NC can be achieved in a similar way. We can formulate the SICE into an optimization problem, i.e., (1) where is the sample covariance matrix; , , and denote the determinant, trace, and sum of the absolute values of all elements of a matrix, respectively. The part “ ” in (1) is the log-likelihood, whereas the part “ ” represents the “sparsity” of the inverse covariance matrix . (1) aims to achieve a tradeoff between the likelihood fit of the inverse covariance estimate and the sparsity. The tradeoff is controlled by , called the regularization parameter; larger will result in more sparse estimate for . The formulation in (1) follows the same line of the -norm regularization, which has been introduced into the least squares formulation to achieve model sparsity and the resulting model is called Lasso [27]. We employ the algorithm in [19] in this paper. Next, we show that with going from small to large, the resulting brain connectivity models have a monotone property. Before introducing the monotone property, the following definitions are needed. Definition: In the graphical representation of the inverse covariance, if node to by an arc, then is called a “neighbor” of . If is connected to chain of arcs, then is called a “connectivity component” of . is connected though some Intuitively, being neighbors means that two nodes (i.e., brain regions) are directly connected, whereas being connectivity components means that two brain regions are indirectly connected, i.e., the connection is mediated through other regions. In other words, not being connectivity components (i.e., two nodes completely separated in the graph) means that the two corresponding brain regions are completely independent of each other. Connectivity components have the following monotone property: Monotone property of SICE: Let components of with and and be the sets of all the connectivity , respectively. If , then . Intuitively, if two regions are connected (either directly or indirectly) at one level of sparseness ( ), they will be connected at all lower levels of sparseness ( ). Proof of the monotone property can be found in the supplementary file [29]. This monotone property can be used to identify how strongly connected each node (brain region) to its connectivity components. For example, assuming that and , this means that is more strongly connected to than . Thus, by changing from small to large, we can obtain an order for the strength of connection between pairs of brain regions. As will be shown in Section 3, this order is different among AD, MCI, and NC. 3 3.1 Ap p l i cati on i n B rai n Con n ecti vi ty M od el i n g of AD D a t a a c q u i s i t i o n a n d p re p ro c e s s i n g We apply SICE on FDG-PET images for 49 AD, 116 MCI, and 67 NC subjects downloaded from the ADNI website. We apply Automated Anatomical Labeling (AAL) [28] to extract data from each of the 116 anatomical volumes of interest (AVOI), and derived average of each AVOI for every subject. The AVOIs represent different regions of the whole brain. 3.2 B r a i n c o n n e c t i v i t y mo d e l i n g b y S I C E 42 AVOIs are selected for brain connectivity modeling, as they are considered to be potentially related to AD. These regions distribute in the frontal, parietal, occipital, and temporal lobes. Table 1 list of the names of the AVOIs with their corresponding lobes. The number before each AVOI is used to index the node in the connectivity models. We apply the SICE algorithm to learn one connectivity model for AD, one for MCI, and one for NC, for a given . With different ’s, the resulting connectivity models hold a monotone property, which can help obtain an order for the strength of connection between brain regions. To show the order clearly, we develop a tree-like plot in Fig. 1, which is for the AD group. To generate this plot, we start at a very small value (i.e., the right-most of the horizontal axis), which results in a fully-connected connectivity model. A fully-connected connectivity model is one that contains no region disconnected with the rest of the brain. Then, we decrease by small steps and record the order of the regions disconnected with the rest of the brain regions. Table 1: Names of the AVOIs for connectivity modeling (“L” means that the brain region is located at the left hemisphere; “R” means right hemisphere.) Frontal lobe Parietal lobe Occipital lobe Temporal lobe 1 Frontal_Sup_L 13 Parietal_Sup_L 21 Occipital_Sup_L 27 T emporal_Sup_L 2 Frontal_Sup_R 14 Parietal_Sup_R 22 Occipital_Sup_R 28 T emporal_Sup_R 3 Frontal_Mid_L 15 Parietal_Inf_L 23 Occipital_Mid_L 29 T emporal_Pole_Sup_L 4 Frontal_Mid_R 16 Parietal_Inf_R 24 Occipital_Mid_R 30 T emporal_Pole_Sup_R 5 Frontal_Sup_Medial_L 17 Precuneus_L 25 Occipital_Inf_L 31 T emporal_Mid_L 6 Frontal_Sup_Medial_R 18 Precuneus_R 26 Occipital_Inf_R 32 T emporal_Mid_R 7 Frontal_Mid_Orb_L 19 Cingulum_Post_L 33 T emporal_Pole_Mid_L 8 Frontal_Mid_Orb_R 20 Cingulum_Post_R 34 T emporal_Pole_Mid_R 9 Rectus_L 35 T emporal_Inf_L 8301 10 Rectus_R 36 T emporal_Inf_R 8302 11 Cingulum_Ant_L 37 Fusiform_L 12 Cingulum_Ant_R 38 Fusiform_R 39 Hippocampus_L 40 Hippocampus_R 41 ParaHippocampal_L 42 ParaHippocampal_R For example, in Fig. 1, as decreases below (but still above ), region “Tempora_Sup_L” is the first one becoming disconnected from the rest of the brain. As decreases below (but still above ), the rest of the brain further divides into three disconnected clusters, including the cluster of “Cingulum_Post_R” and “Cingulum_Post_L”, the cluster of “Fusiform_R” up to “Hippocampus_L”, and the cluster of the other regions. As continuously decreases, each current cluster will split into smaller clusters; eventually, when reaches a very large value, there will be no arc in the IC model, i.e., each region is now a cluster of itself and the split will stop. The sequence of the splitting gives an order for the strength of connection between brain regions. Specifically, the earlier (i.e., smaller ) a region or a cluster of regions becomes disconnected from the rest of the brain, the weaker it is connected with the rest of the brain. For example, in Fig. 1, it can be known that “Tempora_Sup_L” may be the weakest region in the brain network of AD; the second weakest ones are the cluster of “Cingulum_Post_R” and “Cingulum_Post_L”, and the cluster of “Fusiform_R” up to “Hippocampus_L”. It is very interesting to see that the weakest and second weakest brain regions in the brain network include “Cingulum_Post_R” and “Cingulum_Post_L” as well as regions all in the temporal lobe, all of which have been found to be affected by AD early and severely [3]-[5]. Next, to facilitate the comparison between AD and NC, a tree-like plot is also constructed for NC, as shown in Fig. 2. By comparing the plots for AD and NC, we can observe the following two distinct phenomena: First, in AD, between-lobe connectivity tends to be weaker than within-lobe connectivity. This can be seen from Fig. 1 which shows a clear pattern that the lobes become disconnected with each other before the regions within each lobe become disconnected with each other, as goes from small to large. This pattern does not show in Fig. 2 for NC. Second, the same brain regions in the left and right hemisphere are connected much weaker in AD than in NC. This can be seen from Fig. 2 for NC, in which the same brain regions in the left and right hemisphere are still connected even at a very large for NC. However, this pattern does not show in Fig. 1 for AD. Furthermore, a tree-like plot is also constructed for MCI (Fig. 3), and compared with the plots for AD and NC. In terms of the two phenomena discussed previously, MCI shows similar patterns to AD, but these patterns are not as distinct from NC as AD. Specifically, in terms of the first phenomenon, MCI also shows weaker between-lobe connectivity than within-lobe connectivity, which is similar to AD. However, the degree of weakerness is not as distinctive as AD. For example, a few regions in the temporal lobe of MCI, including “Temporal_Mid_R” and “Temporal_Sup_R”, appear to be more strongly connected with the occipital lobe than with other regions in the temporal lobe. In terms of the second phenomenon, MCI also shows weaker between-hemisphere connectivity in the same brain region than NC. However, the degree of weakerness is not as distinctive as AD. For example, several left-right pairs of the same brain regions are still connected even at a very large , such as “Rectus_R” and “Rectus_L”, “Frontal_Mid_Orb_R” and “Frontal_Mid_Orb _L”, “Parietal_Sup_R” and “Parietal_Sup_L”, as well as “Precuneus_R” and “Precuneus_L”. All above findings are consistent with the knowledge that MCI is a transition stage between normal aging and AD. Large λ λ3 λ2 λ1 Small λ Fig 1: Order for the strength of connection between brain regions of AD Large λ Small λ Fig 2: Order for the strength of connection between brain regions of NC Fig 3: Order for the strength of connection between brain regions of MCI Furthermore, we would like to compare how within-lobe and between-lobe connectivity is different across AD, MCI, and NC. To achieve this, we first learn one connectivity model for AD, one for MCI, and one for NC. We adjust the in the learning of each model such that the three models, corresponding to AD, MCI, and NC, respectively, will have the same total number of arcs. This is to “normalize” the models, so that the comparison will be more focused on how the arcs distribute differently across different models. By selecting different values for the total number of arcs, we can obtain models representing the brain connectivity at different levels of strength. Specifically, given a small value for the total number of arcs, only strong arcs will show up in the resulting connectivity model, so the model is a model of strong brain connectivity; when increasing the total number of arcs, mild arcs will also show up in the resulting connectivity model, so the model is a model of mild and strong brain connectivity. For example, Fig. 4 shows the connectivity models for AD, MCI, and NC with the total number of arcs equal to 50 (Fig. 4(a)), 120 (Fig. 4(b)), and 180 (Fig. 4(c)). In this paper, we use a “matrix” representation for the SICE of a connectivity model. In the matrix, each row represents one node and each column also represents one node. Please see Table 1 for the correspondence between the numbering of the nodes and the brain region each number represents. The matrix contains black and white cells: a black cell at the -th row, -th column of the matrix represents existence of an arc between nodes and in the SICE-based connectivity model, whereas a white cell represents absence of an arc. According to this definition, the total number of black cells in the matrix is equal to twice the total number of arcs in the SICE-based connectivity model. Moreover, on each matrix, four red cubes are used to highlight the brain regions in each of the four lobes; that is, from top-left to bottom-right, the red cubes highlight the frontal, parietal, occipital, and temporal lobes, respectively. The black cells inside each red cube reflect within-lobe connectivity, whereas the black cells outside the cubes reflect between-lobe connectivity. While the connectivity models in Fig. 4 clearly show some connectivity difference between AD, MCI, and NC, it is highly desirable to test if the observed difference is statistically significant. Therefore, we further perform a hypothesis testing and the results are summarized in Table 2. Specifically, a P-value is recorded in the sub-table if it is smaller than 0.1, such a P-value is further highlighted if it is even smaller than 0.05; a “---” indicates that the corresponding test is not significant (P-value>0.1). We can observe from Fig. 4 and Table 2: Within-lobe connectivity: The temporal lobe of AD has significantly less connectivity than NC. This is true across different strength levels (e.g., strong, mild, and weak) of the connectivity; in other words, even the connectivity between some strongly-connected brain regions in the temporal lobe may be disrupted by AD. In particular, it is clearly from Fig. 4(b) that the regions “Hippocampus” and “ParaHippocampal” (numbered by 39-42, located at the right-bottom corner of Fig. 4(b)) are much more separated from other regions in AD than in NC. The decrease in connectivity in the temporal lobe of AD, especially between the Hippocampus and other regions, has been extensively reported in the literature [3]-[5]. Furthermore, the temporal lobe of MCI does not show a significant decrease in connectivity, compared with NC. This may be because MCI does not disrupt the temporal lobe as badly as AD. AD MCI NC Fig 4(a): SICE-based brain connectivity models (total number of arcs equal to 50) AD MCI NC Fig 4(b): SICE-based brain connectivity models (total number of arcs equal to 120) AD MCI NC Fig 4(c): SICE-based brain connectivity models (total number of arcs equal to 180) The frontal lobe of AD has significantly more connectivity than NC, which is true across different strength levels of the connectivity. This has been interpreted as compensatory reallocation or recruitment of cognitive resources [6]-[7]. Because the regions in the frontal lobe are typically affected later in the course of AD (our data are early AD), the increased connectivity in the frontal lobe may help preserve some cognitive functions in AD patients. Furthermore, the frontal lobe of MCI does not show a significant increase in connectivity, compared with NC. This indicates that the compensatory effect in MCI brain may not be as strong as that in AD brains. Table 2: P-values from the statistical significance test of connectivity difference among AD, MCI, and NC (a) Total number of arcs = 50 (b) Total number of arcs = 120 (c) Total number of arcs = 180 There is no significant difference among AD, MCI, and NC in terms of the connectivity within the parietal lobe and within the occipital lobe. Another interesting finding is that all the P-values in the third sub-table of Table 2(a) are insignificant. This implies that distribution of the strong connectivity within and between lobes for MCI is very similar to NC; in other words, MCI has not been able to disrupt the strong connectivity among brain regions (it disrupts some mild and weak connectivity though). Between-lobe connectivity: In general, human brains tend to have less between-lobe connectivity than within-lobe connectivity. A majority of the strong connectivity occurs within lobes, but rarely between lobes. These can be clearly seen from Fig. 4 (especially Fig. 4(a)) in which there are much more black cells along the diagonal direction than the off-diagonal direction, regardless of AD, MCI, and NC. The connectivity between the parietal and occipital lobes of AD is significantly more than NC which is true especially for mild and weak connectivity. The increased connectivity between the parietal and occipital lobes of AD has been previously reported in [3]. It is also interpreted as a compensatory effect in [6]-[7]. Furthermore, MCI also shows increased connectivity between the parietal and occipital lobes, compared with NC, but the increase is not as significant as AD. While the connectivity between the frontal and occipital lobes shows little difference between AD and NC, such connectivity for MCI shows a significant decrease especially for mild and weak connectivity. Also, AD may have less temporal-occipital connectivity, less frontal-parietal connectivity, but more parietal-temporal connectivity than NC. Between-hemisphere connectivity: Recall that we have observed from the tree-like plots in Figs. 3 and 4 that the same brain regions in the left and right hemisphere are connected much weaker in AD than in NC. It is desirable to test if this observed difference is statistically significant. To achieve this, we test the statistical significance of the difference among AD, MCI, and NC, in term of the number of connected same-region left-right pairs. Results show that when the total number of arcs in the connectivity models is equal to 120 or 90, none of the tests is significant. However, when the total number of arcs is equal to 50, the P-values of the tests for “AD vs. NC”, “AD vs. MCI”, and “MCI vs. NC” are 0.009, 0.004, and 0.315, respectively. We further perform tests for the total number of arcs equal to 30 and find the P-values to be 0. 0055, 0.053, and 0.158, respectively. These results indicate that AD disrupts the strong connectivity between the same regions of the left and right hemispheres, whereas this disruption is not significant in MCI. 4 Con cl u si on In the paper, we applied SICE to model functional brain connectivity of AD, MCI, and NC based on PET neuroimaging data, and analyze the patterns based on the monotone property of SICE. Our findings were consistent with the previous literature and also showed some new aspects that may suggest further investigation in brain connectivity research in the future. R e f e re n c e s [1] S. Molchan. (2005) The Alzheimer's disease neuroimaging initiative. Business Briefing: US Neurology Review, pp.30-32, 2005. [2] C.J. Stam, B.F. Jones, G. Nolte, M. Breakspear, and P. Scheltens. (2007) Small-world networks and functional connectivity in Alzheimer’s disease. Cerebral Corter 17:92-99. [3] K. Supekar, V. Menon, D. Rubin, M. Musen, M.D. Greicius. (2008) Network Analysis of Intrinsic Functional Brain Connectivity in Alzheimer's Disease. PLoS Comput Biol 4(6) 1-11. [4] K. Wang, M. Liang, L. Wang, L. Tian, X. Zhang, K. Li and T. Jiang. (2007) Altered Functional Connectivity in Early Alzheimer’s Disease: A Resting-State fMRI Study, Human Brain Mapping 28, 967978. [5] N.P. Azari, S.I. Rapoport, C.L. Grady, M.B. Schapiro, J.A. Salerno, A. Gonzales-Aviles. (1992) Patterns of interregional correlations of cerebral glucose metabolic rates in patients with dementia of the Alzheimer type. Neurodegeneration 1: 101–111. [6] R.L. Gould, B.Arroyo, R,G. Brown, A.M. Owen, E.T. Bullmore and R.J. Howard. (2006) Brain Mechanisms of Successful Compensation during Learning in Alzheimer Disease, Neurology 67, 1011-1017. [7] Y. Stern. (2006) Cognitive Reserve and Alzheimer Disease, Alzheimer Disease Associated Disorder 20, 69-74. [8] K.J. Friston. (1994) Functional and effective connectivity: A synthesis. Human Brain Mapping 2, 56-78. [9] G. Alexander, J. Moeller. (1994) Application of the Scaled Subprofile model: a statistical approach to the analysis of functional patterns in neuropsychiatric disorders: A principal component approach to modeling regional patterns of brain function in disease. Human Brain Mapping, 79-94. [10] V.D. Calhoun, T. Adali, G.D. Pearlson, J.J. Pekar. (2001) Spatial and temporal independent component analysis of functional MRI data containing a pair of task-related waveforms. Hum.Brain Mapp. 13, 43-53. [11] V.D. Calhoun, T. Adali, J.J. Pekar, G.D. Pearlson. (2003) Latency (in)sensitive ICA. Group independent component analysis of fMRI data in the temporal frequency domain. Neuroimage. 20, 1661-1669. [12] A.R. McIntosh, F.L. Bookstein, J.V. Haxby, C.L. Grady. (1996) Spatial pattern analysis of functional brain images using partial least squares. Neuroimage. 3, 143-157. [13] K.J. Worsley, J.B. Poline, K.J. Friston, A.C. Evans. (1997) Characterizing the response of PET and fMRI data using multivariate linear models. Neuroimage. 6, 305-319. [14] E. Bullmore, B. Horwitz, G. Honey, M. Brammer, S. Williams, T. Sharma. (2000) How good is good enough in path analysis of fMRI data? NeuroImage 11, 289–301. [15] A.R. McIntosh, C.L. Grady, L.G. Ungerieider, J.V. Haxby, S.I. Rapoport, B. Horwitz. (1994) Network analysis of cortical visual pathways mapped with PET. J. Neurosci. 14 (2), 655–666. [16] K.J. Friston, L. Harrison, W. Penny. (2003) Dynamic causal modelling. Neuroimage 19, 1273-1302. [17] O. Banerjee, L. El Ghaoui, and A. d’Aspremont. (2008) Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. Journal of Machine Learning Research 9:485516. [18] J. Dahl, L. Vandenberghe, and V. Roycowdhury. (2008) Covariance selection for nonchordal graphs via chordal embedding. Optimization Methods Software 23(4):501-520. [19] J. Friedman, T.astie, and R. Tibsirani. (2007) Spares inverse covariance estimation with the graphical lasso, Biostatistics 8(1):1-10. [20] J.Z. Huang, N. Liu, M. Pourahmadi, and L. Liu. (2006) Covariance matrix selection and estimation via penalized normal likelihood. Biometrika, 93(1):85-98. [21] H. Li and J. Gui. (2005) Gradient directed regularization for sparse Gaussian concentration graphs, with applications to inference of genetic networks. Biostatistics 7(2):302-317. [22] Y. Lin. (2007) Model selection and estimation in the gaussian graphical model. Biometrika 94(1)19-35, 2007. [23] A. Dobra, C. Hans, B. Jones, J.R. Nevins, G. Yao, and M. West. (2004) Sparse graphical models for exploring gene expression data. Journal of Multivariate Analysis 90(1):196-212. [24] A. Berge, A.C. Jensen, and A.H.S. Solberg. (2007) Sparse inverse covariance estimates for hyperspectral image classification, Geoscience and Remote Sensing, IEEE Transactions on, 45(5):1399-1407. [25] J.A. Bilmes. (2000) Factored sparse inverse covariance matrices. In ICASSP:1009-1012. [26] L. Sun and et al. (2009) Mining Brain Region Connectivity for Alzheimer's Disease Study via Sparse Inverse Covariance Estimation. In KDD: 1335-1344. [27] R. Tibshirani. (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B 58(1):267-288. [28] N. Tzourio-Mazoyer and et al. (2002) Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single subject brain. Neuroimage 15:273-289. [29] Supplemental information for “Learning Brain Connectivity of Alzheimer's Disease from Neuroimaging Data”. http://www.public.asu.edu/~jye02/Publications/AD-supplemental-NIPS09.pdf

5 0.69487476 38 nips-2009-Augmenting Feature-driven fMRI Analyses: Semi-supervised learning and resting state activity

Author: Andreas Bartels, Matthew Blaschko, Jacquelyn A. Shelton

Abstract: Resting state activity is brain activation that arises in the absence of any task, and is usually measured in awake subjects during prolonged fMRI scanning sessions where the only instruction given is to close the eyes and do nothing. It has been recognized in recent years that resting state activity is implicated in a wide variety of brain function. While certain networks of brain areas have different levels of activation at rest and during a task, there is nevertheless significant similarity between activations in the two cases. This suggests that recordings of resting state activity can be used as a source of unlabeled data to augment discriminative regression techniques in a semi-supervised setting. We evaluate this setting empirically yielding three main results: (i) regression tends to be improved by the use of Laplacian regularization even when no additional unlabeled data are available, (ii) resting state data seem to have a similar marginal distribution to that recorded during the execution of a visual processing task implying largely similar types of activation, and (iii) this source of information can be broadly exploited to improve the robustness of empirical inference in fMRI studies, an inherently data poor domain. 1

6 0.45375109 47 nips-2009-Boosting with Spatial Regularization

7 0.39820367 251 nips-2009-Unsupervised Detection of Regions of Interest Using Iterative Link Analysis

8 0.38986906 261 nips-2009-fMRI-Based Inter-Subject Cortical Alignment Using Functional Connectivity

9 0.3531141 224 nips-2009-Sparse and Locally Constant Gaussian Graphical Models

10 0.34161204 43 nips-2009-Bayesian estimation of orientation preference maps

11 0.31221402 83 nips-2009-Estimating image bases for visual image reconstruction from human brain activity

12 0.30048361 164 nips-2009-No evidence for active sparsification in the visual cortex

13 0.28302902 219 nips-2009-Slow, Decorrelated Features for Pretraining Complex Cell-like Networks

14 0.2702722 233 nips-2009-Streaming Pointwise Mutual Information

15 0.25714481 13 nips-2009-A Neural Implementation of the Kalman Filter

16 0.25472268 49 nips-2009-Breaking Boundaries Between Induction Time and Diagnosis Time Active Information Acquisition

17 0.25390628 237 nips-2009-Subject independent EEG-based BCI decoding

18 0.25153747 188 nips-2009-Perceptual Multistability as Markov Chain Monte Carlo Inference

19 0.24620076 260 nips-2009-Zero-shot Learning with Semantic Output Codes

20 0.22869717 41 nips-2009-Bayesian Source Localization with the Multivariate Laplace Prior


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(7, 0.014), (24, 0.029), (25, 0.063), (35, 0.045), (36, 0.098), (39, 0.068), (44, 0.04), (58, 0.083), (61, 0.017), (65, 0.204), (71, 0.056), (81, 0.055), (86, 0.086), (91, 0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.82669133 70 nips-2009-Discriminative Network Models of Schizophrenia

Author: Irina Rish, Benjamin Thyreau, Bertrand Thirion, Marion Plaze, Marie-laure Paillere-martinot, Catherine Martelli, Jean-luc Martinot, Jean-baptiste Poline, Guillermo A. Cecchi

Abstract: Schizophrenia is a complex psychiatric disorder that has eluded a characterization in terms of local abnormalities of brain activity, and is hypothesized to affect the collective, “emergent” working of the brain. We propose a novel data-driven approach to capture emergent features using functional brain networks [4] extracted from fMRI data, and demonstrate its advantage over traditional region-of-interest (ROI) and local, task-specific linear activation analyzes. Our results suggest that schizophrenia is indeed associated with disruption of global brain properties related to its functioning as a network, which cannot be explained by alteration of local activation patterns. Moreover, further exploitation of interactions by sparse Markov Random Field classifiers shows clear gain over linear methods, such as Gaussian Naive Bayes and SVM, allowing to reach 86% accuracy (over 50% baseline - random guess), which is quite remarkable given that it is based on a single fMRI experiment using a simple auditory task. 1

2 0.82553834 14 nips-2009-A Parameter-free Hedging Algorithm

Author: Kamalika Chaudhuri, Yoav Freund, Daniel J. Hsu

Abstract: We study the problem of decision-theoretic online learning (DTOL). Motivated by practical applications, we focus on DTOL when the number of actions is very large. Previous algorithms for learning in this framework have a tunable learning rate parameter, and a barrier to using online-learning in practical applications is that it is not understood how to set this parameter optimally, particularly when the number of actions is large. In this paper, we offer a clean solution by proposing a novel and completely parameter-free algorithm for DTOL. We introduce a new notion of regret, which is more natural for applications with a large number of actions. We show that our algorithm achieves good performance with respect to this new notion of regret; in addition, it also achieves performance close to that of the best bounds achieved by previous algorithms with optimally-tuned parameters, according to previous notions of regret. 1

3 0.76729387 167 nips-2009-Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations

Author: Mingyuan Zhou, Haojun Chen, Lu Ren, Guillermo Sapiro, Lawrence Carin, John W. Paisley

Abstract: Non-parametric Bayesian techniques are considered for learning dictionaries for sparse image representations, with applications in denoising, inpainting and compressive sensing (CS). The beta process is employed as a prior for learning the dictionary, and this non-parametric method naturally infers an appropriate dictionary size. The Dirichlet process and a probit stick-breaking process are also considered to exploit structure within an image. The proposed method can learn a sparse dictionary in situ; training images may be exploited if available, but they are not required. Further, the noise variance need not be known, and can be nonstationary. Another virtue of the proposed method is that sequential inference can be readily employed, thereby allowing scaling to large images. Several example results are presented, using both Gibbs and variational Bayesian inference, with comparisons to other state-of-the-art approaches. 1

4 0.70412314 250 nips-2009-Training Factor Graphs with Reinforcement Learning for Efficient MAP Inference

Author: Khashayar Rohanimanesh, Sameer Singh, Andrew McCallum, Michael J. Black

Abstract: Large, relational factor graphs with structure defined by first-order logic or other languages give rise to notoriously difficult inference problems. Because unrolling the structure necessary to represent distributions over all hypotheses has exponential blow-up, solutions are often derived from MCMC. However, because of limitations in the design and parameterization of the jump function, these samplingbased methods suffer from local minima—the system must transition through lower-scoring configurations before arriving at a better MAP solution. This paper presents a new method of explicitly selecting fruitful downward jumps by leveraging reinforcement learning (RL). Rather than setting parameters to maximize the likelihood of the training data, parameters of the factor graph are treated as a log-linear function approximator and learned with methods of temporal difference (TD); MAP inference is performed by executing the resulting policy on held out test data. Our method allows efficient gradient updates since only factors in the neighborhood of variables affected by an action need to be computed—we bypass the need to compute marginals entirely. Our method yields dramatic empirical success, producing new state-of-the-art results on a complex joint model of ontology alignment, with a 48% reduction in error over state-of-the-art in that domain. 1

5 0.6965102 121 nips-2009-Know Thy Neighbour: A Normative Theory of Synaptic Depression

Author: Jean-pascal Pfister, Peter Dayan, Máté Lengyel

Abstract: Synapses exhibit an extraordinary degree of short-term malleability, with release probabilities and effective synaptic strengths changing markedly over multiple timescales. From the perspective of a fixed computational operation in a network, this seems like a most unacceptable degree of added variability. We suggest an alternative theory according to which short-term synaptic plasticity plays a normatively-justifiable role. This theory starts from the commonplace observation that the spiking of a neuron is an incomplete, digital, report of the analog quantity that contains all the critical information, namely its membrane potential. We suggest that a synapse solves the inverse problem of estimating the pre-synaptic membrane potential from the spikes it receives, acting as a recursive filter. We show that the dynamics of short-term synaptic depression closely resemble those required for optimal filtering, and that they indeed support high quality estimation. Under this account, the local postsynaptic potential and the level of synaptic resources track the (scaled) mean and variance of the estimated presynaptic membrane potential. We make experimentally testable predictions for how the statistics of subthreshold membrane potential fluctuations and the form of spiking non-linearity should be related to the properties of short-term plasticity in any particular cell type. 1

6 0.68138605 162 nips-2009-Neural Implementation of Hierarchical Bayesian Inference by Importance Sampling

7 0.67438364 44 nips-2009-Beyond Categories: The Visual Memex Model for Reasoning About Object Relationships

8 0.67285359 17 nips-2009-A Sparse Non-Parametric Approach for Single Channel Separation of Known Sounds

9 0.67176694 155 nips-2009-Modelling Relational Data using Bayesian Clustered Tensor Factorization

10 0.66956526 210 nips-2009-STDP enables spiking neurons to detect hidden causes of their inputs

11 0.66758245 38 nips-2009-Augmenting Feature-driven fMRI Analyses: Semi-supervised learning and resting state activity

12 0.66727382 158 nips-2009-Multi-Label Prediction via Sparse Infinite CCA

13 0.66722077 99 nips-2009-Functional network reorganization in motor cortex can be explained by reward-modulated Hebbian learning

14 0.66621041 86 nips-2009-Exploring Functional Connectivities of the Human Brain using Multivariate Information Analysis

15 0.66439605 19 nips-2009-A joint maximum-entropy model for binary neural population patterns and continuous signals

16 0.66412193 145 nips-2009-Manifold Embeddings for Model-Based Reinforcement Learning under Partial Observability

17 0.66403967 141 nips-2009-Local Rules for Global MAP: When Do They Work ?

18 0.66219169 174 nips-2009-Nonparametric Latent Feature Models for Link Prediction

19 0.66107178 41 nips-2009-Bayesian Source Localization with the Multivariate Laplace Prior

20 0.66006279 104 nips-2009-Group Sparse Coding