nips nips2012 nips2012-327 knowledge-graph by maker-knowledge-mining

327 nips-2012-Structured Learning of Gaussian Graphical Models


Source: pdf

Author: Karthik Mohan, Mike Chung, Seungyeop Han, Daniela Witten, Su-in Lee, Maryam Fazel

Abstract: We consider estimation of multiple high-dimensional Gaussian graphical models corresponding to a single set of nodes under several distinct conditions. We assume that most aspects of the networks are shared, but that there are some structured differences between them. Specifically, the network differences are generated from node perturbations: a few nodes are perturbed across networks, and most or all edges stemming from such nodes differ between networks. This corresponds to a simple model for the mechanism underlying many cancers, in which the gene regulatory network is disrupted due to the aberrant activity of a few specific genes. We propose to solve this problem using the perturbed-node joint graphical lasso, a convex optimization problem that is based upon the use of a row-column overlap norm penalty. We then solve the convex problem using an alternating directions method of multipliers algorithm. Our proposal is illustrated on synthetic data and on an application to brain cancer gene expression data. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 We assume that most aspects of the networks are shared, but that there are some structured differences between them. [sent-2, score-0.173]

2 Specifically, the network differences are generated from node perturbations: a few nodes are perturbed across networks, and most or all edges stemming from such nodes differ between networks. [sent-3, score-0.595]

3 This corresponds to a simple model for the mechanism underlying many cancers, in which the gene regulatory network is disrupted due to the aberrant activity of a few specific genes. [sent-4, score-0.413]

4 We propose to solve this problem using the perturbed-node joint graphical lasso, a convex optimization problem that is based upon the use of a row-column overlap norm penalty. [sent-5, score-0.222]

5 We then solve the convex problem using an alternating directions method of multipliers algorithm. [sent-6, score-0.16]

6 Our proposal is illustrated on synthetic data and on an application to brain cancer gene expression data. [sent-7, score-0.609]

7 1 Introduction Probabilistic graphical models are widely used in a variety of applications, from computer vision to natural language processing to computational biology. [sent-8, score-0.114]

8 As this modeling framework is used in increasingly complex domains, the problem of selecting from among the exponentially large space of possible network structures is of paramount importance. [sent-9, score-0.06]

9 This problem is especially acute in the high-dimensional setting, in which the number of variables or nodes in the graphical model is much larger than the number of observations that are available to estimate it. [sent-10, score-0.195]

10 As a motivating example, suppose that we have access to gene expression measurements for n1 lung cancer patients and n2 brain cancer patients, and that we would like to estimate the gene regulatory networks underlying these two types of cancer. [sent-11, score-1.395]

11 We can consider estimating a single network on the basis of all n1 +n2 patients. [sent-12, score-0.06]

12 However, this approach is unlikely to be successful, due to fundamental differences between the true lung cancer and brain cancer gene regulatory networks that stem from tissue specificity of gene expression as well as differing etiology of the two diseases. [sent-13, score-1.418]

13 As an alternative, we could simply estimate a gene regulatory network using the n1 lung cancer patients and a separate gene regulatory network using the n2 brain cancer patients. [sent-14, score-1.537]

14 However, this approach fails to exploit the fact that the two underlying gene regulatory networks likely have substantial commonality, such as tumor-specific pathways. [sent-15, score-0.439]

15 In order to effectively make use of the available data, we need a principled approach for jointly estimating the lung cancer and brain cancer networks in such a way that the two network estimates are encouraged to be quite similar to each other, while allowing for certain structured differences. [sent-16, score-0.832]

16 In this paper, we propose a general framework for jointly learning the structure of K networks, under the assumption that the networks are similar overall, but may have certain structured differences. [sent-18, score-0.121]

17 edu † 1 Specifically, we assume that the network differences result from node perturbation – that is, certain nodes are perturbed across the conditions, and so all or most of the edges associated with those nodes differ across the K networks. [sent-32, score-0.702]

18 We detect such differences through the use of a row-column overlap norm penalty. [sent-33, score-0.088]

19 Figure 1 illustrates a toy example in which a pair of networks are identical to each other, except for a single perturbed node (X2 ) that will be detected using our proposal. [sent-34, score-0.271]

20 The problem of estimating multiple networks that differ due to node perturbations arises in a number of applications. [sent-35, score-0.28]

21 For instance, the gene regulatory networks in cancer patients and in normal individuals are likely to be similar to each other, with specific node perturbations that arise from a small set of genes with somatic (cancer-specific) mutations. [sent-36, score-1.104]

22 Another example arises in the analysis of the conditional independence relationships among p stocks at two distinct points in time. [sent-37, score-0.052]

23 We might be interested in detecting stocks that have differential connectivity with all other edges across the two time points, as these likely correspond to companies that have undergone significant changes. [sent-38, score-0.176]

24 Still another example can be found in the field of neuroscience, where we are interested in learning how the connectivity of neurons in the human brain changes over time. [sent-39, score-0.098]

25 Figure 1: An example of two networks that differ due to node perturbation of X2 . [sent-40, score-0.309]

26 (c) Left: Edges that differ between the two networks. [sent-43, score-0.083]

27 Right: Shaded cells indicate edges that differ between Networks 1 and 2. [sent-44, score-0.168]

28 Our proposal for estimating multiple networks in the presence of node perturbation can be formulated as a convex optimization problem, which we solve using an efficient alternating directions method of multipliers (ADMM) algorithm that significantly outperforms general-purpose optimization tools. [sent-45, score-0.454]

29 We test our method on synthetic data generated from known graphical models, and on one real-world task that involves inferring gene regulatory networks from experimental data. [sent-46, score-0.585]

30 In Section 2, we present recent work in the estimation of Gaussian graphical models (GGMs). [sent-48, score-0.137]

31 In Section 3, we present our proposal for structured learning of multiple GGMs using the row-column overlap norm penalty. [sent-49, score-0.097]

32 1 Background The graphical lasso Suppose that we wish to estimate a GGM on the basis of n observations, X1 , . [sent-53, score-0.216]

33 This estimate will be positive definite for any λ > 0, and sparse when λ is sufficiently large, due to the ℓ1 penalty [10] in (1). [sent-61, score-0.057]

34 We refer to (1) as the graphical lasso formulation. [sent-62, score-0.216]

35 2 The fused graphical lasso In recent literature, convex formulations have been proposed for extending the graphical lasso (1) to the setting in which one has access to a number of observations from K distinct conditions. [sent-65, score-0.536]

36 The goal of the formulations is to estimate a graphical model for each condition under the assumption that the k k K networks share certain characteristics [12, 13]. [sent-66, score-0.2]

37 Letting Sk denote the empirical covariance matrix for the kth class, one can maximize the penalized log likelihood   K   ∑ ∑ L(Θ1 , . [sent-74, score-0.066]

38 , ΘK ) = k=1 nk log det Θ − trace(S Θ ) , λ1 and λ2 are nonnegative tuning parameters, and P (Θ1 , . [sent-86, score-0.103]

39 , ΘK ) is a penalty applied to each off-diagonal element of ij ij ˆ ˆ Θ1 , . [sent-89, score-0.178]

40 , ΘK that solve (2) serve as estimates for (Σ1 )−1 , . [sent-96, score-0.06]

41 , ΘK ) = (3) ij ij ij ij ∑K k 2 GGMs by including K(K−1) RCON penalty 2 terms in (7), one for each pair of models. [sent-103, score-0.32]

42 4 4 An ADMM algorithm for the PNJGL formulation The PNJGL optimization problem (7) is convex, and so can be directly solved in the modeling environment cvx [18], which calls conic interior-point solvers such as SeDuMi or SDPT3. [sent-105, score-0.108]

43 Other algorithms proposed for overlapping group lasso penalties [19, 20, 21] do not apply to our setting since the PNJGL formulation has a combination of Gaussian log-likelihood loss (instead of squared error loss) and the RCON penalty along with a positivedefinite constraint. [sent-107, score-0.183]

44 We also note that other first-order methods are not easily applied to solve the PNJGL formulation because the subgradient of the RCON is not easy to compute and in addition the proximal operator to RCON is non-trivial to compute. [sent-108, score-0.078]

45 In this section we present a fast and scalable alternating directions method of multipliers (ADMM) algorithm [22] to solve the problem (7). [sent-109, score-0.132]

46 The operator Tq is given by   p  1 ∑ ∥Xj ∥q , Tq (A, λ) = argmin ∥X − A∥2 + λ F  2 X j=1 and is also known as the proximal operator corresponding to the ℓ1 /ℓq norm. [sent-116, score-0.079]

47 When p = 30, the interior point method (using cvx, which calls Sedumi) takes 7 minutes to run while ADMM takes only 10 seconds. [sent-134, score-0.062]

48 Also, we observe that the average error between the cvx and ADMM solution when averaged over many random generations of the data is of O(10−4 ). [sent-137, score-0.063]

49 The networks share individual edges as well as hub nodes, or nodes that are highly-connected to many other nodes. [sent-142, score-0.266]

50 There are also perturbed nodes that differ between the networks. [sent-143, score-0.225]

51 We randomly selected m perturbed nodes that differ between A1 and A2 , and set the elements of the corresponding row and column of either A1 or A2 (chosen at random) to be i. [sent-164, score-0.225]

52 We generated n independent observations each from a N (0, Σ1 ) and a N (0, Σ2 ) distribution, and used these to compute the empirical covariance matrices S1 and S2 . [sent-177, score-0.066]

53 We compared the performances of graphical lasso, FGL, and PNJGL with q = 2 with p = 100, m = 2, and n = {10, 25, 50, 200}. [sent-178, score-0.114]

54 Increasing n yields more accurate results for PNJGL with q = 2, FGL, and graphical lasso. [sent-181, score-0.114]

55 Furthermore, PNJGL with q = 2 identifies non-zero edges and differing edges much more accurately than does FGL, which is in turn more accurate than graphical lasso. [sent-182, score-0.307]

56 2 Inferring biological networks We applied the PNJGL method to a recently-published cancer gene expression data set [26], with mRNA expression measurements for 11,861 genes in 220 patients with glioblastoma multiforme (GBM), a brain cancer. [sent-186, score-0.998]

57 Each patient has one of four distinct clinical subtypes: Proneural, Neural, Classical, and Mesenchymal. [sent-187, score-0.053]

58 We selected two subtypes – Proneural (53 patients) and Mesenchymal 6 Figure 3: Simulation study results for PNJGL with q = 2, FGL, and the graphical lasso (GL), for (a) n = 10, (b) n = 25, (c) n = 50, (d) n = 200, when p = 100. [sent-188, score-0.396]

59 Each plot’s x-axis denotes the number of edges estimated to be non-zero. [sent-190, score-0.085]

60 Left: Number of edges correctly estimated to be non-zero. [sent-192, score-0.085]

61 Center: Number of edges correctly estimated to differ across networks, divided by the number of edges estimated to differ across networks. [sent-193, score-0.418]

62 In this experiment, we aim to reconstruct the gene regulatory networks of the two subtypes, as well as to identify genes whose interactions with other genes vary significantly between the subtypes. [sent-198, score-0.821]

63 Such genes are likely to have many somatic (cancer-specific) mutations. [sent-199, score-0.228]

64 Understanding the molecular basis of these subtypes will lead to better understanding of brain cancer, and eventually, improved patient treatment. [sent-200, score-0.287]

65 We selected the 250 genes with the highest within-subtype variance, as well as 10 genes known to be frequently mutated across the four GBM subtypes [26]: TP53, PTEN, NF1, EGFR, IDH1, PIK3R1, RB1, ERBB2, PIK3CA, PDGFRA. [sent-201, score-0.717]

66 Two of these genes (EGFR, PDGFRA) were in the initial list of 250 genes selected based on the withinsubtype variance. [sent-202, score-0.382]

67 We then applied PNJGL with q = 2 and FGL to the resulting 53 × 258 and 56 × 258 gene expression datasets, after standardizing each gene to have variance one. [sent-204, score-0.457]

68 Tuning parameters were selected so that each approach results in a per-network estimate of approximately 6,000 non-zero edges, as well as approximately 4,000 edges that differ 7 across the two network estimates. [sent-205, score-0.269]

69 However, the results that follow persisted across a wide range of tuning parameter values. [sent-206, score-0.07]

70 Figure 4: PNJGL with q = 2 and FGL were performed on the brain cancer data set corresponding to 258 genes in patients with Proneural and Mesenchymal subtypes. [sent-207, score-0.594]

71 We quantify the extent of node perturbation (NP) in the network estimates as follows: N Pj = ∑ 1 ˆ1 ˆ2 i |Vij |; for FGL we get V from the PNJGL formulation as 2 (Θ − Θ ). [sent-210, score-0.237]

72 If N Pj = 0 (using a zero−6 threshold of 10 ), then the jth gene has the same edge weights in the two conditions. [sent-211, score-0.212]

73 In Figure 4(a)(b), we plotted the resulting values for each of the 258 genes in FGL and PNJGL. [sent-212, score-0.191]

74 Although the network estimates resulting from PNJGL and FGL have approximately the same number of edges that differ across cancer subtypes, PNJGL results in estimates in which only 37 genes appear to have node perturbation. [sent-213, score-0.837]

75 FGL results in estimates in which all 258 genes appear to have node perturbation. [sent-214, score-0.302]

76 Clearly, the pattern of network differences resulting from PNJGL is far more structured. [sent-216, score-0.112]

77 The genes known to be frequently mutated across GBM subtypes are somewhat enriched out of those that appear to be perturbed according to the PNJGL estimates (3 out of 10 mutated genes were detected by PNJGL; 37 out of 258 total genes were detected by PNJGL; hypergeometric p-value = 0. [sent-217, score-1.173]

78 In contrast, FGL detects every gene as having node perturbation (Figure 4(a)). [sent-219, score-0.352]

79 The gene with the highest N Pj value (according to both FGL and PNJGL with q = 2) is CXCL13, a small cytokine that belongs to the CXC chemokine family. [sent-220, score-0.257]

80 This gene was not identified as a frequently mutated gene in GBM [26]. [sent-222, score-0.538]

81 Our results suggest the possibility of a previously unknown role of CXCL13 in brain cancer. [sent-224, score-0.077]

82 6 Discussion and future work We have proposed the perturbed-node joint graphical lasso, a new approach for jointly learning Gaussian graphical models under the assumption that network differences result from node perturbations. [sent-225, score-0.414]

83 We impose this structure using a novel RCON penalty, which encourages the differences between the estimated networks to be the union of just a few rows and columns. [sent-226, score-0.138]

84 We solve the resulting convex optimization problem using ADMM, which is more efficient and scalable than standard interior point methods. [sent-227, score-0.11]

85 Our proposed approach leads to far better performance on synthetic data than two alternative approaches: learning Gaussian graphical models assuming edge perturbation [13], or simply learning each model separately. [sent-228, score-0.212]

86 Future work will involve other forms of structured sparsity beyond simply node perturbation. [sent-229, score-0.109]

87 For instance, if certain subnetworks are known a priori to be related to the conditions under study, then the RCON penalty can be modified in order to encourage some subnetworks to be perturbed across the conditions. [sent-230, score-0.261]

88 Model selection and estimation in the Gaussian graphical model. [sent-253, score-0.157]

89 Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. [sent-266, score-0.064]

90 New insights and faster computations for the graphical lasso. [sent-274, score-0.114]

91 Model selection in gaussian graphical models: high-dimensional consistency of l1-regularized MLE. [sent-288, score-0.157]

92 The joint graphical lasso for inverse covariance estimation across multiple classes, 2012. [sent-319, score-0.323]

93 Group lasso with overlaps: the latent group lasso approach. [sent-348, score-0.229]

94 Smoothing proximal gradient method for general structured sparse learning. [sent-378, score-0.087]

95 A primal-dual algorithm for group sparse regularization with overlapping groups. [sent-385, score-0.066]

96 Distributed optimization and statistical learning via the alternating direction method of multipliers. [sent-394, score-0.077]

97 On the linear convergence of the alternating direction method of multipliers. [sent-399, score-0.056]

98 Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. [sent-416, score-0.22]

99 Chemokine CXCL13 is overexpressed in the tumour tissue and in the peripheral blood of breast cancer patients. [sent-419, score-0.303]

100 CXCL13-CXCR5 interactions support prostate cancer cell migration and invasion in a PI3K p110-, SRC- and FAK-dependent fashion. [sent-422, score-0.269]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('pnjgl', 0.585), ('fgl', 0.383), ('cancer', 0.229), ('gene', 0.212), ('admm', 0.204), ('genes', 0.191), ('subtypes', 0.18), ('regulatory', 0.141), ('rcon', 0.135), ('graphical', 0.114), ('lasso', 0.102), ('patients', 0.097), ('gbm', 0.09), ('mutated', 0.09), ('networks', 0.086), ('edges', 0.085), ('perturbed', 0.084), ('differ', 0.083), ('tq', 0.079), ('lung', 0.079), ('brain', 0.077), ('node', 0.074), ('ij', 0.071), ('ggms', 0.068), ('proneural', 0.068), ('tmax', 0.068), ('perturbation', 0.066), ('cvx', 0.063), ('network', 0.06), ('egfr', 0.06), ('nodes', 0.058), ('alternating', 0.056), ('differences', 0.052), ('unif', 0.049), ('pj', 0.047), ('tissue', 0.045), ('chemokine', 0.045), ('mesenchymal', 0.045), ('pdgfra', 0.045), ('covariance', 0.043), ('across', 0.041), ('expand', 0.04), ('sedumi', 0.04), ('glioblastoma', 0.04), ('prostate', 0.04), ('subnetworks', 0.04), ('nk', 0.039), ('interior', 0.038), ('perturbations', 0.037), ('estimates', 0.037), ('witten', 0.037), ('hub', 0.037), ('somatic', 0.037), ('overlap', 0.036), ('penalty', 0.036), ('det', 0.035), ('structured', 0.035), ('multipliers', 0.033), ('expression', 0.033), ('synthetic', 0.032), ('proximal', 0.031), ('patient', 0.03), ('fused', 0.03), ('stocks', 0.029), ('breast', 0.029), ('qi', 0.029), ('tuning', 0.029), ('primal', 0.028), ('convex', 0.028), ('detected', 0.027), ('jacob', 0.027), ('proposal', 0.026), ('obozinski', 0.025), ('group', 0.025), ('calls', 0.024), ('operator', 0.024), ('frequently', 0.024), ('observations', 0.023), ('aij', 0.023), ('differing', 0.023), ('kth', 0.023), ('distinct', 0.023), ('biostatistics', 0.023), ('solve', 0.023), ('estimation', 0.023), ('gaussian', 0.023), ('banerjee', 0.023), ('yuan', 0.021), ('adjacency', 0.021), ('connectivity', 0.021), ('sparse', 0.021), ('optimization', 0.021), ('encourage', 0.02), ('el', 0.02), ('directions', 0.02), ('selection', 0.02), ('overlapping', 0.02), ('lagrangian', 0.02), ('carbonell', 0.02), ('mosci', 0.02)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 327 nips-2012-Structured Learning of Gaussian Graphical Models

Author: Karthik Mohan, Mike Chung, Seungyeop Han, Daniela Witten, Su-in Lee, Maryam Fazel

Abstract: We consider estimation of multiple high-dimensional Gaussian graphical models corresponding to a single set of nodes under several distinct conditions. We assume that most aspects of the networks are shared, but that there are some structured differences between them. Specifically, the network differences are generated from node perturbations: a few nodes are perturbed across networks, and most or all edges stemming from such nodes differ between networks. This corresponds to a simple model for the mechanism underlying many cancers, in which the gene regulatory network is disrupted due to the aberrant activity of a few specific genes. We propose to solve this problem using the perturbed-node joint graphical lasso, a convex optimization problem that is based upon the use of a row-column overlap norm penalty. We then solve the convex problem using an alternating directions method of multipliers algorithm. Our proposal is illustrated on synthetic data and on an application to brain cancer gene expression data. 1

2 0.14949322 6 nips-2012-A Convex Formulation for Learning Scale-Free Networks via Submodular Relaxation

Author: Aaron Defazio, Tibério S. Caetano

Abstract: A key problem in statistics and machine learning is the determination of network structure from data. We consider the case where the structure of the graph to be reconstructed is known to be scale-free. We show that in such cases it is natural to formulate structured sparsity inducing priors using submodular functions, and we use their Lov´ sz extension to obtain a convex relaxation. For tractable classes a such as Gaussian graphical models, this leads to a convex optimization problem that can be efficiently solved. We show that our method results in an improvement in the accuracy of reconstructed networks for synthetic data. We also show how our prior encourages scale-free reconstructions on a bioinfomatics dataset.

3 0.12638523 147 nips-2012-Graphical Models via Generalized Linear Models

Author: Eunho Yang, Genevera Allen, Zhandong Liu, Pradeep K. Ravikumar

Abstract: Undirected graphical models, also known as Markov networks, enjoy popularity in a variety of applications. The popular instances of these models such as Gaussian Markov Random Fields (GMRFs), Ising models, and multinomial discrete models, however do not capture the characteristics of data in many settings. We introduce a new class of graphical models based on generalized linear models (GLMs) by assuming that node-wise conditional distributions arise from exponential families. Our models allow one to estimate multivariate Markov networks given any univariate exponential distribution, such as Poisson, negative binomial, and exponential, by fitting penalized GLMs to select the neighborhood for each node. A major contribution of this paper is the rigorous statistical analysis showing that with high probability, the neighborhood of our graphical models can be recovered exactly. We also provide examples of non-Gaussian high-throughput genomic networks learned via our GLM graphical models. 1

4 0.10574768 164 nips-2012-Iterative Thresholding Algorithm for Sparse Inverse Covariance Estimation

Author: Benjamin Rolfs, Bala Rajaratnam, Dominique Guillot, Ian Wong, Arian Maleki

Abstract: The 1 -regularized maximum likelihood estimation problem has recently become a topic of great interest within the machine learning, statistics, and optimization communities as a method for producing sparse inverse covariance estimators. In this paper, a proximal gradient method (G-ISTA) for performing 1 -regularized covariance matrix estimation is presented. Although numerous algorithms have been proposed for solving this problem, this simple proximal gradient method is found to have attractive theoretical and numerical properties. G-ISTA has a linear rate of convergence, resulting in an O(log ε) iteration complexity to reach a tolerance of ε. This paper gives eigenvalue bounds for the G-ISTA iterates, providing a closed-form linear convergence rate. The rate is shown to be closely related to the condition number of the optimal point. Numerical convergence results and timing comparisons for the proposed method are presented. G-ISTA is shown to perform very well, especially when the optimal point is well-conditioned. 1

5 0.092893541 326 nips-2012-Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses

Author: Po-ling Loh, Martin J. Wainwright

Abstract: We investigate a curious relationship between the structure of a discrete graphical model and the support of the inverse of a generalized covariance matrix. We show that for certain graph structures, the support of the inverse covariance matrix of indicator variables on the vertices of a graph reflects the conditional independence structure of the graph. Our work extends results that have previously been established only in the context of multivariate Gaussian graphical models, thereby addressing an open question about the significance of the inverse covariance matrix of a non-Gaussian distribution. Based on our population-level results, we show how the graphical Lasso may be used to recover the edge structure of certain classes of discrete graphical models, and present simulations to verify our theoretical results. 1

6 0.086641558 346 nips-2012-Topology Constraints in Graphical Models

7 0.077346116 247 nips-2012-Nonparametric Reduced Rank Regression

8 0.071523093 26 nips-2012-A nonparametric variable clustering model

9 0.070405252 7 nips-2012-A Divide-and-Conquer Method for Sparse Inverse Covariance Estimation

10 0.064965226 139 nips-2012-Fused sparsity and robust estimation for linear models with unknown variance

11 0.063960709 312 nips-2012-Simultaneously Leveraging Output and Task Structures for Multiple-Output Regression

12 0.061987057 192 nips-2012-Learning the Dependency Structure of Latent Factors

13 0.061398707 199 nips-2012-Link Prediction in Graphs with Autoregressive Features

14 0.061301541 63 nips-2012-CPRL -- An Extension of Compressive Sensing to the Phase Retrieval Problem

15 0.059676766 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs

16 0.057637151 266 nips-2012-Patient Risk Stratification for Hospital-Associated C. diff as a Time-Series Classification Task

17 0.056950707 319 nips-2012-Sparse Prediction with the $k$-Support Norm

18 0.056112148 186 nips-2012-Learning as MAP Inference in Discrete Graphical Models

19 0.055933394 180 nips-2012-Learning Mixtures of Tree Graphical Models

20 0.054238547 298 nips-2012-Scalable Inference of Overlapping Communities


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.141), (1, 0.056), (2, 0.041), (3, -0.048), (4, -0.01), (5, 0.073), (6, -0.023), (7, -0.124), (8, -0.097), (9, 0.023), (10, -0.039), (11, -0.004), (12, -0.025), (13, 0.036), (14, 0.025), (15, 0.026), (16, 0.033), (17, 0.009), (18, -0.034), (19, -0.022), (20, 0.028), (21, 0.067), (22, -0.01), (23, -0.037), (24, 0.074), (25, -0.136), (26, 0.071), (27, 0.037), (28, -0.049), (29, -0.048), (30, -0.094), (31, 0.015), (32, -0.023), (33, 0.013), (34, 0.064), (35, 0.013), (36, 0.05), (37, 0.015), (38, -0.018), (39, -0.068), (40, -0.045), (41, 0.045), (42, -0.035), (43, 0.038), (44, -0.055), (45, 0.04), (46, 0.03), (47, 0.029), (48, 0.074), (49, 0.033)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92560852 327 nips-2012-Structured Learning of Gaussian Graphical Models

Author: Karthik Mohan, Mike Chung, Seungyeop Han, Daniela Witten, Su-in Lee, Maryam Fazel

Abstract: We consider estimation of multiple high-dimensional Gaussian graphical models corresponding to a single set of nodes under several distinct conditions. We assume that most aspects of the networks are shared, but that there are some structured differences between them. Specifically, the network differences are generated from node perturbations: a few nodes are perturbed across networks, and most or all edges stemming from such nodes differ between networks. This corresponds to a simple model for the mechanism underlying many cancers, in which the gene regulatory network is disrupted due to the aberrant activity of a few specific genes. We propose to solve this problem using the perturbed-node joint graphical lasso, a convex optimization problem that is based upon the use of a row-column overlap norm penalty. We then solve the convex problem using an alternating directions method of multipliers algorithm. Our proposal is illustrated on synthetic data and on an application to brain cancer gene expression data. 1

2 0.7103101 346 nips-2012-Topology Constraints in Graphical Models

Author: Marcelo Fiori, Pablo Musé, Guillermo Sapiro

Abstract: Graphical models are a very useful tool to describe and understand natural phenomena, from gene expression to climate change and social interactions. The topological structure of these graphs/networks is a fundamental part of the analysis, and in many cases the main goal of the study. However, little work has been done on incorporating prior topological knowledge onto the estimation of the underlying graphical models from sample data. In this work we propose extensions to the basic joint regression model for network estimation, which explicitly incorporate graph-topological constraints into the corresponding optimization approach. The first proposed extension includes an eigenvector centrality constraint, thereby promoting this important prior topological property. The second developed extension promotes the formation of certain motifs, triangle-shaped ones in particular, which are known to exist for example in genetic regulatory networks. The presentation of the underlying formulations, which serve as examples of the introduction of topological constraints in network estimation, is complemented with examples in diverse datasets demonstrating the importance of incorporating such critical prior knowledge. 1

3 0.6834287 326 nips-2012-Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses

Author: Po-ling Loh, Martin J. Wainwright

Abstract: We investigate a curious relationship between the structure of a discrete graphical model and the support of the inverse of a generalized covariance matrix. We show that for certain graph structures, the support of the inverse covariance matrix of indicator variables on the vertices of a graph reflects the conditional independence structure of the graph. Our work extends results that have previously been established only in the context of multivariate Gaussian graphical models, thereby addressing an open question about the significance of the inverse covariance matrix of a non-Gaussian distribution. Based on our population-level results, we show how the graphical Lasso may be used to recover the edge structure of certain classes of discrete graphical models, and present simulations to verify our theoretical results. 1

4 0.66215181 6 nips-2012-A Convex Formulation for Learning Scale-Free Networks via Submodular Relaxation

Author: Aaron Defazio, Tibério S. Caetano

Abstract: A key problem in statistics and machine learning is the determination of network structure from data. We consider the case where the structure of the graph to be reconstructed is known to be scale-free. We show that in such cases it is natural to formulate structured sparsity inducing priors using submodular functions, and we use their Lov´ sz extension to obtain a convex relaxation. For tractable classes a such as Gaussian graphical models, this leads to a convex optimization problem that can be efficiently solved. We show that our method results in an improvement in the accuracy of reconstructed networks for synthetic data. We also show how our prior encourages scale-free reconstructions on a bioinfomatics dataset.

5 0.5941751 312 nips-2012-Simultaneously Leveraging Output and Task Structures for Multiple-Output Regression

Author: Piyush Rai, Abhishek Kumar, Hal Daume

Abstract: Multiple-output regression models require estimating multiple parameters, one for each output. Structural regularization is usually employed to improve parameter estimation in such models. In this paper, we present a multiple-output regression model that leverages the covariance structure of the latent model parameters as well as the conditional covariance structure of the observed outputs. This is in contrast with existing methods that usually take into account only one of these structures. More importantly, unlike some of the other existing methods, none of these structures need be known a priori in our model, and are learned from the data. Several previously proposed structural regularization based multiple-output regression models turn out to be special cases of our model. Moreover, in addition to being a rich model for multiple-output regression, our model can also be used in estimating the graphical model structure of a set of variables (multivariate outputs) conditioned on another set of variables (inputs). Experimental results on both synthetic and real datasets demonstrate the effectiveness of our method. 1

6 0.59412736 164 nips-2012-Iterative Thresholding Algorithm for Sparse Inverse Covariance Estimation

7 0.56601834 147 nips-2012-Graphical Models via Generalized Linear Models

8 0.54811162 132 nips-2012-Fiedler Random Fields: A Large-Scale Spectral Approach to Statistical Network Modeling

9 0.53904355 199 nips-2012-Link Prediction in Graphs with Autoregressive Features

10 0.53432161 182 nips-2012-Learning Networks of Heterogeneous Influence

11 0.49411607 215 nips-2012-Minimizing Uncertainty in Pipelines

12 0.49187273 317 nips-2012-Smooth-projected Neighborhood Pursuit for High-dimensional Nonparanormal Graph Estimation

13 0.48474655 240 nips-2012-Newton-Like Methods for Sparse Inverse Covariance Estimation

14 0.4751755 192 nips-2012-Learning the Dependency Structure of Latent Factors

15 0.47358921 319 nips-2012-Sparse Prediction with the $k$-Support Norm

16 0.47132182 139 nips-2012-Fused sparsity and robust estimation for linear models with unknown variance

17 0.46606618 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs

18 0.46291342 7 nips-2012-A Divide-and-Conquer Method for Sparse Inverse Covariance Estimation

19 0.45393628 298 nips-2012-Scalable Inference of Overlapping Communities

20 0.43988976 63 nips-2012-CPRL -- An Extension of Compressive Sensing to the Phase Retrieval Problem


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.081), (11, 0.315), (17, 0.017), (21, 0.022), (38, 0.104), (39, 0.013), (42, 0.024), (54, 0.021), (55, 0.019), (74, 0.022), (76, 0.185), (80, 0.065), (92, 0.024)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.80487257 151 nips-2012-High-Order Multi-Task Feature Learning to Identify Longitudinal Phenotypic Markers for Alzheimer's Disease Progression Prediction

Author: Hua Wang, Feiping Nie, Heng Huang, Jingwen Yan, Sungeun Kim, Shannon Risacher, Andrew Saykin, Li Shen

Abstract: Alzheimer’s disease (AD) is a neurodegenerative disorder characterized by progressive impairment of memory and other cognitive functions. Regression analysis has been studied to relate neuroimaging measures to cognitive status. However, whether these measures have further predictive power to infer a trajectory of cognitive performance over time is still an under-explored but important topic in AD research. We propose a novel high-order multi-task learning model to address this issue. The proposed model explores the temporal correlations existing in imaging and cognitive data by structured sparsity-inducing norms. The sparsity of the model enables the selection of a small number of imaging measures while maintaining high prediction accuracy. The empirical studies, using the longitudinal imaging and cognitive data of the ADNI cohort, have yielded promising results.

2 0.80117148 238 nips-2012-Neurally Plausible Reinforcement Learning of Working Memory Tasks

Author: Jaldert Rombouts, Pieter Roelfsema, Sander M. Bohte

Abstract: A key function of brains is undoubtedly the abstraction and maintenance of information from the environment for later use. Neurons in association cortex play an important role in this process: by learning these neurons become tuned to relevant features and represent the information that is required later as a persistent elevation of their activity [1]. It is however not well known how such neurons acquire these task-relevant working memories. Here we introduce a biologically plausible learning scheme grounded in Reinforcement Learning (RL) theory [2] that explains how neurons become selective for relevant information by trial and error learning. The model has memory units which learn useful internal state representations to solve working memory tasks by transforming partially observable Markov decision problems (POMDP) into MDPs. We propose that synaptic plasticity is guided by a combination of attentional feedback signals from the action selection stage to earlier processing levels and a globally released neuromodulatory signal. Feedback signals interact with feedforward signals to form synaptic tags at those connections that are responsible for the stimulus-response mapping. The neuromodulatory signal interacts with tagged synapses to determine the sign and strength of plasticity. The learning scheme is generic because it can train networks in different tasks, simply by varying inputs and rewards. It explains how neurons in association cortex learn to 1) temporarily store task-relevant information in non-linear stimulus-response mapping tasks [1, 3, 4] and 2) learn to optimally integrate probabilistic evidence for perceptual decision making [5, 6]. 1

3 0.78199363 225 nips-2012-Multi-task Vector Field Learning

Author: Binbin Lin, Sen Yang, Chiyuan Zhang, Jieping Ye, Xiaofei He

Abstract: Multi-task learning (MTL) aims to improve generalization performance by learning multiple related tasks simultaneously and identifying the shared information among tasks. Most of existing MTL methods focus on learning linear models under the supervised setting. We propose a novel semi-supervised and nonlinear approach for MTL using vector fields. A vector field is a smooth mapping from the manifold to the tangent spaces which can be viewed as a directional derivative of functions on the manifold. We argue that vector fields provide a natural way to exploit the geometric structure of data as well as the shared differential structure of tasks, both of which are crucial for semi-supervised multi-task learning. In this paper, we develop multi-task vector field learning (MTVFL) which learns the predictor functions and the vector fields simultaneously. MTVFL has the following key properties. (1) The vector fields MTVFL learns are close to the gradient fields of the predictor functions. (2) Within each task, the vector field is required to be as parallel as possible which is expected to span a low dimensional subspace. (3) The vector fields from all tasks share a low dimensional subspace. We formalize our idea in a regularization framework and also provide a convex relaxation method to solve the original non-convex problem. The experimental results on synthetic and real data demonstrate the effectiveness of our proposed approach. 1

same-paper 4 0.76269799 327 nips-2012-Structured Learning of Gaussian Graphical Models

Author: Karthik Mohan, Mike Chung, Seungyeop Han, Daniela Witten, Su-in Lee, Maryam Fazel

Abstract: We consider estimation of multiple high-dimensional Gaussian graphical models corresponding to a single set of nodes under several distinct conditions. We assume that most aspects of the networks are shared, but that there are some structured differences between them. Specifically, the network differences are generated from node perturbations: a few nodes are perturbed across networks, and most or all edges stemming from such nodes differ between networks. This corresponds to a simple model for the mechanism underlying many cancers, in which the gene regulatory network is disrupted due to the aberrant activity of a few specific genes. We propose to solve this problem using the perturbed-node joint graphical lasso, a convex optimization problem that is based upon the use of a row-column overlap norm penalty. We then solve the convex problem using an alternating directions method of multipliers algorithm. Our proposal is illustrated on synthetic data and on an application to brain cancer gene expression data. 1

5 0.72813857 305 nips-2012-Selective Labeling via Error Bound Minimization

Author: Quanquan Gu, Tong Zhang, Jiawei Han, Chris H. Ding

Abstract: In many practical machine learning problems, the acquisition of labeled data is often expensive and/or time consuming. This motivates us to study a problem as follows: given a label budget, how to select data points to label such that the learning performance is optimized. We propose a selective labeling method by analyzing the out-of-sample error of Laplacian regularized Least Squares (LapRLS). In particular, we derive a deterministic out-of-sample error bound for LapRLS trained on subsampled data, and propose to select a subset of data points to label by minimizing this upper bound. Since the minimization is a combinational problem, we relax it into continuous domain and solve it by projected gradient descent. Experiments on benchmark datasets show that the proposed method outperforms the state-of-the-art methods.

6 0.60264635 363 nips-2012-Wavelet based multi-scale shape features on arbitrary surfaces for cortical thickness discrimination

7 0.5954054 354 nips-2012-Truly Nonparametric Online Variational Inference for Hierarchical Dirichlet Processes

8 0.59268218 284 nips-2012-Q-MKL: Matrix-induced Regularization in Multi-Kernel Learning with Applications to Neuroimaging

9 0.59043324 312 nips-2012-Simultaneously Leveraging Output and Task Structures for Multiple-Output Regression

10 0.58943349 157 nips-2012-Identification of Recurrent Patterns in the Activation of Brain Networks

11 0.587843 68 nips-2012-Clustering Aggregation as Maximum-Weight Independent Set

12 0.58612597 203 nips-2012-Locating Changes in Highly Dependent Data with Unknown Number of Change Points

13 0.58591008 294 nips-2012-Repulsive Mixtures

14 0.58503133 233 nips-2012-Multiresolution Gaussian Processes

15 0.58474904 342 nips-2012-The variational hierarchical EM algorithm for clustering hidden Markov models

16 0.58391732 356 nips-2012-Unsupervised Structure Discovery for Semantic Analysis of Audio

17 0.58368886 318 nips-2012-Sparse Approximate Manifolds for Differential Geometric MCMC

18 0.58356237 192 nips-2012-Learning the Dependency Structure of Latent Factors

19 0.5835402 26 nips-2012-A nonparametric variable clustering model

20 0.58335352 307 nips-2012-Semi-Crowdsourced Clustering: Generalizing Crowd Labeling by Robust Distance Metric Learning