nips nips2010 nips2010-117 knowledge-graph by maker-knowledge-mining

117 nips-2010-Identifying graph-structured activation patterns in networks


Source: pdf

Author: James Sharpnack, Aarti Singh

Abstract: We consider the problem of identifying an activation pattern in a complex, largescale network that is embedded in very noisy measurements. This problem is relevant to several applications, such as identifying traces of a biochemical spread by a sensor network, expression levels of genes, and anomalous activity or congestion in the Internet. Extracting such patterns is a challenging task specially if the network is large (pattern is very high-dimensional) and the noise is so excessive that it masks the activity at any single node. However, typically there are statistical dependencies in the network activation process that can be leveraged to fuse the measurements of multiple nodes and enable reliable extraction of highdimensional noisy patterns. In this paper, we analyze an estimator based on the graph Laplacian eigenbasis, and establish the limits of mean square error recovery of noisy patterns arising from a probabilistic (Gaussian or Ising) model based on an arbitrary graph structure. We consider both deterministic and probabilistic network evolution models, and our results indicate that by leveraging the network interaction structure, it is possible to consistently recover high-dimensional patterns even when the noise variance increases with network size. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Identifying graph-structured activation patterns in networks James Sharpnack Machine Learning Department, Statistics Department Carnegie Mellon University Pittsburgh, PA 15213 jsharpna@andrew. [sent-1, score-0.354]

2 edu Abstract We consider the problem of identifying an activation pattern in a complex, largescale network that is embedded in very noisy measurements. [sent-4, score-0.458]

3 Extracting such patterns is a challenging task specially if the network is large (pattern is very high-dimensional) and the noise is so excessive that it masks the activity at any single node. [sent-6, score-0.448]

4 However, typically there are statistical dependencies in the network activation process that can be leveraged to fuse the measurements of multiple nodes and enable reliable extraction of highdimensional noisy patterns. [sent-7, score-0.546]

5 In this paper, we analyze an estimator based on the graph Laplacian eigenbasis, and establish the limits of mean square error recovery of noisy patterns arising from a probabilistic (Gaussian or Ising) model based on an arbitrary graph structure. [sent-8, score-0.887]

6 We consider both deterministic and probabilistic network evolution models, and our results indicate that by leveraging the network interaction structure, it is possible to consistently recover high-dimensional patterns even when the noise variance increases with network size. [sent-9, score-0.866]

7 1 Introduction The problem of identifying high-dimensional activation patterns embedded in noise is important for applications such as contamination monitoring by a sensor network, determining the set of differentially expressed genes, and anomaly detection in networks. [sent-10, score-0.549]

8 Formally, we consider the problem of identifying a pattern corrupted by noise that is observed at the p nodes of a network: yi = xi + ζi i ∈ [p] = {1, . [sent-11, score-0.221]

9 , xp ] ∈ R (or {0, 1} ) is the p-dimensional iid unknown continuous (or binary) activation pattern, and the noise ζi ∼ N (0, σ 2 ), the Gaussian distribution with mean zero and variance σ 2 . [sent-17, score-0.327]

10 This problem is particularly challenging when the network is large-scale, and hence x is a high-dimensional pattern embedded in heavy noise. [sent-18, score-0.256]

11 Classical approaches to this problem in the signal processing and statistics literature involve either thresholding the measurements at every node, or in the discrete case, matching the observed noisy measurements with all possible patterns (also known as the scan statistic). [sent-19, score-0.383]

12 In this case, multiple hypothesis testing effects imply that the noise variance needs to decrease as the number of nodes p increase [10, 1] to enable consistent mean square error (MSE) recovery. [sent-21, score-0.308]

13 The second approach based on the scan statistic is computationally infeasible in high-dimensional settings as the number of discrete patterns scale exponentially (≥ 2p ) in the number of dimensions p. [sent-22, score-0.242]

14 In practice, network activation patterns tend to be structured due to statistical dependencies in the network activation process. [sent-23, score-0.884]

15 If the activation is independent at each node, noise variance needs to decrease as network size p increases (in blue). [sent-25, score-0.505]

16 If dependencies in the activation process are harnessed, noise variance can increase as pγ where 0 < γ < 1 depends on network interactions (in red). [sent-26, score-0.575]

17 Specifically, we assume that the patterns x are generated from a probabilistic model, either Gaussian graphical model (GGM) or Ising (binary), based on a general graph structure G(V, E), where V denotes the p vertices and E denotes the edges. [sent-29, score-0.523]

18 Gaussian graphical model: Ising model: p(x) ∝ exp(−xT Σ−1 x) p(x) ∝ exp − (i,j)∈E Wij (xi − xj )2 ∝ exp(−xT Lx) (2) In the Ising model, L = D − W denotes the graph Laplacian, where W is the weighted adjacency matrix and D is the diagonal matrix of node degrees di = j:(i,j)∈E Wij . [sent-30, score-0.364]

19 The graphical model implies that all patterns are not equally likely to occur in the network. [sent-32, score-0.231]

20 Thus, the graph structure dictates the statistical dependencies in network measurements. [sent-34, score-0.454]

21 We assume that this graph structure is known, either because it corresponds to the physical topology of the network or it can be learnt using network measurements [18, 25]. [sent-35, score-0.588]

22 In this paper, we are concerned with the following problem: What is the largest amount of noise that can be tolerated, as a function of the graph and parameters of the model, while allowing for consistent reconstruction of graph-structured network activation patterns? [sent-36, score-0.678]

23 If the activations at network nodes are independent of each other, the noise variance (σ 2 ) must decrease with network size p to ensure consistent MSE recovery [10, 1]. [sent-37, score-0.67]

24 We show that by exploiting the network dependencies, it is possible to consistently recover high-dimensional patterns when the noise variance is much larger (can grow with the network size p). [sent-38, score-0.643]

25 We characterize the learnability of graph structured patterns based on the eigenspectrum of the network. [sent-40, score-0.564]

26 To this end, we propose using an estimator based on thresholding the projection of the network measurements onto the graph Laplacian eigenvectors. [sent-41, score-0.593]

27 Our results indicate that the noise threshold is determined by the eigenspectrum of the Laplacian. [sent-43, score-0.343]

28 For the GGM this procedure reduces to PCA and the noise threshold depends on the eigenvalues of the covariance matrix, as expected. [sent-44, score-0.328]

29 We show that for simple graph structures, such as hierarchical or lattice graphs, as well as the random Erd¨ s-R´ nyi graph, the noise threshold can possibly grow in o e the network size p. [sent-45, score-0.846]

30 Thus, leveraging the structure of network interactions can enable extraction of high-dimensional patterns embedded in heavy noise. [sent-46, score-0.553]

31 Limits of MSE recovery for graph-structured patterns are investigated in Section 3 for the binary Ising model, and in Section 4 for the Gaussian graphical model. [sent-49, score-0.354]

32 In Section 5, we analyze the noise threshold for some simple deterministic and random graph structures. [sent-50, score-0.384]

33 The estimator we propose based on the graph Laplacian eigenbasis is both easy to compute and analyze. [sent-55, score-0.509]

34 The work on graph and manifold regularization [4, 3, 23, 2] is closely related and assumes that the function of interest is smooth with respect to the graph, which is essentially equivalent to assuming a graphical model prior of the form in Eq. [sent-57, score-0.281]

35 However, the use of graph Laplacian is theoretically justified mainly in the embedded setting [6, 21], where the data points are sampled from or near a low-dimensional manifold, and the graph weights are the distances between two points as measured by some kernel. [sent-59, score-0.467]

36 To the best of our knowledge, no previous work studies the noise threshold for consistent MSE recovery of arbitrary graph-structured patterns. [sent-60, score-0.311]

37 There have been several attempts at constructing multi-scale basis for graphs that can efficiently represent localized activation patterns, notably diffusion wavelets [9] and treelets [17], however their approximation capabilities are not well understood. [sent-61, score-0.36]

38 1 that the unbalanced Haar wavelets are a special instance of graph Laplacian eigenbasis when the underlying graph is hierarchical. [sent-64, score-0.674]

39 On the other hand, a lattice graph structure yields activations that are globally supported and smooth, and in this case the Laplacian eigenbasis corresponds to the Fourier transform (see Section 5. [sent-65, score-0.572]

40 Thus, the graph Laplacian eigenbasis provides an efficient representation for patterns whose structure is governed by the graph. [sent-67, score-0.573]

41 3 Denoising binary graph-structured patterns The binary Ising model is essentially a discrete version of the GGM, however, the Bayes rule and risk for the Ising model have no known closed form. [sent-68, score-0.272]

42 For binary graph-structured patterns drawn from an Ising prior, we suggest a different estimator based on projections onto the graph Laplacian eigenbasis. [sent-69, score-0.603]

43 Let the graph Laplacian L have spectral decomposition, L = UΛUT , and denote the first k eigenvectors (corresponding to the smallest eigenvalues) of L by U[k] . [sent-70, score-0.265]

44 Define the estimator xk = U[k] UT y, [k] (3) which is a hard thresholding of the projection of network measurements y = [y1 , . [sent-71, score-0.416]

45 (1), when the binary activation patterns are drawn from the Ising prior of Eq. [sent-79, score-0.408]

46 Through this bias-variance decomposition, we see the eigenspectrum of the graph Laplacian determines a bound on the MSE for binary graph-structured activations. [sent-81, score-0.405]

47 4 Denoising Gaussian graph-structured patterns If the network activation patterns are generated by a Gaussian graphical model, it is easy to see that the eigenvalues of the Laplacian (inverse covariance) determine the MSE decay. [sent-85, score-0.886]

48 The posterior mean is the Bayes optimal estimator with Bayes MSE, 1 −2 −1 ) , where {λi }i∈[p] are the ordered eigenvalues of L. [sent-88, score-0.346]

49 (1), when the activation patterns are drawn from the Gaussian graphical model prior of Eq. [sent-94, score-0.424]

50 5 Noise threshold for some simple graphs In this section, we discuss the eigenspectrum of some simple graphs and use the MSE bounds derived in the previous section to analyze the amount of noise that can be tolerated while ensuring consistent MSE recovery of high-dimensional patterns. [sent-97, score-0.704]

51 In all these examples, we find that the tolerable noise level scales as σ 2 = o(pγ ), where γ ∈ (0, 1) characterizes the strength of network interactions. [sent-98, score-0.261]

52 This corresponds to hierarchical graph structured dependencies between node variables, where > +1 denote the strength of interactions between nodes that are in the same block at level = 0, 1, . [sent-101, score-0.471]

53 It is easy to see that in this case the eigenvectors u of the graph Laplacian correspond to unbalanced Haar wavelet basis (proposed in [22, 14]), i. [sent-105, score-0.308]

54 i=L− +1 Using the bound on MSE as given in Theorems 1 and 2, we can now derive the noise threshold that allows for consistent MSE recovery of high-dimensional patterns as the network size p → ∞. [sent-110, score-0.646]

55 If = 2− (1−β) ∀ ≤ γ log2 p+1, for constants γ, β ∈ (0, 1), and = 0 otherwise, then the noise threshold for consistent MSE recovery (RB = o(1)) is σ 2 = o(pγ ). [sent-113, score-0.311]

56 Thus, if we take advantage of the network interaction structure, it is possible to tolerate noise with variance that scales with the network size p, whereas without exploiting structure the noise variance needs to decrease with p, as discussed in the introduction. [sent-114, score-0.743]

57 Larger γ implies stronger network interactions, and hence larger the noise threshold. [sent-115, score-0.261]

58 2 Regular Lattice structure Now consider the lattice graph which is constructed by placing vertices in a regular grid on a d dimensional torus and adding edges of weight 1 to adjacent points. [sent-117, score-0.478]

59 This form for W and since all nodes have same degree gives us a closed form for the eigenvalues of the Laplacian, along with a concentration inequality. [sent-137, score-0.264]

60 Let λL be an eigenvalue of the Laplacian, L, of the lattice graph in d dimensions with • p = nd vertices, chosen uniformly at random. [sent-139, score-0.537]

61 Consider a graph-structured pattern drawn from an Ising model or GGM based on a lattice graph in d dimensions with p = nd vertices. [sent-144, score-0.477]

62 If n is a constant and d = 8γ ln p, for some constant γ ∈ (0, 1), then the noise threshold for consistent MSE recovery (RB = o(1)) is given as: σ 2 = o(pγ ). [sent-145, score-0.311]

63 Again, the noise variance can increase with the network size p, and larger γ implies stronger network interactions as each variables interacts with more number of neighbors (d is larger). [sent-146, score-0.489]

64 3 Erd¨ s-R´ nyi random graph structure o e Erd¨ s-R´ nyi (ER) random graphs are generated by adding edges with weight 1 between any two o e vertices within the vertex set V (of size p) with probability qp . [sent-148, score-0.814]

65 It is known that the probability of edge inclusion (qp ) determines large geometric properties of the graph [11]. [sent-149, score-0.273]

66 Real world networks are generally sparse, so we set qp = p−(1−γ) , where γ ∈ (0, 1). [sent-150, score-0.249]

67 Larger γ implies higher probability of edge inclusion and stronger network interaction structure. [sent-151, score-0.242]

68 Using the degree distribution [7], and a result from perturbation theory, we bound the quantiles of the eigenspectrum of L. [sent-152, score-0.253]

69 Let PG be the probability measure induced by the ER random graph and P• be the uniform distribution over eigenvalues conditional on the graph. [sent-155, score-0.362]

70 Then, for any αp increasing in p, PG {P• {λ• ≤ pγ /2 − pγ−1 } ≥ αp p−γ } = O(1/αp ) (7) Hence, we are able to set the sequence of quantiles for the eigenvalue distribution kp = αp p1−γ such that PG {λkp ≤ pγ /2 − pγ−1 } = O(1/αp ). [sent-156, score-0.227]

71 Consider a graph G drawn from an Erd¨ s-R´ nyi random graph model with p vertices o e and probability of edge inclusion qp = p−(1−γ) for some constant γ ∈ (0, 1). [sent-159, score-0.89]

72 If the latent graphstructured pattern is drawn from an Ising model or a GGM with the Laplacian of G, then the noise variance that can be tolerated while ensuring consistent MSE recovery (RB = oPG (1)) is given as: σ 2 = o(pγ ). [sent-160, score-0.404]

73 6 Experiments We simulate patterns from the Ising model defined on hierarchical, lattice and ER graphs. [sent-161, score-0.373]

74 Histograms of the eigenspectrum for the hierarchical tree graph with a large depth, the lattice graph in high dimensions, and a draw from the ER graph with many nodes is shown in figures 3(a), 4(a), 5(a) respectively. [sent-163, score-1.077]

75 The eigenspectrum of the lattice and ER graphs illustrate the concentration of the eigenvalues about the expected degree of each node. [sent-164, score-0.626]

76 We use iterative eigenvalue solvers to form our estimator and choose the quantile k by minimizing the bound in Theorem 1. [sent-165, score-0.266]

77 We compute the Bayes MSE (by taking multiple draws) of our estimator for a noisy sample of node measurements. [sent-166, score-0.253]

78 We observe in all of the models that the eigenmap estimator is a substantial improvement over Naive (the Bayes estimator that ignores the structure). [sent-167, score-0.343]

79 (b) Estimator Performance Figure 4: The eigenvalue histogram for the lattice with d = 10 and p = 510 (left) and estimator performances (right) with p = 3d and σ 2 = 1. [sent-173, score-0.517]

80 o e (b) Estimator Performance Figure 5: The eigenvalue histogram for a draw from the ER graph with p = 2500 and qp = p−. [sent-176, score-0.636]

81 5 (left) and the estimator performances (right) with qp = p−. [sent-177, score-0.402]

82 Notice that the eigenvalues are concentrated around pγ where qp = p−(1−γ) . [sent-179, score-0.402]

83 (b) Estimator Performance Figure 6: The eigenvalue histogram for a draw from the Watts-Strogatz graph with d = 5 and p = 45 with 0. [sent-181, score-0.387]

84 25 probability of rewiring (left) and estimator performances (right) with 4d vertices and σ 2 = 4. [sent-182, score-0.248]

85 We find that the posterior mean is only a slight improvement over the eigenmap estimator (Figure 3(b)), despite it’s difficulty to compute. [sent-186, score-0.23]

86 The ‘small world’ graph is generated by forming the lattice graph described in Section 5. [sent-189, score-0.604]

87 We observe that the eigenvalues concentrate (more tightly than the lattice graph) around the expected degree 2d (Figure 6(a)) and note that, like the ER model, the eigenspectrum converges to a nearly semi-circular distribution [12]. [sent-191, score-0.573]

88 While we have only considered MSE recovery, it is often possible to detect the presence of patterns in much heavier noise, even though the activation values may not be accurately recovered [16]. [sent-195, score-0.354]

89 Establishing the noise threshold for detection, deriving upper bounds on the noise threshold, and extensions to graphical models with higher-order interaction terms are some of the directions for future work. [sent-196, score-0.362]

90 In addition, the thresholding estimator based on the graph Laplacian eigenbasis can also be used in high-dimensional linear regression or compressed sensing framework to incorporate structure, in addition to sparsity, of the relevant variables. [sent-197, score-0.539]

91 Let ui denote the ith eigenvector of the graph Laplacian L, then under this orthonormal basis, p p 2 uT x2 | i E[ xk − x ] ≤ E[ ¯ Ω] + pP (Ω) + kσ 2 ≤ sup x:xT Lx≤δp i=k+1 i=k+1 p uT x2 + p e−p + kσ 2 . [sent-204, score-0.267]

92 , vd are a subset of the eigenvectors of w with eigenvalues λ1 , . [sent-218, score-0.25]

93 Note that, conditioned on this random variable, d• ∼ Binomial(p − 1, qp ) and Var(d• ) ≤ pqp . [sent-236, score-0.375]

94 Also, it F is known that for γ ∈ (0, 1) the eigenvalues converge to a semicircular distribution[12] such that PG {|λW | ≤ 2 pqp (1 − qp )} → 1. [sent-242, score-0.528]

95 Since 2 pqp (1 − qp ) ≤ 2pγ/2 , we have EG [(λW )2 ] ≤ 4pγ • • for large enough p (ii). [sent-243, score-0.375]

96 P• {|λL − (p − 1)qp | ≥ } ≤ • −2 E• [(λL − (p − 1)qp )2 ] • Note that P• {λL ≤ pqp − qp − } ≤ P• {|λL − (p − 1)qp | ≥ } and setting = pqp /2 = pγ /2, • • P• {λL ≤ pγ /2 − pγ−1 } ≤ 4p−2γ E• [(λL − (p − 1)qp )2 ]. [sent-247, score-0.501]

97 Ando and Tong Zhang, Learning on graph with laplacian regularization, Advances in Neural Information Processing Systems (NIPS), 2006. [sent-260, score-0.434]

98 [6] , Convergence of laplacian eigenmaps, Advances in Neural Information Processing Systems (NIPS), 2006. [sent-267, score-0.225]

99 Singer, From graph to manifold laplacian: the convergence rate, Applied and Computational Harmonic Analysis 21 (2006), no. [sent-327, score-0.237]

100 Yeung, On the distribution of laplacian eigenvalues versus node degrees in complex networks, Physica A 389 (2010), 1779–1788. [sent-348, score-0.444]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('mse', 0.361), ('lx', 0.254), ('qp', 0.249), ('ising', 0.226), ('laplacian', 0.225), ('ggm', 0.21), ('graph', 0.209), ('patterns', 0.187), ('lattice', 0.186), ('eg', 0.174), ('eigenspectrum', 0.168), ('activation', 0.167), ('eigenvalues', 0.153), ('estimator', 0.153), ('network', 0.148), ('eigenbasis', 0.147), ('pqp', 0.126), ('pg', 0.118), ('noise', 0.113), ('eigenvalue', 0.113), ('recovery', 0.095), ('rb', 0.093), ('erd', 0.09), ('graphs', 0.085), ('bayes', 0.084), ('nyi', 0.08), ('haar', 0.074), ('xt', 0.073), ('ut', 0.069), ('dependencies', 0.067), ('wavelets', 0.066), ('node', 0.066), ('histogram', 0.065), ('kp', 0.063), ('threshold', 0.062), ('eigenvectors', 0.056), ('tolerated', 0.055), ('vertices', 0.053), ('measurements', 0.053), ('quantiles', 0.051), ('embedded', 0.049), ('hierarchical', 0.048), ('nodes', 0.048), ('variance', 0.047), ('di', 0.045), ('leveraging', 0.045), ('graphical', 0.044), ('unbalanced', 0.043), ('sketch', 0.042), ('rewiring', 0.042), ('treelets', 0.042), ('vd', 0.041), ('donoho', 0.041), ('consistent', 0.041), ('posterior', 0.04), ('niyogi', 0.039), ('belkin', 0.039), ('corollary', 0.037), ('eigenmap', 0.037), ('tolerate', 0.037), ('wij', 0.035), ('inclusion', 0.035), ('er', 0.035), ('noisy', 0.034), ('degree', 0.034), ('lemma', 0.034), ('coifman', 0.034), ('interactions', 0.033), ('identifying', 0.033), ('estimators', 0.033), ('xk', 0.032), ('nadler', 0.032), ('concentrate', 0.032), ('heavy', 0.032), ('interaction', 0.03), ('id', 0.03), ('thresholding', 0.03), ('dyadic', 0.03), ('tc', 0.03), ('binarized', 0.03), ('structure', 0.03), ('decrease', 0.03), ('closed', 0.029), ('dimensions', 0.029), ('edge', 0.029), ('mikhail', 0.029), ('partha', 0.029), ('eigenmaps', 0.029), ('enable', 0.029), ('notice', 0.029), ('binary', 0.028), ('vertex', 0.028), ('manifold', 0.028), ('tl', 0.028), ('pattern', 0.027), ('johnstone', 0.027), ('supx', 0.027), ('ui', 0.026), ('drawn', 0.026), ('scan', 0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 117 nips-2010-Identifying graph-structured activation patterns in networks

Author: James Sharpnack, Aarti Singh

Abstract: We consider the problem of identifying an activation pattern in a complex, largescale network that is embedded in very noisy measurements. This problem is relevant to several applications, such as identifying traces of a biochemical spread by a sensor network, expression levels of genes, and anomalous activity or congestion in the Internet. Extracting such patterns is a challenging task specially if the network is large (pattern is very high-dimensional) and the noise is so excessive that it masks the activity at any single node. However, typically there are statistical dependencies in the network activation process that can be leveraged to fuse the measurements of multiple nodes and enable reliable extraction of highdimensional noisy patterns. In this paper, we analyze an estimator based on the graph Laplacian eigenbasis, and establish the limits of mean square error recovery of noisy patterns arising from a probabilistic (Gaussian or Ising) model based on an arbitrary graph structure. We consider both deterministic and probabilistic network evolution models, and our results indicate that by leveraging the network interaction structure, it is possible to consistently recover high-dimensional patterns even when the noise variance increases with network size. 1

2 0.16307224 265 nips-2010-The LASSO risk: asymptotic results and real world examples

Author: Mohsen Bayati, José Pereira, Andrea Montanari

Abstract: We consider the problem of learning a coefficient vector x0 ∈ RN from noisy linear observation y = Ax0 + w ∈ Rn . In many contexts (ranging from model selection to image processing) it is desirable to construct a sparse estimator x. In this case, a popular approach consists in solving an ℓ1 -penalized least squares problem known as the LASSO or Basis Pursuit DeNoising (BPDN). For sequences of matrices A of increasing dimensions, with independent gaussian entries, we prove that the normalized risk of the LASSO converges to a limit, and we obtain an explicit expression for this limit. Our result is the first rigorous derivation of an explicit formula for the asymptotic mean square error of the LASSO for random instances. The proof technique is based on the analysis of AMP, a recently developed efficient algorithm, that is inspired from graphical models ideas. Through simulations on real data matrices (gene expression data and hospital medical records) we observe that these results can be relevant in a broad array of practical applications.

3 0.12932985 204 nips-2010-Penalized Principal Component Regression on Graphs for Analysis of Subnetworks

Author: Ali Shojaie, George Michailidis

Abstract: Network models are widely used to capture interactions among component of complex systems, such as social and biological. To understand their behavior, it is often necessary to analyze functionally related components of the system, corresponding to subsystems. Therefore, the analysis of subnetworks may provide additional insight into the behavior of the system, not evident from individual components. We propose a novel approach for incorporating available network information into the analysis of arbitrary subnetworks. The proposed method offers an efficient dimension reduction strategy using Laplacian eigenmaps with Neumann boundary conditions, and provides a flexible inference framework for analysis of subnetworks, based on a group-penalized principal component regression model on graphs. Asymptotic properties of the proposed inference method, as well as the choice of the tuning parameter for control of the false positive rate are discussed in high dimensional settings. The performance of the proposed methodology is illustrated using simulated and real data examples from biology. 1

4 0.11440789 190 nips-2010-On the Convexity of Latent Social Network Inference

Author: Seth Myers, Jure Leskovec

Abstract: In many real-world scenarios, it is nearly impossible to collect explicit social network data. In such cases, whole networks must be inferred from underlying observations. Here, we formulate the problem of inferring latent social networks based on network diffusion or disease propagation data. We consider contagions propagating over the edges of an unobserved social network, where we only observe the times when nodes became infected, but not who infected them. Given such node infection times, we then identify the optimal network that best explains the observed data. We present a maximum likelihood approach based on convex programming with a l1 -like penalty term that encourages sparsity. Experiments on real and synthetic data reveal that our method near-perfectly recovers the underlying network structure as well as the parameters of the contagion propagation model. Moreover, our approach scales well as it can infer optimal networks of thousands of nodes in a matter of minutes.

5 0.11296619 259 nips-2010-Subgraph Detection Using Eigenvector L1 Norms

Author: Benjamin Miller, Nadya Bliss, Patrick J. Wolfe

Abstract: When working with network datasets, the theoretical framework of detection theory for Euclidean vector spaces no longer applies. Nevertheless, it is desirable to determine the detectability of small, anomalous graphs embedded into background networks with known statistical properties. Casting the problem of subgraph detection in a signal processing context, this article provides a framework and empirical results that elucidate a “detection theory” for graph-valued data. Its focus is the detection of anomalies in unweighted, undirected graphs through L1 properties of the eigenvectors of the graph’s so-called modularity matrix. This metric is observed to have relatively low variance for certain categories of randomly-generated graphs, and to reveal the presence of an anomalous subgraph with reasonable reliability when the anomaly is not well-correlated with stronger portions of the background graph. An analysis of subgraphs in real network datasets confirms the efficacy of this approach. 1

6 0.11108281 63 nips-2010-Distributed Dual Averaging In Networks

7 0.10714084 80 nips-2010-Estimation of Renyi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs

8 0.10693595 162 nips-2010-Link Discovery using Graph Feature Tracking

9 0.10468838 148 nips-2010-Learning Networks of Stochastic Differential Equations

10 0.10297649 254 nips-2010-Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models

11 0.10275682 85 nips-2010-Exact learning curves for Gaussian process regression on large random graphs

12 0.099895261 109 nips-2010-Group Sparse Coding with a Laplacian Scale Mixture Prior

13 0.090895914 150 nips-2010-Learning concept graphs from text with stick-breaking priors

14 0.085791931 30 nips-2010-An Inverse Power Method for Nonlinear Eigenproblems with Applications in 1-Spectral Clustering and Sparse PCA

15 0.08429385 101 nips-2010-Gaussian sampling by local perturbations

16 0.081124127 108 nips-2010-Graph-Valued Regression

17 0.080765277 289 nips-2010-b-Bit Minwise Hashing for Estimating Three-Way Similarities

18 0.079128832 284 nips-2010-Variational bounds for mixed-data factor analysis

19 0.077429749 250 nips-2010-Spectral Regularization for Support Estimation

20 0.077044562 238 nips-2010-Short-term memory in neuronal networks through dynamical compressed sensing


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.213), (1, 0.055), (2, 0.052), (3, 0.166), (4, -0.025), (5, -0.116), (6, 0.008), (7, -0.025), (8, -0.033), (9, 0.043), (10, -0.075), (11, -0.023), (12, -0.079), (13, 0.02), (14, -0.012), (15, -0.099), (16, 0.108), (17, -0.12), (18, -0.057), (19, 0.139), (20, 0.023), (21, 0.012), (22, -0.128), (23, 0.071), (24, 0.062), (25, 0.089), (26, 0.081), (27, 0.121), (28, 0.007), (29, -0.04), (30, 0.075), (31, 0.08), (32, 0.032), (33, -0.0), (34, -0.06), (35, -0.105), (36, -0.042), (37, -0.01), (38, -0.047), (39, 0.031), (40, -0.13), (41, 0.021), (42, 0.044), (43, -0.012), (44, 0.076), (45, 0.033), (46, 0.048), (47, 0.066), (48, -0.067), (49, 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97051024 117 nips-2010-Identifying graph-structured activation patterns in networks

Author: James Sharpnack, Aarti Singh

Abstract: We consider the problem of identifying an activation pattern in a complex, largescale network that is embedded in very noisy measurements. This problem is relevant to several applications, such as identifying traces of a biochemical spread by a sensor network, expression levels of genes, and anomalous activity or congestion in the Internet. Extracting such patterns is a challenging task specially if the network is large (pattern is very high-dimensional) and the noise is so excessive that it masks the activity at any single node. However, typically there are statistical dependencies in the network activation process that can be leveraged to fuse the measurements of multiple nodes and enable reliable extraction of highdimensional noisy patterns. In this paper, we analyze an estimator based on the graph Laplacian eigenbasis, and establish the limits of mean square error recovery of noisy patterns arising from a probabilistic (Gaussian or Ising) model based on an arbitrary graph structure. We consider both deterministic and probabilistic network evolution models, and our results indicate that by leveraging the network interaction structure, it is possible to consistently recover high-dimensional patterns even when the noise variance increases with network size. 1

2 0.81045288 204 nips-2010-Penalized Principal Component Regression on Graphs for Analysis of Subnetworks

Author: Ali Shojaie, George Michailidis

Abstract: Network models are widely used to capture interactions among component of complex systems, such as social and biological. To understand their behavior, it is often necessary to analyze functionally related components of the system, corresponding to subsystems. Therefore, the analysis of subnetworks may provide additional insight into the behavior of the system, not evident from individual components. We propose a novel approach for incorporating available network information into the analysis of arbitrary subnetworks. The proposed method offers an efficient dimension reduction strategy using Laplacian eigenmaps with Neumann boundary conditions, and provides a flexible inference framework for analysis of subnetworks, based on a group-penalized principal component regression model on graphs. Asymptotic properties of the proposed inference method, as well as the choice of the tuning parameter for control of the false positive rate are discussed in high dimensional settings. The performance of the proposed methodology is illustrated using simulated and real data examples from biology. 1

3 0.79806185 259 nips-2010-Subgraph Detection Using Eigenvector L1 Norms

Author: Benjamin Miller, Nadya Bliss, Patrick J. Wolfe

Abstract: When working with network datasets, the theoretical framework of detection theory for Euclidean vector spaces no longer applies. Nevertheless, it is desirable to determine the detectability of small, anomalous graphs embedded into background networks with known statistical properties. Casting the problem of subgraph detection in a signal processing context, this article provides a framework and empirical results that elucidate a “detection theory” for graph-valued data. Its focus is the detection of anomalies in unweighted, undirected graphs through L1 properties of the eigenvectors of the graph’s so-called modularity matrix. This metric is observed to have relatively low variance for certain categories of randomly-generated graphs, and to reveal the presence of an anomalous subgraph with reasonable reliability when the anomaly is not well-correlated with stronger portions of the background graph. An analysis of subgraphs in real network datasets confirms the efficacy of this approach. 1

4 0.76318383 190 nips-2010-On the Convexity of Latent Social Network Inference

Author: Seth Myers, Jure Leskovec

Abstract: In many real-world scenarios, it is nearly impossible to collect explicit social network data. In such cases, whole networks must be inferred from underlying observations. Here, we formulate the problem of inferring latent social networks based on network diffusion or disease propagation data. We consider contagions propagating over the edges of an unobserved social network, where we only observe the times when nodes became infected, but not who infected them. Given such node infection times, we then identify the optimal network that best explains the observed data. We present a maximum likelihood approach based on convex programming with a l1 -like penalty term that encourages sparsity. Experiments on real and synthetic data reveal that our method near-perfectly recovers the underlying network structure as well as the parameters of the contagion propagation model. Moreover, our approach scales well as it can infer optimal networks of thousands of nodes in a matter of minutes.

5 0.72740799 85 nips-2010-Exact learning curves for Gaussian process regression on large random graphs

Author: Matthew Urry, Peter Sollich

Abstract: We study learning curves for Gaussian process regression which characterise performance in terms of the Bayes error averaged over datasets of a given size. Whilst learning curves are in general very difficult to calculate we show that for discrete input domains, where similarity between input points is characterised in terms of a graph, accurate predictions can be obtained. These should in fact become exact for large graphs drawn from a broad range of random graph ensembles with arbitrary degree distributions where each input (node) is connected only to a finite number of others. Our approach is based on translating the appropriate belief propagation equations to the graph ensemble. We demonstrate the accuracy of the predictions for Poisson (Erdos-Renyi) and regular random graphs, and discuss when and why previous approximations of the learning curve fail. 1

6 0.72376013 105 nips-2010-Getting lost in space: Large sample analysis of the resistance distance

7 0.62095219 254 nips-2010-Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models

8 0.59576881 148 nips-2010-Learning Networks of Stochastic Differential Equations

9 0.59512818 63 nips-2010-Distributed Dual Averaging In Networks

10 0.56039101 162 nips-2010-Link Discovery using Graph Feature Tracking

11 0.53827614 289 nips-2010-b-Bit Minwise Hashing for Estimating Three-Way Similarities

12 0.52613574 30 nips-2010-An Inverse Power Method for Nonlinear Eigenproblems with Applications in 1-Spectral Clustering and Sparse PCA

13 0.51571912 87 nips-2010-Extended Bayesian Information Criteria for Gaussian Graphical Models

14 0.48840344 150 nips-2010-Learning concept graphs from text with stick-breaking priors

15 0.48590556 80 nips-2010-Estimation of Renyi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs

16 0.48289084 108 nips-2010-Graph-Valued Regression

17 0.47980243 238 nips-2010-Short-term memory in neuronal networks through dynamical compressed sensing

18 0.47948548 200 nips-2010-Over-complete representations on recurrent neural networks can support persistent percepts

19 0.47105965 265 nips-2010-The LASSO risk: asymptotic results and real world examples

20 0.46551982 127 nips-2010-Inferring Stimulus Selectivity from the Spatial Structure of Neural Network Dynamics


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(13, 0.092), (17, 0.016), (27, 0.053), (30, 0.064), (35, 0.049), (45, 0.149), (50, 0.054), (52, 0.057), (59, 0.01), (60, 0.043), (77, 0.072), (78, 0.012), (85, 0.188), (90, 0.067)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.84573054 117 nips-2010-Identifying graph-structured activation patterns in networks

Author: James Sharpnack, Aarti Singh

Abstract: We consider the problem of identifying an activation pattern in a complex, largescale network that is embedded in very noisy measurements. This problem is relevant to several applications, such as identifying traces of a biochemical spread by a sensor network, expression levels of genes, and anomalous activity or congestion in the Internet. Extracting such patterns is a challenging task specially if the network is large (pattern is very high-dimensional) and the noise is so excessive that it masks the activity at any single node. However, typically there are statistical dependencies in the network activation process that can be leveraged to fuse the measurements of multiple nodes and enable reliable extraction of highdimensional noisy patterns. In this paper, we analyze an estimator based on the graph Laplacian eigenbasis, and establish the limits of mean square error recovery of noisy patterns arising from a probabilistic (Gaussian or Ising) model based on an arbitrary graph structure. We consider both deterministic and probabilistic network evolution models, and our results indicate that by leveraging the network interaction structure, it is possible to consistently recover high-dimensional patterns even when the noise variance increases with network size. 1

2 0.75685436 43 nips-2010-Bootstrapping Apprenticeship Learning

Author: Abdeslam Boularias, Brahim Chaib-draa

Abstract: We consider the problem of apprenticeship learning where the examples, demonstrated by an expert, cover only a small part of a large state space. Inverse Reinforcement Learning (IRL) provides an efficient tool for generalizing the demonstration, based on the assumption that the expert is maximizing a utility function that is a linear combination of state-action features. Most IRL algorithms use a simple Monte Carlo estimation to approximate the expected feature counts under the expert’s policy. In this paper, we show that the quality of the learned policies is highly sensitive to the error in estimating the feature counts. To reduce this error, we introduce a novel approach for bootstrapping the demonstration by assuming that: (i), the expert is (near-)optimal, and (ii), the dynamics of the system is known. Empirical results on gridworlds and car racing problems show that our approach is able to learn good policies from a small number of demonstrations. 1

3 0.72250468 238 nips-2010-Short-term memory in neuronal networks through dynamical compressed sensing

Author: Surya Ganguli, Haim Sompolinsky

Abstract: Recent proposals suggest that large, generic neuronal networks could store memory traces of past input sequences in their instantaneous state. Such a proposal raises important theoretical questions about the duration of these memory traces and their dependence on network size, connectivity and signal statistics. Prior work, in the case of gaussian input sequences and linear neuronal networks, shows that the duration of memory traces in a network cannot exceed the number of neurons (in units of the neuronal time constant), and that no network can out-perform an equivalent feedforward network. However a more ethologically relevant scenario is that of sparse input sequences. In this scenario, we show how linear neural networks can essentially perform compressed sensing (CS) of past inputs, thereby attaining a memory capacity that exceeds the number of neurons. This enhanced capacity is achieved by a class of “orthogonal” recurrent networks and not by feedforward networks or generic recurrent networks. We exploit techniques from the statistical physics of disordered systems to analytically compute the decay of memory traces in such networks as a function of network size, signal sparsity and integration time. Alternately, viewed purely from the perspective of CS, this work introduces a new ensemble of measurement matrices derived from dynamical systems, and provides a theoretical analysis of their asymptotic performance. 1

4 0.72104734 265 nips-2010-The LASSO risk: asymptotic results and real world examples

Author: Mohsen Bayati, José Pereira, Andrea Montanari

Abstract: We consider the problem of learning a coefficient vector x0 ∈ RN from noisy linear observation y = Ax0 + w ∈ Rn . In many contexts (ranging from model selection to image processing) it is desirable to construct a sparse estimator x. In this case, a popular approach consists in solving an ℓ1 -penalized least squares problem known as the LASSO or Basis Pursuit DeNoising (BPDN). For sequences of matrices A of increasing dimensions, with independent gaussian entries, we prove that the normalized risk of the LASSO converges to a limit, and we obtain an explicit expression for this limit. Our result is the first rigorous derivation of an explicit formula for the asymptotic mean square error of the LASSO for random instances. The proof technique is based on the analysis of AMP, a recently developed efficient algorithm, that is inspired from graphical models ideas. Through simulations on real data matrices (gene expression data and hospital medical records) we observe that these results can be relevant in a broad array of practical applications.

5 0.71090066 260 nips-2010-Sufficient Conditions for Generating Group Level Sparsity in a Robust Minimax Framework

Author: Hongbo Zhou, Qiang Cheng

Abstract: Regularization technique has become a principled tool for statistics and machine learning research and practice. However, in most situations, these regularization terms are not well interpreted, especially on how they are related to the loss function and data. In this paper, we propose a robust minimax framework to interpret the relationship between data and regularization terms for a large class of loss functions. We show that various regularization terms are essentially corresponding to different distortions to the original data matrix. This minimax framework includes ridge regression, lasso, elastic net, fused lasso, group lasso, local coordinate coding, multiple kernel learning, etc., as special cases. Within this minimax framework, we further give mathematically exact definition for a novel representation called sparse grouping representation (SGR), and prove a set of sufficient conditions for generating such group level sparsity. Under these sufficient conditions, a large set of consistent regularization terms can be designed. This SGR is essentially different from group lasso in the way of using class or group information, and it outperforms group lasso when there appears group label noise. We also provide some generalization bounds in a classification setting. 1

6 0.71016616 7 nips-2010-A Family of Penalty Functions for Structured Sparsity

7 0.70991081 92 nips-2010-Fast global convergence rates of gradient methods for high-dimensional statistical recovery

8 0.70714539 30 nips-2010-An Inverse Power Method for Nonlinear Eigenproblems with Applications in 1-Spectral Clustering and Sparse PCA

9 0.70505029 222 nips-2010-Random Walk Approach to Regret Minimization

10 0.70431364 148 nips-2010-Learning Networks of Stochastic Differential Equations

11 0.70101964 109 nips-2010-Group Sparse Coding with a Laplacian Scale Mixture Prior

12 0.70075667 31 nips-2010-An analysis on negative curvature induced by singularity in multi-layer neural-network learning

13 0.70047426 200 nips-2010-Over-complete representations on recurrent neural networks can support persistent percepts

14 0.69997293 221 nips-2010-Random Projections for $k$-means Clustering

15 0.69938177 5 nips-2010-A Dirty Model for Multi-task Learning

16 0.69884866 51 nips-2010-Construction of Dependent Dirichlet Processes based on Poisson Processes

17 0.6984697 122 nips-2010-Improving the Asymptotic Performance of Markov Chain Monte-Carlo by Inserting Vortices

18 0.69697666 136 nips-2010-Large-Scale Matrix Factorization with Missing Data under Additional Constraints

19 0.69590819 63 nips-2010-Distributed Dual Averaging In Networks

20 0.69446146 193 nips-2010-Online Learning: Random Averages, Combinatorial Parameters, and Learnability