nips nips2010 nips2010-259 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Benjamin Miller, Nadya Bliss, Patrick J. Wolfe
Abstract: When working with network datasets, the theoretical framework of detection theory for Euclidean vector spaces no longer applies. Nevertheless, it is desirable to determine the detectability of small, anomalous graphs embedded into background networks with known statistical properties. Casting the problem of subgraph detection in a signal processing context, this article provides a framework and empirical results that elucidate a “detection theory” for graph-valued data. Its focus is the detection of anomalies in unweighted, undirected graphs through L1 properties of the eigenvectors of the graph’s so-called modularity matrix. This metric is observed to have relatively low variance for certain categories of randomly-generated graphs, and to reveal the presence of an anomalous subgraph with reasonable reliability when the anomaly is not well-correlated with stronger portions of the background graph. An analysis of subgraphs in real network datasets confirms the efficacy of this approach. 1
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract When working with network datasets, the theoretical framework of detection theory for Euclidean vector spaces no longer applies. [sent-10, score-0.2]
2 Nevertheless, it is desirable to determine the detectability of small, anomalous graphs embedded into background networks with known statistical properties. [sent-11, score-0.565]
3 Casting the problem of subgraph detection in a signal processing context, this article provides a framework and empirical results that elucidate a “detection theory” for graph-valued data. [sent-12, score-0.729]
4 Its focus is the detection of anomalies in unweighted, undirected graphs through L1 properties of the eigenvectors of the graph’s so-called modularity matrix. [sent-13, score-1.016]
5 This metric is observed to have relatively low variance for certain categories of randomly-generated graphs, and to reveal the presence of an anomalous subgraph with reasonable reliability when the anomaly is not well-correlated with stronger portions of the background graph. [sent-14, score-0.999]
6 An analysis of subgraphs in real network datasets confirms the efficacy of this approach. [sent-15, score-0.242]
7 1 Introduction A graph G = (V, E) denotes a collection of entities, represented by vertices V , along with some relationship between pairs, represented by edges E. [sent-16, score-0.27]
8 Due to this ubiquitous structure, graphs are used in a variety of applications, including the natural sciences, social network analysis, and engineering. [sent-17, score-0.194]
9 In this article we investigate the problem of detecting a small, dense subgraph embedded into an unweighted, undirected background. [sent-19, score-0.743]
10 We use L1 properties of the eigenvectors of the graph’s modularity matrix to determine the presence of an anomaly, and show empirically that this technique has reasonable power to detect a dense subgraph where lower connectivity would be expected. [sent-20, score-1.339]
11 In Section 4 we give an overview of the modularity matrix and observe how its eigenstructure plays a role in anomaly detection. [sent-23, score-0.493]
12 Sections 5 and 6 respectively detail subgraph detection results on simulated and actual network data, and in Section 7 we summarize and outline future research. [sent-24, score-0.712]
13 1 2 Related Work The area of anomaly detection has, in recent years, expanded to graph-based data [1, 2]. [sent-25, score-0.317]
14 The work of Noble and Cook [3] focuses on finding a subgraph that is dissimilar to a common substructure in the network. [sent-26, score-0.512]
15 Eberle and Holder [4] extend this work using the minimum description length heuristic to determine a “normative pattern” in the graph from which the anomalous subgraph deviates, basing 3 detection algorithms on this property. [sent-27, score-0.994]
16 This work, however, does not address the kind of anomaly we describe in Section 3; our background graphs may not have such a “normative pattern” that occurs over a significant amount of the graph. [sent-28, score-0.406]
17 Research into anomaly detection in dynamic graphs by Priebe et al [5] uses the history of a node’s neighborhood to detect anomalous behavior, but this is not directly applicable to our detection of anomalies in static graphs. [sent-29, score-0.922]
18 There has been research on the use of eigenvectors of matrices derived from the graphs of interest to detect anomalies. [sent-30, score-0.475]
19 In [6] the angle of the principal eigenvector is tracked in a graph representing a computer system, and if the angle changes by more than some threshold, an anomaly is declared present. [sent-31, score-0.612]
20 Network anomalies are also dealt with in [7], but here it is assumed that each node in the network has some highly correlated time-domain input. [sent-32, score-0.196]
21 Also, we want to determine the detectability of small anomalies that may not have a significant impact on one or two principal eigenvectors. [sent-34, score-0.26]
22 There has been a significant amount of work on community detection through spectral properties of graphs [8, 9, 10]. [sent-35, score-0.36]
23 The approach taken here is similar to that of [11], in which graph anomalies are detected by way of eigenspace projections. [sent-37, score-0.322]
24 We here focus on smaller and more subtle subgraph anomalies that are not immediately revealed in a graph’s principal components. [sent-38, score-0.682]
25 3 Graph Anomalies As in [12, 11], we cast the problem of detecting a subgraph embedded in a background as one of detecting a signal in noise. [sent-39, score-0.783]
26 We then define the anomalous subgraph (the “signal”) GS = (VS , ES ) with VS ⊂ V . [sent-42, score-0.673]
27 The objective is then to evaluate the following binary hypothesis test; to decide between the null hypothesis H0 and alternate hypothesis H1 : H0 : The observed graph is “noise” GB H1 : The observed graph is “signal+noise” GB ∪ GS . [sent-43, score-0.318]
28 The background graph GB is created by a graph generator, such as those outlined in [13], with a certain set of parameters. [sent-46, score-0.369]
29 We then create an anomalous “signal” graph GS to embed into the background. [sent-47, score-0.342]
30 We select the vertex subset VS from the set of vertices in the network and embed GS into GB by updating the edge set to be E ∪ ES . [sent-48, score-0.283]
31 We apply our detection algorithm to graphs with and without the embedding present to evaluate its performance. [sent-49, score-0.349]
32 4 The Modularity Matrix and its Eigenvectors Newman’s notion of the modularity matrix [8] associated with an unweighted, undirected graph G is given by 1 B := A − KK T . [sent-50, score-0.498]
33 (1) 2|E| Here A = {aij } is the adjacency matrix of G, where aij is 1 if there is an edge between vertex i and vertex j and is 0 otherwise; and K is the degree vector of G, where the ith component of K is the number of edges adjacent to vertex i. [sent-51, score-0.449]
34 If we assume that edges from one vertex are equally likely to be shared with all other vertices, then the modularity matrix is the difference between the “actual” and “expected” number of edges between each pair of vertices. [sent-52, score-0.528]
35 This is also very similar to 2 (a) (b) (c) Figure 1: Scatterplots of an R-MAT generated graph projected into spaces spanned by two eigenvectors of its modularity matrix, with each point representing a vertex. [sent-53, score-0.714]
36 The graph with no embedding (a) and with an embedded 8-vertex clique (b) look the same in the principal components, but the embedding is visible in the eigenvectors corresponding to the 18th and 21st largest eigenvalues (c). [sent-54, score-0.798]
37 Since B is real and symmetric, it admits the eigendecomposition B = U ΛU T , where U ∈ R|V |×|V | is a matrix where each column is an eigenvector of B, and Λ is a diagonal matrix of eigenvalues. [sent-56, score-0.295]
38 We denote by λi , 1 ≤ i ≤ |V |, the eigenvalues of B, where λi ≥ λi+1 for all i, and by ui the unit-magnitude eigenvector corresponding to λi . [sent-57, score-0.287]
39 Newman analyzed the eigenvalues of the modularity matrix to determine if the graph can be split into two separate communities. [sent-58, score-0.546]
40 As demonstrated in [11], analysis of the principal eigenvectors of B can also reveal the presence of a small, tightly-connected component embedded in a large graph. [sent-59, score-0.488]
41 Figure 1(a) demonstrates the projection of an R-MAT Kronecker graph [15] into the principal components of its modularity matrix. [sent-61, score-0.514]
42 Figure 1(b) demonstrates an 8-vertex clique embedded into the same background graph. [sent-63, score-0.273]
43 The foreground vertices are not at all separated from the background vertices, and the symmetry of the projection has not changed (implying no change in the test statistic). [sent-65, score-0.203]
44 Considering only this subspace, the subgraph of interest cannot be detected reliably; its inward connectivity is not strong enough to stand out in the two principal eigenvectors. [sent-66, score-0.713]
45 The fact that the subgraph is absorbed into the background in the space of u1 and u2 , however, does not imply that it is inseparable in general; only in the subspace with the highest variance. [sent-67, score-0.607]
46 Borrowing language from signal processing, there may be another “channel” in which the anomalous signal subgraph can be separated from the background noise. [sent-68, score-0.85]
47 There is in fact a space spanned by two eigenvectors in which the 8-vertex clique stands out: in the space of the u18 and u21 , the two eigenvectors with the largest components in the rows corresponding to VS , the subgraph is clearly separable from the background, as shown in Figure 1(c). [sent-69, score-1.162]
48 1 Eigenvector L1 Norms The subgraph detection technique we propose here is based on L1 properties of the eigenvectors of the graph’s modularity matrix, where the L1 norm of a vector x = [x1 · · · xN ]T is x 1 := N i=1 |xi |. [sent-71, score-1.272]
49 If there is a subgraph GS that is significantly different from its expectation, this will manifest itself in the modularity 3 (a) (b) Figure 2: L1 analysis of modularity matrix eigenvectors. [sent-79, score-1.122]
50 The subgraph GS has a set of vertices VS , which is associated with a set of indices corresponding to rows and columns of the adjacency matrix A. [sent-83, score-0.633]
51 For any S ⊆ V and v ∈ V , let dS (v) denote the number of edges between the vertex v and the vertex set S. [sent-85, score-0.258]
52 Letting each subgraph vertex have uniform internal and external degree, this ratio approaches 1 as v∈VS (dVS (v) − d(v)d(VS )/d(V ))2 is dominated by / (dVS (v) − d(v)d(VS )/d(V ))2 . [sent-91, score-0.682]
53 This suggests that if VS is much more dense than a typical v∈VS subset of background vertices, x is likely to be well-correlated with an eigenvector of B. [sent-92, score-0.348]
54 2 Null Model Characterization To examine the L1 behavior of the modularity matrix’s eigenvectors, we performed the following experiment. [sent-97, score-0.286]
55 Using the R-MAT generator we created 10,000 graphs with 1024 vertices, an average degree of 6 (the result being an average degree of about 12 since we make the graph undirected), and a probability matrix 0. [sent-98, score-0.456]
56 25 For each graph, we compute the modularity matrix B and its eigendecomposition. [sent-103, score-0.324]
57 After compiling background data, we computed the mean and standard deviation of the L1 norms for each ui . [sent-108, score-0.256]
58 Using the R-MAT graph with the embedded 8-vertex clique, we observed eigenvector L1 norms as shown in Figure 2(b). [sent-110, score-0.523]
59 The vast majority of eigenvectors have L1 norms close to the mean for the associated index. [sent-112, score-0.407]
60 This suggests that the community formation inherent in the R-MAT generator creates components strongly associated with the eigenvectors with larger eigenvalues. [sent-115, score-0.399]
61 Note that u18 is the horizontal axis in Figure 1(c), which by itself provides significant separation between the subgraph and the background. [sent-117, score-0.512]
62 Given a graph G, compute the eigendecomposition of its modularity matrix. [sent-120, score-0.449]
63 If any of these modified L1 norms is less than a certain threshold (since the embedding makes the L1 norm smaller), H1 is declared, and H0 is declared otherwise. [sent-122, score-0.29]
64 While the modularity matrix is not sparse, it is the sum of a sparse matrix and a rank-one matrix, so we can still compute its eigenvalues efficiently, as mentioned in [8]. [sent-129, score-0.411]
65 For each simulation, a subgraph density of 70%, 80%, 90% or 100% is chosen. [sent-133, score-0.537]
66 The subgraph is created by, uniformly at random, selecting the chosen proportion of the 8 possible edges. [sent-135, score-0.512]
67 To 2 determine where to embed the subgraph into the background, we find all vertices with at most 1, 3 or 5 edges and select 8 of these at random. [sent-136, score-0.725]
68 In Figure 3(a), each vertex of the subgraph has 1 edge adjacent to the background. [sent-142, score-0.616]
69 In this case the subgraph connectivity is overwhelmingly inward, and the ROC curve reflects this. [sent-143, score-0.553]
70 When the external degree is increased so that a subgraph vertex may have up to 3 edges adjacent to the background, we see a decline in detection performance as shown in Figure 3(b). [sent-145, score-0.929]
71 Figure 3(c) demonstrates the additional decrease in detection performance when the external subgraph connectivity is increased again, to as much as 5 edges per vertex. [sent-146, score-0.85]
72 5 (a) (b) (c) Figure 3: ROC curves for the detection of 8-vertex subgraphs in a 1024-vertex R-MAT background. [sent-147, score-0.338]
73 Performance is shown for subgraphs of varying density when each foreground vertex is connected to the background by up to 1, 3 and 5 edges in (a), (b) and (c), respectively. [sent-148, score-0.489]
74 For each graph, we compute the top 110 eigenvectors of the modularity matrix and the L1 norm of each. [sent-153, score-0.65]
75 , low-pass filtered) version, we choose the two eigenvectors that deviate the most from this trend, except in the case of Slashdot, where there is only one significant deviation. [sent-156, score-0.291]
76 Plots of the L1 norms and scatterplots in the space of the two eigenvectors that deviate most are shown in Figure 4. [sent-157, score-0.479]
77 Note that, with the exception of the asOregon, we see as similar trend in these networks that we did in the R-MAT simulations, with the L1 norms decreasing as the eigenvalues increase (the L1 trend in asOregon is fairly flat). [sent-159, score-0.247]
78 Also, with the exception of Slashdot, each dataset has a few eigenvectors with much smaller norms than those with similar eigenvalues (Slashdot decreases gradually, with one sharp drop at the maximum eigenvalue). [sent-160, score-0.456]
79 The subgraphs detected by L1 analysis are presented in Table 1. [sent-161, score-0.263]
80 Two subgraphs are chosen for each dataset, corresponding to the highlighted points in the scatterplots in Figure 4. [sent-162, score-0.262]
81 For each subgraph we list the size (number of vertices), density (internal degree divided by the maximum number of edges), external degree, and the eigenvector that separates it from the background. [sent-163, score-0.845]
82 To determine whether a detected subgraph is anomalous with respect to the rest of the graph, we sample the network and compare the sample graphs to the detected subgraphs in terms of density and external degree. [sent-165, score-1.33]
83 Note that the thresholds are set so that the detected subgraphs comfortably meet them. [sent-170, score-0.323]
84 For the Slashdot dataset, no sample was nearly as dense as the two subgraphs we selected by thresholding along the principal eigenvector. [sent-173, score-0.308]
85 Upon further inspection, those remaining are either correlated with another eigenvector that deviates from the overall L1 trend, or correlated with multiple eigenvectors, as we discuss in the next section. [sent-176, score-0.284]
86 7 dataset eigenvector subgraph size Epinions Epinions AstroPh AstroPh CondMat CondMat asOregon asOregon Slashdot Slashdot u36 u45 u57 u106 u29 u36 u6 u32 u1 > 0. [sent-178, score-0.705]
87 The scatterplot looks similar to one in which the subgraph is detectable, but is rotated. [sent-182, score-0.664]
88 7 Conclusion In this article we have demonstrated the efficacy of using eigenvector L1 norms of a graph’s modularity matrix to detect small, dense anomalous subgraphs embedded in a background. [sent-183, score-1.191]
89 Casting the problem of subgraph detection in a signal processing context, we have provided the intuition behind the utility of this approach, and empirically demonstrated its effectiveness on a concrete example: detection of a dense subgraph embedded into a graph generated using known parameters. [sent-184, score-1.635]
90 In real network data we see trends similar to those we see in simulation, and examine outliers to see what subgraphs are detected in real-world datasets. [sent-185, score-0.315]
91 Future research will include the expansion of this technique to reliably detect subgraphs that can be separated from the background in the space of a small number of eigenvectors, but not necessarily one. [sent-186, score-0.327]
92 While the L1 norm itself can indicate the presence of an embedding, it requires the subgraph to be highly correlated with a single eigenvector. [sent-187, score-0.612]
93 Figure 5 demonstrates a case where considering multiple eigenvectors at once would likely improve detection performance. [sent-188, score-0.472]
94 The scatterplot in this figure looks similar to the one in Figure 1(c), but is rotated such that the subgraph is equally aligned with the two eigenvectors into which the matrix has been projected. [sent-189, score-1.02]
95 Other future work will focus on developing detectability bounds, the application of which would be useful when developing detection methods like the algorithm outlined here. [sent-192, score-0.202]
96 Faloutsos, “Neighborhood formation and anomaly detection in bipartite graphs,” in Proc. [sent-199, score-0.342]
97 Kashima, “Eigenspace-based anomaly detection in computer systems,” in Proc. [sent-239, score-0.317]
98 Fujimaki, “Network anomaly detection based on eigen equation compression,” in Proc. [sent-246, score-0.317]
99 Newman, “Finding community structure in networks using the eigenvectors of matrices,” Phys. [sent-252, score-0.333]
100 Jordan, “Nonnegative matrix factorization for combinatorial optimization: Spectral clustering, graph matching, and clique finding,” in Proc. [sent-317, score-0.243]
wordName wordTfidf (topN-words)
[('subgraph', 0.512), ('eigenvectors', 0.291), ('modularity', 0.286), ('eigenvector', 0.193), ('subgraphs', 0.19), ('slashdot', 0.181), ('anomaly', 0.169), ('anomalous', 0.161), ('detection', 0.148), ('vs', 0.144), ('graphs', 0.142), ('graph', 0.137), ('asoregon', 0.127), ('scatterplot', 0.127), ('norms', 0.116), ('anomalies', 0.112), ('epinions', 0.109), ('faloutsos', 0.109), ('vertex', 0.104), ('gs', 0.099), ('background', 0.095), ('astroph', 0.091), ('condmat', 0.091), ('dvs', 0.091), ('vertices', 0.083), ('int', 0.081), ('embedded', 0.077), ('detected', 0.073), ('scatterplots', 0.072), ('gb', 0.068), ('clique', 0.068), ('external', 0.066), ('bx', 0.064), ('dense', 0.06), ('embedding', 0.059), ('leskovec', 0.058), ('principal', 0.058), ('declared', 0.055), ('detectability', 0.054), ('mining', 0.054), ('newman', 0.052), ('network', 0.052), ('edges', 0.05), ('degree', 0.049), ('eigenvalues', 0.049), ('wolfe', 0.048), ('ui', 0.045), ('embed', 0.044), ('chakrabarti', 0.044), ('null', 0.044), ('detect', 0.042), ('community', 0.042), ('signal', 0.041), ('connectivity', 0.041), ('generator', 0.041), ('trend', 0.041), ('matrix', 0.038), ('undirected', 0.037), ('anomalously', 0.036), ('bliss', 0.036), ('densi', 0.036), ('diameters', 0.036), ('eberle', 0.036), ('eigs', 0.036), ('parenthetical', 0.036), ('priebe', 0.036), ('shinking', 0.036), ('determine', 0.036), ('meet', 0.036), ('norm', 0.035), ('presence', 0.033), ('demonstrates', 0.033), ('communities', 0.032), ('correlated', 0.032), ('noble', 0.032), ('holder', 0.032), ('lexington', 0.032), ('lincoln', 0.032), ('axes', 0.031), ('unweighted', 0.031), ('ds', 0.03), ('reveal', 0.029), ('cook', 0.029), ('normative', 0.029), ('inward', 0.029), ('detecting', 0.029), ('article', 0.028), ('spectral', 0.028), ('deviates', 0.027), ('aligned', 0.027), ('roc', 0.026), ('kleinberg', 0.026), ('eigendecomposition', 0.026), ('density', 0.025), ('casting', 0.025), ('formation', 0.025), ('foreground', 0.025), ('looks', 0.025), ('threshold', 0.025), ('thresholds', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 259 nips-2010-Subgraph Detection Using Eigenvector L1 Norms
Author: Benjamin Miller, Nadya Bliss, Patrick J. Wolfe
Abstract: When working with network datasets, the theoretical framework of detection theory for Euclidean vector spaces no longer applies. Nevertheless, it is desirable to determine the detectability of small, anomalous graphs embedded into background networks with known statistical properties. Casting the problem of subgraph detection in a signal processing context, this article provides a framework and empirical results that elucidate a “detection theory” for graph-valued data. Its focus is the detection of anomalies in unweighted, undirected graphs through L1 properties of the eigenvectors of the graph’s so-called modularity matrix. This metric is observed to have relatively low variance for certain categories of randomly-generated graphs, and to reveal the presence of an anomalous subgraph with reasonable reliability when the anomaly is not well-correlated with stronger portions of the background graph. An analysis of subgraphs in real network datasets confirms the efficacy of this approach. 1
Author: Matthias Hein, Thomas Bühler
Abstract: Many problems in machine learning and statistics can be formulated as (generalized) eigenproblems. In terms of the associated optimization problem, computing linear eigenvectors amounts to finding critical points of a quadratic function subject to quadratic constraints. In this paper we show that a certain class of constrained optimization problems with nonquadratic objective and constraints can be understood as nonlinear eigenproblems. We derive a generalization of the inverse power method which is guaranteed to converge to a nonlinear eigenvector. We apply the inverse power method to 1-spectral clustering and sparse PCA which can naturally be formulated as nonlinear eigenproblems. In both applications we achieve state-of-the-art results in terms of solution quality and runtime. Moving beyond the standard eigenproblem should be useful also in many other applications and our inverse power method can be easily adapted to new problems. 1
3 0.11355907 44 nips-2010-Brain covariance selection: better individual functional connectivity models using population prior
Author: Gael Varoquaux, Alexandre Gramfort, Jean-baptiste Poline, Bertrand Thirion
Abstract: Spontaneous brain activity, as observed in functional neuroimaging, has been shown to display reproducible structure that expresses brain architecture and carries markers of brain pathologies. An important view of modern neuroscience is that such large-scale structure of coherent activity reflects modularity properties of brain connectivity graphs. However, to date, there has been no demonstration that the limited and noisy data available in spontaneous activity observations could be used to learn full-brain probabilistic models that generalize to new data. Learning such models entails two main challenges: i) modeling full brain connectivity is a difficult estimation problem that faces the curse of dimensionality and ii) variability between subjects, coupled with the variability of functional signals between experimental runs, makes the use of multiple datasets challenging. We describe subject-level brain functional connectivity structure as a multivariate Gaussian process and introduce a new strategy to estimate it from group data, by imposing a common structure on the graphical model in the population. We show that individual models learned from functional Magnetic Resonance Imaging (fMRI) data using this population prior generalize better to unseen data than models based on alternative regularization schemes. To our knowledge, this is the first report of a cross-validated model of spontaneous brain activity. Finally, we use the estimated graphical model to explore the large-scale characteristics of functional architecture and show for the first time that known cognitive networks appear as the integrated communities of functional connectivity graph. 1
4 0.113244 162 nips-2010-Link Discovery using Graph Feature Tracking
Author: Emile Richard, Nicolas Baskiotis, Theodoros Evgeniou, Nicolas Vayatis
Abstract: We consider the problem of discovering links of an evolving undirected graph given a series of past snapshots of that graph. The graph is observed through the time sequence of its adjacency matrix and only the presence of edges is observed. The absence of an edge on a certain snapshot cannot be distinguished from a missing entry in the adjacency matrix. Additional information can be provided by examining the dynamics of the graph through a set of topological features, such as the degrees of the vertices. We develop a novel methodology by building on both static matrix completion methods and the estimation of the future state of relevant graph features. Our procedure relies on the formulation of an optimization problem which can be approximately solved by a fast alternating linearized algorithm whose properties are examined. We show experiments with both simulated and real data which reveal the interest of our methodology. 1
5 0.11296619 117 nips-2010-Identifying graph-structured activation patterns in networks
Author: James Sharpnack, Aarti Singh
Abstract: We consider the problem of identifying an activation pattern in a complex, largescale network that is embedded in very noisy measurements. This problem is relevant to several applications, such as identifying traces of a biochemical spread by a sensor network, expression levels of genes, and anomalous activity or congestion in the Internet. Extracting such patterns is a challenging task specially if the network is large (pattern is very high-dimensional) and the noise is so excessive that it masks the activity at any single node. However, typically there are statistical dependencies in the network activation process that can be leveraged to fuse the measurements of multiple nodes and enable reliable extraction of highdimensional noisy patterns. In this paper, we analyze an estimator based on the graph Laplacian eigenbasis, and establish the limits of mean square error recovery of noisy patterns arising from a probabilistic (Gaussian or Ising) model based on an arbitrary graph structure. We consider both deterministic and probabilistic network evolution models, and our results indicate that by leveraging the network interaction structure, it is possible to consistently recover high-dimensional patterns even when the noise variance increases with network size. 1
6 0.085979432 204 nips-2010-Penalized Principal Component Regression on Graphs for Analysis of Subnetworks
7 0.077820987 254 nips-2010-Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models
8 0.070323467 288 nips-2010-Worst-case bounds on the quality of max-product fixed-points
9 0.070019729 85 nips-2010-Exact learning curves for Gaussian process regression on large random graphs
10 0.069943443 105 nips-2010-Getting lost in space: Large sample analysis of the resistance distance
11 0.069812015 230 nips-2010-Robust Clustering as Ensembles of Affinity Relations
12 0.068798713 63 nips-2010-Distributed Dual Averaging In Networks
13 0.058344182 181 nips-2010-Network Flow Algorithms for Structured Sparsity
14 0.055088259 150 nips-2010-Learning concept graphs from text with stick-breaking priors
15 0.052883469 148 nips-2010-Learning Networks of Stochastic Differential Equations
16 0.051501561 165 nips-2010-MAP estimation in Binary MRFs via Bipartite Multi-cuts
17 0.050672047 91 nips-2010-Fast detection of multiple change-points shared by many signals using group LARS
18 0.048888233 258 nips-2010-Structured sparsity-inducing norms through submodular functions
19 0.045358133 198 nips-2010-Optimal Web-Scale Tiering as a Flow Problem
20 0.044865195 210 nips-2010-Practical Large-Scale Optimization for Max-norm Regularization
topicId topicWeight
[(0, 0.129), (1, 0.052), (2, -0.0), (3, 0.098), (4, -0.026), (5, -0.13), (6, 0.02), (7, -0.074), (8, 0.049), (9, 0.032), (10, -0.043), (11, 0.003), (12, 0.029), (13, 0.025), (14, 0.08), (15, -0.061), (16, 0.046), (17, -0.094), (18, -0.076), (19, 0.06), (20, 0.056), (21, 0.038), (22, -0.138), (23, 0.06), (24, 0.096), (25, -0.006), (26, 0.127), (27, 0.028), (28, 0.04), (29, 0.097), (30, 0.085), (31, 0.026), (32, 0.012), (33, -0.072), (34, -0.058), (35, -0.047), (36, -0.073), (37, -0.004), (38, 0.003), (39, 0.026), (40, -0.09), (41, 0.008), (42, 0.057), (43, -0.108), (44, 0.015), (45, -0.018), (46, 0.042), (47, -0.024), (48, -0.065), (49, 0.05)]
simIndex simValue paperId paperTitle
same-paper 1 0.96324086 259 nips-2010-Subgraph Detection Using Eigenvector L1 Norms
Author: Benjamin Miller, Nadya Bliss, Patrick J. Wolfe
Abstract: When working with network datasets, the theoretical framework of detection theory for Euclidean vector spaces no longer applies. Nevertheless, it is desirable to determine the detectability of small, anomalous graphs embedded into background networks with known statistical properties. Casting the problem of subgraph detection in a signal processing context, this article provides a framework and empirical results that elucidate a “detection theory” for graph-valued data. Its focus is the detection of anomalies in unweighted, undirected graphs through L1 properties of the eigenvectors of the graph’s so-called modularity matrix. This metric is observed to have relatively low variance for certain categories of randomly-generated graphs, and to reveal the presence of an anomalous subgraph with reasonable reliability when the anomaly is not well-correlated with stronger portions of the background graph. An analysis of subgraphs in real network datasets confirms the efficacy of this approach. 1
2 0.76005387 204 nips-2010-Penalized Principal Component Regression on Graphs for Analysis of Subnetworks
Author: Ali Shojaie, George Michailidis
Abstract: Network models are widely used to capture interactions among component of complex systems, such as social and biological. To understand their behavior, it is often necessary to analyze functionally related components of the system, corresponding to subsystems. Therefore, the analysis of subnetworks may provide additional insight into the behavior of the system, not evident from individual components. We propose a novel approach for incorporating available network information into the analysis of arbitrary subnetworks. The proposed method offers an efficient dimension reduction strategy using Laplacian eigenmaps with Neumann boundary conditions, and provides a flexible inference framework for analysis of subnetworks, based on a group-penalized principal component regression model on graphs. Asymptotic properties of the proposed inference method, as well as the choice of the tuning parameter for control of the false positive rate are discussed in high dimensional settings. The performance of the proposed methodology is illustrated using simulated and real data examples from biology. 1
3 0.72366863 105 nips-2010-Getting lost in space: Large sample analysis of the resistance distance
Author: Ulrike V. Luxburg, Agnes Radl, Matthias Hein
Abstract: The commute distance between two vertices in a graph is the expected time it takes a random walk to travel from the first to the second vertex and back. We study the behavior of the commute distance as the size of the underlying graph increases. We prove that the commute distance converges to an expression that does not take into account the structure of the graph at all and that is completely meaningless as a distance function on the graph. Consequently, the use of the raw commute distance for machine learning purposes is strongly discouraged for large graphs and in high dimensions. As an alternative we introduce the amplified commute distance that corrects for the undesired large sample effects. 1
4 0.71479559 117 nips-2010-Identifying graph-structured activation patterns in networks
Author: James Sharpnack, Aarti Singh
Abstract: We consider the problem of identifying an activation pattern in a complex, largescale network that is embedded in very noisy measurements. This problem is relevant to several applications, such as identifying traces of a biochemical spread by a sensor network, expression levels of genes, and anomalous activity or congestion in the Internet. Extracting such patterns is a challenging task specially if the network is large (pattern is very high-dimensional) and the noise is so excessive that it masks the activity at any single node. However, typically there are statistical dependencies in the network activation process that can be leveraged to fuse the measurements of multiple nodes and enable reliable extraction of highdimensional noisy patterns. In this paper, we analyze an estimator based on the graph Laplacian eigenbasis, and establish the limits of mean square error recovery of noisy patterns arising from a probabilistic (Gaussian or Ising) model based on an arbitrary graph structure. We consider both deterministic and probabilistic network evolution models, and our results indicate that by leveraging the network interaction structure, it is possible to consistently recover high-dimensional patterns even when the noise variance increases with network size. 1
Author: Matthias Hein, Thomas Bühler
Abstract: Many problems in machine learning and statistics can be formulated as (generalized) eigenproblems. In terms of the associated optimization problem, computing linear eigenvectors amounts to finding critical points of a quadratic function subject to quadratic constraints. In this paper we show that a certain class of constrained optimization problems with nonquadratic objective and constraints can be understood as nonlinear eigenproblems. We derive a generalization of the inverse power method which is guaranteed to converge to a nonlinear eigenvector. We apply the inverse power method to 1-spectral clustering and sparse PCA which can naturally be formulated as nonlinear eigenproblems. In both applications we achieve state-of-the-art results in terms of solution quality and runtime. Moving beyond the standard eigenproblem should be useful also in many other applications and our inverse power method can be easily adapted to new problems. 1
6 0.6043669 254 nips-2010-Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models
7 0.5913291 85 nips-2010-Exact learning curves for Gaussian process regression on large random graphs
8 0.56542003 162 nips-2010-Link Discovery using Graph Feature Tracking
9 0.54932582 190 nips-2010-On the Convexity of Latent Social Network Inference
10 0.43906796 63 nips-2010-Distributed Dual Averaging In Networks
11 0.41685107 234 nips-2010-Segmentation as Maximum-Weight Independent Set
12 0.40649033 289 nips-2010-b-Bit Minwise Hashing for Estimating Three-Way Similarities
13 0.40309274 181 nips-2010-Network Flow Algorithms for Structured Sparsity
14 0.39778063 150 nips-2010-Learning concept graphs from text with stick-breaking priors
15 0.38948417 87 nips-2010-Extended Bayesian Information Criteria for Gaussian Graphical Models
16 0.38020974 200 nips-2010-Over-complete representations on recurrent neural networks can support persistent percepts
17 0.37906507 44 nips-2010-Brain covariance selection: better individual functional connectivity models using population prior
18 0.37389377 230 nips-2010-Robust Clustering as Ensembles of Affinity Relations
19 0.36824429 255 nips-2010-Static Analysis of Binary Executables Using Structural SVMs
20 0.35665494 148 nips-2010-Learning Networks of Stochastic Differential Equations
topicId topicWeight
[(13, 0.529), (17, 0.012), (27, 0.046), (30, 0.037), (35, 0.016), (45, 0.128), (50, 0.019), (52, 0.027), (60, 0.016), (77, 0.059), (90, 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 0.89360476 259 nips-2010-Subgraph Detection Using Eigenvector L1 Norms
Author: Benjamin Miller, Nadya Bliss, Patrick J. Wolfe
Abstract: When working with network datasets, the theoretical framework of detection theory for Euclidean vector spaces no longer applies. Nevertheless, it is desirable to determine the detectability of small, anomalous graphs embedded into background networks with known statistical properties. Casting the problem of subgraph detection in a signal processing context, this article provides a framework and empirical results that elucidate a “detection theory” for graph-valued data. Its focus is the detection of anomalies in unweighted, undirected graphs through L1 properties of the eigenvectors of the graph’s so-called modularity matrix. This metric is observed to have relatively low variance for certain categories of randomly-generated graphs, and to reveal the presence of an anomalous subgraph with reasonable reliability when the anomaly is not well-correlated with stronger portions of the background graph. An analysis of subgraphs in real network datasets confirms the efficacy of this approach. 1
2 0.8355059 192 nips-2010-Online Classification with Specificity Constraints
Author: Andrey Bernstein, Shie Mannor, Nahum Shimkin
Abstract: We consider the online binary classification problem, where we are given m classifiers. At each stage, the classifiers map the input to the probability that the input belongs to the positive class. An online classification meta-algorithm is an algorithm that combines the outputs of the classifiers in order to attain a certain goal, without having prior knowledge on the form and statistics of the input, and without prior knowledge on the performance of the given classifiers. In this paper, we use sensitivity and specificity as the performance metrics of the meta-algorithm. In particular, our goal is to design an algorithm that satisfies the following two properties (asymptotically): (i) its average false positive rate (fp-rate) is under some given threshold; and (ii) its average true positive rate (tp-rate) is not worse than the tp-rate of the best convex combination of the m given classifiers that satisfies fprate constraint, in hindsight. We show that this problem is in fact a special case of the regret minimization problem with constraints, and therefore the above goal is not attainable. Hence, we pose a relaxed goal and propose a corresponding practical online learning meta-algorithm that attains it. In the case of two classifiers, we show that this algorithm takes a very simple form. To our best knowledge, this is the first algorithm that addresses the problem of the average tp-rate maximization under average fp-rate constraints in the online setting. 1
3 0.80831999 45 nips-2010-CUR from a Sparse Optimization Viewpoint
Author: Jacob Bien, Ya Xu, Michael W. Mahoney
Abstract: The CUR decomposition provides an approximation of a matrix X that has low reconstruction error and that is sparse in the sense that the resulting approximation lies in the span of only a few columns of X. In this regard, it appears to be similar to many sparse PCA methods. However, CUR takes a randomized algorithmic approach, whereas most sparse PCA methods are framed as convex optimization problems. In this paper, we try to understand CUR from a sparse optimization viewpoint. We show that CUR is implicitly optimizing a sparse regression objective and, furthermore, cannot be directly cast as a sparse PCA method. We also observe that the sparsity attained by CUR possesses an interesting structure, which leads us to formulate a sparse PCA method that achieves a CUR-like sparsity.
4 0.77238953 146 nips-2010-Learning Multiple Tasks using Manifold Regularization
Author: Arvind Agarwal, Samuel Gerber, Hal Daume
Abstract: We present a novel method for multitask learning (MTL) based on manifold regularization: assume that all task parameters lie on a manifold. This is the generalization of a common assumption made in the existing literature: task parameters share a common linear subspace. One proposed method uses the projection distance from the manifold to regularize the task parameters. The manifold structure and the task parameters are learned using an alternating optimization framework. When the manifold structure is fixed, our method decomposes across tasks which can be learnt independently. An approximation of the manifold regularization scheme is presented that preserves the convexity of the single task learning problem, and makes the proposed MTL framework efficient and easy to implement. We show the efficacy of our method on several datasets. 1
5 0.7555626 284 nips-2010-Variational bounds for mixed-data factor analysis
Author: Mohammad E. Khan, Guillaume Bouchard, Kevin P. Murphy, Benjamin M. Marlin
Abstract: We propose a new variational EM algorithm for fitting factor analysis models with mixed continuous and categorical observations. The algorithm is based on a simple quadratic bound to the log-sum-exp function. In the special case of fully observed binary data, the bound we propose is significantly faster than previous variational methods. We show that EM is significantly more robust in the presence of missing data compared to treating the latent factors as parameters, which is the approach used by exponential family PCA and other related matrix-factorization methods. A further benefit of the variational approach is that it can easily be extended to the case of mixtures of factor analyzers, as we show. We present results on synthetic and real data sets demonstrating several desirable properties of our proposed method. 1
6 0.75381958 221 nips-2010-Random Projections for $k$-means Clustering
7 0.73764604 261 nips-2010-Supervised Clustering
8 0.63853735 262 nips-2010-Switched Latent Force Models for Movement Segmentation
9 0.63222712 136 nips-2010-Large-Scale Matrix Factorization with Missing Data under Additional Constraints
10 0.55945474 89 nips-2010-Factorized Latent Spaces with Structured Sparsity
11 0.55631566 210 nips-2010-Practical Large-Scale Optimization for Max-norm Regularization
12 0.54481822 226 nips-2010-Repeated Games against Budgeted Adversaries
13 0.5331443 110 nips-2010-Guaranteed Rank Minimization via Singular Value Projection
14 0.52632403 30 nips-2010-An Inverse Power Method for Nonlinear Eigenproblems with Applications in 1-Spectral Clustering and Sparse PCA
15 0.52236795 166 nips-2010-Minimum Average Cost Clustering
16 0.51604617 246 nips-2010-Sparse Coding for Learning Interpretable Spatio-Temporal Primitives
17 0.51508808 117 nips-2010-Identifying graph-structured activation patterns in networks
18 0.51063257 182 nips-2010-New Adaptive Algorithms for Online Classification
19 0.50610298 196 nips-2010-Online Markov Decision Processes under Bandit Feedback
20 0.50398695 265 nips-2010-The LASSO risk: asymptotic results and real world examples