nips nips2012 nips2012-132 knowledge-graph by maker-knowledge-mining

132 nips-2012-Fiedler Random Fields: A Large-Scale Spectral Approach to Statistical Network Modeling

Source: pdf

Author: Antonino Freno, Mikaela Keller, Marc Tommasi

Abstract: Statistical models for networks have been typically committed to strong prior assumptions concerning the form of the modeled distributions. Moreover, the vast majority of currently available models are explicitly designed for capturing some speciﬁc graph properties (such as power-law degree distributions), which makes them unsuitable for application to domains where the behavior of the target quantities is not known a priori. The key contribution of this paper is twofold. First, we introduce the Fiedler delta statistic, based on the Laplacian spectrum of graphs, which allows to dispense with any parametric assumption concerning the modeled network properties. Second, we use the deﬁned statistic to develop the Fiedler random ﬁeld model, which allows for efﬁcient estimation of edge distributions over large-scale random networks. After analyzing the dependence structure involved in Fiedler random ﬁelds, we estimate them over several real-world networks, showing that they achieve a much higher modeling accuracy than other well-known statistical approaches.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 fr Antonino Freno Abstract Statistical models for networks have been typically committed to strong prior assumptions concerning the form of the modeled distributions. [sent-5, score-0.171]

2 Moreover, the vast majority of currently available models are explicitly designed for capturing some speciﬁc graph properties (such as power-law degree distributions), which makes them unsuitable for application to domains where the behavior of the target quantities is not known a priori. [sent-6, score-0.152]

3 First, we introduce the Fiedler delta statistic, based on the Laplacian spectrum of graphs, which allows to dispense with any parametric assumption concerning the modeled network properties. [sent-8, score-0.279]

4 Second, we use the deﬁned statistic to develop the Fiedler random ﬁeld model, which allows for efﬁcient estimation of edge distributions over large-scale random networks. [sent-9, score-0.089]

5 After analyzing the dependence structure involved in Fiedler random ﬁelds, we estimate them over several real-world networks, showing that they achieve a much higher modeling accuracy than other well-known statistical approaches. [sent-10, score-0.13]

6 1 Introduction Arising from domains as diverse as bioinformatics and web mining, large-scale data exhibiting network structure are becoming increasingly available. [sent-11, score-0.097]

7 Recent studies, especially targeted at social network modeling, have focused on random graph models of those networks. [sent-13, score-0.201]

8 In the simplest form, a random network is a conﬁguration of binary random variables Xuv such that the value of Xuv stands for the presence or absence of a link between nodes u and v in the network. [sent-14, score-0.169]

9 The general idea underlying random graph modeling is that network conﬁgurations are generated by a stochastic process governed by speciﬁc probability laws, so that different models correspond to different families of distributions over graphs. [sent-15, score-0.222]

10 The simplest random graph model is the Erd˝ s-R´ nyi (ER) model [1], which assumes that the probo e ability of observing a link between two nodes in a given graph is constant for any pair of nodes in that graph, and it is independent of which other edges are being observed. [sent-16, score-0.426]

11 In preferential attachment models [2], the probability of linking to any speciﬁed node in a graph is proportional to the degree of the node in the graph, leading to “rich get richer” effects. [sent-17, score-0.266]

12 An attempt to model potentially complex dependencies between graph edges in the form of Gibbs-Boltzmann distributions is made by exponential random graph (ERG) models [4], which subsume the ER model as a special case. [sent-19, score-0.242]

13 Finally, a recent attempt at modeling real networks through ∗ Universit´ Charles de Gaulle – Lille 3, Domaine Universitaire du Pont de Bois – BP 60149, 59653 Vile leneuve d’Ascq (France). [sent-20, score-0.1]

14 1 a stochastic generative process is made by Kronecker graphs [5], which try to capture phenomena such as heavy-tailed degree distributions and shrinking diameter properties while paying attention to the temporal dynamics of network growth. [sent-21, score-0.185]

15 While some of these models behave better than others in terms of computational tractability, one basic limitation affecting all of them is a sort of parametric assumption concerning the probability laws underlying the observed network properties. [sent-22, score-0.206]

16 In other words, currently available models of network structure assume that the shape of the probability distribution generating the network is known a priori. [sent-23, score-0.177]

17 Clearly, in order for such models to deliver accurate estimates of the distributions at hand, their prior assumptions concerning the behavior of the target quantities must be satisﬁed by the given data. [sent-26, score-0.084]

18 To date, the knowledge we have concerning large-scale real-world networks does not allow to assess whether any particular parametric assumption is capturing in depth the target generative process, although some observed network properties may happen to be modeled fairly well. [sent-28, score-0.271]

19 On the one hand, we take a ﬁrst step toward nonparametric modeling of random networks by developing a novel network statistic, which we call the Fiedler delta statistic. [sent-30, score-0.271]

20 The Fiedler delta function allows to model different graph properties at once in an extremely compact form. [sent-31, score-0.212]

21 This statistic is based on the spectral analysis of the graph, and in particular on the smallest non-zero eigenvalue of the Laplacian matrix, which is known as Fiedler value [6, 7]. [sent-32, score-0.131]

22 On the other hand, we use the Fiedler delta statistic to deﬁne a Boltzmann distribution over graphs, leading to the Fiedler random ﬁeld (FRF) model. [sent-33, score-0.141]

23 Roughly speaking, for each binary edge variable Xuv , potentials in a FRF are functions of the difference determined in the Fiedler value by ﬂipping the value of Xuv , where the spectral decomposition is restricted to a suitable subgraph incident to nodes u, v. [sent-34, score-0.279]

24 The intuition is that the information encapsulated in the Fiedler delta for Xuv gives a measure of the role of Xuv in determining the algebraic connectivity of its neighborhood. [sent-35, score-0.137]

25 2 reviews some theoretical background concerning the Laplacian spectrum of graphs. [sent-39, score-0.084]

26 To avoid unwarranted prior assumptions concerning the statistical behavior of the Fiedler delta, potentials are modeled by non-linear functions, which we estimate from data by minimizing a contrastive divergence objective. [sent-42, score-0.228]

27 2 Graphs, Laplacians, and eigenvalues Let G = (V, E) be an undirected graph with n nodes. [sent-46, score-0.164]

28 In the following we assume that the graph is unweighted with adjacency matrix A. [sent-47, score-0.121]

29 The degree du of a node u ∈ V is deﬁned as the number of connections of u to other nodes, that is du = |{v: {u, v} ∈ E}|. [sent-48, score-0.091]

30 Accordingly, the degree matrix D of a graph G corresponds to the diagonal matrix with the vertex degrees d1 , . [sent-49, score-0.152]

31 The main tools exploited by the random graph model proposed here are the graph Laplacian matrices. [sent-53, score-0.242]

32 Different graph Laplacians have been deﬁned in the literature. [sent-54, score-0.121]

33 In this work, we use consistently the unnormalized graph Laplacian, given by L = D − A. [sent-55, score-0.14]

34 The unnormalized graph Laplacian L of an undirected graph G has the following properties: (i) L is symmetric and positive semi-deﬁnite; (ii) the smallest eigenvalue of L is 0; (iii) L has n non-negative, real-valued eigenvalues 0 = λ1 ≤ . [sent-57, score-0.365]

35 If the graph has one single connected component, then M (0, G) = 1, and the second smallest eigenvalue λ2 (G) > 0 is called, in this case, the Fiedler eigenvalue. [sent-62, score-0.182]

36 The Fiedler eigenvalue provides insight into several graph properties: when there is a nontrivial spectral gap, i. [sent-63, score-0.184]

37 λ2 (G) is clearly separated from 0, the graph has good expansion properties, stronger connectivity, and rapid convergence of random walks in the graph. [sent-65, score-0.121]

38 For example, it is known that λ2 (G) ≤ µ(G), where µ(G) is the edge connectivity of the graph (i. [sent-66, score-0.183]

39 the size of the smallest edge cut whose removal makes the graph disconnected [7]). [sent-68, score-0.178]

40 Notice that if the graph has more than one connected component, then λ2 (G) will be also equal to zero, thus implying that the graph is not connected. [sent-69, score-0.242]

41 Without loss of generality, we abuse the term Fiedler eigenvalue to denote the smallest eigenvalue different from zero, regardless of the number of connected components. [sent-70, score-0.104]

42 + For any pair of nodes u and v in a graph G = (V, E), we deﬁne two corresponding graphs G uv and − + − G uv in the following way: G uv = (V, E ∪ {{u, v}}), and G uv = (V, E \ {{u, v}}). [sent-72, score-1.218]

43 Clearly, we + − have that either G uv = G or G uv = G. [sent-73, score-0.472]

44 A basic property concerning the Laplacian eigenvalues of + − G uv and G uv is the following [7, 8, 9]: + − + Lemma 1. [sent-74, score-0.577]

45 If G uv and G uv are two graphs with n nodes, such that {u, v} ⊆ V, G uv = (V, E ∪ − + − n {{u, v}}), and G uv = (V, E \ {{u, v}}), then we have that: (i) i=1 λi (G uv ) − λi (G uv ) = 2; + − (ii) λi (G uv ) ≤ λi (G uv ) for any i such that 1 ≤ i ≤ n. [sent-75, score-1.962]

46 1 Probability distribution Using the notions reviewed above, we deﬁne the Fiedler delta function ∆λ2 in the following way: + Deﬁnition 1. [sent-83, score-0.091]

47 Then, + ∆λ2 (u, v, G) = − λk+1 (G uv ) − λk+1 (G uv ) if Xuv = 1 − + λk+1 (G uv ) − λk+1 (G uv ) otherwise (1) In other words, ∆λ2 (u, v, G) is the variation in the Fiedler eigenvalue of the graph Laplacian that would result from ﬂipping the value of Xuv in G. [sent-85, score-1.108]

48 Concerning the range of the Fiedler delta function, we can easily prove the following proposition: Proposition 2. [sent-86, score-0.091]

49 For any graph G = (V, E) and any pair of nodes {u, v} such that Xuv = 1, we have that 0 ≤ ∆λ2 (u, v, G) ≤ 2. [sent-87, score-0.2]

50 The proposition follows straightforwardly from Lemma 1, given that − ∆λ2 (u, v, G) = λk+1 (G) − λk+1 (G uv ). [sent-90, score-0.255]

51 Given a graph G = (V, E), for each (unordered) pair of nodes {u, v} such that u = v, we take Xuv to denote a binary random variable such that Xuv = 1 if {u, v} ∈ E, and Xuv = 0 otherwise. [sent-92, score-0.2]

52 We also say that a subgraph GS of G with edge set ES is incident to Xuv if {u, v} ⊆ e∈ES e. [sent-94, score-0.138]

53 Given a graph G, let XG denote the set of random variables deﬁned on G, i. [sent-96, score-0.121]

54 For any Xuv ∈ XG , let Guv be a subgraph of G which is incident to Xuv and ϕuv be a two-place real-valued function with parameter vector θ. [sent-99, score-0.099]

55 In other words, a FRF is a Gibbs-Boltzmann distribution over graphs, with potential functions deﬁned for each node pair {u, v} along with some neighboring subgraph Guv . [sent-101, score-0.148]

56 In particular, in order to model the dependence of each variable Xuv on Guv , potentials take as argument both the value of Xuv and the Fiedler delta corresponding to {u, v} in Guv . [sent-102, score-0.187]

57 The idea is to treat the Fiedler delta statistic as a (real-valued) random variable deﬁned over subgraph conﬁgurations, and to exploit this random variable as a compact representation of those conﬁgurations. [sent-103, score-0.215]

58 This means that the dependence structure of a FRF is ﬁxed by the particular choice of subgraphs Guv , so that the set XGuv \ {Xuv } makes Xuv independent of XG \ XGuv . [sent-104, score-0.126]

59 First, how do we ﬁx the subgraph Guv for each pair of nodes {u, v}? [sent-106, score-0.153]

60 Second, how do we choose a shape for the potential functions, so as to fully exploit the information contained in the Fiedler delta, while avoiding unwarranted assumptions concerning their parametric form? [sent-107, score-0.151]

61 Third, how does the Fiedler delta statistic behave with respect to the Markov dependence property for random graphs? [sent-108, score-0.193]

62 2 Dependence structure We ﬁrst recall the deﬁnition of Markov dependence for random graphs [10]. [sent-115, score-0.143]

63 A random graph G is said to be a Markov graph (or to have a Markov dependence structure) if, for any pair of variables Xuv and Xwz in G such that {u, v} ∩ {w, z} = ∅, we have that P (Xuv | Xwz , N (Xuv )) = P (Xuv | N (Xuv )). [sent-118, score-0.313]

64 3, we say that the dependence structure of a random graph G is non-Markovian if, for disjoint pairs of nodes {u, v} and {w, z}, it does not imply that P (Xuv | Xwz , N (Xuv )) = P (Xuv | N (Xuv )), i. [sent-120, score-0.27]

65 Consider a graph G = (V, E) such that V = {u, v, w, z} and E = {{u, v}, {v, w}, {w, z}, {u, z}}. [sent-126, score-0.121]

66 3 can be straightforwardly generalized to the dependence between two variables Xuv and Xwz in circuits/paths of arbitrary size n, since the expression used for the Fiedler eigenvalues of such graphs holds for any n. [sent-131, score-0.147]

67 This fact suggests that FRFs allow to model edge correlations at virtually any distance within G, provided that each subgraph Guv is chosen in such a way as to encompass the relevant correlation. [sent-132, score-0.113]

68 The tests indicated the importance of avoiding limiting assumptions concerning the form of the potentials, which motivated us to model them by a feed-forward multilayer perceptron (MLP), due to its wellknown capabilities of approximating functions of arbitrary shape [12]. [sent-139, score-0.12]

69 If we want our learning objective to be usable in the large-scale setting, then it is not feasible to sum over all node pairs {u, v} in the network, since the number of such pairs grows quadratically with |V|. [sent-151, score-0.088]

70 In this respect, a straightforward approach for scaling to very large networks consists in sampling n objects from the set of all possible pairs of nodes, taking care that the sample contains a good balance between linked and unlinked pairs. [sent-152, score-0.135]

71 In this way, we can empirically tune the k hyperparameter in order to trade-off the informativeness of Guv for the tractability of its spectral decomposition, where it is known that the complexity of computing ∆λ2 (u, v, Guv ) is cubic with respect to the number of nodes in Guv [17]. [sent-160, score-0.08]

72 Algorithm 1 SampleSubgraph: Sampling a neighboring subgraph for a given node pair Input: Undirected graph G = (V, E); node pair {u, v}; number k of snowball waves. [sent-161, score-0.352]

73 Some basic network statistics are reported in Table 1. [sent-178, score-0.08]

74 In the ﬁrst kind of experiments, given a random network G = (V, E), our goal is to measure the accuracy of FRFs at estimating the conditional distribution of variables Xuv given the conﬁguration of neighboring subgraphs Guv of G. [sent-180, score-0.183]

75 This can be seen as a link prediction problem where only local information (given by Guv ) can be used for predicting the presence of a link {u, v}. [sent-181, score-0.076]

76 At the same time, we want to understand whether the overall network size (in terms of the number of nodes) has an impact on the number of training examples that will be necessary for FRFs to converge to stable prediction accuracy. [sent-182, score-0.118]

77 Let us sample our training set D by ﬁrst drawing n node pairs from V in such a way that linked and unlinked pairs from G are equally represented in D, and then extracting the corresponding subgraphs Gui ,vi by Algorithm 1 using one snowball wave. [sent-188, score-0.213]

78 Prediction accuracy is measured by averaging the recognition accuracy for linked and unlinked pairs in T respectively (where |T | = 10, 000). [sent-198, score-0.11]

79 Interestingly, the number of training examples required for the accuracy curve to stabilize does not seem to depend at all on the overall network size. [sent-201, score-0.099]

80 Indeed, fastest convergence is achieved Figure 1: Prediction accuracy of FRFs on the for the average-sized and the second largest arXiv networks for a growing training set size. [sent-202, score-0.097]

81 That is, we want to ascertain whether the effort of modeling the conditional independence structure of the overall network through the FRF formalism is justiﬁed by a suitable gain in prediction accuracy with respect to statistical models that do not focus explicitly on such dependence structure. [sent-220, score-0.244]

82 Interestingly, the degree distribution of WS networks can be expressed in closed form in terms of two parameters δ and β, related to the average degree distribution and a network rewiring process respectively [18]. [sent-223, score-0.205]

83 The parameters of both the WS and the BA model can be estimated by standard maximum-likelihood approaches and then used to predict conditional edge distributions, exploiting information from the degrees observed in the given subgraphs [20, 21]. [sent-225, score-0.096]

84 Since both the BA and the WS model do not show relevant improvements over simple random guessing, this result clearly suggests that exploiting the dependence structure involved in network edge conﬁgurations is crucial to accurately predict the presence/absence of links. [sent-234, score-0.209]

85 General network statistics are also reported, where CCG and DG stand for average clustering coefﬁcient and network diameter respectively. [sent-236, score-0.16]

86 A second group of experiments is aimed at assessing whether the FRFs learned on the arXiv networks can be considered as plausible models of the degree distribution (DD) and the clustering coefﬁcient distribution (CC) observed in each network [15]. [sent-258, score-0.202]

87 To this aim, we use the estimated FRF models to generate artiﬁcial graphs of various size, using Gibbs sampling, and then we compare the DD and CC observed in the artiﬁcial graphs with those estimated on the whole networks. [sent-259, score-0.148]

88 The distance in DD and CC between the artiﬁcial graphs on the one hand and the corresponding real network on the other hand is measured using the Kolmogorov-Smirnov D-statistic, following a common use in graph mining research [15]. [sent-263, score-0.311]

89 Values are averaged over 100 samples for each considered graph size, where the standard deviation is typically in the order of 10−2 . [sent-266, score-0.121]

90 These results are particularly encouraging, since they show how the nonparametric approach motivating the FRF model allows to accurately estimate network properties (such as DD) that are not aimed for explicitly in the model design. [sent-270, score-0.124]

91 This suggests that the Fiedler delta statistic is a promising direction for building generative models capable of capturing different network properties through a uniﬁed approach. [sent-271, score-0.221]

92 4 40 60 80 100 Artificial graph size 120 140 160 40 60 80 (a) 100 Artificial graph size 120 140 160 (b) 1 0. [sent-285, score-0.242]

93 4 40 60 80 100 Artificial graph size 120 140 160 40 (c) 60 80 100 Artificial graph size 120 140 160 (d) Figure 2: D-statistic values for DD and CC on the CondMat (a–b) and HepTh (c–d) networks. [sent-298, score-0.242]

94 5 Conclusions and future work The main motivation inspiring this work was the observation that statistical modeling of networks cries for genuinely nonparametric estimation, because of the inaccuracy often resulting from unwarranted parametric assumptions. [sent-299, score-0.188]

95 In this respect, we showed how the Fiedler delta statistic offers a powerful building block for designing a nonparametric estimator, which we developed in the form of the FRF model. [sent-300, score-0.157]

96 Since here we only applied FRFs to collaboration networks, which are typically scale-free, an important option for future work is to assess the ﬂexibility of FRFs in modeling networks from different families. [sent-301, score-0.122]

97 In the second place, since we only addressed in a heuristic way the problem of learning the dependence structure of FRFs, a stimulating direction for further research consists in designing clever techniques for learning the structure of FRFs, e. [sent-302, score-0.086]

98 Finally, we would like to assess the possibility of modeling networks through mixtures of FRFs, so as to ﬁt different network regions (with possibly conﬂicting properties) through specialized components of the mixture. [sent-305, score-0.184]

99 Weigt, “On the properties of small-world network models,” The European Physical Journal B, vol. [sent-420, score-0.08]

100 Vicsek, “Evolution of the social a e network of scientic collaborations,” Physica A, vol. [sent-441, score-0.08]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('xuv', 0.595), ('fiedler', 0.388), ('guv', 0.316), ('frfs', 0.291), ('uv', 0.236), ('frf', 0.194), ('vuv', 0.134), ('graph', 0.121), ('ws', 0.11), ('xg', 0.099), ('delta', 0.091), ('ba', 0.088), ('concerning', 0.084), ('network', 0.08), ('mlp', 0.079), ('subgraph', 0.074), ('graphs', 0.074), ('xwz', 0.064), ('networks', 0.063), ('cc', 0.063), ('nodes', 0.06), ('laplacian', 0.058), ('dd', 0.058), ('subgraphs', 0.057), ('euv', 0.054), ('dependence', 0.052), ('statistic', 0.05), ('artificial', 0.049), ('condmat', 0.049), ('hepth', 0.049), ('xguv', 0.049), ('potentials', 0.044), ('eigenvalue', 0.043), ('barab', 0.043), ('unwarranted', 0.043), ('contrastive', 0.039), ('edge', 0.039), ('astroph', 0.036), ('hepph', 0.036), ('snowball', 0.036), ('unlinked', 0.036), ('vn', 0.032), ('erg', 0.032), ('gun', 0.032), ('degree', 0.031), ('attachment', 0.03), ('link', 0.029), ('aimed', 0.028), ('preferential', 0.028), ('node', 0.028), ('er', 0.027), ('neighboring', 0.027), ('incident', 0.025), ('cd', 0.025), ('ccg', 0.024), ('committed', 0.024), ('faloutsos', 0.024), ('grqc', 0.024), ('laplacians', 0.024), ('samplesubgraph', 0.024), ('xun', 0.024), ('parametric', 0.024), ('algebraic', 0.023), ('connectivity', 0.023), ('undirected', 0.022), ('involved', 0.021), ('genuinely', 0.021), ('modeling', 0.021), ('eigenvalues', 0.021), ('spectral', 0.02), ('pairs', 0.02), ('capabilities', 0.02), ('want', 0.02), ('assess', 0.02), ('pair', 0.019), ('proposition', 0.019), ('accuracy', 0.019), ('unnormalized', 0.019), ('hand', 0.018), ('divergence', 0.018), ('prediction', 0.018), ('ascq', 0.018), ('erd', 0.018), ('laws', 0.018), ('collaboration', 0.018), ('smallest', 0.018), ('suitable', 0.017), ('gurations', 0.017), ('structure', 0.017), ('nonparametric', 0.016), ('arxiv', 0.016), ('multilayer', 0.016), ('guessing', 0.016), ('du', 0.016), ('linked', 0.016), ('leskovec', 0.016), ('lille', 0.016), ('nyi', 0.016), ('multiplicity', 0.015), ('growing', 0.015)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0 132 nips-2012-Fiedler Random Fields: A Large-Scale Spectral Approach to Statistical Network Modeling

Author: Antonino Freno, Mikaela Keller, Marc Tommasi

2 0.078467496 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs

Author: Anima Anandkumar, Ragupathyraj Valluvan

Abstract: Graphical model selection refers to the problem of estimating the unknown graph structure given observations at the nodes in the model. We consider a challenging instance of this problem when some of the nodes are latent or hidden. We characterize conditions for tractable graph estimation and develop efﬁcient methods with provable guarantees. We consider the class of Ising models Markov on locally tree-like graphs, which are in the regime of correlation decay. We propose an efﬁcient method for graph estimation, and establish its structural consistency −δη(η+1)−2 when the number of samples n scales as n = Ω(θmin log p), where θmin is the minimum edge potential, δ is the depth (i.e., distance from a hidden node to the nearest observed nodes), and η is a parameter which depends on the minimum and maximum node and edge potentials in the Ising model. The proposed method is practical to implement and provides ﬂexibility to control the number of latent variables and the cycle lengths in the output graph. We also present necessary conditions for graph estimation by any method and show that our method nearly matches the lower bound on sample requirements. Keywords: Graphical model selection, latent variables, quartet methods, locally tree-like graphs. 1

3 0.072734855 337 nips-2012-The Lovász ϑ function, SVMs and finding large dense subgraphs

Author: Vinay Jethava, Anders Martinsson, Chiranjib Bhattacharyya, Devdatt Dubhashi

Abstract: The Lov´ sz ϑ function of a graph, a fundamental tool in combinatorial optimizaa tion and approximation algorithms, is computed by solving a SDP. In this paper we establish that the Lov´ sz ϑ function is equivalent to a kernel learning problem a related to one class SVM. This interesting connection opens up many opportunities bridging graph theoretic algorithms and machine learning. We show that there exist graphs, which we call SVM − ϑ graphs, on which the Lov´ sz ϑ function a can be approximated well by a one-class SVM. This leads to novel use of SVM techniques for solving algorithmic problems in large graphs e.g. identifying a √ 1 planted clique of size Θ( n) in a random graph G(n, 2 ). A classic approach for this problem involves computing the ϑ function, however it is not scalable due to SDP computation. We show that the random graph with a planted clique is an example of SVM − ϑ graph. As a consequence a SVM based approach easily identiﬁes the clique in large graphs and is competitive with the state-of-the-art. We introduce the notion of common orthogonal labelling and show that it can be computed by solving a Multiple Kernel learning problem. It is further shown that such a labelling is extremely useful in identifying a large common dense subgraph in multiple graphs, which is known to be a computationally difﬁcult problem. The proposed algorithm achieves an order of magnitude scalability compared to state of the art methods. 1

4 0.067423359 6 nips-2012-A Convex Formulation for Learning Scale-Free Networks via Submodular Relaxation

Author: Aaron Defazio, Tibério S. Caetano

Abstract: A key problem in statistics and machine learning is the determination of network structure from data. We consider the case where the structure of the graph to be reconstructed is known to be scale-free. We show that in such cases it is natural to formulate structured sparsity inducing priors using submodular functions, and we use their Lov´ sz extension to obtain a convex relaxation. For tractable classes a such as Gaussian graphical models, this leads to a convex optimization problem that can be efﬁciently solved. We show that our method results in an improvement in the accuracy of reconstructed networks for synthetic data. We also show how our prior encourages scale-free reconstructions on a bioinfomatics dataset.

5 0.059459217 326 nips-2012-Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses

Author: Po-ling Loh, Martin J. Wainwright

Abstract: We investigate a curious relationship between the structure of a discrete graphical model and the support of the inverse of a generalized covariance matrix. We show that for certain graph structures, the support of the inverse covariance matrix of indicator variables on the vertices of a graph reﬂects the conditional independence structure of the graph. Our work extends results that have previously been established only in the context of multivariate Gaussian graphical models, thereby addressing an open question about the signiﬁcance of the inverse covariance matrix of a non-Gaussian distribution. Based on our population-level results, we show how the graphical Lasso may be used to recover the edge structure of certain classes of discrete graphical models, and present simulations to verify our theoretical results. 1

6 0.058845937 69 nips-2012-Clustering Sparse Graphs

7 0.05754089 199 nips-2012-Link Prediction in Graphs with Autoregressive Features

8 0.05616251 346 nips-2012-Topology Constraints in Graphical Models

9 0.054304924 180 nips-2012-Learning Mixtures of Tree Graphical Models

10 0.053725161 186 nips-2012-Learning as MAP Inference in Discrete Graphical Models

11 0.050067667 298 nips-2012-Scalable Inference of Overlapping Communities

12 0.044083651 10 nips-2012-A Linear Time Active Learning Algorithm for Link Classification

13 0.042943411 147 nips-2012-Graphical Models via Generalized Linear Models

14 0.042052854 309 nips-2012-Semi-supervised Eigenvectors for Locally-biased Learning

15 0.04009654 335 nips-2012-The Bethe Partition Function of Log-supermodular Graphical Models

16 0.03918764 157 nips-2012-Identification of Recurrent Patterns in the Activation of Brain Networks

17 0.038777251 191 nips-2012-Learning the Architecture of Sum-Product Networks Using Clustering on Variables

18 0.037771512 287 nips-2012-Random function priors for exchangeable arrays with applications to graphs and relational data

19 0.035270885 253 nips-2012-On Triangular versus Edge Representations --- Towards Scalable Modeling of Networks

20 0.03455738 106 nips-2012-Dynamical And-Or Graph Learning for Object Shape Modeling and Detection

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.098), (1, 0.034), (2, -0.001), (3, -0.039), (4, -0.03), (5, 0.009), (6, -0.019), (7, -0.043), (8, -0.113), (9, 0.054), (10, 0.005), (11, -0.003), (12, 0.01), (13, 0.009), (14, 0.018), (15, -0.021), (16, 0.035), (17, 0.011), (18, 0.013), (19, -0.012), (20, 0.009), (21, -0.014), (22, 0.005), (23, -0.023), (24, 0.022), (25, -0.014), (26, -0.029), (27, 0.055), (28, -0.022), (29, -0.054), (30, -0.057), (31, 0.074), (32, -0.051), (33, 0.067), (34, 0.018), (35, -0.124), (36, 0.049), (37, -0.009), (38, 0.029), (39, 0.02), (40, -0.04), (41, -0.059), (42, -0.01), (43, -0.055), (44, 0.001), (45, 0.078), (46, 0.023), (47, 0.039), (48, -0.06), (49, 0.003)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92186415 132 nips-2012-Fiedler Random Fields: A Large-Scale Spectral Approach to Statistical Network Modeling

Author: Antonino Freno, Mikaela Keller, Marc Tommasi

2 0.75153524 346 nips-2012-Topology Constraints in Graphical Models

Author: Marcelo Fiori, Pablo Musé, Guillermo Sapiro

Abstract: Graphical models are a very useful tool to describe and understand natural phenomena, from gene expression to climate change and social interactions. The topological structure of these graphs/networks is a fundamental part of the analysis, and in many cases the main goal of the study. However, little work has been done on incorporating prior topological knowledge onto the estimation of the underlying graphical models from sample data. In this work we propose extensions to the basic joint regression model for network estimation, which explicitly incorporate graph-topological constraints into the corresponding optimization approach. The ﬁrst proposed extension includes an eigenvector centrality constraint, thereby promoting this important prior topological property. The second developed extension promotes the formation of certain motifs, triangle-shaped ones in particular, which are known to exist for example in genetic regulatory networks. The presentation of the underlying formulations, which serve as examples of the introduction of topological constraints in network estimation, is complemented with examples in diverse datasets demonstrating the importance of incorporating such critical prior knowledge. 1

3 0.72786719 253 nips-2012-On Triangular versus Edge Representations --- Towards Scalable Modeling of Networks

Author: Qirong Ho, Junming Yin, Eric P. Xing

Abstract: In this paper, we argue for representing networks as a bag of triangular motifs, particularly for important network problems that current model-based approaches handle poorly due to computational bottlenecks incurred by using edge representations. Such approaches require both 1-edges and 0-edges (missing edges) to be provided as input, and as a consequence, approximate inference algorithms for these models usually require Ω(N 2 ) time per iteration, precluding their application to larger real-world networks. In contrast, triangular modeling requires less computation, while providing equivalent or better inference quality. A triangular motif is a vertex triple containing 2 or 3 edges, and the number of such motifs is 2 Θ( i Di ) (where Di is the degree of vertex i), which is much smaller than N 2 for low-maximum-degree networks. Using this representation, we develop a novel mixed-membership network model and approximate inference algorithm suitable for large networks with low max-degree. For networks with high maximum degree, the triangular motifs can be naturally subsampled in a node-centric fashion, allowing for much faster inference at a small cost in accuracy. Empirically, we demonstrate that our approach, when compared to that of an edge-based model, has faster runtime and improved accuracy for mixed-membership community detection. We conclude with a large-scale demonstration on an N ≈ 280, 000-node network, which is infeasible for network models with Ω(N 2 ) inference cost. 1

4 0.68179572 337 nips-2012-The Lovász ϑ function, SVMs and finding large dense subgraphs

Author: Vinay Jethava, Anders Martinsson, Chiranjib Bhattacharyya, Devdatt Dubhashi

5 0.61755353 10 nips-2012-A Linear Time Active Learning Algorithm for Link Classification

Author: Nicolò Cesa-bianchi, Claudio Gentile, Fabio Vitale, Giovanni Zappella

Abstract: We present very eﬃcient active learning algorithms for link classiﬁcation in signed networks. Our algorithms are motivated by a stochastic model in which edge labels are obtained through perturbations of a initial sign assignment consistent with a two-clustering of the nodes. We provide a theoretical analysis within this model, showing that we can achieve an optimal (to whithin a constant factor) number of mistakes on any graph G = (V, E) such that |E| = Ω(|V |3/2 ) by querying O(|V |3/2 ) edge labels. More generally, we show an algorithm that achieves optimality to within a factor of O(k) by querying at most order of |V | + (|V |/k)3/2 edge labels. The running time of this algorithm is at most of order |E| + |V | log |V |. 1

6 0.60813904 287 nips-2012-Random function priors for exchangeable arrays with applications to graphs and relational data

7 0.54995012 199 nips-2012-Link Prediction in Graphs with Autoregressive Features

8 0.5490306 157 nips-2012-Identification of Recurrent Patterns in the Activation of Brain Networks

9 0.53660482 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs

10 0.52763498 298 nips-2012-Scalable Inference of Overlapping Communities

11 0.52306187 309 nips-2012-Semi-supervised Eigenvectors for Locally-biased Learning

12 0.50874251 327 nips-2012-Structured Learning of Gaussian Graphical Models

13 0.49376264 180 nips-2012-Learning Mixtures of Tree Graphical Models

14 0.4652009 135 nips-2012-Forging The Graphs: A Low Rank and Positive Semidefinite Graph Learning Approach

15 0.46449721 6 nips-2012-A Convex Formulation for Learning Scale-Free Networks via Submodular Relaxation

16 0.46260986 177 nips-2012-Learning Invariant Representations of Molecules for Atomization Energy Prediction

17 0.45211896 186 nips-2012-Learning as MAP Inference in Discrete Graphical Models

18 0.44808617 196 nips-2012-Learning with Partially Absorbing Random Walks

19 0.44407624 345 nips-2012-Topic-Partitioned Multinetwork Embeddings

20 0.43551466 326 nips-2012-Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.039), (14, 0.012), (21, 0.023), (23, 0.343), (36, 0.016), (38, 0.085), (39, 0.016), (42, 0.022), (53, 0.016), (54, 0.016), (55, 0.034), (74, 0.066), (76, 0.111), (80, 0.069), (92, 0.035)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.696311 132 nips-2012-Fiedler Random Fields: A Large-Scale Spectral Approach to Statistical Network Modeling

Author: Antonino Freno, Mikaela Keller, Marc Tommasi

2 0.66333026 214 nips-2012-Minimizing Sparse High-Order Energies by Submodular Vertex-Cover

Author: Andrew Delong, Olga Veksler, Anton Osokin, Yuri Boykov

Abstract: Inference in high-order graphical models has become important in recent years. Several approaches are based, for example, on generalized message-passing, or on transformation to a pairwise model with extra ‘auxiliary’ variables. We focus on a special case where a much more efﬁcient transformation is possible. Instead of adding variables, we transform the original problem into a comparatively small instance of submodular vertex-cover. These vertex-cover instances can then be attacked by existing algorithms (e.g. belief propagation, QPBO), where they often run 4–15 times faster and ﬁnd better solutions than when applied to the original problem. We evaluate our approach on synthetic data, then we show applications within a fast hierarchical clustering and model-ﬁtting framework. 1

3 0.59393156 41 nips-2012-Ancestor Sampling for Particle Gibbs

Author: Fredrik Lindsten, Thomas Schön, Michael I. Jordan

Abstract: We present a novel method in the family of particle MCMC methods that we refer to as particle Gibbs with ancestor sampling (PG-AS). Similarly to the existing PG with backward simulation (PG-BS) procedure, we use backward sampling to (considerably) improve the mixing of the PG kernel. Instead of using separate forward and backward sweeps as in PG-BS, however, we achieve the same effect in a single forward sweep. We apply the PG-AS framework to the challenging class of non-Markovian state-space models. We develop a truncation strategy of these models that is applicable in principle to any backward-simulation-based method, but which is particularly well suited to the PG-AS framework. In particular, as we show in a simulation study, PG-AS can yield an order-of-magnitude improved accuracy relative to PG-BS due to its robustness to the truncation error. Several application examples are discussed, including Rao-Blackwellized particle smoothing and inference in degenerate state-space models. 1

4 0.45857397 274 nips-2012-Priors for Diversity in Generative Latent Variable Models

Author: James T. Kwok, Ryan P. Adams

Abstract: Probabilistic latent variable models are one of the cornerstones of machine learning. They offer a convenient and coherent way to specify prior distributions over unobserved structure in data, so that these unknown properties can be inferred via posterior inference. Such models are useful for exploratory analysis and visualization, for building density models of data, and for providing features that can be used for later discriminative tasks. A signiﬁcant limitation of these models, however, is that draws from the prior are often highly redundant due to i.i.d. assumptions on internal parameters. For example, there is no preference in the prior of a mixture model to make components non-overlapping, or in topic model to ensure that co-occurring words only appear in a small number of topics. In this work, we revisit these independence assumptions for probabilistic latent variable models, replacing the underlying i.i.d. prior with a determinantal point process (DPP). The DPP allows us to specify a preference for diversity in our latent variables using a positive deﬁnite kernel function. Using a kernel between probability distributions, we are able to deﬁne a DPP on probability measures. We show how to perform MAP inference with DPP priors in latent Dirichlet allocation and in mixture models, leading to better intuition for the latent variable representation and quantitatively improved unsupervised feature extraction, without compromising the generative aspects of the model. 1

5 0.45789889 210 nips-2012-Memorability of Image Regions

Author: Aditya Khosla, Jianxiong Xiao, Antonio Torralba, Aude Oliva

Abstract: While long term human visual memory can store a remarkable amount of visual information, it tends to degrade over time. Recent works have shown that image memorability is an intrinsic property of an image that can be reliably estimated using state-of-the-art image features and machine learning algorithms. However, the class of features and image information that is forgotten has not been explored yet. In this work, we propose a probabilistic framework that models how and which local regions from an image may be forgotten using a data-driven approach that combines local and global images features. The model automatically discovers memorability maps of individual images without any human annotation. We incorporate multiple image region attributes in our algorithm, leading to improved memorability prediction of images as compared to previous works. 1

6 0.45651934 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs

7 0.4564721 168 nips-2012-Kernel Latent SVM for Visual Recognition

8 0.45642522 324 nips-2012-Stochastic Gradient Descent with Only One Projection

9 0.45391053 188 nips-2012-Learning from Distributions via Support Measure Machines

10 0.45285183 260 nips-2012-Online Sum-Product Computation Over Trees

11 0.45279106 197 nips-2012-Learning with Recursive Perceptual Representations

12 0.45204195 316 nips-2012-Small-Variance Asymptotics for Exponential Family Dirichlet Process Mixture Models

13 0.45190805 83 nips-2012-Controlled Recognition Bounds for Visual Learning and Exploration

14 0.45185825 193 nips-2012-Learning to Align from Scratch

15 0.45183009 355 nips-2012-Truncation-free Online Variational Inference for Bayesian Nonparametric Models

16 0.45152563 229 nips-2012-Multimodal Learning with Deep Boltzmann Machines

17 0.45085216 92 nips-2012-Deep Representations and Codes for Image Auto-Annotation

18 0.45023131 176 nips-2012-Learning Image Descriptors with the Boosting-Trick

19 0.45012134 209 nips-2012-Max-Margin Structured Output Regression for Spatio-Temporal Action Localization

20 0.44956306 354 nips-2012-Truly Nonparametric Online Variational Inference for Hierarchical Dirichlet Processes