nips nips2012 nips2012-253 knowledge-graph by maker-knowledge-mining

253 nips-2012-On Triangular versus Edge Representations --- Towards Scalable Modeling of Networks

Source: pdf

Author: Qirong Ho, Junming Yin, Eric P. Xing

Abstract: In this paper, we argue for representing networks as a bag of triangular motifs, particularly for important network problems that current model-based approaches handle poorly due to computational bottlenecks incurred by using edge representations. Such approaches require both 1-edges and 0-edges (missing edges) to be provided as input, and as a consequence, approximate inference algorithms for these models usually require Ω(N 2 ) time per iteration, precluding their application to larger real-world networks. In contrast, triangular modeling requires less computation, while providing equivalent or better inference quality. A triangular motif is a vertex triple containing 2 or 3 edges, and the number of such motifs is 2 Θ( i Di ) (where Di is the degree of vertex i), which is much smaller than N 2 for low-maximum-degree networks. Using this representation, we develop a novel mixed-membership network model and approximate inference algorithm suitable for large networks with low max-degree. For networks with high maximum degree, the triangular motifs can be naturally subsampled in a node-centric fashion, allowing for much faster inference at a small cost in accuracy. Empirically, we demonstrate that our approach, when compared to that of an edge-based model, has faster runtime and improved accuracy for mixed-membership community detection. We conclude with a large-scale demonstration on an N ≈ 280, 000-node network, which is infeasible for network models with Ω(N 2 ) inference cost. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract In this paper, we argue for representing networks as a bag of triangular motifs, particularly for important network problems that current model-based approaches handle poorly due to computational bottlenecks incurred by using edge representations. [sent-8, score-0.547]

2 In contrast, triangular modeling requires less computation, while providing equivalent or better inference quality. [sent-10, score-0.344]

3 A triangular motif is a vertex triple containing 2 or 3 edges, and the number of such motifs is 2 Θ( i Di ) (where Di is the degree of vertex i), which is much smaller than N 2 for low-maximum-degree networks. [sent-11, score-0.933]

4 Using this representation, we develop a novel mixed-membership network model and approximate inference algorithm suitable for large networks with low max-degree. [sent-12, score-0.263]

5 For networks with high maximum degree, the triangular motifs can be naturally subsampled in a node-centric fashion, allowing for much faster inference at a small cost in accuracy. [sent-13, score-0.754]

6 Empirically, we demonstrate that our approach, when compared to that of an edge-based model, has faster runtime and improved accuracy for mixed-membership community detection. [sent-14, score-0.277]

7 We conclude with a large-scale demonstration on an N ≈ 280, 000-node network, which is infeasible for network models with Ω(N 2 ) inference cost. [sent-15, score-0.205]

8 The fundamental difference between link prediction and community detection is that the ﬁrst is concerned with link outcomes on pairs of vertices, for which providing links as input is intuitive. [sent-18, score-0.278]

9 However, the second task is about discovering the community memberships of individual vertices, and links are in fact no longer the only sensible representation. [sent-19, score-0.204]

10 By representing the input network as a bag of triangular motifs — by which we mean vertex triples with 2 or 3 edges — one can design novel models for mixed-membership community detection that outperform models based on the adjacency matrix representation. [sent-20, score-1.134]

11 In the traditional edge representation, if N is the number of vertices, then the adjacency matrix has size Θ(N 2 ) — thus, any network analysis algorithm that touches every element must have Ω(N 2 ) runtime complexity. [sent-22, score-0.272]

12 For probabilistic network models, this statement applies to the cost of approximate 1 i j i k (a) j i k j i k j i j k (b) i k j i k j i k (c) j k (d) Figure 1: Four types of triangular motifs: (a) full-triangle; (b) 2-triangle; (c) 1-triangle; (d) empty-triangle. [sent-23, score-0.423]

13 For mixed-membership community detection, we only focus on full-triangles and 2-triangles. [sent-24, score-0.179]

14 With an inference cost of Ω(N 2 ), even modestly large networks with only ∼ 10, 000 vertices are infeasible, to say nothing of modern social networks with millions of vertices or more. [sent-29, score-0.532]

15 On the other hand, it can be shown that the number of 2- and 3-edge triangular motifs is upper2 bounded by Θ( i Di ), where Di is the degree of vertex i. [sent-30, score-0.745]

16 For networks with low maximum degree, this quantity is N 2 , allowing us to construct more parsimonious models with faster inference algorithms. [sent-31, score-0.179]

17 Moreover, for networks with high maximum degree, one can subsample Θ(N δ 2 ) of these triangular motifs in a node-centric fashion, where δ is a user-chosen parameter. [sent-32, score-0.671]

18 Speciﬁcally, we assign triangular motifs to nodes in a natural manner, and then subsample motifs only from nodes with too many of them. [sent-33, score-0.808]

19 As we will show, a triangular representation does not preserve all information found in an edge representation. [sent-39, score-0.372]

20 When it comes to networks, triangular motifs (Figure 1) are already of signiﬁcant interest in biology [13], social science [19, 9, 10, 16], and data mining [21, 18, 8]. [sent-48, score-0.589]

21 In particular, 2- and 3-edge triangular motifs are central to the notion of transitivity in the social sciences — if we observe edges A-B and B-C, does A have an edge to C as well? [sent-49, score-0.648]

22 In fact, the ratio of 3-edge triangles to connected vertex triples (i. [sent-53, score-0.348]

23 2- and 3-edge triangular motifs) is precisely the deﬁnition of the network clustering coefﬁcient [16], which is a popular measure of cluster strength. [sent-55, score-0.441]

24 In the following sections, we begin by characterizing the triangular motifs, following which we develop a mixed-membership model and inference algorithm based on these motifs. [sent-56, score-0.344]

25 Our model, which we call MMTM or the Mixed-Membership Triangular Model, performs mixed-membership community detection, assigning each vertex i to a mixture of communities. [sent-57, score-0.308]

26 ) can be combined by fusing their latent spaces, resulting in a multi-view model — for example, [14, 5] model both text and network data from the same mixed-membership vectors. [sent-61, score-0.178]

27 After developing our model and inference algorithm, we present simulated experiments comparing them on a variety of network types to an adjacency-matrix-based model (MMSB) and its inference algorithm. [sent-63, score-0.208]

28 These experiments will show that triangular mixed-membership modeling results in both faster inference and more accurate mixed-membership recovery. [sent-64, score-0.364]

29 We conclude by demonstrating our model/algorithm on a network with N ≈ 280, 000 nodes and ∼ 2, 300, 000 edges, which is far too large for Ω(N 2 ) inference algorithms such as variational MMSB [1] and the Gibbs sampling MMSB inference algorithm we developed for our experiments. [sent-65, score-0.208]

30 Most of the ideas presented here also generalize to directed networks, though the analysis is more involved since directed networks can generate more motifs than undirected ones. [sent-67, score-0.384]

31 Now, deﬁne a triangular motif Eijk involving vertices i < j < k to be the type of subgraph over these 3 vertices. [sent-69, score-0.513]

32 There are 4 basic classes of triangular motifs (Figure 1), distinguished by their number of 1-edges: full-triangle ∆3 (three 1-edges), 2-triangle ∆2 (two 1-edges), 1-triangle ∆1 (one 1-edge), and empty-triangle ∆0 (no 1-edges). [sent-70, score-0.536]

33 Since the full-triangle and 2-triangle classes are regarded as the basic structural elements of most networks [19, 13, 9, 10, 16], we naturally expect them to characterize most of the community structure in networks (cf. [sent-76, score-0.375]

34 In particular, the ∆3 and ∆2 triangular motifs preserve almost all 1-edges from the original network: every 1-edge appears in some triangular motif ∆2 , ∆3 , except for isolated 1-edges (i. [sent-78, score-0.923]

35 connected components of size 2), which are less interesting from a large-scale community detection perspective. [sent-80, score-0.22]

36 For real networks, which have far more 0- than 1-edges, focusing only on ∆3 and ∆2 greatly reduces the number of triangular motifs, via the following lemma: 1 Lemma 1. [sent-82, score-0.301]

37 The total number of ∆3 ’s and ∆2 ’s is upper bounded by i 2 (Di )(Di − 1) = 2 Θ( i Di ), where Di is the degree of vertex i. [sent-83, score-0.209]

38 It is easy to see that each ∆2 is accounted for by exactly one Ti , where i is the center vertex of the ∆2 , and that each ∆3 is accounted for by three sets Ti , Tj and Tk , one for each vertex in the full-triangle. [sent-88, score-0.258]

39 2 For networks with low maximum degree D, Θ( i Di ) = Θ(N D2 ) is typically much smaller than Θ(N 2 ), allowing triangular models to scale to larger networks than edge-based models. [sent-90, score-0.595]

40 A full-triangle ∆3 associated with vertices i, j and k shall appear in the ﬁnal subsample only if it has been subsampled from at least one of Ti , Tj and Tk . [sent-92, score-0.249]

41 To obtain the set of all subsampled triangles ∆2 and ∆3 , we simply take the union of subsampled triangles from each Ti , discarding those full-triangles duplicated in the subsamples. [sent-93, score-0.516]

42 Although this node-centric subsampling does not preserve all properties of a network, such as the distribution of node degrees, it approximately preserves the local cluster properties of each vertex, thus capturing most of the community structure in networks. [sent-94, score-0.298]

43 Speciﬁcally, the “local” clustering coefﬁcient (LCC) of each vertex i, deﬁned as the ratio of #(∆3 ) touching i to #(∆3 , ∆2 ) touching i, is well-preserved. [sent-95, score-0.195]

44 This follows from subsampling the ∆3 and ∆2 ’s at i uniformly at random, though the LCC has a small upwards bias since each ∆3 may also be sampled by the other two vertices j and k. [sent-96, score-0.223]

45 Hence, we expect community detection based on the subsampled triangles to be nearly as accurate as with the original set of triangles — which our experiments will show. [sent-97, score-0.679]

46 We note that other subsampling strategies [11, 22] preserve various network properties, such as degree distribution, diameter, and inter-node random walk times. [sent-98, score-0.321]

47 In our triangular model, the main property of interest is the distribution over ∆3 and ∆2 , analogous to how latent factor models and MMSB model distributions over 0- and 1-edges. [sent-99, score-0.375]

48 In contrast, 0/1-edge subsampling for MMSB and latent factor models is difﬁcult: most networks have Θ(N 2 ) 0-edges but only o(N 2 ) 1-edges, thus sampling o(N 2 ) 0/1-edges leads to high variance in their distribution. [sent-103, score-0.264]

49 3 3 Mixed-Membership Triangular Model Given a network, now represented by triangular motifs ∆3 and ∆2 , our goal is to perform community detection for each network vertex i, in the same sense as what an MMSB model would enable. [sent-104, score-1.007]

50 Under an MMSB, each vertex i is assigned to a mixture over communities, as opposed to traditional singlemembership community detection, which assigns each vertex to exactly one community. [sent-105, score-0.437]

51 Following a design principle similar to the one underlyα ing the MMSB models, we now present a new mixedmembership network model built on the more parsimonious triangular representation. [sent-107, score-0.469]

52 We assume K latent communities, and that each vertex takes Bxyz Eijk a distribution (i. [sent-113, score-0.185]

53 More speciﬁcally, each vertex i is associated with a community mixed-membership vector θi ∈ ∆K−1 restricted to the (K − 1)-simplex ∆K−1 . [sent-118, score-0.308]

54 This mixed-membership vector θi is used to generate community indicators si,jk ∈ {1, . [sent-119, score-0.179]

55 , K}, each of which represents the community chosen by vertex i when it is forming a triangle with vertices j and k. [sent-122, score-0.497]

56 The probability of observing a triangular motif Eijk depends on the community-triplet si,jk , sj,ik , sk,ij , and a tensor of multinomial parameters B. [sent-123, score-0.378]

57 Then, Bxyz ∈ ∆3 represents the probabilities of generating the 4 triangular motifs2 among vertices i, j and k. [sent-128, score-0.432]

58 In detail, Bxyz,1 is the probability of the 2-triangle whose center vertex has community x, and analogously for Bxyz,2 and community y, and for Bxyz,3 and community z; Bxyz,4 is the probability of the full-triangle. [sent-129, score-0.666]

59 – Generate the triangular motif Eijk based on Bxyz and the ordered values of si,jk , sj,ik , sk,ij ; see Table 1 for the exact conditional probabilities. [sent-138, score-0.36]

60 Thus, only the community indices s need to be sampled. [sent-141, score-0.179]

61 2 It is possible to generate a set of triangles that does not correspond to a network, e. [sent-144, score-0.201]

62 where s−ijk is the set of all community memberships except for si,jk , sj,ik , sk,ij , and si,−jk is the set of all community memberships of vertex i except for si,jk . [sent-153, score-0.537]

63 5 Comparing Mixed-Membership Network Models on Synthetic Networks For a mixed-membership network model to be useful, it must recover some meaningful notion of mixed community membership for each vertex. [sent-156, score-0.393]

64 The precise deﬁnition of network community has been a subject of much debate, and various notions of community [1, 15, 17, 12, 6] have been proposed under different motivations. [sent-157, score-0.48]

65 Our MMTM, too, conveys another notion of community based on membership in full triangles ∆3 and 2-triangles ∆2 , which are key aspects of network clustering coefﬁcients. [sent-158, score-0.585]

66 We shall also demonstrate that MMTM leads to faster inference, particularly when δ-subsampling triangles (as described in Section 2). [sent-161, score-0.245]

67 Intuitively, we expect the mixed-membership recovery of our inference algorithm to depend on (a) the degree distribution of the network, and (b) the “degree limit” δ used in subsampling the network; performance should increase as the number of vertices i having degree Di ≤ δ goes up. [sent-162, score-0.445]

68 In particular, our experiments will demonstrate that subsampling yields good performance even when the network contains a few vertices with very large degree Di (a characteristic of many real-world networks). [sent-163, score-0.425]

69 Synthetic networks We compared our MMTM to MMSB3 [1] on multiple synthetic networks, evaluating them according to how well their inference algorithms recovered the vertex mixedmembership vectors θi . [sent-164, score-0.341]

70 Speciﬁcally, we generated M = 60, 000 1-edges as follows: (a) pick a vertex i with probability proportional to its degree; (b) randomly pick a destination community k from θi ; (c) ﬁnd the set Vk of all vertices v such that θvk is the largest element of θv (i. [sent-179, score-0.493]

71 the vertices that mostly belong to community k); (d) within Vk , pick the destination vertex j with probability proportional to its degree. [sent-181, score-0.475]

72 Assign each group to a (different) dominant community k ∈ {1, . [sent-192, score-0.208]

73 In addition, we generated a fourth “pure membership” network under the MMSB model, using pure θi ’s with full membership in the dominant community. [sent-205, score-0.246]

74 To investigate the effect of δ-subsampling triangles (Section 2), we repeated every MMTM experiment under four different values of δ: 20, 15, 10 and 5. [sent-214, score-0.201]

75 The triangles were subsampled prior to running the Gibbs sampler, and they remained ﬁxed during inference. [sent-215, score-0.258]

76 The mixed-membership vectors θi (πi in the original paper) and blockmatrix B were integrated out, and we Gibbs sampled each edge (i, j)’s associated community indicators zi→j , zi←j in a block fashion. [sent-218, score-0.205]

77 MMTM also requires less runtime for all but the biased scale-free network, which has a much larger maximum degree than the others (Table 2). [sent-225, score-0.189]

78 The runtime beneﬁt is most noticable on the biased scale-free network, underscoring the need to subsample real-world networks with high maximum degree. [sent-227, score-0.244]

79 We hypothesize MMSB’s poorer performance on networks of this size (N = 4, 000) results from having Θ(N 2 ) latent variables, while noting that the literature has only considered smaller N < 1, 000 networks [1]. [sent-228, score-0.252]

80 Compared to MMTM, having many latent variables not only increases runtime per iteration, but also the number of iterations required for convergence, since the latent variable state space grows exponentially with the number of latent variables. [sent-229, score-0.246]

81 In support of this, we have observed 4 As explained in Section 2, we ﬁrst need to preprocess the network adjacency list into the ∆3 , ∆2 triangle representation. [sent-230, score-0.226]

82 The generative parameters for the N = 4, 000 network are identical to our earlier experiment, while the parameters for the other three network sizes were adjusted to maintain the same average degree5 . [sent-236, score-0.262]

83 The δsubsampled MMTM experiments show linear runtime dependence on N , which is expected since the number of subsampled triangles is O(N δ 2 ). [sent-239, score-0.336]

84 To demonstrate this, we ran the MMTM Gibbs sampler with δ = 20 on the SNAP Stanford Web Graph6 , containing N = 281, 903 vertices (webpages), 2, 312, 497 1-edges, and approximately 4 billiion 2- and 3-edge triangles ∆3 , ∆2 , which we reduced to 11, 353, 778 via δ = 20subsampling. [sent-244, score-0.39]

85 Note that the vast majority of triangles are associated with exceptionally high-degree vertices, which make up a small fraction of the network. [sent-245, score-0.201]

86 By using δ-subsampling, we limited the number of triangles that come from such vertices, thus making the network feasible for MMTM. [sent-246, score-0.323]

87 Every vertex i is displayed as a circle ci , whose size is proportional to the network degree of i. [sent-253, score-0.373]

88 5 Number of vertices 4 4 x 10 Figure 4: Per-iteration runtimes for MMSB, MMTM and MMTM with δ-subsampling, on synthetic networks with N ranging from 1,000 to 40,000, but with constant average degree. [sent-267, score-0.254]

89 Most high-degree vertices (large circles) are found at the pentagon’s corners, leading to the intuitive conclusion that the ﬁve communities are centered on hub webpages with many links. [sent-269, score-0.228]

90 Finally, if we focus on the sets of vertices near each corner, we see that the green and red sets have distinct degree (i. [sent-271, score-0.228]

91 However, the bag-of-network-motifs idea extends beyond triangles — one could easily consider subgraphs over 4 or more vertices, as in [13]. [sent-275, score-0.233]

92 As with triangular motifs, it is algorithmically infeasible to consider all possible subgraphs; rather, we must focus our attention on a meaningful subset of them. [sent-276, score-0.323]

93 Nevertheless, higher order motifs could be more suited for particular tasks, thus meriting their investigation. [sent-277, score-0.235]

94 In modeling terms, we have applied triangular motifs to a generative mixed-membership setting, which is suitable for visualization but not necessarily for attribute prediction. [sent-278, score-0.554]

95 Recent developments in constrained learning of generative models [23, 24] have yielded signiﬁcant improvements in predictive accuracy, and these techniques are also applicable to mixed-membership triangular modeling. [sent-279, score-0.337]

96 To summarize, we have introduced the bag-of-triangles representation as a parsimonius alternative to the network adjacency matrix, and developed a model (MMTM) and inference algorithm for mixedmembership community detection in networks. [sent-281, score-0.495]

97 Compared to mixed-membership models that use the adjacency matrix (exempliﬁed by MMSB), our model features a much smaller latent variable space, leading to faster inference and better performance at mixed-membership recovery. [sent-282, score-0.183]

98 When combined with triangle subsampling, our model and inference algorithm scale easily to networks with 100,000s of vertices, which are completely infeasible for Θ(N 2 ) adjacency-matrix-based models — the adjacency matrix might not even ﬁt in memory, to say nothing of runtime. [sent-283, score-0.285]

99 With triangle separators, we expect triangle models to scale to networks with millions of vertices and more. [sent-286, score-0.363]

100 Fast counting of triangles in large real networks without counting: Algorithms and laws. [sent-446, score-0.299]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('mmtm', 0.613), ('mmsb', 0.445), ('triangular', 0.301), ('motifs', 0.235), ('triangles', 0.201), ('community', 0.179), ('eijk', 0.144), ('vertices', 0.131), ('vertex', 0.129), ('network', 0.122), ('networks', 0.098), ('subsampling', 0.092), ('degree', 0.08), ('runtime', 0.078), ('communities', 0.068), ('membership', 0.065), ('motif', 0.059), ('di', 0.058), ('triangle', 0.058), ('subsampled', 0.057), ('latent', 0.056), ('bxyz', 0.048), ('mixedmembership', 0.046), ('adjacency', 0.046), ('inference', 0.043), ('detection', 0.041), ('sampler', 0.04), ('ti', 0.039), ('separators', 0.037), ('subsample', 0.037), ('pentagon', 0.036), ('corners', 0.032), ('subgraphs', 0.032), ('jk', 0.031), ('biased', 0.031), ('social', 0.031), ('pure', 0.03), ('gibbs', 0.029), ('webpages', 0.029), ('dominant', 0.029), ('link', 0.029), ('ho', 0.029), ('blockmodel', 0.028), ('transitivity', 0.028), ('edges', 0.027), ('preserve', 0.027), ('mixed', 0.027), ('edge', 0.026), ('blocked', 0.026), ('circle', 0.025), ('synthetic', 0.025), ('memberships', 0.025), ('ijk', 0.025), ('shall', 0.024), ('graphlet', 0.024), ('junming', 0.024), ('lcc', 0.024), ('shervashidze', 0.024), ('simmel', 0.024), ('touching', 0.024), ('acm', 0.022), ('subgraph', 0.022), ('mining', 0.022), ('infeasible', 0.022), ('pittsburgh', 0.021), ('lane', 0.021), ('qirong', 0.021), ('position', 0.021), ('triplet', 0.021), ('faster', 0.02), ('grammatical', 0.02), ('circles', 0.019), ('dirichlet', 0.019), ('collapsed', 0.019), ('recovery', 0.019), ('pick', 0.018), ('triples', 0.018), ('destination', 0.018), ('models', 0.018), ('clustering', 0.018), ('tensor', 0.018), ('representation', 0.018), ('ran', 0.018), ('generative', 0.018), ('hyperparameters', 0.018), ('topic', 0.017), ('undirected', 0.017), ('si', 0.017), ('directed', 0.017), ('ci', 0.017), ('discrete', 0.017), ('superpixels', 0.017), ('admixture', 0.017), ('green', 0.017), ('color', 0.017), ('mellon', 0.016), ('carnegie', 0.016), ('vk', 0.016), ('ahmed', 0.016), ('leskovec', 0.016)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 253 nips-2012-On Triangular versus Edge Representations --- Towards Scalable Modeling of Networks

Author: Qirong Ho, Junming Yin, Eric P. Xing

2 0.29723245 298 nips-2012-Scalable Inference of Overlapping Communities

Author: Prem Gopalan, Sean Gerrish, Michael Freedman, David M. Blei, David M. Mimno

Abstract: We develop a scalable algorithm for posterior inference of overlapping communities in large networks. Our algorithm is based on stochastic variational inference in the mixed-membership stochastic blockmodel (MMSB). It naturally interleaves subsampling the network with estimating its community structure. We apply our algorithm on ten large, real-world networks with up to 60,000 nodes. It converges several orders of magnitude faster than the state-of-the-art algorithm for MMSB, ﬁnds hundreds of communities in large real-world networks, and detects the true communities in 280 benchmark networks with equal or better accuracy compared to other scalable algorithms. 1

3 0.16215727 346 nips-2012-Topology Constraints in Graphical Models

Author: Marcelo Fiori, Pablo Musé, Guillermo Sapiro

Abstract: Graphical models are a very useful tool to describe and understand natural phenomena, from gene expression to climate change and social interactions. The topological structure of these graphs/networks is a fundamental part of the analysis, and in many cases the main goal of the study. However, little work has been done on incorporating prior topological knowledge onto the estimation of the underlying graphical models from sample data. In this work we propose extensions to the basic joint regression model for network estimation, which explicitly incorporate graph-topological constraints into the corresponding optimization approach. The ﬁrst proposed extension includes an eigenvector centrality constraint, thereby promoting this important prior topological property. The second developed extension promotes the formation of certain motifs, triangle-shaped ones in particular, which are known to exist for example in genetic regulatory networks. The presentation of the underlying formulations, which serve as examples of the introduction of topological constraints in network estimation, is complemented with examples in diverse datasets demonstrating the importance of incorporating such critical prior knowledge. 1

4 0.086485222 260 nips-2012-Online Sum-Product Computation Over Trees

Author: Mark Herbster, Stephen Pasteris, Fabio Vitale

Abstract: We consider the problem of performing efﬁcient sum-product computations in an online setting over a tree. A natural application of our methods is to compute the marginal distribution at a vertex in a tree-structured Markov random ﬁeld. Belief propagation can be used to solve this problem, but requires time linear in the size of the tree, and is therefore too slow in an online setting where we are continuously receiving new data and computing individual marginals. With our method we aim to update the data and compute marginals in time that is no more than logarithmic in the size of the tree, and is often signiﬁcantly less. We accomplish this via a hierarchical covering structure that caches previous local sum-product computations. Our contribution is three-fold: we i) give a linear time algorithm to ﬁnd an optimal hierarchical cover of a tree; ii) give a sum-productlike algorithm to efﬁciently compute marginals with respect to this cover; and iii) apply “i” and “ii” to ﬁnd an efﬁcient algorithm with a regret bound for the online allocation problem in a multi-task setting. 1

5 0.067727968 194 nips-2012-Learning to Discover Social Circles in Ego Networks

Author: Jure Leskovec, Julian J. Mcauley

Abstract: Our personal social networks are big and cluttered, and currently there is no good way to organize them. Social networking sites allow users to manually categorize their friends into social circles (e.g. ‘circles’ on Google+, and ‘lists’ on Facebook and Twitter), however they are laborious to construct and must be updated whenever a user’s network grows. We deﬁne a novel machine learning task of identifying users’ social circles. We pose the problem as a node clustering problem on a user’s ego-network, a network of connections between her friends. We develop a model for detecting circles that combines network structure as well as user proﬁle information. For each circle we learn its members and the circle-speciﬁc user proﬁle similarity metric. Modeling node membership to multiple circles allows us to detect overlapping as well as hierarchically nested circles. Experiments show that our model accurately identiﬁes circles on a diverse set of data from Facebook, Google+, and Twitter for all of which we obtain hand-labeled ground-truth. 1

6 0.05627704 68 nips-2012-Clustering Aggregation as Maximum-Weight Independent Set

7 0.05422258 137 nips-2012-From Deformations to Parts: Motion-based Segmentation of 3D Objects

8 0.049180467 126 nips-2012-FastEx: Hash Clustering with Exponential Families

9 0.048545171 186 nips-2012-Learning as MAP Inference in Discrete Graphical Models

10 0.047376022 337 nips-2012-The Lovász ϑ function, SVMs and finding large dense subgraphs

11 0.046904799 6 nips-2012-A Convex Formulation for Learning Scale-Free Networks via Submodular Relaxation

12 0.044620186 82 nips-2012-Continuous Relaxations for Discrete Hamiltonian Monte Carlo

13 0.04452559 77 nips-2012-Complex Inference in Neural Circuits with Probabilistic Population Codes and Topic Models

14 0.044442777 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs

15 0.042455304 326 nips-2012-Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses

16 0.040326711 158 nips-2012-ImageNet Classification with Deep Convolutional Neural Networks

17 0.040207323 15 nips-2012-A Polylog Pivot Steps Simplex Algorithm for Classification

18 0.040031247 199 nips-2012-Link Prediction in Graphs with Autoregressive Features

19 0.039657705 19 nips-2012-A Spectral Algorithm for Latent Dirichlet Allocation

20 0.039429296 287 nips-2012-Random function priors for exchangeable arrays with applications to graphs and relational data

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.119), (1, 0.045), (2, -0.033), (3, -0.026), (4, -0.085), (5, -0.022), (6, -0.019), (7, -0.065), (8, -0.073), (9, 0.066), (10, 0.035), (11, 0.014), (12, 0.022), (13, -0.0), (14, 0.02), (15, 0.041), (16, -0.024), (17, -0.01), (18, -0.023), (19, 0.009), (20, -0.022), (21, -0.008), (22, 0.074), (23, -0.044), (24, -0.044), (25, -0.08), (26, -0.048), (27, 0.102), (28, -0.08), (29, -0.101), (30, -0.047), (31, 0.039), (32, -0.176), (33, 0.023), (34, 0.065), (35, -0.167), (36, 0.119), (37, 0.019), (38, -0.009), (39, -0.051), (40, -0.05), (41, -0.048), (42, 0.002), (43, -0.192), (44, -0.018), (45, 0.077), (46, 0.269), (47, -0.11), (48, -0.074), (49, -0.116)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93780875 253 nips-2012-On Triangular versus Edge Representations --- Towards Scalable Modeling of Networks

Author: Qirong Ho, Junming Yin, Eric P. Xing

2 0.76673257 298 nips-2012-Scalable Inference of Overlapping Communities

Author: Prem Gopalan, Sean Gerrish, Michael Freedman, David M. Blei, David M. Mimno

3 0.63570064 346 nips-2012-Topology Constraints in Graphical Models

Author: Marcelo Fiori, Pablo Musé, Guillermo Sapiro

4 0.61631244 132 nips-2012-Fiedler Random Fields: A Large-Scale Spectral Approach to Statistical Network Modeling

Author: Antonino Freno, Mikaela Keller, Marc Tommasi

Abstract: Statistical models for networks have been typically committed to strong prior assumptions concerning the form of the modeled distributions. Moreover, the vast majority of currently available models are explicitly designed for capturing some speciﬁc graph properties (such as power-law degree distributions), which makes them unsuitable for application to domains where the behavior of the target quantities is not known a priori. The key contribution of this paper is twofold. First, we introduce the Fiedler delta statistic, based on the Laplacian spectrum of graphs, which allows to dispense with any parametric assumption concerning the modeled network properties. Second, we use the deﬁned statistic to develop the Fiedler random ﬁeld model, which allows for efﬁcient estimation of edge distributions over large-scale random networks. After analyzing the dependence structure involved in Fiedler random ﬁelds, we estimate them over several real-world networks, showing that they achieve a much higher modeling accuracy than other well-known statistical approaches.

5 0.5842222 194 nips-2012-Learning to Discover Social Circles in Ego Networks

Author: Jure Leskovec, Julian J. Mcauley

6 0.5410623 345 nips-2012-Topic-Partitioned Multinetwork Embeddings

7 0.44253501 137 nips-2012-From Deformations to Parts: Motion-based Segmentation of 3D Objects

8 0.43097687 309 nips-2012-Semi-supervised Eigenvectors for Locally-biased Learning

9 0.3845199 337 nips-2012-The Lovász ϑ function, SVMs and finding large dense subgraphs

10 0.37797722 10 nips-2012-A Linear Time Active Learning Algorithm for Link Classification

11 0.36986747 157 nips-2012-Identification of Recurrent Patterns in the Activation of Brain Networks

12 0.35608673 327 nips-2012-Structured Learning of Gaussian Graphical Models

13 0.34221524 103 nips-2012-Distributed Probabilistic Learning for Camera Networks with Missing Data

14 0.3386251 302 nips-2012-Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization

15 0.33498541 59 nips-2012-Bayesian nonparametric models for bipartite graphs

16 0.32601902 154 nips-2012-How They Vote: Issue-Adjusted Models of Legislative Behavior

17 0.32231745 165 nips-2012-Iterative ranking from pair-wise comparisons

18 0.31938204 182 nips-2012-Learning Networks of Heterogeneous Influence

19 0.30896029 287 nips-2012-Random function priors for exchangeable arrays with applications to graphs and relational data

20 0.30237702 152 nips-2012-Homeostatic plasticity in Bayesian spiking networks as Expectation Maximization with posterior constraints

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.033), (11, 0.012), (21, 0.038), (38, 0.074), (39, 0.013), (42, 0.02), (53, 0.013), (54, 0.014), (55, 0.03), (59, 0.3), (68, 0.013), (74, 0.083), (76, 0.109), (80, 0.088), (92, 0.052)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.73924184 253 nips-2012-On Triangular versus Edge Representations --- Towards Scalable Modeling of Networks

Author: Qirong Ho, Junming Yin, Eric P. Xing

2 0.69843137 235 nips-2012-Natural Images, Gaussian Mixtures and Dead Leaves

Author: Daniel Zoran, Yair Weiss

Abstract: Simple Gaussian Mixture Models (GMMs) learned from pixels of natural image patches have been recently shown to be surprisingly strong performers in modeling the statistics of natural images. Here we provide an in depth analysis of this simple yet rich model. We show that such a GMM model is able to compete with even the most successful models of natural images in log likelihood scores, denoising performance and sample quality. We provide an analysis of what such a model learns from natural images as a function of number of mixture components including covariance structure, contrast variation and intricate structures such as textures, boundaries and more. Finally, we show that the salient properties of the GMM learned from natural images can be derived from a simplified Dead Leaves model which explicitly models occlusion, explaining its surprising success relative to other models. 1 GMMs and natural image statistics models Many models for the statistics of natural image patches have been suggested in recent years. Finding good models for natural images is important to many different research areas - computer vision, biological vision and neuroscience among others. Recently, there has been a growing interest in comparing different aspects of models for natural images such as log-likelihood and multi-information reduction performance, and much progress has been achieved [1,2, 3,4,5, 6]. Out of these results there is one which is particularly interesting: simple, unconstrained Gaussian Mixture Models (GMMs) with a relatively small number of mixture components learned from image patches are extraordinarily good in modeling image statistics [6, 4]. This is a surprising result due to the simplicity of GMMs and their ubiquity. Another surprising aspect of this result is that many of the current models may be thought of as GMMs with an exponential or infinite number of components, having different constraints on the covariance structure of the mixture components. In this work we study the nature of GMMs learned from natural image patches. We start with a thorough comparison to some popular and cutting edge image models. We show that indeed, GMMs are excellent performers in modeling natural image patches. We then analyze what properties of natural images these GMMs capture, their dependence on the number of components in the mixture and their relation to the structure of the world around us. Finally, we show that the learned GMM suggests a strong connection between natural image statistics and a simple variant of the dead leaves model [7, 8] , explicitly modeling occlusions and explaining some of the success of GMMs in modeling natural images. 1 3.5 .,...- ••.......-.-.. -..---'-. 1 ~~6\8161·· -.. .-.. --...--.-- ---..-.- -. --------------MII+··+ilIl ..... .. . . ~ '[25 . . . ---- ] B'II 1_ -- ~2 ;t:: fI 1 - --- ,---- ._.. : 61.5 ..... '

3 0.63073891 265 nips-2012-Parametric Local Metric Learning for Nearest Neighbor Classification

Author: Jun Wang, Alexandros Kalousis, Adam Woznica

Abstract: We study the problem of learning local metrics for nearest neighbor classiﬁcation. Most previous works on local metric learning learn a number of local unrelated metrics. While this ”independence” approach delivers an increased ﬂexibility its downside is the considerable risk of overﬁtting. We present a new parametric local metric learning method in which we learn a smooth metric matrix function over the data manifold. Using an approximation error bound of the metric matrix function we learn local metrics as linear combinations of basis metrics deﬁned on anchor points over different regions of the instance space. We constrain the metric matrix function by imposing on the linear combinations manifold regularization which makes the learned metric matrix function vary smoothly along the geodesics of the data manifold. Our metric learning method has excellent performance both in terms of predictive power and scalability. We experimented with several largescale classiﬁcation problems, tens of thousands of instances, and compared it with several state of the art metric learning methods, both global and local, as well as to SVM with automatic kernel selection, all of which it outperforms in a signiﬁcant manner. 1

4 0.59657991 272 nips-2012-Practical Bayesian Optimization of Machine Learning Algorithms

Author: Jasper Snoek, Hugo Larochelle, Ryan P. Adams

Abstract: The use of machine learning algorithms frequently involves careful tuning of learning parameters and model hyperparameters. Unfortunately, this tuning is often a “black art” requiring expert experience, rules of thumb, or sometimes bruteforce search. There is therefore great appeal for automatic approaches that can optimize the performance of any given learning algorithm to the problem at hand. In this work, we consider this problem through the framework of Bayesian optimization, in which a learning algorithm’s generalization performance is modeled as a sample from a Gaussian process (GP). We show that certain choices for the nature of the GP, such as the type of kernel and the treatment of its hyperparameters, can play a crucial role in obtaining a good optimizer that can achieve expertlevel performance. We describe new algorithms that take into account the variable cost (duration) of learning algorithm experiments and that can leverage the presence of multiple cores for parallel experimentation. We show that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms including latent Dirichlet allocation, structured SVMs and convolutional neural networks. 1

5 0.59000158 294 nips-2012-Repulsive Mixtures

Author: Francesca Petralia, Vinayak Rao, David B. Dunson

Abstract: Discrete mixtures are used routinely in broad sweeping applications ranging from unsupervised settings to fully supervised multi-task learning. Indeed, ﬁnite mixtures and inﬁnite mixtures, relying on Dirichlet processes and modiﬁcations, have become a standard tool. One important issue that arises in using discrete mixtures is low separation in the components; in particular, different components can be introduced that are very similar and hence redundant. Such redundancy leads to too many clusters that are too similar, degrading performance in unsupervised learning and leading to computational problems and an unnecessarily complex model in supervised settings. Redundancy can arise in the absence of a penalty on components placed close together even when a Bayesian approach is used to learn the number of components. To solve this problem, we propose a novel prior that generates components from a repulsive process, automatically penalizing redundant components. We characterize this repulsive prior theoretically and propose a Markov chain Monte Carlo sampling algorithm for posterior computation. The methods are illustrated using synthetic examples and an iris data set. Key Words: Bayesian nonparametrics; Dirichlet process; Gaussian mixture model; Model-based clustering; Repulsive point process; Well separated mixture. 1

6 0.58175802 338 nips-2012-The Perturbed Variation

7 0.56182027 298 nips-2012-Scalable Inference of Overlapping Communities

8 0.52716452 274 nips-2012-Priors for Diversity in Generative Latent Variable Models

9 0.52660578 168 nips-2012-Kernel Latent SVM for Visual Recognition

10 0.51944631 210 nips-2012-Memorability of Image Regions

11 0.51908767 197 nips-2012-Learning with Recursive Perceptual Representations

12 0.51898718 229 nips-2012-Multimodal Learning with Deep Boltzmann Machines

13 0.5177896 193 nips-2012-Learning to Align from Scratch

14 0.51469916 8 nips-2012-A Generative Model for Parts-based Object Segmentation

15 0.5136835 101 nips-2012-Discriminatively Trained Sparse Code Gradients for Contour Detection

16 0.51354063 260 nips-2012-Online Sum-Product Computation Over Trees

17 0.5133177 176 nips-2012-Learning Image Descriptors with the Boosting-Trick

18 0.5123049 3 nips-2012-A Bayesian Approach for Policy Learning from Trajectory Preference Queries

19 0.51150757 92 nips-2012-Deep Representations and Codes for Image Auto-Annotation

20 0.51051891 201 nips-2012-Localizing 3D cuboids in single-view images