nips nips2012 nips2012-59 knowledge-graph by maker-knowledge-mining

59 nips-2012-Bayesian nonparametric models for bipartite graphs


Source: pdf

Author: Francois Caron

Abstract: We develop a novel Bayesian nonparametric model for random bipartite graphs. The model is based on the theory of completely random measures and is able to handle a potentially infinite number of nodes. We show that the model has appealing properties and in particular it may exhibit a power-law behavior. We derive a posterior characterization, a generative process for network growth, and a simple Gibbs sampler for posterior simulation. Our model is shown to be well fitted to several real-world social networks. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Bayesian nonparametric models for bipartite graphs Francois Caron ¸ INRIA IMB - University of Bordeaux Talence, France Francois. [sent-1, score-0.203]

2 fr Abstract We develop a novel Bayesian nonparametric model for random bipartite graphs. [sent-3, score-0.179]

3 We derive a posterior characterization, a generative process for network growth, and a simple Gibbs sampler for posterior simulation. [sent-6, score-0.221]

4 In this article, we shall focus on bipartite networks, also known as twomode, affiliation or collaboration networks [16, 17]. [sent-10, score-0.201]

5 In bipartite networks, items are divided into two different types A and B, and only connections between items of different types are allowed. [sent-11, score-0.219]

6 Following the readers-books example, we will refer to items of type A as readers and items of type B as books. [sent-13, score-0.336]

7 An example of bipartite graph is shown on Figure 1(b). [sent-14, score-0.165]

8 An important summarizing quantity of a bipartite graph is the degree distribution of readers (resp. [sent-15, score-0.619]

9 The degree of a vertex in a network is the number of edges connected to that vertex. [sent-17, score-0.202]

10 A bipartite graph can be represented by a set of binary variables (zij ) where zij = 1 if reader i has read book j, 0 otherwise. [sent-19, score-0.653]

11 In many situations, the number of available books may be very large and potentially unknown. [sent-20, score-0.392]

12 In this case, a Bayesian nonparametric (BNP) approach can be sensible, by assuming that the pool of books is infinite. [sent-21, score-0.426]

13 To formalize this framework, it will then be convenient to represent the bipartite graph by a collection of atomic measures Zi , i = 1, . [sent-22, score-0.196]

14 , n with ∞ Zi = zij δθj (1) j=1 where {θj } is the set of books and typically Zi only has a finite set of non-zero zij corresponding to books reader i has read. [sent-25, score-1.226]

15 The so-called Indian Buffet Process (IBP) is a simple generative process for the conditional distribution of Zi given Z1 , . [sent-27, score-0.155]

16 Such process can be constructed by considering that the binary measures Zi are i. [sent-31, score-0.093]

17 from some random measure drawn from a beta process [19, 10]. [sent-34, score-0.132]

18 Teh and Gor¨ r [18] proposed a three-parameter extension of the IBP, named stable IBP, that enables to u 1 model a power-law behavior for the degree distribution of books. [sent-36, score-0.274]

19 Although more flexible, the stable IBP still induces a Poissonian distribution for the degree of readers. [sent-37, score-0.25]

20 In this paper, we propose a novel Bayesian nonparametric model for bipartite graphs that addresses some of the limitations of the stable IBP, while retaining computational tractability. [sent-38, score-0.261]

21 We assume that each book j is assigned a positive popularity parameter wj > 0. [sent-39, score-0.422]

22 Similarly, each reader i is assigned a positive parameter γi which represents its ability to read books. [sent-41, score-0.217]

23 The higher γi , the more books the reader i is willing to read. [sent-42, score-0.516]

24 Given the weights wj and γi , reader i reads book j with probability 1 − exp(−γi wj ). [sent-43, score-0.796]

25 We will consider that the weights wj and/or γi are the points of a Poisson process with a given L´ vy measure. [sent-44, score-0.444]

26 We show that depending on the choice of the L´ vy e e measure, a power-law behavior can be obtained for the degree distribution of books and/or readers. [sent-45, score-0.71]

27 Moreover, using a set of suitably chosen latent variables, we can derive a generative process for network growth, and an efficient Gibbs sampler for approximate inference. [sent-46, score-0.238]

28 We provide illustrations of the fit of the proposed model on several real-world bipartite social networks. [sent-47, score-0.206]

29 1 Statistical Model Completely Random Measures We first provide a brief overview of completely random measures (CRM) [12, 13] before describing the BNP model for bipartite graphs in Section 2. [sent-50, score-0.221]

30 CRM can be decomposed into a sum of three independent parts: a non-random measure, a countable collection of atoms with fixed locations, and a countable collection of atoms with randoms masses at random locations. [sent-60, score-0.183]

31 The mean measure Λ of this Poisson process is known as the L´ vy measure. [sent-68, score-0.185]

32 We will assume in the following that the L´ vy measure decomposes as a product e e of two non-atomic densities, i. [sent-69, score-0.123]

33 from h, while the masses are distributed according to a Poisson process ∞ over R+ with mean intensity λ. [sent-75, score-0.173]

34 We will further assume that the total mass G(Θ) = j=1 wj is positive and finite with probability one, which is guaranteed if the following conditions are satisfied ∞ ∞ λ(w)dw = ∞ (1 − exp(−w))λ(w)dw < ∞ and 0 (2) 0 and note g(x) its probability density function evaluated at x. [sent-76, score-0.28]

35 We will refer to λ as the L´ vy intensity e in the following, and to h as the base density of G, and write G ∼ CRM(λ, h). [sent-77, score-0.144]

36 5) and the stable process (τ = 0) as special cases. [sent-79, score-0.12]

37 2 A Bayesian nonparametric model for bipartite graphs Let G ∼ CRM(λ, h) where λ satisfies conditions (2). [sent-84, score-0.203]

38 A draw G takes the form ∞ G= wj δθj (6) j=1 where {θj } is the set of books and {wj } the set of popularity parameters of books. [sent-85, score-0.702]

39 , n, let consider the latent exponential process ∞ Vi = vij δθj (7) j=1 defined for j = 1, . [sent-89, score-0.195]

40 , ∞ by vij |wj ∼ Exp(wj γi ) where Exp(a) denotes the exponential distribution of rate a. [sent-92, score-0.103]

41 We then define the binary process Zi conditionally on Vi by ∞ Zi = zij δθj zij = 1 if vij < 1 zij = 0 otherwise with j=1 (8) By integrating out the latent variables vij we clearly have p(zij = 1|wj , γi ) = 1 − exp(−γi wj ). [sent-94, score-1.059]

42 Hence, the total mass Zi (Θ) = j=1 zij , which corresponds to the total number of books read by reader i is finite with probability one and admits a Poisson(ψλ (γi )) distribution, where ψλ (z) is defined in Equation (3), while the locations ∗ θj are i. [sent-99, score-0.792]

43 As an example, for the gamma process we have Zi (Θ) ∼ Poisson α log 1 + γi . [sent-104, score-0.212]

44 τ It will be useful in the following to introduce a censored version of the latent process Vi , defined by ∞ Ui = uij δθj (9) j=1 where uij = min(vij , 1), for i = 1, . [sent-105, score-0.705]

45 5 a Gibbs sampler for our model, that both rely on the introduction of these latent variables. [sent-131, score-0.094]

46 We write Ki = Zi (Θ) = j=1 zij the degree n n of reader i (number of books read by reader i) and mj = i=1 Zi ({θj }) = i=1 zij the degree of book j (number of people having read book j). [sent-136, score-1.849]

47 Un |G) = γi ij wj ij exp (−γi wj uij ) exp (−γi G(Θ\{θ1 , . [sent-143, score-0.955]

48 , θK }))   i=1 j=1   n K K γi i = i=1 n m  j=1 wj j exp −wj n γi (uij − 1)  exp − γi G(Θ) (10) i=1 i=1 1 In the case where γi = γ, it is possible to derive P (Z1 , . [sent-146, score-0.38]

49 Un ) = K exp −ψλ γi i=1 n h(θj )κ mj , i=1 j=1 γi uij (11) i=1 where ψλ and κ are resp. [sent-160, score-0.467]

50 Proposition 3 The conditional distribution of G given the latent processes U1 , . [sent-165, score-0.123]

51 Un can be expressed as K G = G∗ + wj δθj (12) j=1 where G∗ and (wj ) are mutually independent with n G∗ ∼ CRM(λ∗ , h) λ∗ (w) = λ(w) exp −w γi (13) i=1 and the masses are m P (wj |rest) = n λ(wj )wj j exp (−wj i=1 γi Uij ) n κ(mj , i=1 γi uij ) (14) Proof. [sent-168, score-0.744]

52 In the case of the GGP, G∗ is still a GGP of parameters (α∗ = α, σ ∗ = σ, τ ∗ = τ + the wj ’s are conditionally gamma distributed, i. [sent-170, score-0.457]

53 n i=1 γi ), while n wj |rest ∼ Gamma mj − σ, τ + γi uij i=1 Corollary 4 The predictive distribution of Zn+1 given the latent processes U1 , . [sent-172, score-0.799]

54 , Un is given by K ∗ Zn+1 = Zn+1 + zn+1,j δθj j=1 ∗ where the zn+1,j are independent of Zn+1 with n κ(mj , τ + γn+1 + i=1 γi uij ) n κ(mj , τ + i=1 γi uij ) ∗ where Ber is the Bernoulli distribution and Zn+1 is a homogeneous Poisson process over Θ of intensity measure ψλ∗ (γn+1 ) h(θ). [sent-175, score-0.756]

55 γi uij Finally, we consider the distribution of un+1,j |zn+1,j = 1, u1:n,j . [sent-177, score-0.318]

56 This is given by n p(un+1,j |zn+1,j = 1, u1:n,j ) ∝ κ(mj + 1, un+1,j γn+1 + γi uij )1un+1,j ∈[0,1] (15) i=1 In supplementary material, we show how to sample from this distribution by the inverse cdf method for the GGP. [sent-178, score-0.347]

57 B1 B2 (a) B6 B7 (b) Figure 1: Illustration of the generative process described in Section 2. [sent-188, score-0.111]

58 4 A generative process In this section we describe the generative process for Zi given (U1 , . [sent-191, score-0.222]

59 This reinforcement process, where popular books will be more likely to be picked, is reminiscent of the generative process for the beta-Bernoulli process, popularized under the name of the Indian buffet process [8]. [sent-195, score-0.632]

60 Let xij = − log(uij ) ≥ 0 be latent positive scores. [sent-196, score-0.08]

61 Consider a set of n readers who successively enter into a library with an infinite number of books. [sent-197, score-0.284]

62 The first reader picks a number K1 ∼ Poisson(ψλ (γ1 )) books. [sent-202, score-0.124]

63 Now consider that reader i enters into the library, and knows about the books read by previous readers and their scores. [sent-204, score-0.871]

64 Let K be the total number of books chosen by the previous i − 1 readers, and mj the number of times each of the K books has been read. [sent-205, score-0.906]

65 This generative process is illustrated in Figure 1 together with the underlying bipartite graph . [sent-211, score-0.276]

66 In Figure 2 are represented draws from this generative process with a GGP with parameters γi = 2 for all i, τ = 1, and different values for α and σ. [sent-212, score-0.111]

67 5 Gibbs sampling From the results derived in Proposition 3, a Gibbs sampler can be easily derived to approximate the posterior distribution P (G, U |Z). [sent-214, score-0.082]

68 , K, set uij = 1 if zij = 0, otherwise sample uij |zij , wj , γi ∼ rExp(γi wj , 1) where rExp(λ, a) is the right-truncated exponential distribution of pdf λ exp(−λx)/(1 − exp(−λa))1x∈[0,a] from which we can sample exactly. [sent-223, score-1.332]

69 , K, sample n wj |U, γi ∼ Gamma mj − σ, τ + γi uij i=1 n and the total mass G∗ (Θ) follows a distribution g ∗ (w) ∝ g(w) exp (−w i=1 γi ) where g(w) is the distribution of G(Θ). [sent-227, score-0.793]

70 In the case of the GGP, g ∗ (w) is an exponentially tilted stable distribution for which exact samplers exist [4]. [sent-228, score-0.113]

71 In the particular case of the gamma process, we have the simple n update G∗ (Θ) ∼ Gamma (α, τ + i=1 γi ) . [sent-229, score-0.15]

72 9 Figure 2: Realisations from the generative process of Section 2. [sent-233, score-0.111]

73 We may also go a step further and consider that there is an infinite number of readers with weights γi associated to a given CRM Γ ∼ CRM(λγ , hγ ) and a measurable space of readers Θ. [sent-237, score-0.547]

74 This provides a lot of flexibility in the modelling of the distribution of the degree of readers, allowing in particular to obtain a power-law behavior, as shown in Section 5. [sent-239, score-0.192]

75 We focus here on the case where Γ is drawn from a generalized gamma process of parameters (αγ , σγ , τγ ) for n simplicity. [sent-240, score-0.212]

76 , n,   K γi |G, U ∼ Gamma  K wj uij + G∗ (Θ) zij − σγ , τ + j=1 j=1 K j=1 and Γ∗ ∼ CRM(λ∗ , hγ ) with λ∗ (γ) = λγ (γ) exp −γ γ γ wj + G∗ (Θ) . [sent-244, score-1.064]

77 , K n γi uij + Γ∗ (Θ) wj |U, Γ ∼ Gamma mj − σ, τ + i=1 ∗ ∗ n ∗ ∗ and G ∼ CRM(λ , h) with λ (w) = λ(w) exp −w . [sent-248, score-0.747]

78 For the scale parameter α of the GGP, we can assign a gamma prior α ∼ Gamma(aα , bα ) and update it with α|γ ∼ n ∗ Gamma aα + K, bα + ψλ i=1 γi + Γ (Θ) using a Metropolis-Hastings step. [sent-250, score-0.15]

79 The total number of books read by n readers is O(nσ ). [sent-254, score-0.747]

80 Moreover, for σ > 0, the degree distribution follows a power-law distribution: asymptotically, the proportion of books read by m readers is O(m−1−σ ) (details in supplementary material). [sent-255, score-0.968]

81 However, in our case, a similar behavior can be obtained for the degree distribution of readers when assigning a GGP to it, while it will always be Poisson for the stable IBP. [sent-257, score-0.536]

82 The stable beta process [18] is a particular case of our construction, obtained by setting weights γi = γ and L´ vy measure e λ(w) = α Γ(1 + c) γ(1 − e−γw )−σ−1 e−γw(c+σ) Γ(1 − σ)Γ(c + σ) (16) The proof is obtained by a change of variable from the L´ vy measure of the stable beta process. [sent-259, score-0.522]

83 In a more general setting, we may want γi to depend on a set of metadata associated to reader i. [sent-265, score-0.124]

84 Inference for latent factor models is described in supplementary material. [sent-266, score-0.082]

85 5 Illustrations on real-world social networks We now consider estimating the parameters of our model and evaluating its predictive performance on six bipartite social networks of various sizes. [sent-267, score-0.271]

86 The dataset ‘Boards’ contains information about members of the boards of Norwegian companies sitting at the same board in August 20112 . [sent-269, score-0.188]

87 ‘Forum’ is a forum network about web users contributing to the same forums3 . [sent-270, score-0.127]

88 ‘Books’ concerns data collected from the Book-Crossing community about users providing ratings on books4 where we extracted the bipartite network from the ratings. [sent-271, score-0.208]

89 ‘Movielens100k’ contains information about users rating particular movies5 from which we extracted the bipartite network. [sent-273, score-0.175]

90 php Data for the forum and citation datasets can be downloaded from http://toreopsahl. [sent-312, score-0.099]

91 set containing 3/4 of the readers and a test set with the remaining. [sent-326, score-0.262]

92 Poissonian degree distribution for readers and power-law degree distribution for books. [sent-334, score-0.646]

93 Both methods perform better solely on the Board dataset, where the Poisson assumption on the number of people sitting on the same board makes sense. [sent-335, score-0.127]

94 give the empirical degree distributions of the test network and a draw from the estimated models, for the IMDB dataset and the Books dataset. [sent-339, score-0.224]

95 It is clearly seen that the four models are able to capture the power-law behavior of the degree distribution of actors (Figure 3(e-h)) or books (Figure 4(e-h)). [sent-340, score-0.664]

96 However, only IG and GGP are able to capture the power-law behavior of the degree distribution of movies (Figure 3(a-d)) or readers (Figure 4(a-d)). [sent-341, score-0.499]

97 Random variate generation for exponentially and polynomially tilted stable distributions. [sent-360, score-0.09]

98 Infinite latent feature models and the Indian buffet process. [sent-393, score-0.12]

99 Nonparametric bayes estimators based on beta processes in models for life history data. [sent-403, score-0.075]

100 Random graphs with arbitrary degree distributions and their applications. [sent-438, score-0.193]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('ggp', 0.422), ('books', 0.392), ('uij', 0.295), ('wj', 0.28), ('readers', 0.262), ('crm', 0.179), ('ibp', 0.172), ('degree', 0.169), ('zij', 0.159), ('gamma', 0.15), ('bipartite', 0.145), ('ig', 0.135), ('zn', 0.132), ('reader', 0.124), ('mj', 0.122), ('un', 0.112), ('book', 0.112), ('poisson', 0.11), ('vy', 0.102), ('read', 0.093), ('zi', 0.088), ('vij', 0.08), ('masses', 0.069), ('buffet', 0.067), ('dw', 0.067), ('imdb', 0.065), ('board', 0.064), ('forum', 0.064), ('process', 0.062), ('indian', 0.061), ('boards', 0.06), ('stable', 0.058), ('actors', 0.056), ('sg', 0.056), ('latent', 0.053), ('exp', 0.05), ('generative', 0.049), ('beta', 0.049), ('bnp', 0.049), ('gs', 0.044), ('sitting', 0.042), ('intensity', 0.042), ('sampler', 0.041), ('items', 0.037), ('poissonian', 0.037), ('rexp', 0.037), ('ukj', 0.037), ('downloaded', 0.035), ('nonparametric', 0.034), ('gibbs', 0.034), ('social', 0.034), ('ber', 0.034), ('ths', 0.033), ('network', 0.033), ('crms', 0.032), ('palm', 0.032), ('tilted', 0.032), ('measures', 0.031), ('popularity', 0.03), ('grif', 0.03), ('citations', 0.03), ('users', 0.03), ('supplementary', 0.029), ('networks', 0.029), ('atoms', 0.029), ('countable', 0.028), ('caron', 0.028), ('tw', 0.028), ('ui', 0.027), ('conditionally', 0.027), ('illustrations', 0.027), ('collaboration', 0.027), ('xij', 0.027), ('material', 0.026), ('processes', 0.026), ('ou', 0.025), ('graphs', 0.024), ('behavior', 0.024), ('ki', 0.024), ('locations', 0.024), ('distribution', 0.023), ('measurable', 0.023), ('formula', 0.023), ('newman', 0.022), ('sensible', 0.022), ('reading', 0.022), ('dataset', 0.022), ('successively', 0.022), ('people', 0.021), ('movies', 0.021), ('growth', 0.021), ('measure', 0.021), ('completely', 0.021), ('conditional', 0.021), ('graph', 0.02), ('proposition', 0.019), ('bayesian', 0.019), ('posterior', 0.018), ('movie', 0.018), ('homogeneous', 0.018)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 59 nips-2012-Bayesian nonparametric models for bipartite graphs

Author: Francois Caron

Abstract: We develop a novel Bayesian nonparametric model for random bipartite graphs. The model is based on the theory of completely random measures and is able to handle a potentially infinite number of nodes. We show that the model has appealing properties and in particular it may exhibit a power-law behavior. We derive a posterior characterization, a generative process for network growth, and a simple Gibbs sampler for posterior simulation. Our model is shown to be well fitted to several real-world social networks. 1

2 0.16452946 60 nips-2012-Bayesian nonparametric models for ranked data

Author: Francois Caron, Yee W. Teh

Abstract: We develop a Bayesian nonparametric extension of the popular Plackett-Luce choice model that can handle an infinite number of choice items. Our framework is based on the theory of random atomic measures, with the prior specified by a gamma process. We derive a posterior characterization and a simple and effective Gibbs sampler for posterior simulation. We develop a time-varying extension of our model, and apply it to the New York Times lists of weekly bestselling books. 1

3 0.1252206 315 nips-2012-Slice sampling normalized kernel-weighted completely random measure mixture models

Author: Nicholas Foti, Sinead Williamson

Abstract: A number of dependent nonparametric processes have been proposed to model non-stationary data with unknown latent dimensionality. However, the inference algorithms are often slow and unwieldy, and are in general highly specific to a given model formulation. In this paper, we describe a large class of dependent nonparametric processes, including several existing models, and present a slice sampler that allows efficient inference across this class of models. 1

4 0.11650282 287 nips-2012-Random function priors for exchangeable arrays with applications to graphs and relational data

Author: James Lloyd, Peter Orbanz, Zoubin Ghahramani, Daniel M. Roy

Abstract: A fundamental problem in the analysis of structured relational data like graphs, networks, databases, and matrices is to extract a summary of the common structure underlying relations between individual entities. Relational data are typically encoded in the form of arrays; invariance to the ordering of rows and columns corresponds to exchangeable arrays. Results in probability theory due to Aldous, Hoover and Kallenberg show that exchangeable arrays can be represented in terms of a random measurable function which constitutes the natural model parameter in a Bayesian model. We obtain a flexible yet simple Bayesian nonparametric model by placing a Gaussian process prior on the parameter function. Efficient inference utilises elliptical slice sampling combined with a random sparse approximation to the Gaussian process. We demonstrate applications of the model to network data and clarify its relation to models in the literature, several of which emerge as special cases. 1

5 0.098225035 244 nips-2012-Nonconvex Penalization Using Laplace Exponents and Concave Conjugates

Author: Zhihua Zhang, Bojun Tu

Abstract: In this paper we study sparsity-inducing nonconvex penalty functions using L´ vy e processes. We define such a penalty as the Laplace exponent of a subordinator. Accordingly, we propose a novel approach for the construction of sparsityinducing nonconvex penalties. Particularly, we show that the nonconvex logarithmic (LOG) and exponential (EXP) penalty functions are the Laplace exponents of Gamma and compound Poisson subordinators, respectively. Additionally, we explore the concave conjugate of nonconvex penalties. We find that the LOG and EXP penalties are the concave conjugates of negative Kullback-Leiber (KL) distance functions. Furthermore, the relationship between these two penalties is due to asymmetricity of the KL distance. 1

6 0.089341417 73 nips-2012-Coding efficiency and detectability of rate fluctuations with non-Poisson neuronal firing

7 0.084728517 165 nips-2012-Iterative ranking from pair-wise comparisons

8 0.08426322 47 nips-2012-Augment-and-Conquer Negative Binomial Processes

9 0.080900289 355 nips-2012-Truncation-free Online Variational Inference for Bayesian Nonparametric Models

10 0.076891527 75 nips-2012-Collaborative Ranking With 17 Parameters

11 0.075874202 52 nips-2012-Bayesian Nonparametric Modeling of Suicide Attempts

12 0.066124268 111 nips-2012-Efficient Sampling for Bipartite Matching Problems

13 0.066002019 246 nips-2012-Nonparametric Max-Margin Matrix Factorization for Collaborative Prediction

14 0.061721724 138 nips-2012-Fully Bayesian inference for neural models with negative-binomial spiking

15 0.058429606 342 nips-2012-The variational hierarchical EM algorithm for clustering hidden Markov models

16 0.050765183 6 nips-2012-A Convex Formulation for Learning Scale-Free Networks via Submodular Relaxation

17 0.048826184 316 nips-2012-Small-Variance Asymptotics for Exponential Family Dirichlet Process Mixture Models

18 0.048212569 274 nips-2012-Priors for Diversity in Generative Latent Variable Models

19 0.048011806 131 nips-2012-Feature Clustering for Accelerating Parallel Coordinate Descent

20 0.046081383 166 nips-2012-Joint Modeling of a Matrix with Associated Text via Latent Binary Features


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.12), (1, 0.046), (2, 0.008), (3, 0.005), (4, -0.139), (5, -0.035), (6, 0.003), (7, 0.038), (8, 0.062), (9, 0.074), (10, -0.039), (11, 0.054), (12, -0.048), (13, -0.047), (14, 0.056), (15, -0.032), (16, 0.009), (17, -0.067), (18, 0.004), (19, -0.002), (20, 0.051), (21, -0.045), (22, 0.06), (23, -0.053), (24, -0.035), (25, 0.036), (26, 0.006), (27, -0.052), (28, -0.092), (29, -0.017), (30, -0.116), (31, 0.09), (32, 0.002), (33, 0.046), (34, 0.058), (35, 0.031), (36, 0.09), (37, 0.005), (38, 0.015), (39, -0.013), (40, -0.212), (41, 0.033), (42, 0.007), (43, -0.077), (44, -0.069), (45, 0.007), (46, -0.047), (47, -0.081), (48, 0.027), (49, 0.042)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94104421 59 nips-2012-Bayesian nonparametric models for bipartite graphs

Author: Francois Caron

Abstract: We develop a novel Bayesian nonparametric model for random bipartite graphs. The model is based on the theory of completely random measures and is able to handle a potentially infinite number of nodes. We show that the model has appealing properties and in particular it may exhibit a power-law behavior. We derive a posterior characterization, a generative process for network growth, and a simple Gibbs sampler for posterior simulation. Our model is shown to be well fitted to several real-world social networks. 1

2 0.69622737 60 nips-2012-Bayesian nonparametric models for ranked data

Author: Francois Caron, Yee W. Teh

Abstract: We develop a Bayesian nonparametric extension of the popular Plackett-Luce choice model that can handle an infinite number of choice items. Our framework is based on the theory of random atomic measures, with the prior specified by a gamma process. We derive a posterior characterization and a simple and effective Gibbs sampler for posterior simulation. We develop a time-varying extension of our model, and apply it to the New York Times lists of weekly bestselling books. 1

3 0.64949393 315 nips-2012-Slice sampling normalized kernel-weighted completely random measure mixture models

Author: Nicholas Foti, Sinead Williamson

Abstract: A number of dependent nonparametric processes have been proposed to model non-stationary data with unknown latent dimensionality. However, the inference algorithms are often slow and unwieldy, and are in general highly specific to a given model formulation. In this paper, we describe a large class of dependent nonparametric processes, including several existing models, and present a slice sampler that allows efficient inference across this class of models. 1

4 0.64931637 244 nips-2012-Nonconvex Penalization Using Laplace Exponents and Concave Conjugates

Author: Zhihua Zhang, Bojun Tu

Abstract: In this paper we study sparsity-inducing nonconvex penalty functions using L´ vy e processes. We define such a penalty as the Laplace exponent of a subordinator. Accordingly, we propose a novel approach for the construction of sparsityinducing nonconvex penalties. Particularly, we show that the nonconvex logarithmic (LOG) and exponential (EXP) penalty functions are the Laplace exponents of Gamma and compound Poisson subordinators, respectively. Additionally, we explore the concave conjugate of nonconvex penalties. We find that the LOG and EXP penalties are the concave conjugates of negative Kullback-Leiber (KL) distance functions. Furthermore, the relationship between these two penalties is due to asymmetricity of the KL distance. 1

5 0.64541876 47 nips-2012-Augment-and-Conquer Negative Binomial Processes

Author: Mingyuan Zhou, Lawrence Carin

Abstract: By developing data augmentation methods unique to the negative binomial (NB) distribution, we unite seemingly disjoint count and mixture models under the NB process framework. We develop fundamental properties of the models and derive efficient Gibbs sampling inference. We show that the gamma-NB process can be reduced to the hierarchical Dirichlet process with normalization, highlighting its unique theoretical, structural and computational advantages. A variety of NB processes with distinct sharing mechanisms are constructed and applied to topic modeling, with connections to existing algorithms, showing the importance of inferring both the NB dispersion and probability parameters. 1

6 0.54654348 219 nips-2012-Modelling Reciprocating Relationships with Hawkes Processes

7 0.53524101 287 nips-2012-Random function priors for exchangeable arrays with applications to graphs and relational data

8 0.52743715 89 nips-2012-Coupling Nonparametric Mixtures via Latent Dirichlet Processes

9 0.43996054 246 nips-2012-Nonparametric Max-Margin Matrix Factorization for Collaborative Prediction

10 0.42648757 73 nips-2012-Coding efficiency and detectability of rate fluctuations with non-Poisson neuronal firing

11 0.40150803 54 nips-2012-Bayesian Probabilistic Co-Subspace Addition

12 0.39192548 107 nips-2012-Effective Split-Merge Monte Carlo Methods for Nonparametric Models of Sequential Data

13 0.38927367 52 nips-2012-Bayesian Nonparametric Modeling of Suicide Attempts

14 0.38699013 111 nips-2012-Efficient Sampling for Bipartite Matching Problems

15 0.37529206 138 nips-2012-Fully Bayesian inference for neural models with negative-binomial spiking

16 0.37287247 165 nips-2012-Iterative ranking from pair-wise comparisons

17 0.36811751 299 nips-2012-Scalable imputation of genetic data with a discrete fragmentation-coagulation process

18 0.3607592 57 nips-2012-Bayesian estimation of discrete entropy with mixtures of stick-breaking priors

19 0.35799378 253 nips-2012-On Triangular versus Edge Representations --- Towards Scalable Modeling of Networks

20 0.35417429 223 nips-2012-Multi-criteria Anomaly Detection using Pareto Depth Analysis


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.022), (21, 0.039), (38, 0.092), (39, 0.035), (42, 0.012), (54, 0.014), (55, 0.06), (74, 0.063), (76, 0.123), (80, 0.099), (84, 0.311), (92, 0.025)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.7409206 59 nips-2012-Bayesian nonparametric models for bipartite graphs

Author: Francois Caron

Abstract: We develop a novel Bayesian nonparametric model for random bipartite graphs. The model is based on the theory of completely random measures and is able to handle a potentially infinite number of nodes. We show that the model has appealing properties and in particular it may exhibit a power-law behavior. We derive a posterior characterization, a generative process for network growth, and a simple Gibbs sampler for posterior simulation. Our model is shown to be well fitted to several real-world social networks. 1

2 0.73648536 234 nips-2012-Multiresolution analysis on the symmetric group

Author: Risi Kondor, Walter Dempsey

Abstract: There is no generally accepted way to define wavelets on permutations. We address this issue by introducing the notion of coset based multiresolution analysis (CMRA) on the symmetric group, find the corresponding wavelet functions, and describe a fast wavelet transform for sparse signals. We discuss potential applications in ranking, sparse approximation, and multi-object tracking. 1

3 0.72085303 102 nips-2012-Distributed Non-Stochastic Experts

Author: Varun Kanade, Zhenming Liu, Bozidar Radunovic

Abstract: We consider the online distributed non-stochastic experts problem, where the distributed system consists of one coordinator node that is connected to k sites, and the sites are required to communicate with each other via the coordinator. At each time-step t, one of the k site nodes has to pick an expert from the set {1, . . . , n}, and the same site receives information about payoffs of all experts for that round. The goal of the distributed system is to minimize regret at time horizon T , while simultaneously keeping communication to a minimum. The two extreme solutions to this problem are: (i) Full communication: This essentially simulates the nondistributed setting to obtain the optimal O( log(n)T ) regret bound at the cost of T communication. (ii) No communication: Each site runs an independent copy – the regret is O( log(n)kT ) and the communication is 0. This paper shows the √ difficulty of simultaneously achieving regret asymptotically better than kT and communication better than T . We give a novel algorithm that for an oblivious √ adversary achieves a non-trivial trade-off: regret O( k 5(1+ )/6 T ) and communication O(T /k ), for any value of ∈ (0, 1/5). We also consider a variant of the model, where the coordinator picks the expert. In this model, we show that the label-efficient forecaster of Cesa-Bianchi et al. (2005) already gives us strategy that is near optimal in regret vs communication trade-off. 1

4 0.68704969 51 nips-2012-Bayesian Hierarchical Reinforcement Learning

Author: Feng Cao, Soumya Ray

Abstract: We describe an approach to incorporating Bayesian priors in the MAXQ framework for hierarchical reinforcement learning (HRL). We define priors on the primitive environment model and on task pseudo-rewards. Since models for composite tasks can be complex, we use a mixed model-based/model-free learning approach to find an optimal hierarchical policy. We show empirically that (i) our approach results in improved convergence over non-Bayesian baselines, (ii) using both task hierarchies and Bayesian priors is better than either alone, (iii) taking advantage of the task hierarchy reduces the computational cost of Bayesian reinforcement learning and (iv) in this framework, task pseudo-rewards can be learned instead of being manually specified, leading to hierarchically optimal rather than recursively optimal policies. 1

5 0.60795152 264 nips-2012-Optimal kernel choice for large-scale two-sample tests

Author: Arthur Gretton, Dino Sejdinovic, Heiko Strathmann, Sivaraman Balakrishnan, Massimiliano Pontil, Kenji Fukumizu, Bharath K. Sriperumbudur

Abstract: Given samples from distributions p and q, a two-sample test determines whether to reject the null hypothesis that p = q, based on the value of a test statistic measuring the distance between the samples. One choice of test statistic is the maximum mean discrepancy (MMD), which is a distance between embeddings of the probability distributions in a reproducing kernel Hilbert space. The kernel used in obtaining these embeddings is critical in ensuring the test has high power, and correctly distinguishes unlike distributions with high probability. A means of parameter selection for the two-sample test based on the MMD is proposed. For a given test level (an upper bound on the probability of making a Type I error), the kernel is chosen so as to maximize the test power, and minimize the probability of making a Type II error. The test statistic, test threshold, and optimization over the kernel parameters are obtained with cost linear in the sample size. These properties make the kernel selection and test procedures suited to data streams, where the observations cannot all be stored in memory. In experiments, the new kernel selection approach yields a more powerful test than earlier kernel selection heuristics.

6 0.53547561 215 nips-2012-Minimizing Uncertainty in Pipelines

7 0.53341973 197 nips-2012-Learning with Recursive Perceptual Representations

8 0.53330863 168 nips-2012-Kernel Latent SVM for Visual Recognition

9 0.53129566 193 nips-2012-Learning to Align from Scratch

10 0.53092456 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs

11 0.5307582 188 nips-2012-Learning from Distributions via Support Measure Machines

12 0.52958757 210 nips-2012-Memorability of Image Regions

13 0.5286299 211 nips-2012-Meta-Gaussian Information Bottleneck

14 0.52825838 232 nips-2012-Multiplicative Forests for Continuous-Time Processes

15 0.52817541 355 nips-2012-Truncation-free Online Variational Inference for Bayesian Nonparametric Models

16 0.52765298 279 nips-2012-Projection Retrieval for Classification

17 0.52742434 274 nips-2012-Priors for Diversity in Generative Latent Variable Models

18 0.52690983 229 nips-2012-Multimodal Learning with Deep Boltzmann Machines

19 0.52571279 83 nips-2012-Controlled Recognition Bounds for Visual Learning and Exploration

20 0.52459341 104 nips-2012-Dual-Space Analysis of the Sparse Linear Model