nips nips2012 nips2012-192 knowledge-graph by maker-knowledge-mining

192 nips-2012-Learning the Dependency Structure of Latent Factors


Source: pdf

Author: Yunlong He, Yanjun Qi, Koray Kavukcuoglu, Haesun Park

Abstract: In this paper, we study latent factor models with dependency structure in the latent space. We propose a general learning framework which induces sparsity on the undirected graphical model imposed on the vector of latent factors. A novel latent factor model SLFA is then proposed as a matrix factorization problem with a special regularization term that encourages collaborative reconstruction. The main benefit (novelty) of the model is that we can simultaneously learn the lowerdimensional representation for data and model the pairwise relationships between latent factors explicitly. An on-line learning algorithm is devised to make the model feasible for large-scale learning problems. Experimental results on two synthetic data and two real-world data sets demonstrate that pairwise relationships and latent factors learned by our model provide a more structured way of exploring high-dimensional data, and the learned representations achieve the state-of-the-art classification performance. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 com Abstract In this paper, we study latent factor models with dependency structure in the latent space. [sent-6, score-0.518]

2 We propose a general learning framework which induces sparsity on the undirected graphical model imposed on the vector of latent factors. [sent-7, score-0.303]

3 A novel latent factor model SLFA is then proposed as a matrix factorization problem with a special regularization term that encourages collaborative reconstruction. [sent-8, score-0.435]

4 The main benefit (novelty) of the model is that we can simultaneously learn the lowerdimensional representation for data and model the pairwise relationships between latent factors explicitly. [sent-9, score-0.522]

5 To enable the efficient processing of large data collections, latent factor models (LFMs) have been proposed to find concise descriptions of the members of a data collection. [sent-13, score-0.305]

6 A random vector x ∈ RM is assumed to be generated by a linear combination of a set of basis vectors, i. [sent-14, score-0.104]

7 , BK ] stores the set of unknown basis vectors and “factor” si (i ∈ {1, . [sent-19, score-0.158]

8 The i-th In this paper, we consider the problem of learning hidden dependency structure of latent factors in complex data sets. [sent-24, score-0.408]

9 Our goal includes two main aspects: (1) to learn the interpretable lowerdimensional representations hidden in a set of data samples, and (2) to simultaneously model the pairwise interaction of latent factors. [sent-25, score-0.417]

10 The statistical structure captured by LFM methods, such as Principal Component Analysis (PCA) are limited in interpretability, due to their anti-correlation assumption on the latent factors. [sent-27, score-0.222]

11 For example, when a face image is represented as a linear super-position of PCA bases with uncorrelated coefficients learned by PCA, there exist complex cancellations between the basis images [14]. [sent-28, score-0.231]

12 Methods that theoretically assume independence of components like ICA [10] or sparse coding [15] fail to generate independent representations in practice. [sent-29, score-0.103]

13 1 Instead of imposing this unrealistic assumption, more recent works [18, 25, 27] propose to allow correlated latent factors, which shows to be helpful in obtaining better performance on various tasks. [sent-33, score-0.24]

14 Particularly, the sparse structure of the latent factor network is often preferred but has been never been explicitly explored in the learning process [2, 8, 23]. [sent-37, score-0.34]

15 For example, when mining the enormous on-line news-text documents, a method discovering semantically meaningful latent topics and a concise graph connecting the topics will greatly assist intelligent browsing, organizing and accessing of these documents. [sent-38, score-0.4]

16 The main contribution in this paper is a general LFM method that models the pairwise relationships between latent factors by sparse graphical models. [sent-39, score-0.573]

17 By introducing a generalized Tikhonov regularization, we enforce the interaction of latent factors to have an influence on learning latent factors and basis vectors. [sent-40, score-0.867]

18 As a result, we learn meaningful latent factors and simultaneously obtain a graph where the nodes represent hidden groups and the edges represent their pairwise relationships. [sent-41, score-0.516]

19 This graphical representation helps us analyze collections of complex data samples in a much more structured and organized way. [sent-42, score-0.09]

20 The latent representations of data samples obtained from our model capture deeper signals hidden in the data which produce the useful features for discriminative task and in-depth analysis, e. [sent-43, score-0.273]

21 To learn the hidden factors for generating x, the natural parameter η is assumed to be represented by a linear combination of basis vectors, i. [sent-51, score-0.291]

22 Let G = (V, E) denote a graph with K nodes, corresponding to the K latent factors {s1 , . [sent-59, score-0.393]

23 (5) Since θij = 0 indicates that latent factor si and latent factor sj are conditionally independent given other latent factors, the graph G presents an illustrative view of the statistical dependencies between latent factors. [sent-63, score-1.093]

24 With such a hierarchical and flexible model, there would be significant risk of over-fitting, especially when we consider all possible interactions between K latent factors. [sent-64, score-0.245]

25 , x(N ) }, the Maximum a Posteriori (MAP) estimates of the basis matrix B, the latent factors in S = [s(1) , . [sent-74, score-0.5]

26 , s(N ) ] and the parameters {µ, Θ} of the latent factor network are therefore the solution of the following problem: 1 min {− log h(x(i) ) + A(Bs(i) ) − s(i) B T (x(i) )} B,S,Θ N i 1 1 1 µ S1N + tr(S ΘS) + ρ Θ N 2N 2 ≤ 1, k = 1, . [sent-77, score-0.275]

27 Subproblem (9) is easy to solve for real-valued s(i) but generally hard when the latent factors only admit discrete values. [sent-89, score-0.359]

28 The subproblem (11) is minimizing the sum of a differentiable convex function and an L1 regularization term, for which a few recently developed methods can be very efficient, such as variants of ADMM [6]. [sent-91, score-0.123]

29 (8) when x follows a multivariate normal distribution and s follows a sparse Gaussian graphical model (SGGM). [sent-94, score-0.125]

30 We name our model under this default setting as “structured latent factor analysis” (SLFA) and compare 1 it to related works. [sent-95, score-0.275]

31 Assume p(x|η) = (2π)−M/2 exp(− 2σ2 x − η 2 ) and s ∼ N (µ, Φ−1 ), with 3 sparse precision matrix Φ (inverse covariance). [sent-96, score-0.19]

32 If Φi,j > 0, minimizing the objective function will avoid si and sj to be simultaneously large, and we say the i-th factor and the j-th factor are negatively related. [sent-110, score-0.244]

33 If Φi,j < 0, the solution is likely to have si and sj of the same sign, and we say the i-th factor and the j-th factor are positively related. [sent-111, score-0.215]

34 If Φi,j = 0, the regularization doesn’t induce interaction between si and sj in the objective function. [sent-112, score-0.155]

35 Therefore, this regularization term makes SLFA produce a collaborative reconstruction based on the conditional dependencies between latent factors. [sent-113, score-0.299]

36 On one hand, the collaborative nature makes SLFA capture deeper statistical structure hidden in the data set, compared to the matrix factorization problem with the Tikhonov regularization S 2 or sparse F coding with the sparsity-inducing regularization such as S 1 . [sent-114, score-0.333]

37 On the other hand, SLFA encourages sparse interactions which is very different from previous works such as correlated topic Model [2] and latent Gaussian model [18], where the latent factors are densely related. [sent-115, score-0.748]

38 As summarized in Algorithm 1, at each iteration, we randomly fetch a mini-batch of observations simultaneously, compute their latent factor vector s. [sent-121, score-0.275]

39 Then the latent factor vectors are used to update the basis matrix B in stochastic gradient descent fashion with projections on the constraint set. [sent-122, score-0.439]

40 , x(N ) ], initial guess of basis matrix B, initial precision matrix Φ = I, number of iterations T , parameters σ 2 and ρ, step-size γ, mini-batch size N . [sent-128, score-0.266]

41 – Compute the latent factor vectors Sbatch = (B B + σ 2 Φ)−1 Xbatch . [sent-133, score-0.298]

42 – Update the basis matrix B using a gradient descent step: γ B ← B − N [BSbatch − Xbatch ]Sbatch . [sent-134, score-0.141]

43 – Solve the subproblem (13) to update the sparse inverse covariance matrix Φ using all available latent factor vectors in S. [sent-138, score-0.498]

44 A large ρ will result in a diagonal precision matrix Φ, indicating that the latent factors are conditionally independent. [sent-140, score-0.484]

45 Alternatively, for visual analysis of latent factors, we can select multiple values of ρ to obtain Φ with desired sparsity. [sent-150, score-0.222]

46 Relationship to Sparse Gaussian Graphical Model: We can also see SLFA as a generalization of sparse Gaussian graphical model. [sent-151, score-0.125]

47 In this sense, SLFA could be seen as the sparse Gaussian graphical model of s = Wx, i. [sent-155, score-0.125]

48 A few recent efforts [3, 24] also combined the model of SGGM and with latent factor models. [sent-158, score-0.275]

49 Different from our SLFA, these methods still aim at modeling the interaction between the original features and doesn’t consider interaction in the latent factor space. [sent-160, score-0.365]

50 Instead, SLFA is a hierarchical model and the learned pairwise relationships are on the latent factor level. [sent-161, score-0.397]

51 SLFA, however, can dramatically reduce the problem to learning a 50 × 50 sparse precision matrix and the corresponding graph of 50 nodes. [sent-163, score-0.224]

52 Intuitively, sparse coding based works (such as [7]) try to remove the redundancy in the representation of data while SLFA encourages a (sparse) collaborative reconstruction of the data from the latent bases. [sent-167, score-0.383]

53 [12] proposed a method that can learn latent factors with given tree structure. [sent-169, score-0.381]

54 (14), but uses a different regularization term which imposes the overlapped group sparsity of factors. [sent-173, score-0.088]

55 Differently, SLFA can learn a more general graphical structure among latent factors and doesn’t assume that data sample maps to a sparse combination of basis vectors. [sent-174, score-0.61]

56 The model of SLFA has similar hierarchy with correlated topic model [2] and latent Gaussian model [18]. [sent-175, score-0.275]

57 Besides the key difference of sparsity, SLFA directly use precision matrix to learn latent factor networks while the other two works learn the covariance matrix by Bayesian methods. [sent-176, score-0.501]

58 1 Synthetic Data I: Four Different Graphical Relationships The first experiment uses randomly generated synthetic data with different graphical structures of latent factors. [sent-179, score-0.304]

59 It aims to test if SLFA can find true latent factors and the true relationships among latent factors and to study the effect of the parameter ρ on the results. [sent-180, score-0.807]

60 We use four special cases of Sparse Gaussian Graphical Model to generate the latent factors. [sent-181, score-0.242]

61 The underlying graph is either a ring, a grid, a tree or a random sparse graph, which are shown in Figure 1. [sent-182, score-0.099]

62 2 0 2 4 −log2(rho) 6 (g) F-score (tree) 8 0 −2 0 2 4 −log2(rho) 6 8 (h) F-score (random) Figure 1: Recovering structured latent factors from data. [sent-199, score-0.389]

63 On the upper row are four different underlying graphical model of latent factors. [sent-200, score-0.302]

64 Red edge means the two latent factors are positively related (Φ∗ < 0), blue edge implies the two latent factors are negatively related (Φ∗ > 0). [sent-201, score-0.81]

65 We compare SLFA to other four methods for learning the basis matrix B and the precision matrix Φ from the data. [sent-221, score-0.286]

66 The first one is NMF, where we learn nonnegative basis B from the data and then learn the sparse precision matrix Φ for the corresponding factor vectors (non nonnegative constraint on factors) by SGGM. [sent-222, score-0.414]

67 The second one is an ideal case where we have the “oracle” of the true basis B∗ , then after fit the data to be true basis we learn the sparse precision matrix Φ by SGGM. [sent-223, score-0.468]

68 In all cases except the oracle method, we have a non-convex problem so that after we obtain the learned basis vectors we use Hungarian algorithm to align them to with the true basis vectors based on the cosine similarity. [sent-226, score-0.333]

69 We compute the precision and recall rates for recovering the relationship between latent factors by comparing the learned Φ with the true precision matrix Φ∗ . [sent-227, score-0.691]

70 We can see that for all four cases, our proposed method SLFA is as good as the “oracle” method at recovering the pairwise relationship between latent factors. [sent-232, score-0.352]

71 NMF most probably fails to find the right basis since it does consider any higher level information about the interactions between basis elements, hence SGGM can’t find meaningful relationship between the factors obtained from NMF. [sent-233, score-0.406]

72 Since latent factors have dense interactions in L2 version of SLFA, combining it with a postprocessing by SGGM improves the performance significantly, however it still performs worse compared to SLFA. [sent-235, score-0.382]

73 This experiment also confirms that the idea of performing an integrated learning of the bases together with a regularized precision matrix is essential for recovering the true structure in the data. [sent-236, score-0.231]

74 2 Synthetic Data II: Parts-based Images The second experiment also utilizes a simulated data set based on images to compare SLFA with popular latent factor models. [sent-238, score-0.311]

75 For Φ(i, j) > 0, Bi and Bj are negatively related (exclusive), for Φ(i, j) < 0, Bi and Bj are positively related (supportive). [sent-254, score-0.092]

76 of which is essentially a linear combination of five latent parts shown in Figure 2a. [sent-255, score-0.222]

77 Given 37 basis images, we first randomly select one of the five big circles as the body of the “bugs”. [sent-256, score-0.136]

78 Each shape of body is associated with four positions where the legs of the bug is located. [sent-257, score-0.16]

79 We combine the selected five latent parts with random coefficients that are sampled from the uniform distribution and multiplied by −1 with probability 0. [sent-260, score-0.222]

80 Finally, we add a randomly selected basis with small random coefficients plus Gaussian random noise to the image to introduce the noise and confusion in the data set. [sent-262, score-0.104]

81 The generating process (Figure 2b) indicates positive relationship between one type of body and its associates legs, as well as negative relationship between the pair of circle and square that is located at the same position. [sent-264, score-0.108]

82 Using SLFA and other two baseline algorithms, PCA and NMF, we learn a set of latent bases and compare the result of three methods in Figures 2e. [sent-265, score-0.302]

83 We can see that the basis images generated by SLFA is almost exactly same as the true latent bases. [sent-266, score-0.386]

84 This is due to the fact that SLFA accounts for the sparse interaction between factors in the joint optimization problem and encourages collaborative reconstruction. [sent-267, score-0.305]

85 NMF basis (shown in supplementary material due to space considerations) in this case also turns out to be similar to true basis, however, one can still observe that many components contain mixed structures since it can not capture the true data generation process. [sent-268, score-0.152]

86 More importantly, SLFA provides the convenience of analyzing the relationship between the bases using the precision matrix Φ. [sent-271, score-0.221]

87 In Figure 2d, we analyze the relational structure learned in the precision matrix Φ. [sent-272, score-0.158]

88 The most negatively related (exclusive) pairs (the i and j entries with highest positive entries in Φ) are circular and square legs which conforms fully to the generation process, since only one of them is chosen for any given location. [sent-273, score-0.115]

89 Accordingly, the most positively related pairs are a body shape and one of its associated legs since every bug has a body and four legs with fixed positions. [sent-274, score-0.303]

90 In figure 3, we plot a graph of topics (standing-alone topics removed) with positive interaction between each other and present the top 5 keywords for each topic. [sent-280, score-0.193]

91 Each edge corresponds to a negative element in the sparse precision matrix Φ. [sent-308, score-0.19]

92 This data set contains the gene expression values of 8, 141 genes for 295 breast cancer tumor samples. [sent-324, score-0.126]

93 Lasso-overlapped-group, which is a logistic regression approach with the graph-guided sparsity enforced, uses a known biological network as the graphical (overlapped group) regularization on the lasso regression. [sent-327, score-0.182]

94 Gene groups that usually correspond to biological processes or pathways, exhibit diverse pairwise dependency relationships among each other. [sent-336, score-0.139]

95 SLFA discovers these relationships while learning the latent representation of each data sample at the same time. [sent-337, score-0.263]

96 The learned structural information and latent gene groups also get confirmed by the biological function analysis in supplementary document. [sent-339, score-0.332]

97 5 Conclusion In this paper we have introduced a novel structured latent factor model that simultaneously learns latent factors and their pairwise relationships. [sent-340, score-0.737]

98 The learned sparse interaction between latent factors is crucial for understanding complex data sets and to visually analyze them. [sent-342, score-0.502]

99 SLFA model is also a hierarchical extension of Sparse Gaussian Graphical Model by generalizing the application of precision matrix from the original variable space to the latent factor space and optimizing the bases together with the precision matrix simultaneously. [sent-343, score-0.583]

100 : Exponential family sparse coding with applications to selftaught learning. [sent-412, score-0.103]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('slfa', 0.841), ('latent', 0.222), ('sggm', 0.184), ('factors', 0.137), ('basis', 0.104), ('precision', 0.088), ('bs', 0.086), ('nmf', 0.082), ('subproblem', 0.078), ('legs', 0.067), ('sparse', 0.065), ('graphical', 0.06), ('bases', 0.058), ('bic', 0.058), ('topics', 0.057), ('factor', 0.053), ('pairwise', 0.048), ('negatively', 0.048), ('gene', 0.048), ('pca', 0.047), ('bk', 0.047), ('xbatch', 0.046), ('interaction', 0.045), ('regularization', 0.045), ('positively', 0.044), ('relationships', 0.041), ('rho', 0.041), ('bug', 0.041), ('coding', 0.038), ('relationship', 0.038), ('matrix', 0.037), ('images', 0.036), ('topic', 0.035), ('exclusive', 0.035), ('cancer', 0.035), ('sj', 0.034), ('graph', 0.034), ('learned', 0.033), ('det', 0.032), ('collaborative', 0.032), ('body', 0.032), ('glasso', 0.031), ('tikhonov', 0.031), ('doesn', 0.031), ('bugs', 0.031), ('lfm', 0.031), ('obd', 0.031), ('sbatch', 0.031), ('yanjun', 0.031), ('si', 0.031), ('structured', 0.03), ('concise', 0.03), ('biological', 0.029), ('jenatton', 0.029), ('microarray', 0.029), ('ij', 0.028), ('hidden', 0.028), ('scaled', 0.027), ('lowerdimensional', 0.027), ('lasso', 0.027), ('totally', 0.027), ('encourages', 0.026), ('gaussian', 0.026), ('schmidt', 0.025), ('simultaneously', 0.025), ('obs', 0.025), ('koray', 0.025), ('arxiv', 0.025), ('kronecker', 0.024), ('true', 0.024), ('recovering', 0.024), ('deeper', 0.023), ('tumor', 0.023), ('georgia', 0.023), ('tr', 0.023), ('vectors', 0.023), ('interactions', 0.023), ('learn', 0.022), ('overlapped', 0.022), ('nec', 0.022), ('ring', 0.022), ('oracle', 0.022), ('svm', 0.022), ('synthetic', 0.022), ('bj', 0.022), ('sparsity', 0.021), ('dependency', 0.021), ('score', 0.021), ('factorization', 0.02), ('four', 0.02), ('bi', 0.02), ('controller', 0.02), ('breast', 0.02), ('raina', 0.02), ('covariance', 0.02), ('hsieh', 0.019), ('preprint', 0.018), ('correlated', 0.018), ('ss', 0.018), ('labs', 0.018)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999976 192 nips-2012-Learning the Dependency Structure of Latent Factors

Author: Yunlong He, Yanjun Qi, Koray Kavukcuoglu, Haesun Park

Abstract: In this paper, we study latent factor models with dependency structure in the latent space. We propose a general learning framework which induces sparsity on the undirected graphical model imposed on the vector of latent factors. A novel latent factor model SLFA is then proposed as a matrix factorization problem with a special regularization term that encourages collaborative reconstruction. The main benefit (novelty) of the model is that we can simultaneously learn the lowerdimensional representation for data and model the pairwise relationships between latent factors explicitly. An on-line learning algorithm is devised to make the model feasible for large-scale learning problems. Experimental results on two synthetic data and two real-world data sets demonstrate that pairwise relationships and latent factors learned by our model provide a more structured way of exploring high-dimensional data, and the learned representations achieve the state-of-the-art classification performance. 1

2 0.12149269 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs

Author: Anima Anandkumar, Ragupathyraj Valluvan

Abstract: Graphical model selection refers to the problem of estimating the unknown graph structure given observations at the nodes in the model. We consider a challenging instance of this problem when some of the nodes are latent or hidden. We characterize conditions for tractable graph estimation and develop efficient methods with provable guarantees. We consider the class of Ising models Markov on locally tree-like graphs, which are in the regime of correlation decay. We propose an efficient method for graph estimation, and establish its structural consistency −δη(η+1)−2 when the number of samples n scales as n = Ω(θmin log p), where θmin is the minimum edge potential, δ is the depth (i.e., distance from a hidden node to the nearest observed nodes), and η is a parameter which depends on the minimum and maximum node and edge potentials in the Ising model. The proposed method is practical to implement and provides flexibility to control the number of latent variables and the cycle lengths in the output graph. We also present necessary conditions for graph estimation by any method and show that our method nearly matches the lower bound on sample requirements. Keywords: Graphical model selection, latent variables, quartet methods, locally tree-like graphs. 1

3 0.085782617 124 nips-2012-Factorial LDA: Sparse Multi-Dimensional Text Models

Author: Michael Paul, Mark Dredze

Abstract: Latent variable models can be enriched with a multi-dimensional structure to consider the many latent factors in a text corpus, such as topic, author perspective and sentiment. We introduce factorial LDA, a multi-dimensional model in which a document is influenced by K different factors, and each word token depends on a K-dimensional vector of latent variables. Our model incorporates structured word priors and learns a sparse product of factors. Experiments on research abstracts show that our model can learn latent factors such as research topic, scientific discipline, and focus (methods vs. applications). Our modeling improvements reduce test perplexity and improve human interpretability of the discovered factors. 1

4 0.077814169 19 nips-2012-A Spectral Algorithm for Latent Dirichlet Allocation

Author: Anima Anandkumar, Yi-kai Liu, Daniel J. Hsu, Dean P. Foster, Sham M. Kakade

Abstract: Topic modeling is a generalization of clustering that posits that observations (words in a document) are generated by multiple latent factors (topics), as opposed to just one. This increased representational power comes at the cost of a more challenging unsupervised learning problem of estimating the topic-word distributions when only words are observed, and the topics are hidden. This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of topic models, including Latent Dirichlet Allocation (LDA). For LDA, the procedure correctly recovers both the topic-word distributions and the parameters of the Dirichlet prior over the topic mixtures, using only trigram statistics (i.e., third order moments, which may be estimated with documents containing just three words). The method, called Excess Correlation Analysis, is based on a spectral decomposition of low-order moments via two singular value decompositions (SVDs). Moreover, the algorithm is scalable, since the SVDs are carried out only on k × k matrices, where k is the number of latent factors (topics) and is typically much smaller than the dimension of the observation (word) space. 1

5 0.075414129 168 nips-2012-Kernel Latent SVM for Visual Recognition

Author: Weilong Yang, Yang Wang, Arash Vahdat, Greg Mori

Abstract: Latent SVMs (LSVMs) are a class of powerful tools that have been successfully applied to many applications in computer vision. However, a limitation of LSVMs is that they rely on linear models. For many computer vision tasks, linear models are suboptimal and nonlinear models learned with kernels typically perform much better. Therefore it is desirable to develop the kernel version of LSVM. In this paper, we propose kernel latent SVM (KLSVM) – a new learning framework that combines latent SVMs and kernel methods. We develop an iterative training algorithm to learn the model parameters. We demonstrate the effectiveness of KLSVM using three different applications in visual recognition. Our KLSVM formulation is very general and can be applied to solve a wide range of applications in computer vision and machine learning. 1

6 0.074677527 70 nips-2012-Clustering by Nonnegative Matrix Factorization Using Graph Random Walk

7 0.073114716 326 nips-2012-Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses

8 0.072515294 138 nips-2012-Fully Bayesian inference for neural models with negative-binomial spiking

9 0.068493821 246 nips-2012-Nonparametric Max-Margin Matrix Factorization for Collaborative Prediction

10 0.066130735 166 nips-2012-Joint Modeling of a Matrix with Associated Text via Latent Binary Features

11 0.065424494 312 nips-2012-Simultaneously Leveraging Output and Task Structures for Multiple-Output Regression

12 0.064340688 164 nips-2012-Iterative Thresholding Algorithm for Sparse Inverse Covariance Estimation

13 0.063432336 274 nips-2012-Priors for Diversity in Generative Latent Variable Models

14 0.063089669 147 nips-2012-Graphical Models via Generalized Linear Models

15 0.061987057 327 nips-2012-Structured Learning of Gaussian Graphical Models

16 0.061614972 6 nips-2012-A Convex Formulation for Learning Scale-Free Networks via Submodular Relaxation

17 0.060493052 89 nips-2012-Coupling Nonparametric Mixtures via Latent Dirichlet Processes

18 0.05869738 77 nips-2012-Complex Inference in Neural Circuits with Probabilistic Population Codes and Topic Models

19 0.057244264 104 nips-2012-Dual-Space Analysis of the Sparse Linear Model

20 0.055971142 105 nips-2012-Dynamic Pruning of Factor Graphs for Maximum Marginal Prediction


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.175), (1, 0.059), (2, -0.029), (3, -0.034), (4, -0.059), (5, 0.009), (6, 0.0), (7, -0.066), (8, 0.001), (9, -0.008), (10, 0.03), (11, 0.05), (12, 0.008), (13, 0.04), (14, 0.036), (15, 0.058), (16, 0.091), (17, 0.051), (18, 0.023), (19, 0.017), (20, -0.011), (21, -0.001), (22, -0.039), (23, -0.052), (24, 0.016), (25, -0.017), (26, 0.057), (27, -0.059), (28, -0.006), (29, 0.055), (30, -0.066), (31, 0.015), (32, 0.013), (33, 0.003), (34, -0.008), (35, -0.041), (36, 0.01), (37, 0.06), (38, 0.099), (39, -0.083), (40, 0.012), (41, -0.03), (42, 0.054), (43, -0.032), (44, -0.007), (45, 0.104), (46, -0.07), (47, 0.058), (48, 0.006), (49, 0.005)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93741351 192 nips-2012-Learning the Dependency Structure of Latent Factors

Author: Yunlong He, Yanjun Qi, Koray Kavukcuoglu, Haesun Park

Abstract: In this paper, we study latent factor models with dependency structure in the latent space. We propose a general learning framework which induces sparsity on the undirected graphical model imposed on the vector of latent factors. A novel latent factor model SLFA is then proposed as a matrix factorization problem with a special regularization term that encourages collaborative reconstruction. The main benefit (novelty) of the model is that we can simultaneously learn the lowerdimensional representation for data and model the pairwise relationships between latent factors explicitly. An on-line learning algorithm is devised to make the model feasible for large-scale learning problems. Experimental results on two synthetic data and two real-world data sets demonstrate that pairwise relationships and latent factors learned by our model provide a more structured way of exploring high-dimensional data, and the learned representations achieve the state-of-the-art classification performance. 1

2 0.67376643 166 nips-2012-Joint Modeling of a Matrix with Associated Text via Latent Binary Features

Author: Xianxing Zhang, Lawrence Carin

Abstract: A new methodology is developed for joint analysis of a matrix and accompanying documents, with the documents associated with the matrix rows/columns. The documents are modeled with a focused topic model, inferring interpretable latent binary features for each document. A new matrix decomposition is developed, with latent binary features associated with the rows/columns, and with imposition of a low-rank constraint. The matrix decomposition and topic model are coupled by sharing the latent binary feature vectors associated with each. The model is applied to roll-call data, with the associated documents defined by the legislation. Advantages of the proposed model are demonstrated for prediction of votes on a new piece of legislation, based only on the observed text of legislation. The coupling of the text and legislation is also shown to yield insight into the properties of the matrix decomposition for roll-call data. 1

3 0.66379458 54 nips-2012-Bayesian Probabilistic Co-Subspace Addition

Author: Lei Shi

Abstract: For modeling data matrices, this paper introduces Probabilistic Co-Subspace Addition (PCSA) model by simultaneously capturing the dependent structures among both rows and columns. Briefly, PCSA assumes that each entry of a matrix is generated by the additive combination of the linear mappings of two low-dimensional features, which distribute in the row-wise and column-wise latent subspaces respectively. In consequence, PCSA captures the dependencies among entries intricately, and is able to handle non-Gaussian and heteroscedastic densities. By formulating the posterior updating into the task of solving Sylvester equations, we propose an efficient variational inference algorithm. Furthermore, PCSA is extended to tackling and filling missing values, to adapting model sparseness, and to modelling tensor data. In comparison with several state-of-art methods, experiments demonstrate the effectiveness and efficiency of Bayesian (sparse) PCSA on modeling matrix (tensor) data and filling missing values.

4 0.65936321 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs

Author: Anima Anandkumar, Ragupathyraj Valluvan

Abstract: Graphical model selection refers to the problem of estimating the unknown graph structure given observations at the nodes in the model. We consider a challenging instance of this problem when some of the nodes are latent or hidden. We characterize conditions for tractable graph estimation and develop efficient methods with provable guarantees. We consider the class of Ising models Markov on locally tree-like graphs, which are in the regime of correlation decay. We propose an efficient method for graph estimation, and establish its structural consistency −δη(η+1)−2 when the number of samples n scales as n = Ω(θmin log p), where θmin is the minimum edge potential, δ is the depth (i.e., distance from a hidden node to the nearest observed nodes), and η is a parameter which depends on the minimum and maximum node and edge potentials in the Ising model. The proposed method is practical to implement and provides flexibility to control the number of latent variables and the cycle lengths in the output graph. We also present necessary conditions for graph estimation by any method and show that our method nearly matches the lower bound on sample requirements. Keywords: Graphical model selection, latent variables, quartet methods, locally tree-like graphs. 1

5 0.64164543 22 nips-2012-A latent factor model for highly multi-relational data

Author: Rodolphe Jenatton, Nicolas L. Roux, Antoine Bordes, Guillaume R. Obozinski

Abstract: Many data such as social networks, movie preferences or knowledge bases are multi-relational, in that they describe multiple relations between entities. While there is a large body of work focused on modeling these data, modeling these multiple types of relations jointly remains challenging. Further, existing approaches tend to breakdown when the number of these types grows. In this paper, we propose a method for modeling large multi-relational datasets, with possibly thousands of relations. Our model is based on a bilinear structure, which captures various orders of interaction of the data, and also shares sparse latent factors across different relations. We illustrate the performance of our approach on standard tensor-factorization datasets where we attain, or outperform, state-of-the-art results. Finally, a NLP application demonstrates our scalability and the ability of our model to learn efficient and semantically meaningful verb representations. 1

6 0.63606215 124 nips-2012-Factorial LDA: Sparse Multi-Dimensional Text Models

7 0.63026083 19 nips-2012-A Spectral Algorithm for Latent Dirichlet Allocation

8 0.62329072 246 nips-2012-Nonparametric Max-Margin Matrix Factorization for Collaborative Prediction

9 0.59680748 287 nips-2012-Random function priors for exchangeable arrays with applications to graphs and relational data

10 0.58704954 365 nips-2012-Why MCA? Nonlinear sparse coding with spike-and-slab prior for neurally plausible image encoding

11 0.58702004 52 nips-2012-Bayesian Nonparametric Modeling of Suicide Attempts

12 0.55489618 220 nips-2012-Monte Carlo Methods for Maximum Margin Supervised Topic Models

13 0.55170625 312 nips-2012-Simultaneously Leveraging Output and Task Structures for Multiple-Output Regression

14 0.54626483 326 nips-2012-Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses

15 0.54509372 327 nips-2012-Structured Learning of Gaussian Graphical Models

16 0.54426783 180 nips-2012-Learning Mixtures of Tree Graphical Models

17 0.53672045 89 nips-2012-Coupling Nonparametric Mixtures via Latent Dirichlet Processes

18 0.5349921 317 nips-2012-Smooth-projected Neighborhood Pursuit for High-dimensional Nonparanormal Graph Estimation

19 0.53163248 346 nips-2012-Topology Constraints in Graphical Models

20 0.52648145 66 nips-2012-Causal discovery with scale-mixture model for spatiotemporal variance dependencies


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.356), (17, 0.017), (21, 0.03), (36, 0.013), (38, 0.11), (42, 0.019), (54, 0.024), (55, 0.022), (74, 0.047), (76, 0.136), (80, 0.076), (92, 0.045)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.95385814 124 nips-2012-Factorial LDA: Sparse Multi-Dimensional Text Models

Author: Michael Paul, Mark Dredze

Abstract: Latent variable models can be enriched with a multi-dimensional structure to consider the many latent factors in a text corpus, such as topic, author perspective and sentiment. We introduce factorial LDA, a multi-dimensional model in which a document is influenced by K different factors, and each word token depends on a K-dimensional vector of latent variables. Our model incorporates structured word priors and learns a sparse product of factors. Experiments on research abstracts show that our model can learn latent factors such as research topic, scientific discipline, and focus (methods vs. applications). Our modeling improvements reduce test perplexity and improve human interpretability of the discovered factors. 1

2 0.92304689 191 nips-2012-Learning the Architecture of Sum-Product Networks Using Clustering on Variables

Author: Aaron Dennis, Dan Ventura

Abstract: The sum-product network (SPN) is a recently-proposed deep model consisting of a network of sum and product nodes, and has been shown to be competitive with state-of-the-art deep models on certain difficult tasks such as image completion. Designing an SPN network architecture that is suitable for the task at hand is an open question. We propose an algorithm for learning the SPN architecture from data. The idea is to cluster variables (as opposed to data instances) in order to identify variable subsets that strongly interact with one another. Nodes in the SPN network are then allocated towards explaining these interactions. Experimental evidence shows that learning the SPN architecture significantly improves its performance compared to using a previously-proposed static architecture. 1

3 0.88369435 233 nips-2012-Multiresolution Gaussian Processes

Author: David B. Dunson, Emily B. Fox

Abstract: We propose a multiresolution Gaussian process to capture long-range, nonMarkovian dependencies while allowing for abrupt changes and non-stationarity. The multiresolution GP hierarchically couples a collection of smooth GPs, each defined over an element of a random nested partition. Long-range dependencies are captured by the top-level GP while the partition points define the abrupt changes. Due to the inherent conjugacy of the GPs, one can analytically marginalize the GPs and compute the marginal likelihood of the observations given the partition tree. This property allows for efficient inference of the partition itself, for which we employ graph-theoretic techniques. We apply the multiresolution GP to the analysis of magnetoencephalography (MEG) recordings of brain activity.

4 0.87949806 270 nips-2012-Phoneme Classification using Constrained Variational Gaussian Process Dynamical System

Author: Hyunsin Park, Sungrack Yun, Sanghyuk Park, Jongmin Kim, Chang D. Yoo

Abstract: For phoneme classification, this paper describes an acoustic model based on the variational Gaussian process dynamical system (VGPDS). The nonlinear and nonparametric acoustic model is adopted to overcome the limitations of classical hidden Markov models (HMMs) in modeling speech. The Gaussian process prior on the dynamics and emission functions respectively enable the complex dynamic structure and long-range dependency of speech to be better represented than that by an HMM. In addition, a variance constraint in the VGPDS is introduced to eliminate the sparse approximation error in the kernel matrix. The effectiveness of the proposed model is demonstrated with three experimental results, including parameter estimation and classification performance, on the synthetic and benchmark datasets. 1

same-paper 5 0.87210667 192 nips-2012-Learning the Dependency Structure of Latent Factors

Author: Yunlong He, Yanjun Qi, Koray Kavukcuoglu, Haesun Park

Abstract: In this paper, we study latent factor models with dependency structure in the latent space. We propose a general learning framework which induces sparsity on the undirected graphical model imposed on the vector of latent factors. A novel latent factor model SLFA is then proposed as a matrix factorization problem with a special regularization term that encourages collaborative reconstruction. The main benefit (novelty) of the model is that we can simultaneously learn the lowerdimensional representation for data and model the pairwise relationships between latent factors explicitly. An on-line learning algorithm is devised to make the model feasible for large-scale learning problems. Experimental results on two synthetic data and two real-world data sets demonstrate that pairwise relationships and latent factors learned by our model provide a more structured way of exploring high-dimensional data, and the learned representations achieve the state-of-the-art classification performance. 1

6 0.86990565 282 nips-2012-Proximal Newton-type methods for convex optimization

7 0.82706177 7 nips-2012-A Divide-and-Conquer Method for Sparse Inverse Covariance Estimation

8 0.78050435 12 nips-2012-A Neural Autoregressive Topic Model

9 0.77355498 332 nips-2012-Symmetric Correspondence Topic Models for Multilingual Text Analysis

10 0.73291957 342 nips-2012-The variational hierarchical EM algorithm for clustering hidden Markov models

11 0.68910807 354 nips-2012-Truly Nonparametric Online Variational Inference for Hierarchical Dirichlet Processes

12 0.67502248 19 nips-2012-A Spectral Algorithm for Latent Dirichlet Allocation

13 0.67299652 47 nips-2012-Augment-and-Conquer Negative Binomial Processes

14 0.6706292 78 nips-2012-Compressive Sensing MRI with Wavelet Tree Sparsity

15 0.66313016 166 nips-2012-Joint Modeling of a Matrix with Associated Text via Latent Binary Features

16 0.66190588 72 nips-2012-Cocktail Party Processing via Structured Prediction

17 0.66150957 69 nips-2012-Clustering Sparse Graphs

18 0.66099 104 nips-2012-Dual-Space Analysis of the Sparse Linear Model

19 0.65734577 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs

20 0.65720654 150 nips-2012-Hierarchical spike coding of sound