nips nips2012 nips2012-352 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Han Liu, Fang Han, Cun-hui Zhang
Abstract: We advocate the use of a new distribution family—the transelliptical—for robust inference of high dimensional graphical models. The transelliptical family is an extension of the nonparanormal family proposed by Liu et al. (2009). Just as the nonparanormal extends the normal by transforming the variables using univariate functions, the transelliptical extends the elliptical family in the same way. We propose a nonparametric rank-based regularization estimator which achieves the parametric rates of convergence for both graph recovery and parameter estimation. Such a result suggests that the extra robustness and flexibility obtained by the semiparametric transelliptical modeling incurs almost no efficiency loss. We also discuss the relationship between this work with the transelliptical component analysis proposed by Han and Liu (2012). 1
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract We advocate the use of a new distribution family—the transelliptical—for robust inference of high dimensional graphical models. [sent-5, score-0.176]
2 The transelliptical family is an extension of the nonparanormal family proposed by Liu et al. [sent-6, score-1.146]
3 Just as the nonparanormal extends the normal by transforming the variables using univariate functions, the transelliptical extends the elliptical family in the same way. [sent-8, score-1.44]
4 We propose a nonparametric rank-based regularization estimator which achieves the parametric rates of convergence for both graph recovery and parameter estimation. [sent-9, score-0.289]
5 Such a result suggests that the extra robustness and flexibility obtained by the semiparametric transelliptical modeling incurs almost no efficiency loss. [sent-10, score-0.762]
6 We also discuss the relationship between this work with the transelliptical component analysis proposed by Han and Liu (2012). [sent-11, score-0.741]
7 1 Introduction We consider the problem of learning high dimensional graphical models. [sent-12, score-0.115]
8 , Xd )T can be represented as an undirected graph denoted by G = (V, E), where V contains nodes corresponding to the d variables in X, and the edge set E describes the conditional independence relationship among X1 , . [sent-16, score-0.266]
9 / Most graph estimation methods rely on the Gaussian graphical models, in which the random vector X is assumed to be Gaussian: X ∼ Nd (µ, Σ). [sent-23, score-0.232]
10 Under this assumption, the graph G is encoded by the precision matrix Θ := Σ−1 . [sent-24, score-0.146]
11 More specifically, no edge connects Xj and Xk if and only if Θjk = 0. [sent-25, score-0.058]
12 In high dimensions where d n, [21] propose a neighborhood pursuit approach for estimating Gaussian graphical models by solving a collection of sparse regression problems using the Lasso [25, 3]. [sent-28, score-0.115]
13 [15, 14, 24] maximize the non-concave penalized likelihood to obtain an estimator with less bias than the traditional L1 -regularized estimator. [sent-31, score-0.086]
14 More recently, [29, 2] propose the graphical Dantzig selector and CLIME, which can be solved by linear programming and possess more favorable theoretical properties than the penalized likelihood approach. [sent-33, score-0.146]
15 1 Besides Gaussian models, [18] propose a semiparametric procedure named nonparanormal SKEP TIC which extends the Gaussian family to the more flexible semiparametric Gaussian copula family. [sent-34, score-0.512]
16 , fd , such that the transformed data f (X) := (f1 (X1 ), . [sent-38, score-0.27]
17 In another line of research, [26] extends the Gaussian graphical models to the elliptical graphical models. [sent-44, score-0.521]
18 However, for elliptical distributions, only the generalized partial correlation graph can be reliably estimated. [sent-45, score-0.533]
19 These graphs only represent the conditional uncorrelatedness, but conditional independence, among variables. [sent-46, score-0.125]
20 Therefore, by extending the Gaussian to the elliptical family, the gain in modeling flexibility is traded off with a loss in the strength of inference. [sent-47, score-0.267]
21 In a related work, [9] provide a latent variable interpretation of the generalized partial correlation graph for multivariate t-distributions. [sent-48, score-0.359]
22 However, the theoretical properties of their estimator is unknown. [sent-50, score-0.055]
23 In this paper, we introduce a new distribution family named transelliptical graphical model. [sent-51, score-0.95]
24 The transelliptical distribution is a generalization of the nonparanormal distribution proposed by [18]. [sent-53, score-1.02]
25 By mimicking how the nonparanormal extends the normal family, the transelliptical extends the elliptical family in the same way. [sent-54, score-1.412]
26 The transelliptical family contains the nonparanomral family and elliptical family. [sent-55, score-1.127]
27 To infer the graph structure, a rankbased procedure using the Kendall’s tau statistic is proposed. [sent-56, score-0.309]
28 We show such a procedure is adaptive over the transelliptical family: the procedure by default delivers a conditional uncorrelated graphs among certain latent variables; however, if the true distribution is the nonparanormal, the procedure automatically delivers the conditional independence graph. [sent-57, score-1.131]
29 Theoretically, even though the transelliptical family is much larger than the nonparanormal family, the same parametric rates of convergence for graph recovery and parameter estimation can be established. [sent-59, score-1.259]
30 These results suggest that the transelliptical graphical model can be used routinely as a replacement of the nonparanormal models. [sent-60, score-1.109]
31 An equivalent definition of an elliptical distribution is that its characteristic function can be written as exp(itT µ)φ(tT Σt), where φ is a properly-defined characteristic function which has a one-to-one mapping with ξ in Definition 2. [sent-69, score-0.309]
32 An elliptical distribution does not necessarily have a density. [sent-72, score-0.273]
33 We introduce the transelliptical graphical models in analogy to the nonparanormal graphical models [19, 18]. [sent-91, score-1.204]
34 The key concept is transelliptical distribution which is also introduced in [12]. [sent-92, score-0.728]
35 However, the definition of transelliptical distribution in this paper is slightly more restrictive than that in [12] due to the complication of graphical modeling. [sent-93, score-0.868]
36 More specifically, let R+ := {Σ ∈ Rd×d : ΣT = Σ, diag(Σ) = 1, Σ d 0}, (3) we define the transelliptical distribution as follows: Definition 3. [sent-94, score-0.728]
37 , fd ), if there exists a set of monotone univariate functions f1 , . [sent-102, score-0.341]
38 , fd and a nonnegative random variable ξ satisfying P(ξ = 0) = 0, such that (f1 (X1 ), . [sent-105, score-0.287]
39 d (4) 1 Here, Σ is called latent generalized correlation matrix . [sent-109, score-0.246]
40 We then discuss the relationship between the transelliptical family with the nonparanormal family, which is defined as follows: Definition 3. [sent-110, score-1.096]
41 , fd (Xd ))T ∼ Nd (0, Σ), where Σ ∈ R+ is called latent correlation matrix. [sent-124, score-0.434]
42 2, we see the transelliptical is a strict extension of the nonparanormal. [sent-127, score-0.705]
43 Both families assume there exits a set of univariate transformations such that the transformed data follow a base distribution: the nonparanormal exploits a normal base distribution; while the transelliptical exploits an elliptical base distribution. [sent-128, score-1.38]
44 In the nonparanormal, Σ is the correlation matrix for the latent normal, therefore it is called latent correlation matrix; In the transelliptical, Σ is the generalized correlation matrix for the latent elliptical distribution, therefore it is called latent generalized correlation matrix. [sent-129, score-1.07]
45 , fd ) where Σ ∈ R+ d is the latent generalized correlation matrix. [sent-134, score-0.487]
46 We define Θ := Σ−1 to be the latent generalized concentration matrix. [sent-136, score-0.164]
47 We define the latent generalized partial correlation matrix Γ as Γjk := −Θjk / Θjj · Θkk . [sent-138, score-0.271]
48 Let diag(A) be the matrix A with off-diagonal elements replaced by zero and A1/2 be the squared root matrix of A. [sent-139, score-0.058]
49 (5) Therefore, Γ has the same nonzero pattern as Σ−1 . [sent-141, score-0.052]
50 We then define a undirected graph G = (V, E): the vertex set V contains nodes corresponding to the d variables in X, and the edge set E satisfies (Xj , Xk ) ∈ E if and only if Γjk = 0 for j, k = 1, . [sent-142, score-0.148]
51 R+ (G) d (6) R+ d Given a graph G, we define to be the set containing all the Σ ∈ with zero entries at the positions specified by the graph G. [sent-146, score-0.234]
52 The transelliptical graphical model induced by G is defined as: Definition 3. [sent-147, score-0.82]
53 The transelliptical graphical model induced by a graph G, denoted by P(G), is defined to be the set of distributions: P(G) := all the transelliptical distributions T Ed (Σ, ξ; f1 , . [sent-149, score-1.68]
54 (7) d In the rest of this section, we prove some properties of the transelliptical family and discuss the interpretation of the meaning of the graph G. [sent-153, score-0.925]
55 This graph is called latent generalized partial correlation graph. [sent-154, score-0.359]
56 First, we show the transelliptical family is closed under marginalization and conditioning. [sent-155, score-0.791]
57 The marginal and the conditional distributions of (X1 , X2 )T given the remaining variables are still transellpitical. [sent-166, score-0.089]
58 18 of [8], the marginal distribution of (Z1 , Z2 )T and the conditional distribution of (Z1 , Z2 )T given the remaining Z3 , . [sent-180, score-0.117]
59 To see the conditional case, since X has continuous marginals and f1 , . [sent-185, score-0.054]
60 , fd are monotone, the distribution of (X1 , X2 )T conditional on X\{1,2} is the same as conditional on Z\{1,2} . [sent-188, score-0.401]
61 Combined with the fact that Z1 = f1 (X1 ), Z2 = f2 (X2 ), we know that (X1 , X2 )T | X\{1,2} follows a transelliptical distribution. [sent-189, score-0.705]
62 From (5), we see the matrices Γ and Θ have the same nonzero pattern, therefore, they encode the same graph G. [sent-190, score-0.151]
63 The next lemma shows that, if the second moment of X exists, the absence of an edge in the graph G is equivalent to the pairwise conditional uncorrelatedness of two corresponding latent variables. [sent-195, score-0.5]
64 , fd ) with Eξ 2 < ∞, and Zj := fj (Xj ) for j = 1, . [sent-204, score-0.31]
65 Therefore, the latent generalized correlation matrix Σ is the generalized correlation matrix of the latent variable Z. [sent-218, score-0.492]
66 It suffices to prove that, for elliptical distributions with Eξ 2 < ∞, the generalized partial correlation matrix Γ as defined in (5) encodes the conditional uncorrelatedness among the variables. [sent-219, score-0.665]
67 We say C separates A and B in the graph G if any path from a node in A to a node in B goes through at least one node in C. [sent-225, score-0.14]
68 The next lemma implies the equivalence between the pairwise and global conditional uncorrelatedness of the latent variables for the transelliptical graphical models. [sent-227, score-1.155]
69 This lemma connects the graph theory with probability theory. [sent-228, score-0.179]
70 , fd ) be any element of the transelliptical graphical model P(G) satisfying Eξ 2 < ∞. [sent-234, score-1.107]
71 Then C separates A and B in G if and only if ZA and ZB are conditional uncorrelated given ZC . [sent-242, score-0.111]
72 It then suffices to show the pairwise conditional uncorrelatedness implies the global conditional uncorrelatedness for the elliptical family. [sent-245, score-0.676]
73 Compared with the nonparanormal graphical model, the transelliptical graphical model gains a lot on modeling flexibility, but at the price of inferring a weaker notion of graphs: a missing edge in the graph only represents the conditional uncorrelatedness of the latent variables. [sent-248, score-1.63]
74 The next lemma shows that we do not lose any thing compared with the nonparanormal graphical model. [sent-249, score-0.446]
75 , fd ) be a member of the transelliptical graphical model P(G). [sent-257, score-1.09]
76 If X is also nonparanormal, the graph G encodes the conditional independence relationship of X (In other words, the distribution of X is Markov to G). [sent-258, score-0.238]
77 4 Rank-based Regularization Estimator In this section, we propose a nonparametric rank-based regularization estimator which achieves the optimal parametric rates of convergence for both graph recovery and parameter estimation. [sent-259, score-0.289]
78 The main idea of our procedure is to treat the marginal transformation functions fj and the generating variable ξ as nuisance parameters, and exploit the nonparametric Kendall’s tau statistic to directly estimate the latent generalized correlation matrix Σ. [sent-260, score-0.512]
79 The obtained correlation matrix estimate is then plugged into the CLIME procedure to estimate the sparse latent generalized concentration matrix Θ. [sent-261, score-0.335]
80 From the previous discussion, we know the graph G is encoded by the nonzero pattern of Θ. [sent-262, score-0.169]
81 We then get a graph estimator by thresholding the estimated Θ. [sent-263, score-0.188]
82 1 The Kendall’s tau Statistic and its Invariance Property Let x1 , . [sent-265, score-0.132]
83 Our task is to estimate the latent generalized concentration matrix Θ := Σ−1 . [sent-272, score-0.193]
84 The Kendall’s tau is defined as: 2 τjk = xi − xi , (8) sign xi − xi j j k k n(n − 1) 1≤i 0 is a tuning parameter. [sent-273, score-0.152]
85 Once Θ is obtained, we can apply an additional thresholding step to estimate the graph G. [sent-276, score-0.133]
86 For this, we define a graph estimator G = (V, E), in which an edge (j, k) ∈ E if Θjk ≥ γ. [sent-277, score-0.203]
87 Compared with the original CLIME, the extra cost of our rank-based procedure is the computation of S, which requires us to evaluate d(d − 1)/2 pairwise Kendal’s tau statistics. [sent-279, score-0.201]
88 A naive implementation of the Kendall’s tau requires O(n2 ) computation. [sent-280, score-0.132]
89 However, efficient algorithm based on sorting and balanced binary trees has been developed to calculate the Kendall’s tau statistic with a computational complexity O(n log n) [4]. [sent-281, score-0.167]
90 Unlike our work, they focus on the more restrictive nonparanromal family and discuss several rank-based procedures using the normal-score, Spearman’s rho, and Kendall’s tau. [sent-286, score-0.147]
91 Unlike our results, they advocate the use of the Spearman’s rho and normal-score correlation coefficients. [sent-287, score-0.183]
92 Their main concern is that, within the more restrictive nonparanormal family, the Spearman’s rho and normal-score correlations are slightly easier to compute and have smaller asymptotic variance. [sent-288, score-0.351]
93 In constrast to their results, the new insight obtained from this current paper is that we advocate the usage of the Kendall’s tau due to its invariance property within the much larger transelliptical family. [sent-289, score-0.913]
94 In fact, we can show that the Spearman’s rho is not invariant within the transelliptical family unless the true distribution is nonparanormal. [sent-290, score-0.871]
95 5 Asymptotic Properties We analyze the theoretical properties of the rank-based regularization estimator proposed in Section 4. [sent-292, score-0.073]
96 This result suggests that the transelliptical graphical model can be used as a safe replacement of the Gaussian graphical models, the nonparanormal graphical models, and the elliptical graphical models. [sent-295, score-1.704]
97 , fd ) with Σ ∈ R+ and Θ := Σ−1 ∈ Sd (q, S, M ) with d 0 ≤ q < 1. [sent-306, score-0.27]
98 (12) Let G be the graph estimator defined in Section 4. [sent-309, score-0.172]
99 If we further assume Θ ∈ Sd (0, s, M ) and minj,k:|Θjk |=0 |Θjk | ≥ 2γ, then (Graph recovery) P G = G ≥ 1 − o(1), (13) where G is the graph determined by the nonzero pattern of Θ. [sent-311, score-0.169]
100 The difference between the rank-based CLIME and the original CLIME is that we replace the Pearson correlation coefficient matrix R by the Kendall’s tau matrix S. [sent-313, score-0.278]
wordName wordTfidf (topN-words)
[('transelliptical', 0.705), ('fd', 0.27), ('nonparanormal', 0.269), ('elliptical', 0.25), ('jk', 0.231), ('ecd', 0.149), ('uncorrelatedness', 0.148), ('tau', 0.132), ('clime', 0.127), ('graph', 0.117), ('kendall', 0.116), ('graphical', 0.115), ('xd', 0.106), ('correlation', 0.088), ('family', 0.086), ('latent', 0.076), ('rho', 0.057), ('estimator', 0.055), ('spearman', 0.055), ('conditional', 0.054), ('generalized', 0.053), ('monotone', 0.043), ('zd', 0.042), ('extends', 0.041), ('recovery', 0.04), ('sd', 0.04), ('zj', 0.04), ('fj', 0.04), ('advocate', 0.038), ('lq', 0.037), ('sjk', 0.037), ('diag', 0.035), ('semiparametric', 0.035), ('statistic', 0.035), ('lemma', 0.035), ('concentration', 0.035), ('delivers', 0.034), ('nonzero', 0.034), ('uncorrelated', 0.034), ('han', 0.032), ('rd', 0.031), ('penalized', 0.031), ('edge', 0.031), ('matrix', 0.029), ('univariate', 0.028), ('nition', 0.028), ('au', 0.027), ('connects', 0.027), ('xj', 0.027), ('thing', 0.027), ('independence', 0.025), ('partial', 0.025), ('restrictive', 0.025), ('procedure', 0.025), ('ces', 0.024), ('parametric', 0.024), ('separates', 0.023), ('gaussian', 0.023), ('distribution', 0.023), ('cv', 0.023), ('pairwise', 0.022), ('sin', 0.022), ('extra', 0.022), ('named', 0.021), ('tuning', 0.02), ('replacement', 0.02), ('normal', 0.02), ('xk', 0.02), ('denoted', 0.02), ('base', 0.019), ('exibility', 0.019), ('relationship', 0.019), ('invariance', 0.019), ('constrast', 0.019), ('tic', 0.019), ('nonparanromal', 0.019), ('rates', 0.018), ('regularization', 0.018), ('pattern', 0.018), ('discussions', 0.018), ('characteristic', 0.018), ('distributions', 0.018), ('liu', 0.017), ('exploits', 0.017), ('exits', 0.017), ('piscataway', 0.017), ('traded', 0.017), ('fhan', 0.017), ('marginal', 0.017), ('nonparametric', 0.017), ('satisfying', 0.017), ('discuss', 0.017), ('moment', 0.017), ('graphs', 0.017), ('nj', 0.017), ('thresholding', 0.016), ('hanliu', 0.016), ('irrepresentable', 0.016), ('rjk', 0.016), ('fang', 0.016)]
simIndex simValue paperId paperTitle
same-paper 1 1.0 352 nips-2012-Transelliptical Graphical Models
Author: Han Liu, Fang Han, Cun-hui Zhang
Abstract: We advocate the use of a new distribution family—the transelliptical—for robust inference of high dimensional graphical models. The transelliptical family is an extension of the nonparanormal family proposed by Liu et al. (2009). Just as the nonparanormal extends the normal by transforming the variables using univariate functions, the transelliptical extends the elliptical family in the same way. We propose a nonparametric rank-based regularization estimator which achieves the parametric rates of convergence for both graph recovery and parameter estimation. Such a result suggests that the extra robustness and flexibility obtained by the semiparametric transelliptical modeling incurs almost no efficiency loss. We also discuss the relationship between this work with the transelliptical component analysis proposed by Han and Liu (2012). 1
2 0.57594806 351 nips-2012-Transelliptical Component Analysis
Author: Fang Han, Han Liu
Abstract: We propose a high dimensional semiparametric scale-invariant principle component analysis, named TCA, by utilize the natural connection between the elliptical distribution family and the principal component analysis. Elliptical distribution family includes many well-known multivariate distributions like multivariate Gaussian, t and logistic and it is extended to the meta-elliptical by Fang et.al (2002) using the copula techniques. In this paper we extend the meta-elliptical distribution family to a even larger family, called transelliptical. We prove that TCA can obtain a near-optimal s log d/n estimation consistency rate in recovering the leading eigenvector of the latent generalized correlation matrix under the transelliptical distribution family, even if the distributions are very heavy-tailed, have infinite second moments, do not have densities and possess arbitrarily continuous marginal distributions. A feature selection result with explicit rate is also provided. TCA is further implemented in both numerical simulations and largescale stock data to illustrate its empirical usefulness. Both theories and experiments confirm that TCA can achieve model flexibility, estimation accuracy and robustness at almost no cost. 1
3 0.25271073 317 nips-2012-Smooth-projected Neighborhood Pursuit for High-dimensional Nonparanormal Graph Estimation
Author: Tuo Zhao, Kathryn Roeder, Han Liu
Abstract: We introduce a new learning algorithm, named smooth-projected neighborhood pursuit, for estimating high dimensional undirected graphs. In particularly, we focus on the nonparanormal graphical model and provide theoretical guarantees for graph estimation consistency. In addition to new computational and theoretical analysis, we also provide an alternative view to analyze the tradeoff between computational efficiency and statistical error under a smoothing optimization framework. Numerical results on both synthetic and real datasets are provided to support our theory. 1
4 0.11131287 326 nips-2012-Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses
Author: Po-ling Loh, Martin J. Wainwright
Abstract: We investigate a curious relationship between the structure of a discrete graphical model and the support of the inverse of a generalized covariance matrix. We show that for certain graph structures, the support of the inverse covariance matrix of indicator variables on the vertices of a graph reflects the conditional independence structure of the graph. Our work extends results that have previously been established only in the context of multivariate Gaussian graphical models, thereby addressing an open question about the significance of the inverse covariance matrix of a non-Gaussian distribution. Based on our population-level results, we show how the graphical Lasso may be used to recover the edge structure of certain classes of discrete graphical models, and present simulations to verify our theoretical results. 1
5 0.086474366 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs
Author: Anima Anandkumar, Ragupathyraj Valluvan
Abstract: Graphical model selection refers to the problem of estimating the unknown graph structure given observations at the nodes in the model. We consider a challenging instance of this problem when some of the nodes are latent or hidden. We characterize conditions for tractable graph estimation and develop efficient methods with provable guarantees. We consider the class of Ising models Markov on locally tree-like graphs, which are in the regime of correlation decay. We propose an efficient method for graph estimation, and establish its structural consistency −δη(η+1)−2 when the number of samples n scales as n = Ω(θmin log p), where θmin is the minimum edge potential, δ is the depth (i.e., distance from a hidden node to the nearest observed nodes), and η is a parameter which depends on the minimum and maximum node and edge potentials in the Ising model. The proposed method is practical to implement and provides flexibility to control the number of latent variables and the cycle lengths in the output graph. We also present necessary conditions for graph estimation by any method and show that our method nearly matches the lower bound on sample requirements. Keywords: Graphical model selection, latent variables, quartet methods, locally tree-like graphs. 1
6 0.085643426 147 nips-2012-Graphical Models via Generalized Linear Models
7 0.072578445 202 nips-2012-Locally Uniform Comparison Image Descriptor
8 0.068528354 335 nips-2012-The Bethe Partition Function of Log-supermodular Graphical Models
9 0.066626072 180 nips-2012-Learning Mixtures of Tree Graphical Models
10 0.055777732 189 nips-2012-Learning from the Wisdom of Crowds by Minimax Entropy
11 0.055612557 247 nips-2012-Nonparametric Reduced Rank Regression
12 0.055358615 354 nips-2012-Truly Nonparametric Online Variational Inference for Hierarchical Dirichlet Processes
13 0.053899523 211 nips-2012-Meta-Gaussian Information Bottleneck
14 0.053095549 248 nips-2012-Nonparanormal Belief Propagation (NPNBP)
15 0.05217829 310 nips-2012-Semiparametric Principal Component Analysis
16 0.049146023 206 nips-2012-Majorization for CRFs and Latent Likelihoods
17 0.047352538 286 nips-2012-Random Utility Theory for Social Choice
18 0.046946403 199 nips-2012-Link Prediction in Graphs with Autoregressive Features
19 0.046416115 287 nips-2012-Random function priors for exchangeable arrays with applications to graphs and relational data
20 0.044594251 186 nips-2012-Learning as MAP Inference in Discrete Graphical Models
topicId topicWeight
[(0, 0.125), (1, 0.068), (2, 0.077), (3, -0.077), (4, -0.062), (5, 0.03), (6, 0.04), (7, -0.094), (8, -0.074), (9, 0.051), (10, -0.021), (11, -0.022), (12, 0.045), (13, -0.017), (14, -0.04), (15, -0.051), (16, 0.438), (17, -0.005), (18, -0.089), (19, 0.12), (20, 0.004), (21, -0.288), (22, -0.04), (23, 0.062), (24, -0.005), (25, -0.256), (26, -0.375), (27, -0.06), (28, 0.008), (29, 0.057), (30, 0.032), (31, 0.002), (32, -0.058), (33, -0.04), (34, -0.187), (35, -0.003), (36, -0.039), (37, -0.058), (38, -0.047), (39, 0.01), (40, -0.02), (41, 0.075), (42, 0.036), (43, 0.033), (44, -0.078), (45, -0.067), (46, -0.005), (47, -0.02), (48, -0.02), (49, 0.056)]
simIndex simValue paperId paperTitle
same-paper 1 0.93247455 352 nips-2012-Transelliptical Graphical Models
Author: Han Liu, Fang Han, Cun-hui Zhang
Abstract: We advocate the use of a new distribution family—the transelliptical—for robust inference of high dimensional graphical models. The transelliptical family is an extension of the nonparanormal family proposed by Liu et al. (2009). Just as the nonparanormal extends the normal by transforming the variables using univariate functions, the transelliptical extends the elliptical family in the same way. We propose a nonparametric rank-based regularization estimator which achieves the parametric rates of convergence for both graph recovery and parameter estimation. Such a result suggests that the extra robustness and flexibility obtained by the semiparametric transelliptical modeling incurs almost no efficiency loss. We also discuss the relationship between this work with the transelliptical component analysis proposed by Han and Liu (2012). 1
2 0.90977997 351 nips-2012-Transelliptical Component Analysis
Author: Fang Han, Han Liu
Abstract: We propose a high dimensional semiparametric scale-invariant principle component analysis, named TCA, by utilize the natural connection between the elliptical distribution family and the principal component analysis. Elliptical distribution family includes many well-known multivariate distributions like multivariate Gaussian, t and logistic and it is extended to the meta-elliptical by Fang et.al (2002) using the copula techniques. In this paper we extend the meta-elliptical distribution family to a even larger family, called transelliptical. We prove that TCA can obtain a near-optimal s log d/n estimation consistency rate in recovering the leading eigenvector of the latent generalized correlation matrix under the transelliptical distribution family, even if the distributions are very heavy-tailed, have infinite second moments, do not have densities and possess arbitrarily continuous marginal distributions. A feature selection result with explicit rate is also provided. TCA is further implemented in both numerical simulations and largescale stock data to illustrate its empirical usefulness. Both theories and experiments confirm that TCA can achieve model flexibility, estimation accuracy and robustness at almost no cost. 1
3 0.66871548 317 nips-2012-Smooth-projected Neighborhood Pursuit for High-dimensional Nonparanormal Graph Estimation
Author: Tuo Zhao, Kathryn Roeder, Han Liu
Abstract: We introduce a new learning algorithm, named smooth-projected neighborhood pursuit, for estimating high dimensional undirected graphs. In particularly, we focus on the nonparanormal graphical model and provide theoretical guarantees for graph estimation consistency. In addition to new computational and theoretical analysis, we also provide an alternative view to analyze the tradeoff between computational efficiency and statistical error under a smoothing optimization framework. Numerical results on both synthetic and real datasets are provided to support our theory. 1
4 0.57025218 211 nips-2012-Meta-Gaussian Information Bottleneck
Author: Melanie Rey, Volker Roth
Abstract: We present a reformulation of the information bottleneck (IB) problem in terms of copula, using the equivalence between mutual information and negative copula entropy. Focusing on the Gaussian copula we extend the analytical IB solution available for the multivariate Gaussian case to distributions with a Gaussian dependence structure but arbitrary marginal densities, also called meta-Gaussian distributions. This opens new possibles applications of IB to continuous data and provides a solution more robust to outliers. 1
5 0.34514648 248 nips-2012-Nonparanormal Belief Propagation (NPNBP)
Author: Gal Elidan, Cobi Cario
Abstract: The empirical success of the belief propagation approximate inference algorithm has inspired numerous theoretical and algorithmic advances. Yet, for continuous non-Gaussian domains performing belief propagation remains a challenging task: recent innovations such as nonparametric or kernel belief propagation, while useful, come with a substantial computational cost and offer little theoretical guarantees, even for tree structured models. In this work we present Nonparanormal BP for performing efficient inference on distributions parameterized by a Gaussian copulas network and any univariate marginals. For tree structured networks, our approach is guaranteed to be exact for this powerful class of non-Gaussian models. Importantly, the method is as efficient as standard Gaussian BP, and its convergence properties do not depend on the complexity of the univariate marginals, even when a nonparametric representation is used. 1
6 0.33267236 309 nips-2012-Semi-supervised Eigenvectors for Locally-biased Learning
7 0.33004701 202 nips-2012-Locally Uniform Comparison Image Descriptor
8 0.29049069 326 nips-2012-Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses
9 0.28358412 128 nips-2012-Fast Resampling Weighted v-Statistics
10 0.27308932 287 nips-2012-Random function priors for exchangeable arrays with applications to graphs and relational data
11 0.26696229 189 nips-2012-Learning from the Wisdom of Crowds by Minimax Entropy
12 0.2659941 147 nips-2012-Graphical Models via Generalized Linear Models
13 0.2572498 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs
14 0.24801108 123 nips-2012-Exponential Concentration for Mutual Information Estimation with Application to Forests
15 0.23880848 286 nips-2012-Random Utility Theory for Social Choice
16 0.23477158 132 nips-2012-Fiedler Random Fields: A Large-Scale Spectral Approach to Statistical Network Modeling
17 0.21323502 346 nips-2012-Topology Constraints in Graphical Models
18 0.20972545 335 nips-2012-The Bethe Partition Function of Log-supermodular Graphical Models
19 0.2094053 254 nips-2012-On the Sample Complexity of Robust PCA
20 0.20823376 192 nips-2012-Learning the Dependency Structure of Latent Factors
topicId topicWeight
[(0, 0.032), (21, 0.02), (22, 0.059), (24, 0.044), (38, 0.102), (39, 0.268), (42, 0.026), (55, 0.02), (74, 0.038), (76, 0.169), (80, 0.07), (92, 0.033)]
simIndex simValue paperId paperTitle
1 0.89065915 351 nips-2012-Transelliptical Component Analysis
Author: Fang Han, Han Liu
Abstract: We propose a high dimensional semiparametric scale-invariant principle component analysis, named TCA, by utilize the natural connection between the elliptical distribution family and the principal component analysis. Elliptical distribution family includes many well-known multivariate distributions like multivariate Gaussian, t and logistic and it is extended to the meta-elliptical by Fang et.al (2002) using the copula techniques. In this paper we extend the meta-elliptical distribution family to a even larger family, called transelliptical. We prove that TCA can obtain a near-optimal s log d/n estimation consistency rate in recovering the leading eigenvector of the latent generalized correlation matrix under the transelliptical distribution family, even if the distributions are very heavy-tailed, have infinite second moments, do not have densities and possess arbitrarily continuous marginal distributions. A feature selection result with explicit rate is also provided. TCA is further implemented in both numerical simulations and largescale stock data to illustrate its empirical usefulness. Both theories and experiments confirm that TCA can achieve model flexibility, estimation accuracy and robustness at almost no cost. 1
2 0.86193842 248 nips-2012-Nonparanormal Belief Propagation (NPNBP)
Author: Gal Elidan, Cobi Cario
Abstract: The empirical success of the belief propagation approximate inference algorithm has inspired numerous theoretical and algorithmic advances. Yet, for continuous non-Gaussian domains performing belief propagation remains a challenging task: recent innovations such as nonparametric or kernel belief propagation, while useful, come with a substantial computational cost and offer little theoretical guarantees, even for tree structured models. In this work we present Nonparanormal BP for performing efficient inference on distributions parameterized by a Gaussian copulas network and any univariate marginals. For tree structured networks, our approach is guaranteed to be exact for this powerful class of non-Gaussian models. Importantly, the method is as efficient as standard Gaussian BP, and its convergence properties do not depend on the complexity of the univariate marginals, even when a nonparametric representation is used. 1
3 0.85611641 106 nips-2012-Dynamical And-Or Graph Learning for Object Shape Modeling and Detection
Author: Xiaolong Wang, Liang Lin
Abstract: This paper studies a novel discriminative part-based model to represent and recognize object shapes with an “And-Or graph”. We define this model consisting of three layers: the leaf-nodes with collaborative edges for localizing local parts, the or-nodes specifying the switch of leaf-nodes, and the root-node encoding the global verification. A discriminative learning algorithm, extended from the CCCP [23], is proposed to train the model in a dynamical manner: the model structure (e.g., the configuration of the leaf-nodes associated with the or-nodes) is automatically determined with optimizing the multi-layer parameters during the iteration. The advantages of our method are two-fold. (i) The And-Or graph model enables us to handle well large intra-class variance and background clutters for object shape detection from images. (ii) The proposed learning algorithm is able to obtain the And-Or graph representation without requiring elaborate supervision and initialization. We validate the proposed method on several challenging databases (e.g., INRIA-Horse, ETHZ-Shape, and UIUC-People), and it outperforms the state-of-the-arts approaches. 1
4 0.84141511 166 nips-2012-Joint Modeling of a Matrix with Associated Text via Latent Binary Features
Author: Xianxing Zhang, Lawrence Carin
Abstract: A new methodology is developed for joint analysis of a matrix and accompanying documents, with the documents associated with the matrix rows/columns. The documents are modeled with a focused topic model, inferring interpretable latent binary features for each document. A new matrix decomposition is developed, with latent binary features associated with the rows/columns, and with imposition of a low-rank constraint. The matrix decomposition and topic model are coupled by sharing the latent binary feature vectors associated with each. The model is applied to roll-call data, with the associated documents defined by the legislation. Advantages of the proposed model are demonstrated for prediction of votes on a new piece of legislation, based only on the observed text of legislation. The coupling of the text and legislation is also shown to yield insight into the properties of the matrix decomposition for roll-call data. 1
5 0.82651424 249 nips-2012-Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison
Author: Tianbao Yang, Yu-feng Li, Mehrdad Mahdavi, Rong Jin, Zhi-Hua Zhou
Abstract: Both random Fourier features and the Nystr¨ m method have been successfully o applied to efficient kernel learning. In this work, we investigate the fundamental difference between these two approaches, and how the difference could affect their generalization performances. Unlike approaches based on random Fourier features where the basis functions (i.e., cosine and sine functions) are sampled from a distribution independent from the training data, basis functions used by the Nystr¨ m method are randomly sampled from the training examples and are o therefore data dependent. By exploring this difference, we show that when there is a large gap in the eigen-spectrum of the kernel matrix, approaches based on the Nystr¨ m method can yield impressively better generalization error bound than o random Fourier features based approach. We empirically verify our theoretical findings on a wide range of large data sets. 1
6 0.81753367 323 nips-2012-Statistical Consistency of Ranking Methods in A Rank-Differentiable Probability Space
same-paper 7 0.80377585 352 nips-2012-Transelliptical Graphical Models
8 0.74110544 310 nips-2012-Semiparametric Principal Component Analysis
9 0.67953616 317 nips-2012-Smooth-projected Neighborhood Pursuit for High-dimensional Nonparanormal Graph Estimation
10 0.66954273 47 nips-2012-Augment-and-Conquer Negative Binomial Processes
11 0.66195762 315 nips-2012-Slice sampling normalized kernel-weighted completely random measure mixture models
12 0.65958548 163 nips-2012-Isotropic Hashing
13 0.65018207 363 nips-2012-Wavelet based multi-scale shape features on arbitrary surfaces for cortical thickness discrimination
14 0.64906108 354 nips-2012-Truly Nonparametric Online Variational Inference for Hierarchical Dirichlet Processes
15 0.64812648 74 nips-2012-Collaborative Gaussian Processes for Preference Learning
16 0.64477742 75 nips-2012-Collaborative Ranking With 17 Parameters
17 0.6423499 154 nips-2012-How They Vote: Issue-Adjusted Models of Legislative Behavior
18 0.64056271 355 nips-2012-Truncation-free Online Variational Inference for Bayesian Nonparametric Models
19 0.63972896 144 nips-2012-Gradient-based kernel method for feature extraction and variable selection
20 0.63654315 346 nips-2012-Topology Constraints in Graphical Models