nips nips2012 nips2012-351 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Fang Han, Han Liu
Abstract: We propose a high dimensional semiparametric scale-invariant principle component analysis, named TCA, by utilize the natural connection between the elliptical distribution family and the principal component analysis. Elliptical distribution family includes many well-known multivariate distributions like multivariate Gaussian, t and logistic and it is extended to the meta-elliptical by Fang et.al (2002) using the copula techniques. In this paper we extend the meta-elliptical distribution family to a even larger family, called transelliptical. We prove that TCA can obtain a near-optimal s log d/n estimation consistency rate in recovering the leading eigenvector of the latent generalized correlation matrix under the transelliptical distribution family, even if the distributions are very heavy-tailed, have infinite second moments, do not have densities and possess arbitrarily continuous marginal distributions. A feature selection result with explicit rate is also provided. TCA is further implemented in both numerical simulations and largescale stock data to illustrate its empirical usefulness. Both theories and experiments confirm that TCA can achieve model flexibility, estimation accuracy and robustness at almost no cost. 1
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract We propose a high dimensional semiparametric scale-invariant principle component analysis, named TCA, by utilize the natural connection between the elliptical distribution family and the principal component analysis. [sent-3, score-0.715]
2 Elliptical distribution family includes many well-known multivariate distributions like multivariate Gaussian, t and logistic and it is extended to the meta-elliptical by Fang et. [sent-4, score-0.265]
3 In this paper we extend the meta-elliptical distribution family to a even larger family, called transelliptical. [sent-6, score-0.065]
4 d realizations of a random vector X ∈ Rd with population covariance matrix Σ and correlation matrix Σ0 , the Principal Component Analysis (PCA) aims at recovering the top m leading eigenvectors u1 , . [sent-16, score-0.466]
5 In practice, Σ is unknown and the top m leading eigenvectors u1 , . [sent-20, score-0.172]
6 , um of the Pearson sample covariance matrix are obtained as the estimators. [sent-23, score-0.069]
7 However, because the PCA is well-known to be scale-variant, meaning that changing the measurement scale of variables will make the estimators different, the PCA conducted on the sample correlation matrix is also regular in literatures [2]. [sent-24, score-0.247]
8 It aims at recovering the top m leading eigenvectors θ1 , . [sent-25, score-0.207]
9 , θm of Σ0 using the top m leading eigenvectors θ1 , . [sent-28, score-0.172]
10 Because Σ0 is scale-invariant, we call the PCA aiming at recovering the eigenvectors of Σ0 the scale-invariant PCA. [sent-32, score-0.146]
11 In high dimensional settings, when d scales with n, it has been discussed in [14] that u1 and θ1 are generally not consistent estimators of u1 and θ1 . [sent-33, score-0.065]
12 The elliptical distributions are of special interest in Principal Component Analysis. [sent-45, score-0.447]
13 The study of elliptical distributions and their extensions have been launched in statistics recently by [4]. [sent-46, score-0.447]
14 The elliptical distributions can be characterized by their stochastic representations [5]. [sent-47, score-0.447]
15 Elliptical distribution family includes a variety of famous multivariate distributions: multivariate Gaussian, multivariate Cauchy, Student’s t, logistic, Kotz, symmetric Pearson type-II and type-VII distributions. [sent-54, score-0.275]
16 [4] introduce the term meta-elliptical distribution in extending the continuous elliptical distributions whose densities exist to a wider class of distributions with densities existing. [sent-56, score-0.65]
17 The construction of the meta-elliptical distributions is based on the copula technique and it was initially introduced by [25]. [sent-57, score-0.128]
18 In particular, when the latent elliptical distribution is the multivariate Gaussian, we have the meta-Gaussian or the nonparanormal distributions introduced by [16] and [19]. [sent-58, score-0.604]
19 The elliptical distribution is of special interest in Principal Component Analysis (PCA). [sent-59, score-0.412]
20 It has been shown in a variety of literatures [27, 11, 22, 12, 24] that the PCA conducted on elliptical distributions shares a number of good properties enjoyed by the PCA conducted on the Gaussian distribution. [sent-60, score-0.479]
21 In particular, [11] show that with regard to a range of hypothesis relevant to PCA, tests based on a multivariate Gaussian assumption have the identical power for all elliptical distributions even without second moments. [sent-61, score-0.517]
22 In this paper, a new high dimensional scale-invariant principle component analysis approach is proposed, named Transelliptical Component Analysis (TCA). [sent-63, score-0.099]
23 Firstly, to achieve both the estimation accuracy and model flexibility, we build the model of TCA on the transelliptical distributions. [sent-64, score-0.307]
24 , Xd )T is said to follow a transelliptical distribution if there exists a set of univariate strictly monotone functions f = {fj }d such that f (X) := (f1 (X1 ), . [sent-68, score-0.373]
25 , fd (Xd ))T j=1 follows a continuous elliptical distribution with parameters µ = 0 and Σ0 = [Σ0 ] 0. [sent-71, score-0.679]
26 Transelliptical distributions do not necessarily possess densities and are strict extensions to the meta-elliptical distributions defined in [4]. [sent-73, score-0.228]
27 TCA aims at recovering the top m leading eigenvectors θ1 , . [sent-74, score-0.207]
28 We prove that even though the generating variable ξ is changing and marginal distributions are arbitrarily continuous, Kendall’s tau correlation matrix approximates Σ0 in a parametric rate OP ( log d/n). [sent-79, score-0.495]
29 This key observation makes Kendall’s tau a better estimator than Pearson sample correlation matrix with regard to a much larger distribution family than the Gaussian. [sent-80, score-0.451]
30 Thirdly, in terms of methodology and theory, we analyze the general case that X follows a transelliptical distribution and θ1 is sparse. [sent-81, score-0.332]
31 We ob∗ tain the TCA estimator θ1 of θ1 utilizing the Kendall’s tau correlation matrix. [sent-83, score-0.341]
32 We prove that the TCA can obtain a fast convergence rate in terms of parameter estimation and is of the rate sin ∠(θ1 , θ∞ ) = OP (s log d/n), where θ∞ is the estimator TCA obtains. [sent-84, score-0.131]
33 Let MI· and M·J be the submatrix of M with rows in I and all columns, and the submatrix of M with columns in J and all rows. [sent-91, score-0.06]
34 Elliptical and Transelliptical Distributions This section is devoted to a brief discussion of elliptical and transelliptical distributions. [sent-104, score-0.721]
35 , Xd )T is said to be continuous if the marginal distribution functions are all continuous. [sent-108, score-0.147]
36 1 Elliptical Distributions In this section we shall firstly provide a definition of the elliptical distributions following [5]. [sent-111, score-0.447]
37 A random variable in R with continuous marginal distribution function does not necessarily possess density. [sent-119, score-0.179]
38 A well-known set of examples is the cantor distribution, whose support set is the cantor set. [sent-120, score-0.072]
39 We define Σ0 = [Σ0 ] with Σ0 = Σjk / Σjj Σkk to be the generalized correlation matrix of Z. [sent-150, score-0.186]
40 Σ0 jk jk is the correlation matrix of Z when Z’s second moment exists and still reflects the rank dependency even when Z has infinite second moment [13]. [sent-151, score-0.41]
41 2 Transelliptical Distributions To extend the elliptical distribution, we firstly define two sets of symmetric matrices: R+ = {Σ ∈ d Rd×d : ΣT = Σ, diag(Σ) = 1, Σ 0}; Rd = {Σ ∈ Rd×d : ΣT = Σ, diag(Σ) = 1, Σ 0}. [sent-154, score-0.387]
42 , Xd )T with continuous marginal distribution functions F1 , . [sent-160, score-0.106]
43 , Fd and density existing is said to follow a meta-elliptical distribution if and only if there exists a continuous elliptically distributed random vector Z ∼ ECd (0, Σ0 , g) with the marginal distribution function Qg and Σ0 ∈ R+ , such that (Q−1 (F1 (X1 )), . [sent-163, score-0.309]
44 g g d In this paper, we generalize the meta-elliptical distribution family to a broader class, named the transelliptical. [sent-167, score-0.092]
45 The transelliptical distributions do not assume that densities exist for both X and Z and are therefore strict extensions to meta-elliptical distributions. [sent-168, score-0.402]
46 , Xd )T is said to follow a transelliptical distribution if and only if there exists a set of strictly monotone functions f = {fj }d and a latent j=1 continuous elliptically distributed random vector Z ∼ ECd (0, Σ0 , ξ) with Σ0 ∈ Rd , such that (f1 (X1 ), . [sent-174, score-0.584]
47 , fd ) and Σ0 the latent generalized correlation matrix. [sent-181, score-0.386]
48 If X follows a meta-elliptical distribution, in other words, X possesses density and has continuous marginal distributions F1 , . [sent-184, score-0.141]
49 In summary, transelliptical ⊃ meta-elliptical = meta-elliptical copula ⊃ elliptical* ⊃ elliptical copula, transelliptical ⊃ meta-Gaussian = nonparanormal. [sent-199, score-1.069]
50 Here elliptical* represents the elliptical distributions which are continuous and possess densities. [sent-200, score-0.543]
51 2 Latent Correlation Matrix Estimation for Transelliptical Distributions We firstly study the correlation and covariance matrices of elliptical distributions. [sent-202, score-0.528]
52 Given Z ∼ ECd (µ, Σ, ξ) with rank(Σ) = q and finite second moments and Σ0 the 2 generalized correlation matrix of Z, we have E(Z) = µ, Var(Z) = E(ξ ) Σ, and Cor(Z) = Σ0 . [sent-206, score-0.186]
53 q When the random vector is elliptically distributed with second moment finite, the sample mean and correlation matrices are element-wise consistent estimators of µ and Σ0 . [sent-207, score-0.307]
54 However, the elliptical distributions are generally very heavy-tailed (multivariate t or Cauchy distributions for example), making Pearson sample correlation matrix a bad estimator. [sent-208, score-0.693]
55 When the distribution family is extended to the transelliptical, the Pearson sample correlation matrix is generally no longer a element-wise consistent estimator of Σ0 . [sent-209, score-0.289]
56 TCA is a two-stage method in estimating the leading eigenvectors of Σ0 . [sent-213, score-0.172]
57 Firstly, we estimate the Kendall’s tau correlation matrix R. [sent-214, score-0.348]
58 1 Rank-based Measures of Associations The main idea of the TCA is to exploit the Kendall’s tau statistic to estimate the generalized correlation matrix Σ0 efficiently and robustly. [sent-217, score-0.385]
59 , Xd )T be a d−dimensional random vector with marginal distributions F1 , . [sent-221, score-0.093]
60 The population Spearman’s rho and Kendall’s tau correlation coefficients are given by ρ(Xj , Xk ) = Corr(Fj (Xj ), Fk (Xk )), τ (Xj , Xk ) = P((Xj − Xj )(Xk − Xk ) > 0) − P((Xj − Xj )(Xk − Xk ) < 0), where (Xj , Xk ) is a independent copy of (Xj , Xk ). [sent-225, score-0.303]
61 In particular, for Kendall’s tau, we have the following theorem, which states an explicit relationship between τjk and Σ0 given X ∼ jk T Ed (Σ0 , ξ; f1 , . [sent-226, score-0.112]
62 , fd ), no matter what the generating variable ξ is. [sent-229, score-0.245]
63 , fd ) transelliptically distributed, we have π Σ0 = sin τ (Xj , Xk ) . [sent-236, score-0.292]
64 22 in [5] builds the result only for one sample statistic and cannot be generalized to the statistic of multiple samples, like the Kendall’s tau or Spearman’s rho. [sent-243, score-0.236]
65 On the other hand, when X ∼ jk T Ed (Σ0 , ξ; f1 , . [sent-248, score-0.112]
66 , fd ) with ξ =d 1, [10] proves that: ρ(Xj , Xk ) = 3( arcsin Σ0 jk ) π − 4( arcsin Σ0 3 jk ) . [sent-251, score-0.551]
67 We consider the following rank-based statistic: 2 τ = jk sign (xij − xi j ) (xik − xi k ) , if j = k n(n − 1) (3. [sent-259, score-0.112]
68 , θd ∈ S are the corresponding eigenvectors of λ1 , . [sent-264, score-0.111]
69 4) where B0 (s) := {v ∈ Rd : v 0 ≤ s} and R is the estimated Kendall’s tau correlation matrix. [sent-269, score-0.303]
70 2 TCA Algorithm Generally we can plug in the Kendall’s tau correlation matrix R to any sparse PCA algorithm listed above. [sent-273, score-0.377]
71 We use the iterative deflation method to learn the first k instead of the first one leading eigenvectors, following the discussions of [21, 15, 28, 29]. [sent-281, score-0.061]
72 In detail, a matrix Γ ∈ Rd×s deflates a vector v ∈ Rd and achieves a new matrix Γ : Γ := (I − vv T )Γ(I − vv T ). [sent-282, score-0.138]
73 1 Rank-based Correlation Matrix Estimation This section is devoted to the concentration result of the Kendall sample correlation matrix R to the Pearson correlation matrix Σ0 . [sent-287, score-0.399]
74 , fd ) and letting R be the Kendall tau correlation matrix, we have with probability at least 1 − d−5/2 , R − Σ0 max ≤ 3π log d/n. [sent-297, score-0.522]
75 1 can be proved by realizing that τjk is an unbiased estimator of τ (Xj , Xk ) and is a U-statistic with size 2. [sent-301, score-0.064]
76 We assume that the Model Md (Σ0 , ξ, s; f ) holds and the next theorem provides an upper bound on the angle between the estimated leading ∗ eigenvector θ1 and true leading eigenvector θ1 . [sent-306, score-0.212]
77 For any two vectors v1 ∈ Sd−1 and v2 ∈ Sd−1 , letting | sin ∠(v1 , v2 )| = T 1 − (v1 v2 )2 , then we have ∗ P | sin ∠(θ1 , θ1 )| ≤ 6π ·s λ 1 − λ2 log d n ≥ 1 − d−5/2 . [sent-311, score-0.074]
78 When (n, d) goes to infinity, the two leading eigenvalues λ1 and λ2 will typically go to infinity and will at least be away from zero. [sent-317, score-0.061]
79 1 Numerical Simulations In the simulation study we randomly sample n data points from a certain transelliptical distribution T Ed (Σ0 , ξ; f1 , . [sent-333, score-0.332]
80 To determine the transelliptical distribution, firstly, we derive Σ0 in the following way: A covariance matrix Σ is firstly synthesized through the eigenvalue decomposition, where the first two eigenvalues are given and the correspondd ing eigenvectors are pre-specified to be sparse. [sent-338, score-0.463]
81 = ωd = 1, and the first two leading eigenvectors of Σ, u1 and u2 , are sparse with the first s = 10 entries of u1 and the second s = 10 entries of u2 are nonzero, i. [sent-342, score-0.201]
82 The generalized correlation matrix Σ0 is generated from Σ, with λ1 = 4, λ2 = 2. [sent-347, score-0.186]
83 , λd ≤ 1 and the top two leading eigenvectors sparse: θ1j = − √1 10 0 1 ≤ j ≤ 10 otherwise and θ2j = − √1 10 0 11 ≤ j ≤ 20 . [sent-351, score-0.172]
84 X following a multivariate-t distribution with degree of freedom m, mean 0 and covariance matrix Σ0 (Example 2. [sent-382, score-0.07]
85 , fd } = {h1 , h2 , h3 , h4 , h5 , h1 , h2 , h3 , h4 , h5 , . [sent-393, score-0.219]
86 φ(y)dy This is equivalent to say that X is transelliptically distributed with the latent elliptical distribution Z ∼ M td (3, 0, Σ0 ). [sent-397, score-0.5]
87 (B) Successful matches of the market trend proportions only using the stocks in Ak and Bk . [sent-524, score-0.151]
88 We collected the daily closing prices for J=452 stocks that were consistently in the S&P; 500 index between January 1, 2003 through January 1, 2008. [sent-532, score-0.117]
89 Let St = [Stt,j ] denote by the closing price of stock j on day t. [sent-534, score-0.089]
90 We wish to evaluate the ability of using the only k stocks to represent the trend of the whole stock market. [sent-535, score-0.175]
91 To this end, we run Kendall and Pearson on St and obtain the leading eigenvectors θKendall and θP earson using the tuning parameter k ∈ N. [sent-536, score-0.208]
92 And then we let TtW , TtAk and TtBk denote by the trend of the whole stocks, Ak stocks and Bk stocks in tth day compared with t − 1th date, i. [sent-538, score-0.203]
93 In this way, we can calculate the proportion of successful matches 1 of the market trend using the stocks in Ak and Bk as: ρAk := T t I(TtW = TtAk ) and ρBk := Bk 1 W t I(Tt = Tt ). [sent-540, score-0.151]
94 It can be observed from Figure 1 (B) that Kendall summarizes the trend of the whole stock market constantly better than Pearson. [sent-543, score-0.123]
95 Principal component analysis in sensory analysis: covariance or correlation matrix? [sent-557, score-0.177]
96 Metaelliptical copulas and their use in e frequency analysis of multivariate hydrological data. [sent-581, score-0.07]
97 Tca: Transelliptical principal component analysis for high dimensional non-gaussian data. [sent-592, score-0.161]
98 Generalized power method for sparse e a principal component analysis. [sent-627, score-0.154]
99 Augmented sparse principal component analysis for high dimensional data. [sent-667, score-0.19]
100 Minimax rates of estimation for sparse pca in high dimensions. [sent-686, score-0.121]
wordName wordTfidf (topN-words)
[('elliptical', 0.387), ('tca', 0.364), ('kendall', 0.35), ('transelliptical', 0.307), ('pearson', 0.239), ('fd', 0.219), ('ecd', 0.19), ('latpearson', 0.18), ('tau', 0.162), ('correlation', 0.141), ('jk', 0.112), ('eigenvectors', 0.111), ('elliptically', 0.111), ('rd', 0.096), ('pca', 0.092), ('principal', 0.089), ('bk', 0.086), ('xk', 0.085), ('stocks', 0.081), ('supp', 0.081), ('xd', 0.081), ('ak', 0.077), ('multivariate', 0.07), ('fang', 0.069), ('copula', 0.068), ('tpower', 0.063), ('leading', 0.061), ('distributions', 0.06), ('xj', 0.057), ('rstly', 0.055), ('arcsin', 0.054), ('stt', 0.054), ('ttak', 0.054), ('ttw', 0.054), ('stock', 0.053), ('fj', 0.05), ('arxiv', 0.048), ('continuous', 0.048), ('possess', 0.048), ('contamination', 0.048), ('md', 0.047), ('matrix', 0.045), ('said', 0.041), ('trend', 0.041), ('family', 0.04), ('utilize', 0.039), ('dt', 0.039), ('estimator', 0.038), ('sin', 0.037), ('statistic', 0.037), ('dimensional', 0.036), ('preprint', 0.036), ('closing', 0.036), ('zd', 0.036), ('nonparanormal', 0.036), ('component', 0.036), ('cantor', 0.036), ('earson', 0.036), ('kotz', 0.036), ('transelliptically', 0.036), ('ttbk', 0.036), ('au', 0.035), ('rq', 0.035), ('spearman', 0.035), ('densities', 0.035), ('recovering', 0.035), ('sd', 0.034), ('mij', 0.034), ('proposition', 0.033), ('eigenvector', 0.033), ('marginal', 0.033), ('card', 0.032), ('literatures', 0.032), ('contoured', 0.032), ('submatrix', 0.03), ('axis', 0.03), ('scheme', 0.029), ('ation', 0.029), ('fpr', 0.029), ('market', 0.029), ('sparse', 0.029), ('estimators', 0.029), ('realizations', 0.028), ('rate', 0.028), ('han', 0.027), ('op', 0.027), ('devoted', 0.027), ('named', 0.027), ('realizing', 0.026), ('aat', 0.026), ('latent', 0.026), ('distributed', 0.026), ('generating', 0.026), ('necessarily', 0.025), ('distribution', 0.025), ('secondly', 0.024), ('theorem', 0.024), ('cauchy', 0.024), ('um', 0.024), ('vv', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999952 351 nips-2012-Transelliptical Component Analysis
Author: Fang Han, Han Liu
Abstract: We propose a high dimensional semiparametric scale-invariant principle component analysis, named TCA, by utilize the natural connection between the elliptical distribution family and the principal component analysis. Elliptical distribution family includes many well-known multivariate distributions like multivariate Gaussian, t and logistic and it is extended to the meta-elliptical by Fang et.al (2002) using the copula techniques. In this paper we extend the meta-elliptical distribution family to a even larger family, called transelliptical. We prove that TCA can obtain a near-optimal s log d/n estimation consistency rate in recovering the leading eigenvector of the latent generalized correlation matrix under the transelliptical distribution family, even if the distributions are very heavy-tailed, have infinite second moments, do not have densities and possess arbitrarily continuous marginal distributions. A feature selection result with explicit rate is also provided. TCA is further implemented in both numerical simulations and largescale stock data to illustrate its empirical usefulness. Both theories and experiments confirm that TCA can achieve model flexibility, estimation accuracy and robustness at almost no cost. 1
2 0.57594806 352 nips-2012-Transelliptical Graphical Models
Author: Han Liu, Fang Han, Cun-hui Zhang
Abstract: We advocate the use of a new distribution family—the transelliptical—for robust inference of high dimensional graphical models. The transelliptical family is an extension of the nonparanormal family proposed by Liu et al. (2009). Just as the nonparanormal extends the normal by transforming the variables using univariate functions, the transelliptical extends the elliptical family in the same way. We propose a nonparametric rank-based regularization estimator which achieves the parametric rates of convergence for both graph recovery and parameter estimation. Such a result suggests that the extra robustness and flexibility obtained by the semiparametric transelliptical modeling incurs almost no efficiency loss. We also discuss the relationship between this work with the transelliptical component analysis proposed by Han and Liu (2012). 1
3 0.14785698 202 nips-2012-Locally Uniform Comparison Image Descriptor
Author: Andrew Ziegler, Eric Christiansen, David Kriegman, Serge J. Belongie
Abstract: Keypoint matching between pairs of images using popular descriptors like SIFT or a faster variant called SURF is at the heart of many computer vision algorithms including recognition, mosaicing, and structure from motion. However, SIFT and SURF do not perform well for real-time or mobile applications. As an alternative very fast binary descriptors like BRIEF and related methods use pairwise comparisons of pixel intensities in an image patch. We present an analysis of BRIEF and related approaches revealing that they are hashing schemes on the ordinal correlation metric Kendall’s tau. Here, we introduce Locally Uniform Comparison Image Descriptor (LUCID), a simple description method based on linear time permutation distances between the ordering of RGB values of two image patches. LUCID is computable in linear time with respect to the number of pixels and does not require floating point computation. 1
4 0.13286375 317 nips-2012-Smooth-projected Neighborhood Pursuit for High-dimensional Nonparanormal Graph Estimation
Author: Tuo Zhao, Kathryn Roeder, Han Liu
Abstract: We introduce a new learning algorithm, named smooth-projected neighborhood pursuit, for estimating high dimensional undirected graphs. In particularly, we focus on the nonparanormal graphical model and provide theoretical guarantees for graph estimation consistency. In addition to new computational and theoretical analysis, we also provide an alternative view to analyze the tradeoff between computational efficiency and statistical error under a smoothing optimization framework. Numerical results on both synthetic and real datasets are provided to support our theory. 1
5 0.10851222 211 nips-2012-Meta-Gaussian Information Bottleneck
Author: Melanie Rey, Volker Roth
Abstract: We present a reformulation of the information bottleneck (IB) problem in terms of copula, using the equivalence between mutual information and negative copula entropy. Focusing on the Gaussian copula we extend the analytical IB solution available for the multivariate Gaussian case to distributions with a Gaussian dependence structure but arbitrary marginal densities, also called meta-Gaussian distributions. This opens new possibles applications of IB to continuous data and provides a solution more robust to outliers. 1
6 0.087656438 120 nips-2012-Exact and Stable Recovery of Sequences of Signals with Sparse Increments via Differential 1-Minimization
7 0.080742002 300 nips-2012-Scalable nonconvex inexact proximal splitting
8 0.075194478 310 nips-2012-Semiparametric Principal Component Analysis
9 0.072996929 254 nips-2012-On the Sample Complexity of Robust PCA
10 0.072506994 309 nips-2012-Semi-supervised Eigenvectors for Locally-biased Learning
11 0.06560427 286 nips-2012-Random Utility Theory for Social Choice
12 0.064320676 247 nips-2012-Nonparametric Reduced Rank Regression
13 0.055776335 117 nips-2012-Ensemble weighted kernel estimators for multivariate entropy estimation
14 0.055287205 248 nips-2012-Nonparanormal Belief Propagation (NPNBP)
15 0.054008003 282 nips-2012-Proximal Newton-type methods for convex optimization
16 0.051316213 134 nips-2012-Finite Sample Convergence Rates of Zero-Order Stochastic Optimization Methods
17 0.050893668 326 nips-2012-Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses
18 0.050064042 82 nips-2012-Continuous Relaxations for Discrete Hamiltonian Monte Carlo
19 0.0497967 290 nips-2012-Recovery of Sparse Probability Measures via Convex Programming
20 0.049308769 27 nips-2012-A quasi-Newton proximal splitting method
topicId topicWeight
[(0, 0.147), (1, 0.064), (2, 0.087), (3, -0.072), (4, -0.019), (5, 0.063), (6, 0.061), (7, -0.058), (8, 0.003), (9, 0.026), (10, -0.038), (11, -0.03), (12, 0.034), (13, -0.052), (14, -0.062), (15, -0.046), (16, 0.422), (17, -0.022), (18, -0.068), (19, 0.094), (20, -0.02), (21, -0.27), (22, -0.029), (23, 0.078), (24, 0.003), (25, -0.256), (26, -0.383), (27, -0.058), (28, 0.026), (29, 0.075), (30, 0.043), (31, 0.003), (32, -0.033), (33, -0.041), (34, -0.199), (35, 0.013), (36, -0.054), (37, -0.067), (38, -0.046), (39, 0.012), (40, -0.016), (41, 0.068), (42, 0.047), (43, 0.008), (44, -0.072), (45, -0.058), (46, 0.027), (47, -0.065), (48, -0.028), (49, 0.041)]
simIndex simValue paperId paperTitle
same-paper 1 0.93948126 351 nips-2012-Transelliptical Component Analysis
Author: Fang Han, Han Liu
Abstract: We propose a high dimensional semiparametric scale-invariant principle component analysis, named TCA, by utilize the natural connection between the elliptical distribution family and the principal component analysis. Elliptical distribution family includes many well-known multivariate distributions like multivariate Gaussian, t and logistic and it is extended to the meta-elliptical by Fang et.al (2002) using the copula techniques. In this paper we extend the meta-elliptical distribution family to a even larger family, called transelliptical. We prove that TCA can obtain a near-optimal s log d/n estimation consistency rate in recovering the leading eigenvector of the latent generalized correlation matrix under the transelliptical distribution family, even if the distributions are very heavy-tailed, have infinite second moments, do not have densities and possess arbitrarily continuous marginal distributions. A feature selection result with explicit rate is also provided. TCA is further implemented in both numerical simulations and largescale stock data to illustrate its empirical usefulness. Both theories and experiments confirm that TCA can achieve model flexibility, estimation accuracy and robustness at almost no cost. 1
2 0.89940387 352 nips-2012-Transelliptical Graphical Models
Author: Han Liu, Fang Han, Cun-hui Zhang
Abstract: We advocate the use of a new distribution family—the transelliptical—for robust inference of high dimensional graphical models. The transelliptical family is an extension of the nonparanormal family proposed by Liu et al. (2009). Just as the nonparanormal extends the normal by transforming the variables using univariate functions, the transelliptical extends the elliptical family in the same way. We propose a nonparametric rank-based regularization estimator which achieves the parametric rates of convergence for both graph recovery and parameter estimation. Such a result suggests that the extra robustness and flexibility obtained by the semiparametric transelliptical modeling incurs almost no efficiency loss. We also discuss the relationship between this work with the transelliptical component analysis proposed by Han and Liu (2012). 1
3 0.63047409 317 nips-2012-Smooth-projected Neighborhood Pursuit for High-dimensional Nonparanormal Graph Estimation
Author: Tuo Zhao, Kathryn Roeder, Han Liu
Abstract: We introduce a new learning algorithm, named smooth-projected neighborhood pursuit, for estimating high dimensional undirected graphs. In particularly, we focus on the nonparanormal graphical model and provide theoretical guarantees for graph estimation consistency. In addition to new computational and theoretical analysis, we also provide an alternative view to analyze the tradeoff between computational efficiency and statistical error under a smoothing optimization framework. Numerical results on both synthetic and real datasets are provided to support our theory. 1
4 0.58384496 211 nips-2012-Meta-Gaussian Information Bottleneck
Author: Melanie Rey, Volker Roth
Abstract: We present a reformulation of the information bottleneck (IB) problem in terms of copula, using the equivalence between mutual information and negative copula entropy. Focusing on the Gaussian copula we extend the analytical IB solution available for the multivariate Gaussian case to distributions with a Gaussian dependence structure but arbitrary marginal densities, also called meta-Gaussian distributions. This opens new possibles applications of IB to continuous data and provides a solution more robust to outliers. 1
5 0.37338316 202 nips-2012-Locally Uniform Comparison Image Descriptor
Author: Andrew Ziegler, Eric Christiansen, David Kriegman, Serge J. Belongie
Abstract: Keypoint matching between pairs of images using popular descriptors like SIFT or a faster variant called SURF is at the heart of many computer vision algorithms including recognition, mosaicing, and structure from motion. However, SIFT and SURF do not perform well for real-time or mobile applications. As an alternative very fast binary descriptors like BRIEF and related methods use pairwise comparisons of pixel intensities in an image patch. We present an analysis of BRIEF and related approaches revealing that they are hashing schemes on the ordinal correlation metric Kendall’s tau. Here, we introduce Locally Uniform Comparison Image Descriptor (LUCID), a simple description method based on linear time permutation distances between the ordering of RGB values of two image patches. LUCID is computable in linear time with respect to the number of pixels and does not require floating point computation. 1
6 0.33095184 309 nips-2012-Semi-supervised Eigenvectors for Locally-biased Learning
7 0.32839456 248 nips-2012-Nonparanormal Belief Propagation (NPNBP)
8 0.31016761 128 nips-2012-Fast Resampling Weighted v-Statistics
9 0.29055414 189 nips-2012-Learning from the Wisdom of Crowds by Minimax Entropy
10 0.27992353 286 nips-2012-Random Utility Theory for Social Choice
11 0.2743623 123 nips-2012-Exponential Concentration for Mutual Information Estimation with Application to Forests
12 0.25885257 254 nips-2012-On the Sample Complexity of Robust PCA
13 0.25080183 287 nips-2012-Random function priors for exchangeable arrays with applications to graphs and relational data
14 0.24703351 268 nips-2012-Perfect Dimensionality Recovery by Variational Bayesian PCA
15 0.24025005 85 nips-2012-Convergence and Energy Landscape for Cheeger Cut Clustering
16 0.23080395 37 nips-2012-Affine Independent Variational Inference
17 0.22584786 247 nips-2012-Nonparametric Reduced Rank Regression
18 0.22475463 144 nips-2012-Gradient-based kernel method for feature extraction and variable selection
19 0.21719912 95 nips-2012-Density-Difference Estimation
20 0.21694411 326 nips-2012-Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses
topicId topicWeight
[(0, 0.042), (14, 0.01), (21, 0.019), (24, 0.01), (38, 0.108), (39, 0.424), (42, 0.038), (54, 0.013), (55, 0.024), (74, 0.025), (76, 0.137), (80, 0.036), (92, 0.03)]
simIndex simValue paperId paperTitle
same-paper 1 0.78251058 351 nips-2012-Transelliptical Component Analysis
Author: Fang Han, Han Liu
Abstract: We propose a high dimensional semiparametric scale-invariant principle component analysis, named TCA, by utilize the natural connection between the elliptical distribution family and the principal component analysis. Elliptical distribution family includes many well-known multivariate distributions like multivariate Gaussian, t and logistic and it is extended to the meta-elliptical by Fang et.al (2002) using the copula techniques. In this paper we extend the meta-elliptical distribution family to a even larger family, called transelliptical. We prove that TCA can obtain a near-optimal s log d/n estimation consistency rate in recovering the leading eigenvector of the latent generalized correlation matrix under the transelliptical distribution family, even if the distributions are very heavy-tailed, have infinite second moments, do not have densities and possess arbitrarily continuous marginal distributions. A feature selection result with explicit rate is also provided. TCA is further implemented in both numerical simulations and largescale stock data to illustrate its empirical usefulness. Both theories and experiments confirm that TCA can achieve model flexibility, estimation accuracy and robustness at almost no cost. 1
2 0.73415458 248 nips-2012-Nonparanormal Belief Propagation (NPNBP)
Author: Gal Elidan, Cobi Cario
Abstract: The empirical success of the belief propagation approximate inference algorithm has inspired numerous theoretical and algorithmic advances. Yet, for continuous non-Gaussian domains performing belief propagation remains a challenging task: recent innovations such as nonparametric or kernel belief propagation, while useful, come with a substantial computational cost and offer little theoretical guarantees, even for tree structured models. In this work we present Nonparanormal BP for performing efficient inference on distributions parameterized by a Gaussian copulas network and any univariate marginals. For tree structured networks, our approach is guaranteed to be exact for this powerful class of non-Gaussian models. Importantly, the method is as efficient as standard Gaussian BP, and its convergence properties do not depend on the complexity of the univariate marginals, even when a nonparametric representation is used. 1
3 0.73234147 106 nips-2012-Dynamical And-Or Graph Learning for Object Shape Modeling and Detection
Author: Xiaolong Wang, Liang Lin
Abstract: This paper studies a novel discriminative part-based model to represent and recognize object shapes with an “And-Or graph”. We define this model consisting of three layers: the leaf-nodes with collaborative edges for localizing local parts, the or-nodes specifying the switch of leaf-nodes, and the root-node encoding the global verification. A discriminative learning algorithm, extended from the CCCP [23], is proposed to train the model in a dynamical manner: the model structure (e.g., the configuration of the leaf-nodes associated with the or-nodes) is automatically determined with optimizing the multi-layer parameters during the iteration. The advantages of our method are two-fold. (i) The And-Or graph model enables us to handle well large intra-class variance and background clutters for object shape detection from images. (ii) The proposed learning algorithm is able to obtain the And-Or graph representation without requiring elaborate supervision and initialization. We validate the proposed method on several challenging databases (e.g., INRIA-Horse, ETHZ-Shape, and UIUC-People), and it outperforms the state-of-the-arts approaches. 1
4 0.71807188 166 nips-2012-Joint Modeling of a Matrix with Associated Text via Latent Binary Features
Author: Xianxing Zhang, Lawrence Carin
Abstract: A new methodology is developed for joint analysis of a matrix and accompanying documents, with the documents associated with the matrix rows/columns. The documents are modeled with a focused topic model, inferring interpretable latent binary features for each document. A new matrix decomposition is developed, with latent binary features associated with the rows/columns, and with imposition of a low-rank constraint. The matrix decomposition and topic model are coupled by sharing the latent binary feature vectors associated with each. The model is applied to roll-call data, with the associated documents defined by the legislation. Advantages of the proposed model are demonstrated for prediction of votes on a new piece of legislation, based only on the observed text of legislation. The coupling of the text and legislation is also shown to yield insight into the properties of the matrix decomposition for roll-call data. 1
5 0.66802883 249 nips-2012-Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison
Author: Tianbao Yang, Yu-feng Li, Mehrdad Mahdavi, Rong Jin, Zhi-Hua Zhou
Abstract: Both random Fourier features and the Nystr¨ m method have been successfully o applied to efficient kernel learning. In this work, we investigate the fundamental difference between these two approaches, and how the difference could affect their generalization performances. Unlike approaches based on random Fourier features where the basis functions (i.e., cosine and sine functions) are sampled from a distribution independent from the training data, basis functions used by the Nystr¨ m method are randomly sampled from the training examples and are o therefore data dependent. By exploring this difference, we show that when there is a large gap in the eigen-spectrum of the kernel matrix, approaches based on the Nystr¨ m method can yield impressively better generalization error bound than o random Fourier features based approach. We empirically verify our theoretical findings on a wide range of large data sets. 1
6 0.65592241 323 nips-2012-Statistical Consistency of Ranking Methods in A Rank-Differentiable Probability Space
7 0.63255858 352 nips-2012-Transelliptical Graphical Models
8 0.548455 310 nips-2012-Semiparametric Principal Component Analysis
9 0.48087102 154 nips-2012-How They Vote: Issue-Adjusted Models of Legislative Behavior
10 0.47742248 47 nips-2012-Augment-and-Conquer Negative Binomial Processes
11 0.46539992 317 nips-2012-Smooth-projected Neighborhood Pursuit for High-dimensional Nonparanormal Graph Estimation
12 0.46343866 163 nips-2012-Isotropic Hashing
13 0.45317581 363 nips-2012-Wavelet based multi-scale shape features on arbitrary surfaces for cortical thickness discrimination
14 0.44927189 75 nips-2012-Collaborative Ranking With 17 Parameters
15 0.44908541 354 nips-2012-Truly Nonparametric Online Variational Inference for Hierarchical Dirichlet Processes
16 0.44810948 315 nips-2012-Slice sampling normalized kernel-weighted completely random measure mixture models
17 0.44773102 216 nips-2012-Mirror Descent Meets Fixed Share (and feels no regret)
18 0.4388653 309 nips-2012-Semi-supervised Eigenvectors for Locally-biased Learning
19 0.43653923 335 nips-2012-The Bethe Partition Function of Log-supermodular Graphical Models
20 0.43641829 74 nips-2012-Collaborative Gaussian Processes for Preference Learning