jmlr jmlr2013 jmlr2013-102 jmlr2013-102-reference knowledge-graph by maker-knowledge-mining

102 jmlr-2013-Sparse Matrix Inversion with Scaled Lasso

Source: pdf

Author: Tingni Sun, Cun-Hui Zhang

Abstract: We propose a new method of learning a sparse nonnegative-deﬁnite target matrix. Our primary example of the target matrix is the inverse of a population covariance or correlation matrix. The algorithm ﬁrst estimates each column of the target matrix by the scaled Lasso and then adjusts the matrix estimator to be symmetric. The penalty level of the scaled Lasso for each column is completely determined by data via convex minimization, without using cross-validation. We prove that this scaled Lasso method guarantees the fastest proven rate of convergence in the spectrum norm under conditions of weaker form than those in the existing analyses of other ℓ1 regularized algorithms, and has faster guaranteed rate of convergence when the ratio of the ℓ1 and spectrum norms of the target inverse matrix diverges to inﬁnity. A simulation study demonstrates the computational feasibility and superb performance of the proposed method. Our analysis also provides new performance bounds for the Lasso and scaled Lasso to guarantee higher concentration of the error at a smaller threshold level than previous analyses, and to allow the use of the union bound in column-by-column applications of the scaled Lasso without an adjustment of the penalty level. In addition, the least squares estimation after the scaled Lasso selection is considered and proven to guarantee performance bounds similar to that of the scaled Lasso. Keywords: precision matrix, concentration matrix, inverse matrix, graphical model, scaled Lasso, linear regression, spectrum norm

reference text

F. Abramovich and V. Grinshtein. Map model selection in gaussian regression. Electr. J. Statist., 4: 932–949, 2010. A. Antoniadis. Comments on: ℓ1 -penalization for mixture regression models. Test, 19(2):257–258, 2010. O. Banerjee, L. El Ghaoui, and A. d’Aspremont. Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. Journal of Machine Learning Research, 9:485–516, 2008. P. Bickel, Y. Ritov, and A. Tsybakov. Simultaneous analysis of Lasso and Dantzig selector. Annals of Statistics, 37(4):1705–1732, 2009. 3416 S PARSE M ATRIX I NVERSION WITH S CALED L ASSO L. Birge and P. Massart. Gaussian model selection. J. Eur. Math. Soc., 3:203–268, 2001. L. Birge and P. Massart. Minimal penalties for gaussian model selection. Probability Theory Related Fields, 138:33–73, 2007. C. Borell. The brunn-minkowski inequality in gaussian space. Invent. Math., 30:207–216, 1975. F. Bunea, A. Tsybakov, and M.H. Wegkamp. Aggregation for gaussian regression. Ann. Statist., 35: 1674–1697, 2007. T. Cai, L. Wang, and G. Xu. Shifting inequality and recovery of sparse signals. IEEE Transactions on Signal Processing, 58:1300–1308, 2010. T. Cai, W. Liu, and X. Luo. A constrained ℓ1 minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 106:594–607, 2011. E. J. Cand` s and T. Tao. Decoding by linear programming. IEEE Trans. on Information Theory, e 51:4203–4215, 2005. E. J. Cand` s and T. Tao. The dantzig selector: statistical estimation when p is much larger than n e (with discussion). Annals of Statistics, 35:2313–2404, 2007. D. L. Donoho and I. Johnstone. Minimax risk over ℓ p –balls for ℓq –error. Probability Theory and Related Fields, 99:277–303, 1994. J. Friedman, T. Hastie, and R. Tibshirani. Sparse inverse covariance estimation with the graphical Lasso. Biostatistics, 9:432–441, 2008. P. J. Huber and E. M. Ronchetti. Robust Statistics, pages 172–175. Wiley, second edition, 2009. V. Koltchinskii. The dantzig selector and sparsity oracle inequalities. Bernoulli, 15:799–828, 2009. C. Lam and J. Fan. Sparsistency and rates of convergence in large covariance matrices estimation. Annals of Statistics, 37:4254–4278, 2009. N. Meinshausen and P. B¨ hlmann. High-dimensional graphs and variable selection with the Lasso. u Annals of Statistics, 34:1436–1462, 2006. S. Negahban, P. Ravikumar, M. J. Wainwright, and B. Yu. A uniﬁed framework for highdimensional analysis of M-estimators with decomposable regularizers. Statistical Science, 27: 538–557, 2012. G. Raskutti, M. J. Wainwright, and B. Yu. Minimax rates of estimation for high–dimensional linear regression over ℓq –balls. IEEE Trans. Info. Theory, 57:6976–6994, 2011. P. Ravikumar, M. J. Wainwright, G. Raskutti, and B. Yu. Model selection in gaussian graphical models: High-dimensional consistency of ℓ1 -regularized MLE. In Advances in Neural Information Processing Systems (NIPS), 21, 2008. G. Rocha, P. Zhao, and B. Yu. A path following algorithm for sparse pseudo-likelihood inverse covariance estimation (splice). Technical report, University of California, Berkeley, 2008. 3417 S UN AND Z HANG A.J. Rothman, P.J. Bickel, E. Levina, and J. Zhu. Sparse permutation invariant covariance estimation. Electronic Journal of Statistics, 2:494–515, 2008. T. Sun and C.-H. Zhang. Scaled sparse linear regression. Biometrika, 99:879–898, 2012. S. van de Geer. The deterministic Lasso. Technical Report 140, ETH Zurich, Switzerland, 2007. S. van de Geer and P. B¨ hlmann. On the conditions used to prove oracle results for the Lasso. u Electronic Journal of Statistics, 3:1360–1392, 2009. S. Yang and E. D. Kolaczyk. Target detection via network ﬁltering. IEEE Transactions on Information Theory, 56(5):2502–2515, 2010. F. Ye and C.-H. Zhang. Rate minimaxity of theLasso and Dantzig selector for the ℓq loss in ℓr balls. Journal of Machine Learning Research, 11:3481–3502, 2010. M. Yuan. Sparse inverse covariance matrix estimation via linear programming. Journal of Machine Learning Research, 11:2261–2286, 2010. M. Yuan and Y. Lin. Model selection and estimation in the gaussian graphical model. Biometrika, 94(1):19–35, 2007. C.-H. Zhang. Nearly unbiased variable selection under minimax concave penalty. Annals of Statistics, 38:894–942, 2010. C.-H. Zhang and J. Huang. The sparsity and bias of the Lasso selection in high-dimensional linear regression. Annals of Statistics, 36(4):1567–1594, 2008. C.-H. Zhang and T. Zhang. A general theory of concave regularization for high dimensional sparse estimation problems. Statistical Science, 27(4):576–593, 2012. T. Zhang. Some sharp performance bounds for least squares regression with L1 regularization. Ann. Statist., 37(5A):2109–2144, 2009. 3418