nips nips2008 nips2008-99 nips2008-99-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Guillaume R. Obozinski, Martin J. Wainwright, Michael I. Jordan
Abstract: We study the behavior of block 1 / 2 regularization for multivariate regression, where a K-dimensional response vector is regressed upon a fixed set of p covariates. The problem of support union recovery is to recover the subset of covariates that are active in at least one of the regression problems. Studying this problem under high-dimensional scaling (where the problem parameters as well as sample size n tend to infinity simultaneously), our main result is to show that exact recovery is possible once the order parameter given by θ 1 / 2 (n, p, s) : = n/[2ψ(B ∗ ) log(p − s)] exceeds a critical threshold. Here n is the sample size, p is the ambient dimension of the regression model, s is the size of the union of supports, and ψ(B ∗ ) is a sparsity-overlap function that measures a combination of the sparsities and overlaps of the K-regression coefficient vectors that constitute the model. This sparsity-overlap function reveals that block 1 / 2 regularization for multivariate regression never harms performance relative to a naive 1 -approach, and can yield substantial improvements in sample complexity (up to a factor of K) when the regression vectors are suitably orthogonal relative to the design. We complement our theoretical results with simulations that demonstrate the sharpness of the result, even for relatively small problems. 1
[1] F. Bach. Consistency of the group Lasso and multiple kernel learning. Technical report, INRIA D´ partement d’Informatique, Ecole Normale Sup´ rieure, 2008. e e
[2] F. Bach, G. Lanckriet, and M. Jordan. Multiple kernel learning, conic duality, and the SMO algorithm. In Proc. Int. Conf. Machine Learning (ICML). Morgan Kaufmann, 2004.
[3] D. Donoho, M. Elad, and V. M. Temlyakov. Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Trans. Info Theory, 52(1):6–18, January 2006.
[4] H. Liu and J. Zhang. On the Mellon University, 2008. 1− q regularized regression. Technical Report arXiv:0802.1517v1, Carnegie
[5] L. Meier, S. van de Geer, and P. B¨ hlmann. The group lasso for logistic regression. Technical report, u Mathematics Department, Swiss Federal Institute of Technology Z¨ rich, 2007. u
[6] Y. Nardi and A. Rinaldo. On the asymptotic properties of the group lasso estimator for linear models. Electronic Journal of Statistics, 2:605–633, 2008.
[7] G. Obozinski, B. Taskar, and M. Jordan. Joint covariate selection and joint subspace selection for multiple classification problems. Statistics and Computing, 2009. To appear.
[8] M. Pontil and C.A. Michelli. Learning the kernel function via regularization. Journal of Machine Learning Research, 6:1099–1125, 2005.
[9] P. Ravikumar, J. Lafferty, H. Liu, and L. Wasserman. SpAM: sparse additive models. In Neural Info. Proc. Systems (NIPS) 21, Vancouver, Canada, December 2007.
[10] J. A. Tropp. Just relax: Convex programming methods for identifying sparse signals in noise. IEEE Trans. Info Theory, 52(3):1030–1051, March 2006.
[11] M. J. Wainwright. Sharp thresholds for high-dimensional and noisy recovery of sparsity using using 1 -constrained quadratic programs. Technical Report 709, Department of Statistics, UC Berkeley, 2006.
[12] M. Yuan and Y. Lin. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society B, 1(68):4967, 2006.
[13] P. Zhao, G. Rocha, and B. Yu. Grouped and hierarchical model selection through composite absolute penalties. Technical report, Statistics Department, UC Berkeley, 2007.
[14] P. Zhao and B. Yu. Model selection with the lasso. J. of Machine Learning Research, pages 2541–2567, 2007. 8