nips nips2009 nips2009-138 nips2009-138-reference knowledge-graph by maker-knowledge-mining

138 nips-2009-Learning with Compressible Priors

Source: pdf

Author: Volkan Cevher

Abstract: We describe a set of probability distributions, dubbed compressible priors, whose independent and identically distributed (iid) realizations result in p-compressible signals. A signal x ∈ RN is called p-compressible with magnitude R if its sorted coefﬁcients exhibit a power-law decay as |x|(i) R · i−d , where the decay rate d is equal to 1/p. p-compressible signals live close to K-sparse signals (K N) in the r -norm (r > p) since their best K-sparse approximation error decreases with O R · K 1/r−1/p . We show that the membership of generalized Pareto, Student’s t, log-normal, Fr´ chet, and log-logistic distributions to the set of compresse ible priors depends only on the distribution parameters and is independent of N . In contrast, we demonstrate that the membership of the generalized Gaussian distribution (GGD) depends both on the signal dimension and the GGD parameters: the expected decay rate of N -sample iid realizations from the GGD with the shape parameter q is given by 1/ [q log (N/q)]. As stylized examples, we show via experiments that the wavelet coefﬁcients of natural images are 1.67-compressible whereas their pixel gradients are 0.95 log (N/0.95)-compressible, on the average. We also leverage the connections between compressible priors and sparse signals to develop new iterative re-weighted sparse signal recovery algorithms that outperform the standard 1 -norm minimization. Finally, we describe how to learn the hyperparameters of compressible priors in underdetermined regression problems by exploiting the geometry of their order statistics during signal recovery. 1

reference text

[1] E. J. Cand` s. Compressive sampling. In Proc. International Congress of Mathematicians, e volume 3, pages 1433–1452, Madrid, Spain, 2006.

[2] M.E. Tipping. Sparse bayesian learning and the relevance vector machine. The Journal of Machine Learning Research, 1:211–244, 2001.

[3] D. P. Wipf and B. D. Rao. Sparse Bayesian learning for basis selection. IEEE Transactions on Signal Processing, 52(8):2153–2164, 2004.

[4] T. Blumensath and M.E. Davies. Sampling theorems for signals from the union of linear subspaces. IEEE Trans. Info. Theory, 2009.

[5] A. Cohen, W. Dahmen, and R. DeVore. Compressed sensing and best k-term approximation. American Mathematical Society, 22(1):211–231, 2009.

[6] I. F. Gorodnitsky, J. S. George, and B. D. Rao. Neuromagnetic source imaging with FOCUSS: a recursive weighted minimum norm algorithm. Electroenceph. and Clin. Neurophys., 95(4):231–251, 1995.

[7] E. J. Cand` s, M. B. Wakin, and S. P. Boyd. Enhancing sparsity by reweighted e Journal of Fourier Analysis and Applications, 14(5):877–905, 2008. 1 minimization.

[8] D. P. Wipf and S. Nagarajan. Iterative reweighted 1 and 2 methods for ﬁnding sparse solutions. In SPARS09, Rennes, France, 2009.

[9] S. S. Chen, D. L. Donoho, and M. A. Saunders. Atomic decomposition by basis pursuit. SIAM review, pages 129–159, 2001.

[10] H.A. David and H.N. Nagaraja. Order Statistics. Wiley-Interscience, 2004.

[11] S. Mallat. A Wavelet Tour of Signal Processing. Academic Press, 1999.

[12] H. Choi and R. G. Baraniuk. Wavelet statistical models and Besov spaces. Lecture Notes in Statistics, pages 9–30, 2003.

[13] T. K. Nayak. Multivariate Lomax distribution: properties and usefulness in reliability theory. Journal of Applied Probability, pages 170–177, 1987.

[14] V. Cevher. Compressible priors. IEEE Trans. on Information Theory, in preparation, 2010.

[15] R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, pages 267–288, 1996.

[16] E. T. Hale, W. Yin, and Y. Zhang. Fixed-point continuation for 1 -minimization: Methodology and convergence. SIAM Journal on Optimization, 19:1107, 2008.

[17] P. J. Garrigues. Sparse Coding Models of Natural Images: Algorithms for Efﬁcient Inference and Learning of Higher-Order Structure. PhD thesis, EECS Department, University of California, Berkeley, May 2009.

[18] M. J. Wainwright and E. P. Simoncelli. Scale mixtures of Gaussians and the statistics of natural images. In NIPS, 2000.

[19] D. Wipf and S. Nagarajan. A new view of automatic relevance determination. In NIPS, volume 20, 2008.

[20] I. Daubechies, R. DeVore, M. Fornasier, and S. Gunturk. Iteratively re-weighted least squares minimization for sparse recovery. Commun. Pure Appl. Math, 2009.

[21] R. Chartrand and W. Yin. Iteratively reweighted algorithms for compressive sensing. In ICASSP, pages 3869–3872, 2008.

[22] M.A. Fischler and R.C. Bolles. Random sample consensus: a paradigm for model ﬁtting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395, 1981.

[23] E. J. Candes and D. L. Donoho. Curvelets and curvilinear integrals. Journal of Approximation Theory, 113(1):59–90, 2001.

[24] E. van den Berg and M. P. Friedlander. Probing the Pareto frontier for basis pursuit solutions. SIAM Journal on Scientiﬁc Computing, 31(2):890–912, 2008. 9