nips nips2012 nips2012-27 nips2012-27-reference knowledge-graph by maker-knowledge-mining

27 nips-2012-A quasi-Newton proximal splitting method


Source: pdf

Author: Stephen Becker, Jalal Fadili

Abstract: A new result in convex analysis on the calculation of proximity operators in certain scaled norms is derived. We describe efficient implementations of the proximity calculation for a useful class of functions; the implementations exploit the piece-wise linear nature of the dual problem. The second part of the paper applies the previous result to acceleration of convex minimization problems, and leads to an elegant quasi-Newton method. The optimization method compares favorably against state-of-the-art alternatives. The algorithm has extensive applications including signal processing, sparse recovery and machine learning and classification. 1


reference text

[1] H. H. Bauschke and P. L. Combettes. Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer-Verlag, New York, 2011.

[2] R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Computing, 16(5):1190–1208, 1995. 8

[3] P. L. Combettes and J. C. Pesquet. Proximal splitting methods in signal processing. In H. H. Bauschke, R. S. Burachik, P. L. Combettes, V. Elser, D. R. Luke, and H. Wolkowicz, editors, Fixed-Point Algorithms for Inverse Problems in Science and Engineering, pages 185–212. Springer-Verlag, New York, 2011.

[4] E. G. Birgin, J. M. Mart´nez, and M. Raydan. Nonmonotone spectral projected gradient methods on ı convex sets. SIAM J. Optim., 10(4):1196–1211, 2000.

[5] S. Wright, R. Nowak, and M. Figueiredo. Sparse reconstruction by separable approximation. IEEE Transactions on Signal Processing, 57, 2009. 2479–2493.

[6] J. Barzilai and J. Borwein. Two point step size gradient method. IMA J. Numer. Anal., 8:141–148, 1988.

[7] A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. on Imaging Sci., 2(1):183–202, 2009.

[8] B. O’Donoghue and E. Cand` s. Adaptive restart for accelerated gradient schemes. Preprint: e arXiv:1204.3982, 2012.

[9] T. Pock and A. Chambolle. Diagonal preconditioning for first order primal-dual algorithms in convex optimization. In ICCV, 2011.

[10] G. H.-G. Chen and R. T. Rockafellar. Convergence rates in forward–backward splitting. SIAM Journal on Optimization, 7(2):421–444, 1997.

[11] C. Zhu, R. H. Byrd, P. Lu, and J. Nocedal. Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Software, 23(4):550–560, 1997. ¨

[12] Jos´ Luis Morales and Jorge Nocedal. Remark on algorithm 778: L-BFGS-B: Fortran subroutines for e large-scale bound constrained optimization¨ ACM Trans. Math. Softw., 38(1):7:1–7:4, 2011. .

[13] W. W. Hager and H. Zhang. A new active set algorithm for box constrained optimization. SIAM J. Optim., 17:526–557, 2006.

[14] A. Andrew and J. Gao. Scalable training of l1 -regularized log-linear models. In ICML, 2007.

[15] M. Schmidt, G. Fung, and R. Rosales. Fast optimization methods for l1 regularization: A comparative study and two new approaches. In European Conference on Machine Learning, 2007.

[16] Z. Wen, W. Yin, D. Goldfarb, and Y. Zhang. A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation. SIAM J. Sci. Comput., 32(4):1832–1857, 2010.

[17] T. Goldstein and S. Setzer. High-order methods for basis pursuit. Technical report, CAM-UCLA, 2011.

[18] J. Yu, S.V.N. Vishwanathan, S. Guenter, and N. Schraudolph. A quasi-Newton approach to nonsmooth convex optimization problems in machine learning. J. Machine Learning Research, 11:1145–1200, 2010.

[19] M. Schmidt, E. van den Berg, M. Friedlander, and K. Murphy. Optimizing costly functions with simple constraints: A limited-memory projected quasi-Newton algorithm. In AISTATS, 2009.

[20] M. Schmidt, D. Kim, and S. Sra. Projected Newton-type methods in machine learning. In S. Sra, S. Nowozin, and S.Wright, editors, Optimization for Machine Learning. MIT Press, 2011.

[21] J. D. Lee, Y. Sun, and M. A. Saunders. Proximal Newton-type methods for minimizing convex objective functions in composite form. Preprint: arXiv:1206.1623, 2012.

[22] J.-J. Moreau. Fonctions convexes duales et points proximaux dans un espace hilbertien. CRAS S´ r. A e Math., 255:2897–2899, 1962.

[23] R. Griesse and D. A. Lorenz. A semismooth Newton method for Tikhonov functionals with sparsity constraints. Inverse Problems, 24(3):035007, 2008.

[24] C. Broyden. Quasi-Newton methods and their application to function minimization. Math. Comp., 21:577–593, 1967.

[25] N. Gould. Seminal papers in nonlinear optimization. In An introduction to algorithms for continuous optimization. Oxford University Computing Laboratory, 2006. http://www.numerical.rl.ac. uk/nimg/course/lectures/paper/paper.pdf.

[26] J. Nocedal and S. Wright. Numerical Optimization. Springer, 2nd edition, 2006.

[27] I. Dhillon, D. Kim, and S. Sra. Tackling box-constrained optimization via a new projected quasi-Newton approach. SIAM J. Sci. Comput., 32(6):3548–3563, 2010.

[28] Roger Fletcher. On the Barzilai-Borwein method. In Liqun Qi, Koklay Teo, Xiaoqi Yang, Panos M. Pardalos, and Donald W. Hearn, editors, Optimization and Control with Applications, volume 96 of Applied Optimization, pages 235–256. Springer US, 2005.

[29] H. Raguet, J. Fadili, and G. Peyr´ . Generalized forward-backward splitting. Technical report, Preprint e Hal-00613637, 2011.

[30] W. Hackbusch. A sparse matrix arithmetic based on H-matrices. Part I: Introduction to H-matrices. Computing, 62:89–108, 1999.

[31] R. T. Rockafellar. Convex Analysis. Princeton University Press, 1970. 9