nips nips2010 nips2010-181 nips2010-181-reference knowledge-graph by maker-knowledge-mining

181 nips-2010-Network Flow Algorithms for Structured Sparsity

Source: pdf

Author: Julien Mairal, Rodolphe Jenatton, Francis R. Bach, Guillaume R. Obozinski

Abstract: We consider a class of learning problems that involve a structured sparsityinducing norm deﬁned as the sum of ℓ∞ -norms over groups of variables. Whereas a lot of effort has been put in developing fast optimization methods when the groups are disjoint or embedded in a speciﬁc hierarchical structure, we address here the case of general overlapping groups. To this end, we show that the corresponding optimization problem is related to network ﬂow optimization. More precisely, the proximal problem associated with the norm we consider is dual to a quadratic min-cost ﬂow problem. We propose an efﬁcient procedure which computes its solution exactly in polynomial time. Our algorithm scales up to millions of variables, and opens up a whole new range of applications for structured sparse models. We present several experiments on image and video data, demonstrating the applicability and scalability of our approach for various problems. 1

reference text

[1] P. Bickel, Y. Ritov, and A. Tsybakov. Simultaneous analysis of Lasso and Dantzig selector. Ann. Stat., 37(4):1705–1732, 2009.

[2] B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. Ann. Stat., 32(2):407–499, 2004.

[3] Y. Nesterov. Gradient methods for minimizing composite objective function. Technical report, Center for Operations Research and Econometrics (CORE), Catholic University of Louvain, 2007.

[4] A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci., 2(1):183–202, 2009.

[5] R. Jenatton, J-Y. Audibert, and F. Bach. Structured variable selection with sparsity-inducing norms. Technical report, 2009. Preprint arXiv:0904.3523v1.

[6] L. Jacob, G. Obozinski, and J.-P. Vert. Group Lasso with overlap and graph Lasso. In Proc. ICML, 2009.

[7] P. Zhao, G. Rocha, and B. Yu. The composite absolute penalties family for grouped and hierarchical variable selection. Ann. Stat., 37(6A):3468–3497, 2009.

[8] R. G. Baraniuk, V. Cevher, M. Duarte, and C. Hegde. Model-based compressive sensing. IEEE T. Inform. Theory, 2010. to appear.

[9] V. Cehver, M.F. Duarte, C. Hedge, and R.G. Baraniuk. Sparse signal recovery using markov random ﬁelds. In Adv. NIPS, 2008.

[10] J. Huang, T. Zhang, and D. Metaxas. Learning with structured sparsity. In Proc. ICML, 2009.

[11] S. Kim and E. P. Xing. Tree-guided group lasso for multi-task regression with structured sparsity. In Proc. ICML, 2010.

[12] R. Jenatton, J. Mairal, G. Obozinski, and F. Bach. Proximal methods for sparse hierarchical dictionary learning. In Proc. ICML, 2010.

[13] M. Yuan and Y. Lin. Model selection and estimation in regression with grouped variables. J. Roy. Stat. Soc. B, 68:49–67, 2006.

[14] G. Obozinski, B. Taskar, and M. I. Jordan. Joint covariate selection and joint subspace selection for multiple classiﬁcation problems. Stat. Comput., 20(2):231–252, 2010.

[15] F. Bach. Exploring large feature spaces with hierarchical multiple kernel learning. In Adv. NIPS, 2008.

[16] P. L. Combettes and J.-C. Pesquet. Proximal splitting methods in signal processing. In Fixed-Point Algorithms for Inverse Problems in Science and Engineering. Springer, 2010.

[17] J. Mairal, R. Jenatton, G. Obozinski, and F. Bach. Network ﬂow algorithms for structured sparsity. Technical report, 2010. Preprint arXiv:1008.5209v1.

[18] D. S. Hochbaum and S. P. Hong. About strongly polynomial time algorithms for quadratic optimization over submodular constraints. Math. Program., 69(1):269–309, 1995.

[19] P. Brucker. An O(n) algorithm for quadratic knapsack problems. Oper. Res. Lett., 3:163–166, 1984.

[20] G. Gallo, M. E. Grigoriadis, and R. E. Tarjan. A fast parametric maximum ﬂow algorithm and applications. SIAM J. Comput., 18:30–55, 1989.

[21] M. Babenko and A.V. Goldberg. Experimental evaluation of a parametric ﬂow algorithm. Technical report, Microsoft Research, 2006. MSR-TR-2006-77.

[22] H. Groenevelt. Two algorithms for maximizing a separable concave function over a polymatroid feasible region. Eur. J. Oper. Res., pages 227–236, 1991.

[23] F. Bach. Structured sparsity-inducing norms through submodular functions. In Adv. NIPS, 2010.

[24] A. V. Goldberg and R. E. Tarjan. A new approach to the maximum ﬂow problem. In Proc. of ACM Symposium on Theory of Computing, 1986.

[25] L. R. Ford and D. R. Fulkerson. Maximal ﬂow through a network. Canadian J. Math., 8(3), 1956.

[26] B. V. Cherkassky and A. V. Goldberg. On implementing the pushrelabel method for the maximum ﬂow problem. Algorithmica, 19(4):390–410, 1997.

[27] S. Negahban, P. Ravikumar, M. J. Wainwright, and B. Yu. A uniﬁed framework for high-dimensional analysis of M-estimators with decomposable regularizers. In Adv. NIPS, 2009.

[28] J. M. Borwein and A. S. Lewis. Convex analysis and nonlinear optimization. Springer, 2006.

[29] J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma. Robust face recognition via sparse representation. IEEE T. Pattern. Anal., pages 210–227, 2008.

[30] P. Sprechmann, I. Ramirez, G. Sapiro, and Y. C. Eldar. Collaborative hierarchical sparse modeling. Technical report, 2010. Preprint arXiv:1003.0400v1. 9