nips nips2007 nips2007-40 nips2007-40-reference knowledge-graph by maker-knowledge-mining

40 nips-2007-Bundle Methods for Machine Learning


Source: pdf

Author: Quoc V. Le, Alex J. Smola, S.v.n. Vishwanathan

Abstract: We present a globally convergent method for regularized risk minimization problems. Our method applies to Support Vector estimation, regression, Gaussian Processes, and any other regularized risk minimization setting which leads to a convex optimization problem. SVMPerf can be shown to be a special case of our approach. In addition to the unified framework we present tight convergence bounds, which show that our algorithm converges in O(1/ ) steps to precision for general convex problems and in O(log(1/ )) steps for continuously differentiable problems. We demonstrate in experiments the performance of our approach. 1


reference text

[1] Y. Altun, A. J. Smola, and T. Hofmann. Exponential families for conditional random fields. In UAI, pages 2–9, 2004.

[2] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.

[3] J. Hiriart-Urruty and C. Lemar´ chal. Convex Analysis and Minimization Algorithms. 1993. e

[4] T. Joachims. A support vector method for multivariate performance measures. In ICML, pages 377–384, 2005.

[5] T. Joachims. Training linear SVMs in linear time. In KDD, 2006.

[6] S.-I. Lee, V. Ganapathi, and D. Koller. Efficient structure learning of Markov networks using L1regularization. In NIPS, pages 817–824, 2007.

[7] A. Nedich and D. P. Bertsekas. Convergence rate of incremental subgradient algorithms. In Stochastic Optimization: Algorithms and Applications, pages 263–304. 2000.

[8] J. Nocedal and S. J. Wright. Numerical Optimization. Springer, 1999.

[9] N. Ratliff, J. Bagnell, and M. Zinkevich. (online) subgradient methods for structured prediction. In Proc. of AIStats, 2007.

[10] R. T. Rockafellar. Convex Analysis. Princeton University Press, 1970.

[11] S. Shalev-Shwartz and Y. Singer. Online learning meets optimization in the dual. In COLT, 2006.

[12] S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-gradient solver for SVM. In ICML, 2007.

[13] A. J. Smola, S. V. N. Vishwanathan, and Q. V. Le. Bundle methods for machine learning. JMLR, 2008. in preparation.

[14] B. Taskar, C. Guestrin, and D. Koller. Max-margin Markov networks. In NIPS, pages 25–32, 2004.

[15] C. H. Teo, Q. Le, A. Smola, and S. V. N. Vishwanathan. A scalable modular convex solver for regularized risk minimization. In KDD, 2007.

[16] I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. Large margin methods for structured and interdependent output variables. JMLR, 6:1453–1484, 2005.

[17] V. Vapnik. The Nature of Statistical Learning Theory. Springer, 1995.

[18] T. Zhang. Sequential greedy approximation for certain convex optimization problems. IEEE Trans. Information Theory, 49(3):682–691, 2003. 8