nips nips2005 nips2005-184 nips2005-184-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ben Taskar, Simon Lacoste-Julian, Michael I. Jordan
Abstract: We present a simple and scalable algorithm for large-margin estimation of structured models, including an important class of Markov networks and combinatorial models. We formulate the estimation problem as a convex-concave saddle-point problem and apply the extragradient method, yielding an algorithm with linear convergence using simple gradient and projection calculations. The projection step can be solved using combinatorial algorithms for min-cost quadratic flow. This makes the approach an efficient alternative to formulations based on reductions to a quadratic program (QP). We present experiments on two very different structured prediction tasks: 3D image segmentation and word alignment, illustrating the favorable scaling properties of our algorithm. 1
[1] Y. Altun, I. Tsochantaridis, and T. Hofmann. Hidden Markov support vector machines. In Proc. ICML, 2003.
[2] D. Anguelov, B. Taskar, V. Chatalbashev, D. Koller, D. Gupta, G. Heitz, and A. Ng. Discriminative learning of Markov random fields for segmentation of 3d scan data. In CVPR, 2005.
[3] P. Baldi, J. Cheng, and A. Vullo. Large-scale prediction of disulphide bond connectivity. In Proc. NIPS, 2004.
[4] P. Bartlett, M. Collins, B. Taskar, and D. McAllester. Exponentiated gradient algorithms for large-margin structured classification. In NIPS, 2004.
[5] S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell., 24, 2002.
[6] M. Collins. Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proc. EMNLP, 2002.
[7] D. M. Greig, B. T. Porteous, and A. H. Seheult. Exact maximum a posteriori estimation for binary images. J. R. Statist. Soc. B, 51, 1989.
[8] F. Guerriero and P. Tseng. Implementation and test of auction methods for solving generalized network flow problems with separable convex cost. Journal of Optimization Theory and Applications, 115(1):113–144, October 2002.
[9] B.S. He and L. Z. Liao. Improvements of some projection methods for monotone nonlinear variational inequalities. JOTA, 112:111:128, 2002.
[10] M. Jerrum and A. Sinclair. Polynomial-time approximation algorithms for the Ising model. SIAM J. Comput., 22, 1993.
[11] G. M. Korpelevich. The extragradient method for finding saddle points and other problems. Ekonomika i Matematicheskie Metody, 12:747:756, 1976.
[12] S. Kumar and M. Hebert. Discriminative fields for modeling spatial dependencies in natural images. In NIPS, 2003.
[13] J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML, 2001.
[14] E. Matusov, R. Zens, and H. Ney. Symmetric word alignments for statistical machine translation. In Proc. COLING, 2004.
[15] R. Mihalcea and T. Pedersen. An evaluation exercise for word alignment. In Proceedings of the HLT-NAACL 2003 Workshop, Building and Using parallel Texts: Data Driven Machine Translation and Beyond, pages 1–6, Edmonton, Alberta, Canada, 2003.
[16] F. Och and H. Ney. A systematic comparison of various statistical alignment models. Computational Linguistics, 29(1), 2003.
[17] A. Schrijver. Combinatorial Optimization: Polyhedra and Efficiency. Springer, 2003.
[18] B. Taskar. Learning Structured Prediction Models: A Large Margin Approach. PhD thesis, Stanford University, 2004.
[19] B. Taskar, V. Chatalbashev, D. Koller, and C. Guestrin. Learning structured prediction models: a large margin approach. In ICML, 2005.
[20] B. Taskar, C. Guestrin, and D. Koller. Max margin Markov networks. In NIPS, 2003.
[21] B. Taskar, S. Lacoste-Julien, and M. Jordan. Structured prediction, dual extragradient and Bregman projections. Technical report, UC Berkeley Statistics Department, 2005.
[22] B. Taskar, S. Lacoste-Julien, and D. Klein. A discriminative matching approach to word alignment. In EMNLP, 2005.
[23] L. G. Valiant. The complexity of computing the permanent. Theoretical Computer Science, 8:189–201, 1979.