nips nips2008 nips2008-239 nips2008-239-reference knowledge-graph by maker-knowledge-mining

239 nips-2008-Tighter Bounds for Structured Estimation

Source: pdf

Author: Olivier Chapelle, Chuong B. Do, Choon H. Teo, Quoc V. Le, Alex J. Smola

Abstract: Large-margin structured estimation methods minimize a convex upper bound of loss functions. While they allow for efﬁcient optimization algorithms, these convex formulations are not tight and sacriﬁce the ability to accurately model the true loss. We present tighter non-convex bounds based on generalizing the notion of a ramp loss from binary classiﬁcation to structured estimation. We show that a small modiﬁcation of existing optimization algorithms sufﬁces to solve this modiﬁed problem. On structured prediction tasks such as protein sequence alignment and web page ranking, our algorithm leads to improved accuracy. 1

reference text

[1] K. Crammer, and Y. Singer. On the Learnability and Design of Output Codes for Multiclass Problems. In COLT 2000, pages 35–46. Morgan Kaufmann, 2000.

[2] R. Collobert, F.H. Sinz, J. Weston, and L. Bottou. Trading convexity for scalability. In ICML 2006, pages 201–208. ACM, 2006.

[3] C. B. Do, S. S. Gross, and S. Batzoglou. CONTRAlign: discriminative training for protein sequence alignment. In RECOMB, pages 160–174, 2006.

[4] C. B. Do, D. A. Woods, and S. Batzoglou. CONTRAfold: RNA secondary structure prediction without physics-based models. Bioinformatics, 22(14):e90–e98, 2006.

[5] S. R. Eddy. Non-coding RNA genes and the modern RNA world. Nature Reviews Genetics, 2(12):919– 929, 2001.

[6] Y. Freund, R. Iyer, R.E. Schapire, and Y. Singer. An efﬁcient boosting algorithm for combining preferences. In ICML 1998, pages 170–178., 1998.

[7] S. Grifﬁths-Jones, S. Moxon, M. Marshall, A. Khanna, S. R. Eddy, and A. Bateman. Rfam: annotating non-coding RNAs in complete genomes. Nucl. Acids Res., 33:D121–D124, 2005.

[8] R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. In Advances in Large Margin Classiﬁers, pages 115–132, 2000. MIT Press.

[9] T. Hoang. DC optimization: Theory, methods, and applications. In R. Horst and P. Pardalos, editors, Handbook of Global Optimization, Kluwer.

[10] T. Joachims. Optimizing search engines using clickthrough data. In KDD. ACM, 2002.

[11] T. Joachims, T. Galor, and R. Elber. Learning to align sequences: A maximum-margin approach. In New Algorithms for Macromolecular Simulation, LNCS 49, 57–68. Springer, 2005.

[12] Q. Le and A.J. Smola. Direct optimization of ranking measures. NICTA-TR, 2007.

[13] T.-Y. Liu, J. Xu, T. Qin, W. Xiong, and H. Li. Letor: Benchmark dataset for research on learning to rank for information retrieval. In LR4IR, 2007.

[14] J. Pei and N. V. Grishin. MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information. Nucl. Acids Res., 34(16):4364–4374, 2006.

[15] N. Ratliff, J. Bagnell, and M. Zinkevich. (online) subgradient methods for structured prediction. In AISTATS, 2007.

[16] B. Rost. Twilight zone of protein sequence alignments. Protein Eng., 12(2):85–94, 1999.

[17] S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-gradient solver for svm. In Proc. Intl. Conf. Machine Learning, 2007.

[18] B. Taskar, C. Guestrin, and D. Koller. Max-margin Markov networks. In NIPS 16, pages 25–32, 2004. MIT Press.

[19] C.H. Teo, Q. Le, A.J. Smola, and S.V.N. Vishwanathan. A scalable modular convex solver for regularized risk minimization. In KDD. ACM, 2007.

[20] I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res., 6:1453–1484, 2005.

[21] M. Weimer, A. Karatzoglou, Q. Le, and A. Smola. Coﬁ rank - maximum margin matrix factorization for collaborative ranking. In NIPS 20. MIT Press, 2008.

[22] A.L. Yuille and A. Rangarajan. The concave-convex procedure. Neural Computation, 15:915–936, 2003.

[23] A. Nedic and D. P. Bertsekas. Incremental subgradient methods for nondifferentiable optimization. Siam J. on Optimization, 12:109–138, 2001. 8