nips nips2006 nips2006-195 nips2006-195-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Samuel S. Gross, Olga Russakovsky, Chuong B. Do, Serafim Batzoglou
Abstract: We consider the problem of training a conditional random field (CRF) to maximize per-label predictive accuracy on a training set, an approach motivated by the principle of empirical risk minimization. We give a gradient-based procedure for minimizing an arbitrarily accurate approximation of the empirical risk under a Hamming loss function. In experiments with both simulated and real data, our optimization procedure gives significantly better testing performance than several current approaches for CRF training, especially in situations of high label noise. 1
[1] J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In ICML, 2001.
[2] V. Vapnik. Statistical Learning Theory. Wiley, 1998.
[3] C. B. Do, M. S. P. Mahabhashyam, M. Brudno, and S. Batzoglou. ProbCons: probabilistic consistencybased multiple sequence alignment. Genome Research, 15(2):330–340, 2005.
[4] C. B. Do, D. A. Woods, and S. Batzoglou. CONTRAfold: RNA secondary structure prediction without physics-based models. Bioinformatics, 22(14):e90–e98, 2006.
[5] P. Liang, B. Taskar, and D. Klein. Alignment by agreement. In HLT-NAACL, 2006.
[6] J. Nocedal and S. J. Wright. Numerical Optimization. Springer, 1999.
[7] S. Kakade, Y. W. Teh, and S. Roweis. An alternate objective function for Markovian fields. In ICML, 2002.
[8] Y. Altun, M. Johnson, and T. Hofmann. Investigating loss functions and optimization methods for discriminative learning of label sequences. In EMNLP, 2003.
[9] B. Taskar, C. Guestrin, and D. Koller. Max margin markov networks. In NIPS, 2003.
[10] I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support vector machine learning for interdependent and structured output spaces. In ICML, 2004.
[11] J. Suzuki, E. McDermott, and H. Isozaki. Training conditional random fields with multivariate evaluation measures. In ACL, 2006.
[12] G. Grumbling, V. Strelets, and The Flybase Consortium. FlyBase: anatomical data, images and queries. Nucleic Acids Research, 34:D484–D488, 2006.
[13] J. Platt. Using sparseness and analytic QP to speed training of support vector machines. In NIPS, 1999.
[14] M. Jansche. Maximum expected F-measure training of logistic regression models. In EMNLP, 2005.
[15] F. J. Och. Minimum error rate training in statistical machine translation. In ACL, 2003.