nips nips2007 nips2007-22 nips2007-22-reference knowledge-graph by maker-knowledge-mining

22 nips-2007-Agreement-Based Learning

Source: pdf

Author: Percy Liang, Dan Klein, Michael I. Jordan

Abstract: The learning of probabilistic models with many hidden variables and nondecomposable dependencies is an important and challenging problem. In contrast to traditional approaches based on approximate inference in a single intractable model, our approach is to train a set of tractable submodels by encouraging them to agree on the hidden variables. This allows us to capture non-decomposable aspects of the data while still maintaining tractability. We propose an objective function for our approach, derive EM-style algorithms for parameter estimation, and demonstrate their effectiveness on three challenging real-world learning tasks. 1

reference text

[1] J. Besag. The analysis of non-lattice data. The Statistician, 24:179–195, 1975.

[2] P. F. Brown, S. A. D. Pietra, V. J. D. Pietra, and R. L. Mercer. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19:263–311, 1993.

[3] G. Hinton. Products of experts. In International Conference on Artiﬁcial Neural Networks, 1999.

[4] V. Jojic, N. Jojic, C. Meek, D. Geiger, A. Siepel, D. Haussler, and D. Heckerman. Efﬁcient approximations for learning phylogenetic HMM models from data. Bioinformatics, 20:161–168, 2004.

[5] D. Klein and C. D. Manning. Corpus-based induction of syntactic structure: Models of dependency and constituency. In Association for Computational Linguistics (ACL), 2004.

[6] P. Liang, B. Taskar, and D. Klein. Alignment by agreement. In Human Language Technology and North American Association for Computational Linguistics (HLT/NAACL), 2006.

[7] B. Lindsay. Composite likelihood methods. Contemporary Mathematics, 80:221–239, 1988.

[8] H. Ney and S. Vogel. HMM-based word alignment in statistical translation. In International Conference on Computational Linguistics (COLING), 1996.

[9] A. Siepel and D. Haussler. Combining phylogenetic and hidden Markov models in biosequence analysis. Journal of Computational Biology, 11:413–428, 2004.

[10] C. Sutton and A. McCallum. Piecewise training of undirected models. In Uncertainty in Artiﬁcial Intelligence (UAI), 2005.

[11] C. Sutton and A. McCallum. Piecewise pseudolikelihood for efﬁcient CRF training. In International Conference on Machine Learning (ICML), 2007.

[12] M. Wainwright and M. I. Jordan. Graphical models, exponential families, and variational inference. Technical report, Department of Statistics, University of California at Berkeley, 2003. 8