acl acl2012 acl2012-33 acl2012-33-reference knowledge-graph by maker-knowledge-mining

33 acl-2012-Automatic Event Extraction with Structured Preference Modeling


Source: pdf

Author: Wei Lu ; Dan Roth

Abstract: This paper presents a novel sequence labeling model based on the latent-variable semiMarkov conditional random fields for jointly extracting argument roles of events from texts. The model takes in coarse mention and type information and predicts argument roles for a given event template. This paper addresses the event extraction problem in a primarily unsupervised setting, where no labeled training instances are available. Our key contribution is a novel learning framework called structured preference modeling (PM), that allows arbitrary preference to be assigned to certain structures during the learning procedure. We establish and discuss connections between this framework and other existing works. We show empirically that the structured preferences are crucial to the success of our task. Our model, trained without annotated data and with a small number of structured preferences, yields performance competitive to some baseline supervised approaches.


reference text

T. Berg-Kirkpatrick, A. Bouchard-C oˆt´ e, J. DeNero, and D. Klein. 2010. Painless unsupervised learning with features. In Proc. of HLT-NAACL’10, pages 582–590. J. Besag. 1975. Statistical analysis of non-lattice data. The Statistician, pages 179–195. M. Chang, L. Ratinov, and D. Roth. 2007. Guiding semisupervision with constraint-driven learning. In Proc. of ACL’07, pages 280–287. M. Chang, D. Goldwasser, D. Roth, and V. Srikumar. 2010a. Discriminative learning over constrained latent representations. In Proc. of NAACL’10, 6. M. Chang, V. Srikumar, D. Goldwasser, and D. Roth. 2010b. Structured output learning with indirect supervision. In Proc. ICML’10. K. Ganchev, J. Gra ¸ca, J. Gillenwater, and B. Taskar. 2010. Posterior regularization for structured latent variable models. The Journal of Machine Learning Research (JMLR), 11:2001–2049. A. Haghighi and D. Klein. 2006. Prototype-driven learning for sequence models. In Proc. of HLT-NAACL’06, pages 320–327. J. D. Lafferty, A. McCallum, and F. C. N. Pereira. 2001 . Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. of ICML’01, pages 282–289. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. 1998. Gradient-based learning applied to document recognition. Proc. of the IEEE, pages 2278–2324. D.C. Liu and J. Nocedal. 1989. On the limited memory bfgs method for large scale optimization. Mathematical programming, 45(1):503–528. D. Okanohara, Y. Miyao, Y. Tsuruoka, and J. Tsujii. 2006. Improving the scalability of semi-markov conditional random fields for named entity recognition. In Proc. of ACL’06, pages 465–472. H. Poon, C. Cherry, and K. Toutanova. 2009. Unsupervised morphological segmentation with log-linear models. In Proc. of HLT-NAACL’09, pages 209–217. L. Ratinov and D. Roth. 2009. Design challenges and misconceptions in named entity recognition. In Proc. of CoNLL’09, pages 147–155. L. Ratinov, D. Roth, D. Downey, and M. Anderson. 2011. Local and global algorithms for disambiguation to wikipedia. In Proc. of ACL-HLT’11, pages 1375– 1384. D. Roth and W. Yih. 2005. Integer linear programming inference for conditional random fields. In Proc. of ICML’05, pages 736–743. R. Samdani, M. Chang, and D. Roth. 2012. Unified expectation maximization. In Proc. NAACL’12. 844 S. Sarawagi and W.W. Cohen. 2004. Semi-markov conditional random fields for information extraction. NIPS’04, pages 1185–1 192. N.A. Smith and J. Eisner. 2005a. Contrastive estimation: Training log-linear models on unlabeled data. In Proc. of ACL’05, pages 354–362. N.A. Smith and J. Eisner. 2005b. Guiding unsupervised grammar induction using contrastive estimation. In Proc. of IJCAI Workshop on Grammatical Inference Applications, pages 73–82. J. Str o¨tgen and M. Gertz. 2010. Heideltime: High quality rule-based extraction and normalization of temporal expressions. In Proc. of SemEval’10, pages 321– 324.