acl acl2010 acl2010-184 acl2010-184-reference knowledge-graph by maker-knowledge-mining

184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans

Source: pdf

Author: Fei Huang ; Alexander Yates

Abstract: Most supervised language processing systems show a significant drop-off in performance when they are tested on text that comes from a domain significantly different from the domain of the training data. Semantic role labeling techniques are typically trained on newswire text, and in tests their performance on fiction is as much as 19% worse than their performance on newswire text. We investigate techniques for building open-domain semantic role labeling systems that approach the ideal of a train-once, use-anywhere system. We leverage recently-developed techniques for learning representations of text using latent-variable language models, and extend these techniques to ones that provide the kinds of features that are useful for semantic role labeling. In experiments, our novel system reduces error by 16% relative to the previous state of the art on out-of-domain text.

reference text

Omri Abend, Roi Reichart, and Ari Rappoport. 2009. Unsupervised argument identification for semantic role labeling. In Proceedings of the ACL. Michiel Bacchiani, Michael Riley, Brian Roark, and Richard Sproat. 2006. MAP adaptation of stochastic grammars. Computer Speech and Language, 20(1):41–68. Shai Ben-David, John Blitzer, Koby Crammer, and Fernando Pereira. 2007. Analysis of representations for domain adaptation. In Advances in Neural Information Processing Systems 20, Cambridge, MA. MIT Press. Shai Ben-David, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, and Jenn Wortman. 2009. A theory of learning from different domains. Machine Learning, (to appear). John Blitzer, Ryan McDonald, and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In EMNLP. Xavier Carreras and Llu ı´s M `arquez. 2003. Phrase recognition by filtering and ranking with perceptrons. In Proceedings of RANLP-2003. Xavier Carreras and Llu ı´s M `arquez. 2004. Introduction to the CoNLL-2004 shared task: Semantic role labeling. In Proceedings of the Conference on Natural Language Learning (CoNLL). Xavier Carreras and Llu ı´s M `arquez. 2005. Introduction to the CoNLL-2005 shared task: Semantic role labeling. In Proceedings of the Conference on Natural Language Learning (CoNLL). Trevor Cohn and Phil Blunsom. 2005. Semantic role labelling with tree conditional random fields. In Proceedings of CoNLL. Arthur Dempster, Nan Laird, and Donald Rubin. 1977. Likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1): 1–38. Koen Deschacht and Marie-Francine Moens. 2009. Semi-supervised semantic role labeling using the latent words language model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). D. Downey, M. Broadhead, and O. Etzioni. 2007a. Locating complex named entities in web text. In Procs. of the 20th International Joint Conference on Artificial Intelligence (IJCAI 2007). Doug Downey, Stefan Schoenmackers, and Oren Etzioni. 2007b. Sparse information extraction: Unsupervised language models to the rescue. In ACL. G. Escudero, L. M ´arquez, and G. Rigau. 2000. An empirical study of the domain dependence of supervised word sense disambiguation systems. In EMNLP/VLC. Hagen F ¨urstenau and Mirella Lapata. 2009a. Graph alignment for semi-supervised semantic role labeling. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 11–20. Hagen F ¨urstenau and Mirella Lapata. 2009b. Semisupervised semantic role labeling. In Proceedings of the 12th Conference of the European Chapter of the ACL, pages 220–228. Daniel Gildea and Daniel Jurafsky. 2002. Automatic labeling of semantic roles. Computational Linguistics, 28(3):245–288. Daniel Gildea. 2001 . Corpus Variation and Parser Performance. In Conference on Empirical Methods in Natural Language Processing. Trond Grenager and Christopher D Manning. 2006. Unsupervised discovery of a statistical verb lexicon. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Fei Huang and Alexander Yates. 2009. Distributional representations for handling sparsity in supervised sequence labeling. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. H. Kucera and W.N. Francis. 1967. Computational Analysis of Present-Day American English. Brown University Press. J. Lafferty, Andrew McCallum, and Fernando Pereira. 2001 . Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the International Conference on Machine Learning. David McClosky, Eugene Charniak, and Mark Johnson. 2006. Reranking and self-training for parser adaptation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, pages 337–344. Martha Palmer, Dan Gildea, and Paul Kingsbury. 2005. The Proposition Bank: A corpus annotated with semantic roles. Computational Linguistics Journal, 31(1). Sameer Pradhan, Kadri Hacioglu, Wayne Ward, James H. Martin, and Daniel Jurafsky. 2005. Semantic role chunking combining complementary syntactic views. In Proc. of the Annual Conference on Computational Natural Language Learning (CoNLL). Sameer Pradhan, Wayne Ward, and James H. Martin. 2007. Towards robust semantic role labeling. In Proceedings of NAACL-HLT, pages 556–563. Vasin Punyakanok, Dan Roth, and Wen-tau Yih. 2008. The importance of syntactic parsing and inference in semantic role labeling. Computational Linguistics, 34(2):257–287. 977 Lawrence R. Rabiner. 1989. A tutorial on hidden Markov models and selected applications recognition. Proceedings in speech of the IEEE, 77(2):257– 285. Robert S. Swier and Suzanne Stevenson. pervised semantic role labelling. 2004. Unsu- In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pages 95–102. Kristina Toutanova, Aria Haghighi, and Christopher D. Manning. 2008. A global joint model for semantic role labeling. Computational Linguistics, 34(2): 161–191. Jason Weston, Frederic Ratle, and Ronan Collobert. 2008. Deep learning via semi-supervised embedding. In Proceedings of the 25th International Conference on Machine Learning. 978