acl acl2011 acl2011-269 acl2011-269-reference knowledge-graph by maker-knowledge-mining

269 acl-2011-Scaling up Automatic Cross-Lingual Semantic Role Annotation


Source: pdf

Author: Lonneke van der Plas ; Paola Merlo ; James Henderson

Abstract: Broad-coverage semantic annotations for training statistical learners are only available for a handful of languages. Previous approaches to cross-lingual transfer of semantic annotations have addressed this problem with encouraging results on a small scale. In this paper, we scale up previous efforts by using an automatic approach to semantic annotation that does not rely on a semantic ontology for the target language. Moreover, we improve the quality of the transferred semantic annotations by using a joint syntacticsemantic parser that learns the correlations between syntax and semantics of the target language and smooths out the errors from automatic transfer. We reach a labelled F-measure for predicates and arguments of only 4% and 9% points, respectively, lower than the upper bound from manual annotations.


reference text

A. Abeill´ e, L. Cl´ ement, and F. Toussenel. 2003. Building a treebank for French. In Treebanks: Building and Using Parsed Corpora. Kluwer Academic Publishers. P. Annesi and R. Basili. 2010. Cross-lingual alignment of FrameNet annotations through Hidden Markov Models. In Proceedings of CICLing. R. Basili, D. De Cao, D. Croce, B. Coppola, and A. Moschitti, 2009. Computational Linguistics and Intelligent Text Processing, chapter Cross-Language Frame Semantics Transfer in Bilingual Corpora, pages 332– 345. Springer Berlin / Heidelberg. M.-H. Candito, B. Crabb ´e, P. Denis, and F. Gu´ erin. 2009. Analyse syntaxique du fran ¸cais : des constituants aux d ´ependances. In Proceedings of la Conf e´rence sur le Traitement Automatique des Langues Naturelles (TALN’09), Senlis, France. B. Dorr. 1994. Machine translation divergences: A formal description and proposed solution. Computational Linguistics, 20(4):597–633. C. J. Fillmore, R. Johnson, and M.R.L. Petruck. 2003. Background to FrameNet. International journal of lexicography, 16.3:235–250. P. Fung, Z. Wu, Y. Yang, and D. Wu. 2007. Learn- ing bilingual semantic frames: Shallow semantic parsing vs. semantic role projection. In 11th Conference on Theoretical and Methodological Issues in Machine Translation (TMI 2007). J. Henderson, P. Merlo, G. Musillo, and I. Titov. 2008. A latent variable model of synchronous parsing for syntactic and semantic dependencies. In Proceedings of CONLL 2008, pages 178–182. R. Hwa, P. Resnik, A. Weinberg, and O. Kolak. 2002. Evaluating translational correspondence using annotation projection. In Proceedings of the 40th Annual Meeting of the ACL. R. Hwa, P. Resnik, A.Weinberg, C. Cabezas, and O. Kolak. 2005. Bootstrapping parsers via syntactic projection accross parallel texts. Natural language engineering, 11:31 1–325. R. Johansson and P. Nugues. 2006. A FrameNet-based semantic role labeler for Swedish. In Proceedings of the annual Meeting of the Association for Computational Linguistics (ACL). P. Koehn. 2003. Europarl: A multilingual corpus for evaluation of machine translation. J. Lang and M. Lapata. 2010. Unsupervised induction of semantic roles. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 939–947, Los Angeles, California, June. Association for Computational Linguistics. M. Marcus, B. Santorini, and M.A. Marcinkiewicz. 1993. Building a large annotated corpus of English: the Penn Treebank. Comp. Ling. , 19:3 13–330. P. Merlo and L. van der Plas. 2009. Abstraction and generalisation in semantic role labels: PropBank, VerbNet 304 or both? In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 288–296, Suntec, Singapore. A. Meyers. 2007. Annotation guidelines for NomBank - noun argument structure for PropBank. Technical report, New York University. P. Monachesi, G. Stevens, and J. Trapman. 2007. Adding semantic role annotation to a corpus of written Dutch. In Proceedings of the Linguistic Annotation Workshop (LAW), pages 77–84, Prague, Czech republic. F. J. Och and H. Ney. 2003. A systematic comparison of various statistical alignment models. Computational Linguistics, 29: 19–5 1. Sebastian Pad o´ and Mirella Lapata. 2009. Cross-lingual annotation projection of semantic roles. Journal of Artificial Intelligence Research, 36:307–340. S. Pad o´ and G. Pitel. 2007. Annotation pr´ ecise du fran ¸cais en s ´emantique de rˆ oles par projection crosslinguistique. In Proceedings of TALN. S. Pad o´. 2007. Cross-lingual Annotation Projection Models for Role-Semantic Information. Ph.D. thesis, Saarland University. M. Palmer, D. Gildea, and P. Kingsbury. 2005. The Proposition Bank: An annotated corpus of semantic roles. Computational Linguistics, 3 1:71–105. I. Titov and J. Henderson. 2007. A latent variable model for generative dependency parsing. In Proceedings of the International Conference on Parsing Technologies (IWPT-07), pages 144–155, Prague, Czech Republic. I. Titov, J. Henderson, P. Merlo, and G. Musillo. 2009. Online graph planarisation for synchronous parsing of semantic and syntactic dependencies. In Proceedings of the twenty-first international joint conference on artificial intelligence (IJCAI-09), Pasadena, California, July. L. van der Plas, T. Samard˘ zi c´, and P. Merlo. 2010. Crosslingual validity of PropBank in the manual annotation of French. In In Proceedings of the 4th Linguistic Annotation Workshop (The LAW IV), Uppsala, Sweden. D. Wu and P. Fung. 2009a. Can semantic role labeling improve SMT? In Proceedings of the Annual Conference of European Association of Machine Translation. D. Wu and P. Fung. 2009b. Semantic roles for SMT: A hybrid two-pass model. In Proceedings of the Joint Conference of the North American Chapter of ACL/Human Language Technology. D. Yarowsky, G. Ngai, and R. Wicentowski. 2001 . Inducing multilingual text analysis tools via robust projection across aligned corpora. In Proceedings of the International Conference on Human Language Technology (HLT).