emnlp emnlp2011 emnlp2011-145 emnlp2011-145-reference knowledge-graph by maker-knowledge-mining

145 emnlp-2011-Unsupervised Semantic Role Induction with Graph Partitioning


Source: pdf

Author: Joel Lang ; Mirella Lapata

Abstract: In this paper we present a method for unsupervised semantic role induction which we formalize as a graph partitioning problem. Argument instances of a verb are represented as vertices in a graph whose edge weights quantify their role-semantic similarity. Graph partitioning is realized with an algorithm that iteratively assigns vertices to clusters based on the cluster assignments of neighboring vertices. Our method is algorithmically and conceptually simple, especially with respect to how problem-specific knowledge is incorporated into the model. Experimental results on the CoNLL 2008 benchmark dataset demonstrate that our model is competitive with other unsupervised approaches in terms of F1 whilst attaining significantly higher cluster purity.


reference text

O. Abend and A. Rappoport. 2010. Fully unsupervised core-adjunct argument classification. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 226–236, Uppsala, Sweden. O. Abend, R. Reichart, and A. Rappoport. 2009. Unsupervised Argument Identification for Semantic Role Labeling. In Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics 1329 and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, pages 28–36, Singapore. S. Abney. 2007. Semisupervised Learning for Computational Linguistics. Chapman & Hall/CRC. A. Alexandrescu and K. Kirchhoff. 2009. Graph-based learning for statistical machine translation. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 119–127, Boulder, Colorado. C. Biemann. 2006. Chinese Whispers: an efficient graph clustering algorithm and its application to natural language processing problems. In Proceedings of TextGraphs: the First Workshop on Graph Based Methods for Natural Language Processing, pages 73– 80, New York City. C. Bishop. 2006. Pattern Recognition and Machine Learning. Springer. Y. Boykov, O. Veksler, and R. Zabih. 2001. Fast Approximate Energy Minimization via Graph Cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(1 1): 1222–1239. T. Brants and A. Franz. 2006. Web 1T 5-gram Version 1. Linguistic Data Consortium, Philadelphia. Z. Chen and H. Ji. 2010. Graph-based clustering for computational linguistics: A survey. In Proceedings of TextGraphs-5 - 2010 Workshop on Graph-based Methods for Natural Language Processing, pages 1–9, Uppsala, Sweden. C. Christodoulopoulos, S. Goldwater, and M. Steedman. 2010. Two decades of unsupervised POS induction: How far have we come? In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 575–584, Cambridge, MA. D. Dowty. 1991. Thematic Proto Roles and Argument Selection. Language, 67(3):547–619. H. F ¨urstenau and M. Lapata. 2009. Graph Aligment for Semi-Supervised Semantic Role Labeling. In Pro- ceedings of the Conference on Empirical Methods in Natural Language Processing, pages 11–20, Singapore. D. Gildea and D. Jurafsky. 2002. Automatic Labeling of Semantic Roles. Computational Linguistics, 28(3):245–288. A. Gordon and R. Swanson. 2007. Generalizing Semantic Role Annotations Across Syntactically Similar Verbs. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, pages 192–199, Prague, Czech Republic. T. Grenager and C. Manning. 2006. Unsupervised Discovery of a Statistical Verb Lexicon. In Proceedings of the Conference on Empirical Methods on Natural Language Processing, pages 1–8, Sydney, Australia. G. Haffari and A. Sarkar. 2007. Analysis of SemiSupervised Learning with the Yarowsky Algorithm. In Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence, Vancouver, BC. K. Kipper, H. T. Dang, and M. Palmer. 2000. ClassBased Construction of a Verb Lexicon. In Proceedings of the 17th AAAI Conference on Artificial Intelligence, pages 691–696. AAAI Press / The MIT Press. I. Klapaftis and Suresh M. 2010. Word sense induction & disambiguation using hierarchical random graphs. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 745– 755, Cambridge, MA. J. Lang and M. Lapata. 2010. Unsupervised Induction of Semantic Roles. In Proceedings of the 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 939– 947, Los Angeles, California. J. Lang and M. Lapata. 2011. Unsupervised Semantic Role Induction via Split-Merge Clustering. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon. To appear in. L. M` arquez, X. Carreras, K. Litkowski, and S. Stevenson. 2008. Semantic Role Labeling: an Introduction to the Special Issue. Computational Linguistics, 34(2): 145– 159. G. Melli, Y. Wang, Y. Liu, M. M. Kashani, Z. Shi, B. Gu, A. Sarkar, and F. Popowich. 2005. Description of SQUASH, the SFU Question Answering Summary Handler for the DUC-2005 Summarization Task. In Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing Document Understanding Workshop, Vancouver, Canada. J. Nivre, J. Hall, J. Nilsson, G. Eryigit A. Chanev, S. K ¨ubler, S. Marinov, and E. Marsi. 2007. MaltParser: A Language-independent System for Datadriven Dependency Parsing. Natural Language Engineering, 13(2):95–135. S. Pad o´ and M. Lapata. 2009. Cross-lingual Annotation Projection of Semantic Roles. Journal of Artificial Intelligence Research, 36:307–340. M. Palmer, D. Gildea, and P. Kingsbury. 2005. The Proposition Bank: An Annotated Corpus of Semantic Roles. Computational Linguistics, 31(1):71–106. S. Pradhan, W. Ward, and J. Martin. 2008. Towards Robust Semantic Role Labeling. Computational Linguistics, 34(2):289–310. D. Ravichandran, P. Pantel, and E. Hovy. 2005. Randomized Algorithms and NLP: Using Locality Sensitive Hash Function for High Speed Noun Clustering. 1330 In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, page 622629, Ann Arbor, Michigan. J. Ruppenhofer, M. Ellsworth, M. Petruck, C. Johnson, and J. Scheffczyk. 2006. FrameNet II: Extended Theory and Practice, version 1.3. Technical report, International Computer Science Institute, Berkeley, CA, USA. S. Schaeffer. 2007. Graph clustering. Computer Science Review, 1(1):27–64. D. Shen and M. Lapata. 2007. Using Semantic Roles to Improve Question Answering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the Conference on Computational Natural Language Learning, pages 12–21, Prague, Czech Republic. A. Subramanya, S. Petrov, and F. Pereira. 2010. Efficient graph-based semi-supervised learning of structured tagging models. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 167–176, Cambridge, MA. M. Surdeanu, S. Harabagiu, J. Williams, and P. Aarseth. 2003. Using Predicate-Argument Structures for Information Extraction. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pages 8–15, Sapporo, Japan. M. Surdeanu, R. Johansson, A. Meyers, and L. M `arquez. 2008. The CoNLL-2008 Shared Task on Joint Parsing of Syntactic and Semantic Dependencies. In Proceedings of the 12th CoNLL, pages 159–177, Manchester, England. R. Swier and S. Stevenson. 2004. Unsupervised Semantic Role Labelling. In Proceedings of the Conference on Empirical Methods on Natural Language Processing, pages 95–102, Barcelona, Spain. P. Talukdar and F. Pereira. 2010. Experiments in graphbased semi-supervised learning methods for classinstance acquisition. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 1473–1481, Uppsala, Sweden. P. Talukdar. 2010. Graph-Based Weakly Supervised Methods for Information Extraction & Integration. Ph.D. thesis, CIS Department, University of Pennsyl- vania. Peter D. Turney 2010. and Patrick Pantel. From fre- quency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37: 141 188. M. Wainwright and M. Jordan. els, Exponential Foundations 2): 2008. Graphical Mod- Families, and Variational and Trends in Machine Inference. Learning, 1(1- 1–305. D. Wu and P. Fung. 2009. A Hybrid Two-Pass Model. Semantic Roles for SMT: In Proceedings American Annual Meeting of the Association for Computational Linguistics HLT 2009: Short Papers, pages 13–16, Boulder, Colorado. D. Yarowsky. 1995. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, pages 189–196, Cambridge, MA. X. Zhu, Z. Ghahramani, and J. Lafferty. 2003. SemiSupervised Learning Using Gaussian Fields and Harmonic Functions. In Proceedings of the 20th International Conference on Machine Learning, Washington, DC. 1331 of North