acl acl2012 acl2012-50 acl2012-50-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Katja Markert ; Yufang Hou ; Michael Strube
Abstract: Previous work on classifying information status (Nissim, 2006; Rahman and Ng, 2011) is restricted to coarse-grained classification and focuses on conversational dialogue. We here introduce the task of classifying finegrained information status and work on written text. We add a fine-grained information status layer to the Wall Street Journal portion of the OntoNotes corpus. We claim that the information status of a mention depends not only on the mention itself but also on other mentions in the vicinity and solve the task by collectively classifying the information status ofall mentions. Our approach strongly outperforms reimplementations of previous work.
Ron Artstein and Massimo Poesio. 2008. Inter-coder agreement for computational linguistics. Computational Linguistics, 34(4):555–596. Regina Barzilay and Mirella Lapata. 2008. Modeling local coherence: An entity-based approach. Computational Linguistics, 34(1): 1–34. Betty J. Birner and Gregory Ward. 1998. Information Status andNoncanonical Word Order in English. John Benjamins, Amsterdam, The Netherlands. Aoife Cahill and Arndt Riester. 2009. Incorporating information status into generation ranking. In Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing, Singapore, 2–7 August 2009, pages 817–825. Jie Cai, E´va M ´ujdricza-Maydt, and Michael Strube. 2011. Unrestricted coreference resolution via global hypergraph partitioning. In Proceedings of the Shared Task of the 15th Conference on Computational Natural Language Learning, Portland, Oreg., 23–24 June 2011, pages 56–60. Jean Carletta. 1996. Assessing agreement on classification tasks: The kappa statistic. Computational Linguistics, 22(2):249–254. Michael Collins and Nigel Duffy. 2001. Convolution kernels for natural language. In Advances in Neural Information Processing Systems 14, Vancouver, B.C., Canada, 3–8 December, 2001, pages 625–632, Cambridge, Mass. MIT Press. Pascal Denis and Jason Baldridge. 2007. Joint determination of anaphoricity and coreference resolution us- ing integer programming. In Proceedings of Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Rochester, N.Y., 22–27 April 2007, pages 236–243. Katja Filippova and Michael Strube. 2007. Generating constituent order in German clauses. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, 23–30 June 2007, pages 320–327. Claire Gardent and H ´el `ene Manu e´lian. 2005. Cr´ eation d’un corpus annot ´e pour le traitement des descriptions d ´efinies. Traitement Automatique des Langues, 46(1): 115–140. Barbara J. Grosz, Aravind K. Joshi, and Scott Weinstein. 1995. Centering: A framework for modeling the local coherence of discourse. Computational Linguistics, 21(2):203–225. Iorn Korzen and Matthias Buch-Kromann. 2011. Anaphoric relations in the Copenhagen dependency 803 treebanks. In S. Dipper and H. Zinsmeister, editors, Corpus-based Investigations of Pragmatic and Discourse Phenomena, volume 3 of Bochumer Linguistische Arbeitsberichte, pages 83–98. University of Bochum, Bochum, Germany. Ivana Kruijff-Korbayov a´ and Mark Steedman. 2003. Discourse and information structure. Journal ofLogic, Language and Information. Special Issue on Discource and Information Structure, 12(3): 149–259. Knud Lambrecht. 1994. Information Structure and Sentence Form. Cambridge, U.K.: Cambridge University Press. Qing Lu and Lise Getoor. 2003. Link-based classification. In Proceedings of the 20th International Conference on Machine Learning, Washington, D.C., 21–24 August 2003, pages 496–503. Sofus A. Macskassy and Foster Provost. 2007. Classification in networked data: A toolkit and a univariate case study. Journal of Machine Learning Research, 8:935–983. Katja Markert and Malvina Nissim. 2005. Comparing knowledge sources for nominal anaphora resolution. Computational Linguistics, 3 1(3):367–401 . Josef Meyer and Robert Dale. 2002. Mining a corpus to support associative anaphora resolution. In Proceedings of the 4th International Conference on Discourse Anaphora and Anaphor Resolution, Lisbon, Portugal, 18–20 September, 2002. Natalia M. Modjeska, Katja Markert, and Malvina Nissim. 2003. Using the web in machine learning for other-anaphora resolution. In Proceedings of the 2003 Conference on Empirical Methods in Natural Lan- guage Processing, Sapporo, Japan, 11–12 July 2003, pages 176–183. Ani Nenkova, Jason Brenier, Anubha Kothari, Sasha Calhoun, Laura Whitton, David Beaver, and Dan Jurafsky. 2007. To memorize or to predict: Prominence labeling in conversational speech. In Proceedings of Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Rochester, N.Y., 22–27 April 2007, pages 9–16. Vincent Ng. 2009. Graph-cut-based anaphoricity determination for coreference resolution. In Proceedings of Human Language Technologies 2009: The Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, Col., 3 1 May 5 June 2009, pages 575–583. Malvina Nissim, Shipara Dingare, Jean Carletta, and Mark Steedman. 2004. An annotation scheme for information status in dialogue. In Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal, 26–28 May 2004, pages 1023–1026. Malvina Nissim. 2006. Learning information status of discourse entities. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, 22–23 July 2006, pages 94–012. Bo Pang and Lillian Lee. 2004. A sentimental education: – Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, 21–26 July 2004, pages 272–279. Massimo Poesio, Rahul Mehta, Axel Maroudas, and Janet Hitzeman. 2004. Learning to resolve bridging references. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, 21–26 July 2004, pages 143–150. Massimo Poesio. 2004. The MATE/GNOME proposals for anaphoric annotation, revisited. In Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue, Cambridge, Mass., 30 April 1May 2004, pages 154– 162. Scott Prevost. 1996. An information structural approach to spoken language generation. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, Cal., 24–27 June 1996, pages 294–301 . Ellen F. Prince. 198 1. Towards a taxonomy of given-new information. In P. Cole, editor, Radical Pragmatics, pages 223–255. Academic Press, New York, N.Y. Ellen F. Prince. 1992. The ZPG letter: Subjects, definiteness, and information-status. In W.C. Mann and S.A. Thompson, editors, Discourse Description. Diverse Linguistic Analyses of a Fund-Raising Text, – pages 295–325. John Benjamins, Amsterdam. Altaf Rahman and Vincent Ng. 2011. Learning the information status of noun phrases in spoken dialogues. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, U.K., 27–29 July 2011, pages 1069–1080. Arndt Riester, David Lorenz, and Nina Seemann. 2010. A recursive annotation scheme for referential information status. In Proceedings of the 7th International Conference on Language Resources and Evaluation, La Valetta, Malta, 17–23 May 2010, pages 717–722. Julia Ritz, Stefanie Dipper, and Michael G ¨otze. 2008. Annotation of information structure: An evaluation across different types of texts. In Proceedings of the 6th International Conference on Language Resources and Evaluation, Marrakech, Morocco, 26 May 1 June 2008, pages 2137–2142. Ryohei Sasano and Sadao Kurohashi. 2009. A probabilistic model for associative anaphora resolution. In Proceedings of the 2009 Conference on Empirical – 804 Methods in Natural Language Processing, Singapore, 6–7 August 2009, pages 1455–1464. Advaith Siddharthan, Ani Nenkova, and Kathleen McKeown. 2011. Information status distinctions and referring expressions: An empirical study of references to people in news summaries. Computational Linguistics, 37(4):81 1–842. Swapna Somasundaran, Galileo Namata, Janyce Wiebe, and Lise Getoor. 2009. Supervised and unsupervised methods in employing discourse relations for improving opinion polarity classification. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6–7 August 2009. Ben Taskar, Pieter Abbeel, and Daphne Koller. 2002. Discriminative probabilistic models for relational data. In Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence, Edmonton, Alberta, Canada, 1-4 August 2002, pages 485–492. Renata Vieira and Massimo Poesio. 2000. An empirically-based system for processing definite descriptions. Computational Linguistics, 26(4):539– 593. Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, and Ann Houston. 2011. OntoNotes release 4.0. LDC201 1T03, Philadelphia, Penn.: Linguistic Data Consortium. Yiming Yang, Se´ an Slattery, and Rayid Ghani. 2002. A study of approaches to hypertext categorization. Journal of Intelligent Information Systems, 18(2-3):219– 241. Guodong Zhou and Fang Kong. 2009. Global learning of noun phrase anaphoricity in coreference resolution via label propagation. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6–7 August 2009, pages 978–986.