acl acl2013 acl2013-296 acl2013-296-reference knowledge-graph by maker-knowledge-mining

296 acl-2013-Recognizing Identical Events with Graph Kernels


Source: pdf

Author: Goran Glavas ; Jan Snajder

Abstract: Identifying news stories that discuss the same real-world events is important for news tracking and retrieval. Most existing approaches rely on the traditional vector space model. We propose an approach for recognizing identical real-world events based on a structured, event-oriented document representation. We structure documents as graphs of event mentions and use graph kernels to measure the similarity between document pairs. Our experiments indicate that the proposed graph-based approach can outperform the traditional vector space model, and is especially suitable for distinguishing between topically similar, yet non-identical events.


reference text

ACE. 2005. Evaluation of the detection and recognition of ACE: Entities, values, temporal expressions, relations, and events. James Allan. 2002. Topic Detection and Tracking: Event-based Information Organization, volume 12. Kluwer Academic Pub. James Allen. 1983. temporal intervals. 26(1 1):832–843. Maintaining knowledge about Communications of the ACM, Martin Atkinson and Erik Van der Goot. 2009. Near real time information mining in multilingual news. In Proceedings of the 18th International Conference on World Wide Web, pages 1153–1 154. ACM. Cosmin Adrian Bejan and Sanda Harabagiu. 2008. A linguistic resource for discovering event structures and resolving event coreference. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008). Cosmin Adrian Bejan and Sanda Harabagiu. 2010. Unsupervised event coreference resolution with rich linguistic features. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 1412–1422. Association for Computational Linguistics. Steven Bethard. 2008. Finding Event, Temporal and Causal Structure in Text: A Machine Learning Approach. Ph.D. thesis, University of Colorado at Boulder. Karsten Michael Borgwardt. 2007. Graph Kernels. Ph.D. thesis, Ludwig-Maximilians-Universit a¨t M ¨unchen. Thorsten Brants, Francine Chen, and Ayman Farahat. 2003. A system for new event detection. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 330–337. ACM. Thomas G ¨artner, Peter Flach, and Stefan Wrobel. 2003. On graph kernels: Hardness results and efficient alternatives. In Learning Theory and Kernel Machines, pages 129–143. Springer. Goran Glava ˇs and Jan Sˇnajder. 2013. Exploring coref- erence uncertainty of generically extracted event mentions. In Proceedings of 14th International Conference on Intelligent Text Processing and Computational Linguistics, pages 408–422. Springer. Richard Hammack, Wilfried Imrich, and Sandi Klav zˇar. 2011. Handbook of Product Graphs. Discrete Mathematics and Its Applications. CRC Press. Vasileios Hatzivassiloglou, Luis Gravano, and Ankineedu Maganti. 2000. An investigation of linguistic features and clustering algorithms for topical document clustering. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 224–23 1. ACM. Giridhar Kumaran and James Allan. 2004. Text classification and named entities for new event detection. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 297–304. ACM. Giridhar Kumaran and James Allan. 2005. Using names and topics for new event detection. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Lan- guage Processing, pages 121–128. Association for Computational Linguistics. Heeyoung Lee, Marta Recasens, Angel Chang, Mihai Surdeanu, and Dan Jurafsky. 2012. Joint entity and event coreference resolution across documents. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 489–500. Association for Computational Linguistics. Pierre Mah e´, Nobuhisa Ueda, Tatsuya Akutsu, JeanLuc Perret, and Jean-Philippe Vert. 2005. Graph kernels for molecular structure-activity relationship analysis with support vector machines. Journal of Chemical Information and Modeling, 45(4):939– 951. Juha Makkonen, Helena Ahonen-Myka, and Marko Salmenkivi. 2004. Simple semantics in topic detection and tracking. Information Retrieval, 7(3):347– 368. Sauro Menchetti, Fabrizio Costa, and Paolo Frasconi. 2005. Weighted decomposition kernels. In Proceedings of the 22nd International Conference on Machine Learning, pages 585–592. ACM. James Pustejovsky, Jos e´ Castano, Robert Ingria, Roser Sauri, Robert Gaizauskas, Andrea Setzer, Graham Katz, and Dragomir Radev. 2003a. Timeml: Robust specification of event and temporal expressions in text. New Directions in Question Answering, 3:28– 34. James Pustejovsky, Patrick Hanks, Roser Sauri, Andrew See, Robert Gaizauskas, Andrea Setzer, Dragomir Radev, Beth Sundheim, David Day, Lisa Ferro, et al. 2003b. The TimeBank corpus. In Corpus Linguistics, volume 2003, page 40. Gerard Salton, Anita Wong, and Chung-Shu Yang. 1975. A vector space model for automatic indexing. Communications of the ACM, 18(11):613–620. 802 Ralf Steinberger, Bruno Pouliquen, and Erik Van Der Goot. 2009. An introduction to the european media monitor family of applications. In Proceedings of the Information Access in a Multilingual World-Proceedings of the SIGIR 2009 Workshop, pages 1–8. Marc Verhagen, Robert Gaizauskas, Frank Schilder, Mark Hepple, Graham Katz, and James Pustejovsky. 2007. Semeval-2007 Task 15: TempEval temporal relation identification. In Proceedings of the 4th International Workshop on Semantic Evaluations, pages 75–80. Association for Computational Linguistics. Marc Verhagen, Roser Sauri, Tommaso Caselli, and James Pustejovsky. 2010. Semeval-2010 Task 13: TempEval-2. In Proceedings of the 5th International Workshop on Semantic Evaluation, pages 57– 62. Association for Computational Linguistics. Charles Wayne. 2000. Multilingual topic detection and tracking: Successful research enabled by corpora and evaluation. In Proceedings of the Second International Conference on Language Resources and Evaluation Conference (LREC 2000), volume 2000, pages 1487–1494. Yiming Yang, Jaime G Carbonell, Ralf D Brown, Thomas Pierce, Brian T Archibald, and Xin Liu. 1999. Learning approaches for detecting and tracking news events. Intelligent Systems and their Applications, IEEE, 14(4):32–43. Alexander Yeh. 2000. More accurate tests for the statistical significance of result differences. In Proceedings of the 18th Conference on Computational linguistics, pages 947–953. Association for Computational Linguistics. 803