acl acl2011 acl2011-121 acl2011-121-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Edward Benson ; Aria Haghighi ; Regina Barzilay
Abstract: We present a novel method for record extraction from social streams such as Twitter. Unlike typical extraction setups, these environments are characterized by short, one sentence messages with heavily colloquial speech. To further complicate matters, individual messages may not express the full relation to be uncovered, as is often assumed in extraction tasks. We develop a graphical model that addresses these problems by learning a latent set of records and a record-message alignment simultaneously; the output of our model is a set of canonical records, the values of which are consistent with aligned messages. We demonstrate that our approach is able to accurately induce event records from Twitter messages, evaluated against events from a local city guide. Our method achieves significant error reduction over baseline methods.1
Jun Zhu, Zaiqing Nie, Xiaojing Liu, Bo Zhang, and Ji- tRWo WnexWgtWr.acetni. g20e0n9t.iyStraetSlantoi wnbshalip:sa.sItnatiPstrioc ael adpi nrgosacohf EuIgEnexnt Per aoc Atcigeniecgdhi rnte glisant oio afn DdsL L f.ruoims G lar agvea 2-t0e0x0t. co Slnleocwtiboanlsl.: Razvan C. Bunescu and Raymond J. Mooney. 2007. Learning to extract relations from the web using minimal supervision. In Proceedings of the ACL. J Eisenstein, B O’Connor, and N Smith. ... 2010. A latent variable model for geographic lexical variation. Proceedings of the 2010 . . . ,Jan. Takaaki Hasegawa, Satoshi Sekine, and Ralph Grishman. 2004. Discovering relations among named entities from large corpora. In Proceedings of ACL. Robert W. Irving, Paul Leather, and Dan Gusfield. 1987. An efficient algorithm for the optimal stable marriage. J. ACM, 34:532–543, July. John Lafferty, Andrew McCallum, and Fernando Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of International Conference of Machine Learning (ICML), pages 282–289. P. Liang and D. Klein. 2007. Structured Bayesian nonparametric models with variational inference (tutorial). In Association for Computational Linguistics (ACL). Gideon S. Mann and David Yarowsky. 2005. Multi-field information extraction and cross-document fusion. In Proceeding of the ACL. Mike Mintz, Steven Bills, Rion Snow, and Dan Jurafsky. 2009a. Distant supervision for relation extraction without labeled data. In Proceedings of ACL/IJCNLP. Mike Mintz, Steven Bills, Rion Snow, and Daniel Jurafsky. 2009b. Distant supervision for relation extraction without labeled data. In Proceedings of the ACL, pages 1003–101 1. A Ritter, C Cherry, and B Dolan. 2010. Unsupervised modeling of twitter conversations. Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 172–180. Yusuke Shinyama and Satoshi Sekine. 2006. Preemptive information extraction using unrestricted relation discovery. In Proceedings of HLT/NAACL. Roman Yangarber, Ralph Grishman, Pasi Tapanainen, and Silja Huttunen. 2000. Automatic acquisition of domain knowledge for information extraction. In Proceedings of COLING. Limin Yao, Sebastian Riedel, and Andrew McCallum. 2010a. Collective cross-document relation extraction without labelled data. In Proceedings of the EMNLP, pages 1013–1023. Limin Yao, Sebastian Riedel, and Andrew McCallum. 2010b. Cross-document relation extraction without labelled data. In Proceedings of EMNLP. 398