acl acl2010 acl2010-111 acl2010-111-reference knowledge-graph by maker-knowledge-mining

111 acl-2010-Extracting Sequences from the Web

Source: pdf

Author: Anthony Fader ; Stephen Soderland ; Oren Etzioni

Abstract: Classical Information Extraction (IE) systems fill slots in domain-specific frames. This paper reports on SEQ, a novel open IE system that leverages a domainindependent frame to extract ordered sequences such as presidents of the United States or the most common causes of death in the U.S. SEQ leverages regularities about sequences to extract a coherent set of sequences from Web text. SEQ nearly doubles the area under the precision-recall curve compared to an extractor that does not exploit these regularities.

reference text

Michele Banko and Oren Etzioni. 2008. The tradeoffs between open and traditional relation extraction. In Proceedings of ACL-08: HLT, pages 28–36. Michele Banko, Michael J. Cafarella, Stephen Soderland, Matthew Broadhead, and Oren Etzioni. 2007. Open information extraction from the web. In IJCAI, pages 2670–2676. H. Chieu, H. Ng, and Y. Lee. 2003. Closing the gap: Learning-based information extraction rivaling knowledge-engineering methods. In ACL, pages 216–223. William W. Cohen, Matthew Hurst, and Lee S. Jensen. 2002. A flexible learning system for wrapping tables and lists in html documents. In In International World Wide Web Conference, pages 232–241. Doug Downey, Oren Etzioni, and Stephen Soderland. 2005. A probabilistic model of redundancy in information extraction. In IJCAI, pages 1034–1041. O. Etzioni, M. Cafarella, D. Downey, A. Popescu, T. Shaked, S. Soderland, D. Weld, and A. Yates. 2004. Methods for domain-independent information extraction from the Web: An experimental comparison. In Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI-2004), pages 391–398. Oren Etzioni, Michael Cafarella, Doug Downey, Ana maria Popescu, Tal Shaked, Stephen Soderl, Daniel S. Weld, and Er Yates. 2005. Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence, 165:91–134. D. Freitag. 2000. Machine learning for information extraction in informal domains. Machine Learning, 39(2-3): 169–202. Marti A. Hearst. 1992. Automatic acquisition of hyponyms from large text corpora. In COLING, pages 539–545. Satoshi Sekine. 2006. On-demand information extraction. In Proceedings of the COLING/ACL on Main conference poster sessions, pages 73 1–738, Morristown, NJ, USA. Association for Computational Linguistics. 290