emnlp emnlp2011 emnlp2011-57 emnlp2011-57-reference knowledge-graph by maker-knowledge-mining

57 emnlp-2011-Extreme Extraction - Machine Reading in a Week


Source: pdf

Author: Marjorie Freedman ; Lance Ramshaw ; Elizabeth Boschee ; Ryan Gabbard ; Gary Kratkiewicz ; Nicolas Ward ; Ralph Weischedel

Abstract: We report on empirical results in extreme extraction. It is extreme in that (1) from receipt of the ontology specifying the target concepts and relations, development is limited to one week and that (2) relatively little training data is assumed. We are able to surpass human recall and achieve an F1 of 0.5 1 on a question-answering task with less than 50 hours of effort using a hybrid approach that mixes active learning, bootstrapping, and limited (5 hours) manual rule writing. We compare the performance of three systems: extraction with handwritten rules, bootstrapped extraction, and a combination. We show that while the recall of the handwritten rules surpasses that of the learned system, the learned system is able to improve the overall recall and F1.


reference text

E. Agichtein and L. Gravano. Snowball: extracting relations from large plain-text collections. In Proceedings of the ACM Conference on Digital Libraries, pp. 85-94, 2000. A. Blum and T. Mitchell. Combining Labeled and Unlabeled Data with Co-Training. In Proceedings of the 1998 Conference on Computational Learning Theory, July 1998. E. Boschee, V. Punyakanok, R. Weischedel. An Exploratory Study Towards ‗Machines that Learn to Read‘. Proceedings of AAAI BICA Fall Symposium, November 2008. J. Chen, D. Ji, C. Tan and Z. Niu. (2006). Relation extraction using label propagation based semisupervised learning. COLING-ACL 2006: 129-136. July 2006. M. Freedman, E. Loper, E. Boschee, and R. Weischedel. Empirical Studies in Learning to Read. Proceedings of NAACL 2010 Workshop on Formalisms and Methodology for Learning by Reading, pp. 61-69, June 2010. W. Li and A. McCallum. Rapid development of Hindi named entity recognition using conditional random fields and feature induction. Transactions on Asian Language Information Processing (TALIP), Volume 2 Issue 3 September, 2003. R Grishman and B. Sundheim. Message Understanding Conference-6 : A Brief History