acl acl2012 acl2012-42 acl2012-42-reference knowledge-graph by maker-knowledge-mining

42 acl-2012-Bootstrapping via Graph Propagation


Source: pdf

Author: Max Whitney ; Anoop Sarkar

Abstract: Bootstrapping a classifier from a small set of seed rules can be viewed as the propagation of labels between examples via features shared between them. This paper introduces a novel variant of the Yarowsky algorithm based on this view. It is a bootstrapping learning method which uses a graph propagation algorithm with a well defined objective function. The experimental results show that our proposed bootstrapping algorithm achieves state of the art performance or better on several different natural language data sets.


reference text

S. Abney. 2004. Understanding the Yarowsky algorithm. Computational Linguistics, 30(3). Eugene Agichtein and Luis Gravano. 2000. Snowball: Extracting relations from large plain-text collections. In Proceedings of the Fifth ACM International Conference on Digital Libraries, DL ’00. A. Blum and S. Chawla. 2001 . Learning from labeled and unlabeled data using graph mincuts. In Proc. 19th International Conference on Machine Learning (ICML-2001). A. Blum and T. Mitchell. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of Computational Learning Theory. Michael Collins and Yoram Singer. 1999. Unsupervised models for named entity classification. In In EMNLP 1999: Proceedings ofthe Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pages 100–1 10. Michael Collins. 1997. Three generative, lexicalised models for statistical parsing. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, pages 16–23, Madrid, Spain, July. Association for Computational Linguistics. Hal Daume. 2011. Seeding, transduction, out-ofsample error and the Microsoft approach... Blog post at http://nlpers.blogspot.com/2011/04/seedingtransduction-out-of-sample.html, April 6. Jason Eisner and Damianos Karakos. 2005. Bootstrapping without the boot. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pages 395–402, Vancouver, British Columbia, Canada, October. Association for Computational Linguistics. Gholamreza Haffari and Anoop Sarkar. 2007. Analysis of semi-supervised learning with the Yarowsky algorithm. In UAI 2007, Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence, Vancouver, BC, Canada, pages 159–166. Aria Haghighi and Dan Klein. 2006a. Prototype-driven grammar induction. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 881–888, Sydney, Australia, July. Association for Computational Linguistics. Aria Haghighi and Dan Klein. 2006b. Prototype-driven learning for sequence models. In Proceedings of the Human Language Technology Conference of the NAACL, Main Conference, pages 320–327, New York City, USA, June. Association for Computational Linguistics. Marti A. Hearst. 1992. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 628 14th conference on Computational linguistics - Volume 2, COLING ’92, pages 539–545, Stroudsburg, PA, USA. Association for Computational Linguistics. John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01, pages 282–289, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc. K. Nigam, A. McCallum, S. Thrun, and T. Mitchell. 2000. Text classification from labeled and unlabeled documents using EM. Machine Learning, 30(3). Ellen Riloff and Jessica Shepherd. 1997. A corpusbased approach for building semantic lexicons. In In Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pages 117– 124. H. J. Scudder. 1965. Probability of error of some adaptive pattern-recognition machines. IEEE Transactions on Information Theory, 11:363–371 . Amarnag Subramanya, Slav Petrov, and Fernando Pereira. 2010. Efficient graph-based semi-supervised learning of structured tagging models. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 167–176, Cambridge, MA, October. Association for Computational Linguistics. Partha Pratim Talukdar. 2010. Graph-based weaklysupervised methods for information extraction & integration. Ph.D. thesis, University of Pennsylvania. Software: https://github.com/parthatalukdar/junto. X. Zhu and Z. Ghahramani and J. Lafferty. 2003. Semisupervised learning using Gaussian fields and harmonic functions. In Proceedings ofInternational Conference on Machine Learning. David Yarowsky. 1995. Unsupervised word sense disambiguation rivaling supervised methods. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, pages 189–196, Cambridge, Massachusetts, USA, June. Association for Computational Linguistics.