emnlp emnlp2011 emnlp2011-17 emnlp2011-17-reference knowledge-graph by maker-knowledge-mining

17 emnlp-2011-Active Learning with Amazon Mechanical Turk


Source: pdf

Author: Florian Laws ; Christian Scheible ; Hinrich Schutze

Abstract: Supervised classification needs large amounts of annotated training data that is expensive to create. Two approaches that reduce the cost of annotation are active learning and crowdsourcing. However, these two approaches have not been combined successfully to date. We evaluate the utility of active learning in crowdsourcing on two tasks, named entity recognition and sentiment detection, and show that active learning outperforms random selection of annotation examples in a noisy crowdsourcing scenario.


reference text

Jason Baldridge and Alexis Palmer. 2009. How well does active learning actually work? Time-based evaluation of cost-reduction strategies for language documentation. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 296–305. Anthony Brew, Derek Greene, and P ´adraig Cunningham. 2010. Using crowdsourcing and active learning to track sentiment in online media. In Proceeding of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence, pages 145–150. Klaus Brinker. 2003. Incorporating diversity in active learning with support vector machines. In Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), pages 59–66. Chris Callison-Burch. 2009. Fast, cheap, and creative: evaluating translation quality using Amazon’s Mechanical Turk. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 286–295. Bob Carpenter and Massimo Poesio. 2010. Models of data annotation. Tutorial at the seventh international conference on Language Resources and Evaluation (LREC 2010). Nancy Chinchor, David D. Lewis, and Lynette Hirschman. 1993. Evaluating message understanding systems: an analysis of the third message understanding conference (muc-3). Computational Linguistics, 19(3):409–449. Pinar Donmez and Jaime G. Carbonell. 2008. Proactive learning: cost-sensitive active learning with multiple imperfect oracles. In Proceeding ofthe 17th ACM conference on Information and knowledge management, pages 619–628. Pinar Donmez, Jaime G. Carbonell, and Jeff Schneider. 2009. Efficiently learning the accuracy of labeling sources for selective sampling. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 259– 268. Tim Finin, William Murnane, Anand Karandikar, Nicholas Keller, Justin Martineau, and Mark Dredze. 2010. Annotating named entities in twitter data with crowdsourcing. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon ’s Mechanical Turk, pages 80–88. Ben Hachey, Beatrice Alex, and Markus Becker. 2005. Investigating the effects of selective sampling on the annotation task. In CoNLL ’05: Proceedings of the 9th Conference on Computational Natural Language Learning, pages 144–15 1. Robbie Haertel, Paul Felt, Eric K. Ringger, and Kevin Seppi. 2010. Parallel active learning: Eliminating wait time with minimal staleness. In Proceedings of the NAACL HLT 2010 Workshop on Active Learning for Natural Language Processing, pages 33–41. Panagiotis G. Ipeirotis, Foster Provost, and Jing Wang. 2010. Quality management on amazon mechanical turk. In Proceedings of the ACM SIGKDD Workshop on Human Computation (HCOMP ’10). Nolan Lawson, Kevin Eustice, Mike Perkowitz, and Meliha Yetisgen-Yildiz. 2010. Annotating large email datasets for named entity recognition with mechanical turk. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon ’s Mechanical Turk, pages 71–79. David D. Lewis and William A. Gale. 1994. A sequential algorithm for training text classifiers. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 3–12. Christopher Manning and Dan Klein. 2003. Optimization, maxent models, and conditional estimation without magic. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Tutorials - Volume 5, pages 8–8. Eric W. Noreen. 1989. Computer-intensive methods for testing hypotheses: an introduction. Wiley. Miles Osborne and Jason Baldridge. 2004. Ensemblebased active learning for parse selection. In Daniel Marcu Susan Dumais and Salim Roukos, editors, HLT-NAACL 2004: Main Proceedings, pages 89– 96. Sebastian Pad o´, 2006. User’s guide to sigf: Significance testing by approximate randomisation. Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the 1556 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 79–86. Ines Rehbein, Josef Ruppenhofer, and Alexis Palmer. 2010. Bringing active learning to life. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 949–957. Eric Ringger, Peter McClanahan, Robbie Haertel, George Busby, Marc Carmen, James Carroll, Kevin Seppi, and Deryle Lonsdale. 2007. Active learning for part-ofspeech tagging: Accelerating corpus annotation. In Proceedings of the Linguistic Annotation Workshop at ACL-2007, pages 101–108. Andrew Schein and Lyle Ungar. 2007. Active learning for logistic regression: An evaluation. Machine Learning, 68(3):235–265. Burr Settles, Mark Craven, and Lewis Friedland. 2008. Active learning with real annotation costs. In Proceedings of the NIPS Workshop on Cost-Sensitive Learning, pages 1069–1078. Rion Snow, Brendan O’Connor, Daniel Jurafsky, and Andrew Ng. 2008. Cheap and fast – but is it good? evaluating non-expert annotations for natural language tasks. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 254–263. Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the conll-2003 shared task: languageindependent named entity recognition. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL (CoNLL 2003), pages 142–147. Katrin Tomanek, Joachim Wermter, and Udo Hahn. 2007. An approach to text corpus construction which cuts annotation costs and maintains reusability of annotated data. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 486–495. Simon Tong and Daphne Koller. 2002. Support vector machine active learning with applications to text classification. The Journal of Machine Learning Research, 2:45–66. Robert Voyer, Valerie Nygaard, Will Fitzgerald, and Hannah Copperman. 2010. A hybrid model for annotating named entity training corpora. In Proceedings of the Fourth Linguistic Annotation Workshop, pages 243–246. Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. 2005. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 347–354.