emnlp emnlp2011 emnlp2011-110 emnlp2011-110-reference knowledge-graph by maker-knowledge-mining

110 emnlp-2011-Ranking Human and Machine Summarization Systems


Source: pdf

Author: Peter Rankel ; John Conroy ; Eric Slud ; Dianne O'Leary

Abstract: The Text Analysis Conference (TAC) ranks summarization systems by their average score over a collection of document sets. We investigate the statistical appropriateness of this score and propose an alternative that better distinguishes between human and machine evaluation systems.


reference text

Peter J. Bickel and Jian-Jian Ren. 2001. The Bootstrap in Hypothesis Testing. In State of the Art in Statistics and Probability Theory, Festschrift for Willem R. van Zwet, volume 36 of Lecture Notes– Monograph Series, pages 91–1 12. Institute of Mathematical Statistics. John M. Conroy and Hoa Trang Dang. 2008. Mind the Gap: Dangers of Divorcing Evaluations of Summary Content from Linguistic Quality. In Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1, COLING ’08, pages 145–152, Stroudsburg, PA, USA. Association for Computational Linguistics. Hoa T. Dang and Karolina Owczarzak. 2008. Overview of the tac 2008 update summarization task. In Proceedings of the 1st Text Analysis Conference (TAC), Gaithersburg, Maryland, USA. B. Efron and R. J. Tibshirani. 1993. An Introduction to the Bootstrap. Chapman & Hall, New York. Chin-Yew Lin and Eduard Hovy. 2003. Automatic Evaluation of Summaries Using N-gram Co-Occurrences Statistics. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics, Edmonton, Alberta. National Institute of Standards and Technology. 2010. Text Analysis Conference, http://www.nist.gov/tac. Ani Nenkova, Rebecca Passonneau, and Kathleen McKeown. 2007. The Pyramid Method: Incorporating Human Content Selection Variation in Summarization Evaluation. ACM Transactions on Speech and Language Processing, 4(2). Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. BLEU: a Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL ’02, pages 3 11–3 18, Stroudsburg, PA, USA. Association for Computational Linguistics. R.H. Randles and D.A. Wolfe. 1979. Introduction to the Theory of Nonparametric Statistics. Wiley series in probability and mathematical statistics. Probability and mathematical statistics. Wiley. Alan Turing. 1950. Computing Machinery and Intelligence. Mind, 59(236):433–460.