acl acl2013 acl2013-60 acl2013-60-reference knowledge-graph by maker-knowledge-mining

60 acl-2013-Automatic Coupling of Answer Extraction and Information Retrieval

Source: pdf

Author: Xuchen Yao ; Benjamin Van Durme ; Peter Clark

Abstract: Information Retrieval (IR) and Answer Extraction are often designed as isolated or loosely connected components in Question Answering (QA), with repeated overengineering on IR, and not necessarily performance gain for QA. We propose to tightly integrate them by coupling automatically learned features for answer extraction to a shallow-structured IR model. Our method is very quick to implement, and significantly improves IR for QA (measured in Mean Average Precision and Mean Reciprocal Rank) by 10%-20% against an uncoupled retrieval baseline in both document and passage retrieval, which further leads to a downstream 20% improvement in QA F1.

reference text

Arvind Agarwal, Hema Raghavan, Karthik Subbian, Prem Melville, Richard D. Lawrence, David C. Gondek, and James Fan. 2012. Learning to rank for robust question answering. In Proceedings of the 21st ACM international conference on Information and knowledge management, CIKM ’ 12, pages 833–842, New York, NY, USA. ACM. Ron Artstein and Massimo Poesio. 2008. Inter-Coder Agreement for Computational Linguistics. Computational Linguistics, 34(4):555–596. M.W. Bilotti and E. Nyberg. 2008. Improving text retrieval precision and answer accuracy in question answering systems. In Coling 2008: Proceedings of the 2nd workshop on Information Retrieval for Question Answering, pages 1–8. M.W. Bilotti, P. Ogilvie, J. Callan, and E. Nyberg. 2007. Structured retrieval for question answering. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 35 1–358. ACM. M.W. Bilotti, J. Elsas, J. Carbonell, and E. Nyberg. 2010. Rank learning for factoid question answering with linguistic and semantic constraints. In Proceedings of the 19th ACM international conference on Information and knowledge management, pages 459–468. ACM. Steven Bird and Edward Loper. 2004. Nltk: The natural language toolkit. In The Companion Volume to the Proceedings of 42st Annual Meeting of the Association for Computational Linguistics, pages 214– 217, Barcelona, Spain, July. Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan, and Tat-Seng Chua. 2005. Question answering passage retrieval using dependency relations. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’05, pages 400–407, New York, NY, USA. ACM. Mark A. Greenwood, editor. 2008. Coling 2008: Proceedings of the 2nd workshop on Information Retrieval for Question Answering. Coling 2008 Organizing Committee, Manchester, UK, August. Michael Kaisser. 2012. Answer Sentence Retrieval by Matching Dependency Paths acquired from Question/Answer Sentence Pairs. In EACL, pages 88–98. Dan Klein and Christopher D. Manning. 2003. Accurate Unlexicalized Parsing. In In Proc. the 41st Annual Meeting of the Association for Computational Linguistics. Klaus H. Krippendorff. 2004. Content Analysis: An Introduction to Its Methodology. Sage Publications, Inc, 2nd edition. John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001 . Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01, pages 282–289, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc. J. Lin and B. Katz. 2006. Building a reusable test collection for question answering. Journal of the American Society for Information Science and Technology, 57(7):851–861. D. Lin and P. Pantel. 2001 . Discovery of inference rules for question-answering. Natural Language Engineering, 7(4):343–360. Jimmy Lin. 2007. An exploration of the principles underlying redundancy-based factoid question answering. ACM Trans. Inf. Syst., 25(2), April. P. Ogilvie. 2010. Retrieval using Document Structure andAnnotations. Ph.D. thesis, Carnegie Mellon University. Christopher Pinchak, Davood Rafiei, and Dekang Lin. 2009. Answer typing for information retrieval. In Proceedings of the 18th ACM conference on Information and knowledge management, CIKM ’09, pages 1955–1958, New York, NY, USA. ACM. John Prager, Eric Brown, Anni Coden, and Dragomir Radev. 2000. Question-answering by predictive annotation. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’00, pages 184–191, New York, NY, USA. ACM. J. Prager, J. Chu-Carroll, E. Brown, and K. Czuba. 2006. Question answering by predictive annotation. Advances in Open Domain Question Answering, pages 307–347. L. Ratinov and D. Roth. 2009. Design challenges and misconceptions in named entity recognition. In CoNLL, 6. Tetsuya Sakai, Hideki Shima, Noriko Kando, Ruihua Song, Chuan-Jie Lin, Teruko Mitamura, Miho Sugimito, and Cheng-Wei Lee. 2010. Overview of the ntcir-7 aclia ir4qa task. In Proceedings of NTCIR-8 Workshop Meeting, Tokyo, Japan. D. Shen and M. Lapata. 2007. Using semantic roles to improve question answering. In Proceedings of EMNLP-CoNLL, pages 12–21. M.D. Smucker, J. Allan, and B. Carterette. 2007. A comparison of statistical significance tests for information retrieval evaluation. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pages 623– 632. ACM. 164 T. Strohman, D. Metzler, H. Turtle, and W.B. Croft. 2005. Indri: A language model-based search engine for complex queries. In Proceedings of the International Conference on Intelligent Analysis, volume 2, pages 2–6. Citeseer. Xuchen Yao, Benjamin Van Durme, Peter Clark, and Chris Callison-Burch. 2013. Answer Extraction as Sequence Tagging with Tree Edit Distance. In Proceedings of NAACL 2013. Xian Zhang, Yu Hao, Xiaoyan Zhu, Ming Li, and David R. Cheriton. 2007. Information distance from a question to an answer. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’07, pages 874–883, New York, NY, USA. ACM. L. Zhao and J. Callan. 2008. A generative retrieval model for structured documents. In Proceedings of the 17thACM conference on Information and knowledge management, pages 1163–1 172. ACM. 165