acl acl2010 acl2010-106 acl2010-106-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Tingxu Yan ; Tamsin Maxwell ; Dawei Song ; Yuexian Hou ; Peng Zhang
Abstract: p . zhang1 @ rgu .ac .uk Bag-of-words approaches to information retrieval (IR) are effective but assume independence between words. The Hyperspace Analogue to Language (HAL) is a cognitively motivated and validated semantic space model that captures statistical dependencies between words by considering their co-occurrences in a surrounding window of text. HAL has been successfully applied to query expansion in IR, but has several limitations, including high processing cost and use of distributional statistics that do not exploit syntax. In this paper, we pursue two methods for incorporating syntactic-semantic information from textual ‘events’ into HAL. We build the HAL space directly from events to investigate whether processing costs can be reduced through more careful definition of word co-occurrence, and improve the quality of the pseudo-relevance feedback by applying event information as a constraint during HAL construction. Both methods significantly improve performance results in comparison with original HAL, and interpolation of HAL and relevance model expansion outperforms either method alone.
Bach E. The Algebra of Events. 1986. Linguistics and Philosophy, 9(1): pp. 5–16. Bai J. and Song D. and Bruza P. and Nie J.-Y. and Cao G. Query Expansion using Term Relationships in Language Models for Information Retrieval 2005. In: Proceedings of the 14th International ACM Conference on Information and Knowledge Management, pp. 688–695. Bruza P. and Song D. Inferring Query Models by Computing Information Flow. 2002. In: Proceedings of the 11th International ACM Conference on Information and Knowledge Management, pp. 206–269. Deerwester S., Dumais S., Furnas G., Landauer T. and Harshman R. Indexing by latent semantic analysis. 1990. Journal of the American Sociaty for Information Science, 41(6): pp. 391–407. Gao J. and Nie J. and Wu G. and Cao G. Dependence Language Model for Information Retrieval. 2004. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 170–177. Harris Z. 1968. Mathematical Structures of Language. . Wiley, New York. Johansson R. and Nugues P. Dependency-based Syntactic-semantic Analysis with PropBank and NomBank. 2008. In: CoNLL ’08: Proceedings of the Twelfth Conference on Computational Natural Language Learning, pp. 183–187. Landauer T., Foltz P. and Laham D. Introduction to Latent Semantic Analysis. 1998. Discourse Processes, 25: pp. 259–284. Lavrenko V. 2004. A Generative Theory of Relevance, PhD thesis, University of Massachusetts, Amherst. Lavrenko V. and Croft W. B. Relevance Based Language Models. 2001. In: SIGIR ’01: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 120–127, New York, NY, USA, 2001. ACM. Lin D. and Pantel P. DIRT - Discovery of Inference Rules from Text. 2001. In: KDD ’01: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 323–328, New York, NY, USA. Lund K. and Burgess C. Producing High-dimensional Semantic Spaces from Lexical Co-occurrence. 1996. Behavior Research Methods, Instruments & Computers, 28: pp. 203–208. Prentice-Hall, Englewood Cliffs, NJ. Metzler D. and Bruce W. B. A Markov Random Field Model for Term Dependencies 2005. In: SIGIR ’05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 472–479, New York, NY, USA. ACM. Metzler D. and Bruce W. B. Latent Concept Expansion using Markov Random Fields 2007. In: SIGIR ’07: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 3 11–3 18, ACM, New York, NY, USA. Pado S. and Lapata M. Dependency-Based Construction of Semantic Space Models. 2007. Computational Linguistics, 33: pp. 161–199. Shen D. and Lapata M. Using Semantic Roles to Improve Question Answering. 2007. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 12–21. Sleator D. D. and Temperley D. Parsing English with a Link Grammar 1991. Technical Report CMU-CS91-196, Department of Computer Science, Carnegie Mellon University. Smeaton A. F., O’Donnell R. and Kelledy F. Indexing Structures Derived from Syntax in TREC-3: System Description. 1995. In: The Third Text REtrieval Conference (TREC-3), pp. 55–67. Song F. and Croft W. B. A General Language Model for Information Retrieval. 1999. In: CIKM ’99: Proceedings of the Eighth International Conference on Information and Knowledge Management, pp. 316–321, New York, NY, USA, ACM. 125