nips nips2008 nips2008-73 nips2008-73-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Kevyn Collins-thompson
Abstract: Query expansion is a long-studied approach for improving retrieval effectiveness by enhancing the user’s original query with additional related words. Current algorithms for automatic query expansion can often improve retrieval accuracy on average, but are not robust: that is, they are highly unstable and have poor worst-case performance for individual queries. To address this problem, we introduce a novel formulation of query expansion as a convex optimization problem over a word graph. The model combines initial weights from a baseline feedback algorithm with edge weights based on word similarity, and integrates simple constraints to enforce set-based criteria such as aspect balance, aspect coverage, and term centrality. Results across multiple standard test collections show consistent and significant reductions in the number and magnitude of expansion failures, while retaining the strong positive gains of the baseline algorithm. Our approach does not assume a particular retrieval model, making it applicable to a broad class of existing expansion algorithms. 1
[1] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
[2] C. Buckley. Why current IR engines fail. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2004), pages 584–585, 2004.
[3] K. Collins-Thompson and J. Callan. Query expansion using random walk models. In Proc. of the 14th International Conf. on Information and Knowledge Management (CIKM 2005), pages 704–711, 2005.
[4] K. Collins-Thompson and J. Callan. Estimation and use of uncertainty in pseudo-relevance feedback. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2007), pages 303–310, 2007.
[5] V. Lavrenko. A Generative Theory of Relevance. PhD thesis, Univ. of Massachusetts, Amherst, 2004.
[6] M. S. Lobo, M. Fazel, and S. Boyd. Portfolio optimization with linear and fixed transaction costs. Annals of Operations Research, 152(1):376–394, 2007.
[7] H. M. Markowitz. Portfolio selection. Journal of Finance, 7(1):77–91, 1952.
[8] D. Metzler and W. B. Croft. Combining the language model and inference network approaches to retrieval. Information Processing and Management, 40(5):735–750, 2004.
[9] J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In Proc. of the 1998 ACM SIGIR Conference on Research and Development in Information Retrieval, pages 275–281, 1998.
[10] P. Ravikumar and J. Lafferty. Quadratic programming relaxations for metric labeling and markov random field map estimation. In Proceedings of the 23rd International Conference on Machine Learning (ICML 2006), pages 737–744, 2006.
[11] J. Teevan, S. T. Dumais, and E. Horvitz. Personalizing search via automated analysis of interests and activities. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2005), pages 449–456, New York, NY, USA, 2005. ACM.
[12] J. Xu and W. B. Croft. Query expansion using local and global document analysis. In Proceedings of the 1996 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 4–11, 1996.