acl acl2011 acl2011-256 acl2011-256-reference knowledge-graph by maker-knowledge-mining

256 acl-2011-Query Weighting for Ranking Model Adaptation

Source: pdf

Author: Peng Cai ; Wei Gao ; Aoying Zhou ; Kam-Fai Wong

Abstract: We propose to directly measure the importance of queries in the source domain to the target domain where no rank labels of documents are available, which is referred to as query weighting. Query weighting is a key step in ranking model adaptation. As the learning object of ranking algorithms is divided by query instances, we argue that it’s more reasonable to conduct importance weighting at query level than document level. We present two query weighting schemes. The first compresses the query into a query feature vector, which aggregates all document instances in the same query, and then conducts query weighting based on the query feature vector. This method can efficiently estimate query importance by compressing query data, but the potential risk is information loss resulted from the compression. The second measures the similarity between the source query and each target query, and then combines these fine-grained similarity values for its importance estimation. Adaptation experiments on LETOR3.0 data set demonstrate that query weighting significantly outperforms document instance weighting methods.

reference text

Ricardo A. Baeza-Yates and Berthier Ribeiro-Neto. 1999. Modern Information Retrieval. Somnath Banerjee, Avinava Dubey, Jinesh Machchhar, and Soumen Chakrabarti. 2009. Efficient and accurate local learning for ranking. In SIGIR workshop : Learning to rank for information retrieval, pages 1–8. Shai Ben-David, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, and Jennifer Wortman Vaughan. 2010. A theory of learning from different domains. Machine Learning, 79(1-2): 15 1–175. John Blitzer, Ryan Mcdonald, and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In Proceedings of EMNLP. C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. 2005. Learning to rank using gradient descent. In Proceedings of ICML, pages 89–96. Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to rank: from pairwise approach to listwise approach. In Proceedings of ICML, pages 129 136. Depin Chen, Jun Yan, Gang Wang, Yan Xiong, Weiguo Fan, and Zheng Chen. 2008a. Transrank: A novel algorithm for transfer of rank learning. In Proceedings of ICDM Workshops, pages 106–1 15. Keke Chen, Rongqing Lu, C.K. Wong, Gordon Sun, Larry Heck, and Belle Tseng. 2008b. Trada: Tree based ranking function adaptation. In Proceedings of CIKM. Depin Chen, Yan Xiong, Jun Yan, Gui-Rong Xue, Gang Wang, and Zheng Chen. 2010. Knowledge transfer – for cross domain learning to rank. Information Retrieval, 13(3):236–253. Hal Daum e´ III and Daniel Marcu. 2006. Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research, 26(1): 101–126. Y. Freund, R. Iyer, R. Schapire, and Y. Singer. 2004. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4:933– 969. Jianfeng Gao, Qiang Wu, Chris Burges, Krysta Svore, Yi Su, Nazan Khan, Shalin Shah, and Hongyan Zhou. 2009. Model adaptation via model interpolation and boosting for web search ranking. In Proceedings of EMNLP. Wei Gao, Peng Cai, Kam Fai Wong, and Aoying Zhou. 2010. Learning to rank only using training data from related domain. In Proceedings of SIGIR, pages 162– 169. Xiubo Geng, Tie-Yan Liu, Tao Qin, Andrew Arnold, Hang Li, and Heung-Yeung Shum. 2008. Query dependent ranking using k-nearest neighbor. In Proceedings of SIGIR, pages 115–122. 121 Bo Geng, Linjun Yang, Chao Xu, and Xian-Sheng Hua. 2009. Ranking model adaptation for domain-specific search. In Proceedings of CIKM. R. Herbrich, T. Graepel, and K. Obermayer. 2000. Large Margin Rank Boundaries for Ordinal Regression. MIT Press, Cambridge. Jiayuan Huang, Alexander J. Smola, Arthur Gretton, Karsten M. Borgwardt, and Bernhard Sch o¨lkopf. 2007. Correcting sample selection bias by unlabeled data. In Proceedings of NIPS, pages 601–608. Jing Jiang and ChengXiang Zhai. 2007. Instance weighting for domain adaptation in nlp. In Proceedings of ACL. Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In Proceedings of SIGKDD, pages 133–142. Maurice Kendall. 1970. Rank Correlation Methods. Griffin. Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3):225–331. Sinno Jialin Pan, Ivor W. Tsang, James T. Kwok, and Qiang Yang. 2009. Domain adaptation via transfer component analysis. In Proceedings of IJCAI, pages 1187–1 192. John C. Platt and John C. Platt. 1999. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers, pages 61–74. MIT Press. Tao Qin, Tie-Yan Liu, Jun Xu, and Hang Li. 2010. Letor: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval, 13(4):346–374. S. Shalev-Shwartz, Y. Singer, and N. Srebro. 2007. Pegasos: Primal estimated sub-gradient solver for svm. In Proceedings of the 24th International Conference on Machine Learning, pages 807–814. Hidetoshi Shimodaira. 2000. Improving predictive inference under covariate shift by weighting the loglikelihood function. Journal of Statistical Planning and Inference, 90:227–244. Masashi Sugiyama, Shinichi Nakajima, Hisashi Kashima, Paul von B ¨unau, and Motoaki Kawanabe. 2008. Direct importance estimation with model selection and its application to covariate shift adaptation. In Proceedings of NIPS, pages 1433–1440. Ellen M. Voorhees. 2003. Overview of trec 2003. In Proceedings of TREC-2003, pages 1–13. Ellen M. Voorhees. 2004. Overview of trec 2004. In Proceedings of TREC-2004, pages 1–12. Bo Wang, Jie Tang, Wei Fan, Songcan Chen, Zi Yang, and Yanzhu Liu. 2009. Heterogeneous cross domain ranking in latent space. In Proceedings of CIKM. Finley, F. Radlinski, and T. Joachims. 2007. A support vector method for optimizing average precision. In Proceedings of SIGIR, pages 271–278. Bianca Zadrozny Zadrozny. 2004. Learning and evaluating classifiers under sample selection bias. In Proceedings of ICML, pages 325–332. Y. Yue, T. 122