nips nips2008 nips2008-224 nips2008-224-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jim C. Huang, Brendan J. Frey
Abstract: Ranking is at the heart of many information retrieval applications. Unlike standard regression or classification in which we predict outputs independently, in ranking we are interested in predicting structured outputs so that misranking one object can significantly affect whether we correctly rank the other objects. In practice, the problem of ranking involves a large number of objects to be ranked and either approximate structured prediction methods are required, or assumptions of independence between object scores must be made in order to make the problem tractable. We present a probabilistic method for learning to rank using the graphical modelling framework of cumulative distribution networks (CDNs), where we can take into account the structure inherent to the problem of ranking by modelling the joint cumulative distribution functions (CDFs) over multiple pairwise preferences. We apply our framework to the problem of document retrieval in the case of the OHSUMED benchmark dataset. We will show that the RankNet, ListNet and ListMLE probabilistic models can be viewed as particular instances of CDNs and that our proposed framework allows for the exploration of a broad class of flexible structured loss functionals for learning to rank. 1
[1] R. Baeza-Yates and B. Ribeiro-Neto. Modern information retrieval. Addison Wesley, 1999.
[2] C.J.C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton and G. Hullender. Learning to rank using gradient descent. In Proceedings of the Twenty-Second International Conference on Machine Learning (ICML), 2005.
[3] C.J.C. Burges, R. Ragno and Q.V. Le. Learning to rank with nonsmooth cost functions. In Proceedings of the Nineteenth Annual Conference on Neural Information Processing Systems (NIPS), 2007.
[4] Z. Cao, T. Qin, T.Y. Liu, M.F. Tsai and H. Li. Learning to rank: from pairwise approach to listwise approach. In Proceedings of the Twenty-Fourth International Conference on Machine Learning (ICML), 2007.
[5] J.C. Huang and B.J. Frey. Cumulative distribution networks and the derivative-sum-product algorithm. In Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI), 2008.
[6] K. Jarvelin and J. Kekalainen. Cumulated evaluation of IR techniques, ACM Information Systems, 2002.
[7] T. Joachims. A support vector method for multivariate performance measures. In Proceedings of the Twenty-Second International Conference on Machine Learning (ICML), 2005.
[8] T.Y. Liu, J. Xu, T. Qin, W. Xiong and H. Li. LETOR: Benchmark dataset for research on learning to rank for information retrieval. LR4IR 2007, in conjunction with SIGIR 2007, 2007.
[9] J. I. Marden. Analyzing and modeling rank data. CRC Press, 1995.
[10] E.A. Nadaraya. On estimating regression. Theory of Probability and its Applications 9(1), pp. 141-142, 1964.
[11] S.E. Robertson. Overview of the OKAPI projects. Journal of Documentation 53 (1), pp. 3-7, 1997.
[12] G.S. Watson. Smooth regression analysis. The Indian Journal of Statistics. Series A 26, pp. 359-372, 1964.
[13] F. Xia, T.Y. Liu, J. Wang, W. Zhang and H. Li. Listwise approach to learning to rank - theory and algorithm. In Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML), 2008.
[14] C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of SIGIR 2001, 2001.