nips nips2006 nips2006-119 nips2006-119-reference knowledge-graph by maker-knowledge-mining

119 nips-2006-Learning to Rank with Nonsmooth Cost Functions

Source: pdf

Author: Christopher J. Burges, Robert Ragno, Quoc V. Le

Abstract: The quality measures used in information retrieval are particularly difﬁcult to optimize directly, since they depend on the model scores only through the sorted order of the documents returned for a given query. Thus, the derivatives of the cost with respect to the model parameters are either zero, or are undeﬁned. In this paper, we propose a class of simple, ﬂexible algorithms, called LambdaRank, which avoids these difﬁculties by working with implicit cost functions. We describe LambdaRank using neural network models, although the idea applies to any differentiable function class. We give necessary and sufﬁcient conditions for the resulting implicit cost function to be convex, and we show that the general method has a simple mechanical interpretation. We demonstrate signiﬁcantly improved accuracy, over a state-of-the-art ranking algorithm, on several datasets. We also show that LambdaRank provides a method for signiﬁcantly speeding up the training phase of that ranking algorithm. Although this paper is directed towards ranking, the proposed method can be extended to any non-smooth and multivariate cost functions. 1

reference text

[1] C. Buckley and E. Voorhees. Evaluating evaluation measure stability. In SIGIR, pages 33–40, 2000.

[2] C.J.C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to Rank using Gradient Descent. In ICML 22, Bonn, Germany, 2005.

[3] C. Cortes and M. Mohri. Conﬁdence Intervals for the Area Under the ROC Curve. In NIPS 18. MIT Press, 2005.

[4] Y. Freund, R. Iyer, R.E. Schapire, and Y. Singer. An efﬁcient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4:933–969, 2003.

[5] J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: A statistical view of boosting. The Annals of Statistics, 28(2):337–374, 2000.

[6] K. Jarvelin and J. Kekalainen. IR evaluation methods for retrieving highly relevant documents. In SIGIR 23. ACM, 2000.

[7] T. Joachims. A support vector method for multivariate performance measures. In ICML 22, 2005.

[8] I. Matveeva, C. Burges, T. Burkard, A. Lauscius, and L. Wong. High accuracy retrieval with multiple nested rankers. In SIGIR, 2006.

[9] I. Newton. Philosophiae Naturalis Principia Mathematica. The Royal Society, 1687.

[10] S. Robertson and H. Zaragoza. On rank-based effectiveness measures and optimisation. Technical Report MSR-TR-2006-61, Microsoft Research, 2006.

[11] M. Spivak. Calculus on Manifolds. Addison-Wesley, 1965.

[12] B. Taskar, V. Chatalbashev, D. Koller, and C. Guestrin. Learning structured prediciton models: A large margin approach. In ICML 22, Bonn, Germany, 2005.

[13] I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support vector machine learning for interdependent and structured output spaces. In ICML 24, 2004.

[14] E.M. Voorhees. Overview of the TREC 2001/2002 Question Answering Track. In TREC, 2001,2002.

[15] L. Yan, R. Dodlier, M.C. Mozer, and R. Wolniewicz. Optimizing Classiﬁer Performance via an Approximation to the Wilcoxon-Mann-Whitney Statistic. In ICML 20, 2003.