nips nips2007 nips2007-112 nips2007-112-reference knowledge-graph by maker-knowledge-mining

112 nips-2007-Learning Monotonic Transformations for Classification

Source: pdf

Author: Andrew Howard, Tony Jebara

Abstract: A discriminative method is proposed for learning monotonic transformations of the training data while jointly estimating a large-margin classiﬁer. In many domains such as document classiﬁcation, image histogram classiﬁcation and gene microarray experiments, ﬁxed monotonic transformations can be useful as a preprocessing step. However, most classiﬁers only explore these transformations through manual trial and error or via prior domain knowledge. The proposed method learns monotonic transformations automatically while training a large-margin classiﬁer without any prior knowledge of the domain. A monotonic piecewise linear function is learned which transforms data for subsequent processing by a linear hyperplane classiﬁer. Two algorithmic implementations of the method are formalized. The ﬁrst solves a convergent alternating sequence of quadratic and linear programs until it obtains a locally optimal solution. An improved algorithm is then derived using a convex semideﬁnite relaxation that overcomes initialization issues in the greedy optimization problem. The eﬀectiveness of these learned transformations on synthetic problems, text data and image data is demonstrated. 1

reference text

[1] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.

[2] M. Brown, W. Grundy, D. Lin, N. Christianini, C. Sugnet, M. Jr, and D. Haussler. Support vector machine classiﬁcation of microarray gene expression data, 1999.

[3] O. Chapelle, P. Hafner, and V.N. Vapnik. Support vector machines for histogram-based classiﬁcation. Neural Networks, IEEE Transactions on, 10:1055–1064, 1999.

[4] C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3):273–297, 1995.

[5] M. Hein and O. Bousquet. Hilbertian metrics and positive deﬁnite kernels on probability measures. In Proceedings of Artiﬁcial Intelligence and Statistics, 2005.

[6] T. Jebara, R. Kondor, and A. Howard. Probability product kernels. Journal of Machine Learning Research, 5:819–844, 2004.

[7] G. Lanckriet, N. Cristianini, P. Bartlett, L. El Ghaoui, and M. I. Jordan. Learning the kernel matrix with semideﬁnite programming. Journal of Machine Learning Research, 5:27–72, 2004.

[8] J.B. Lasserre. Convergent LMI relaxations for nonconvex quadratic programs. In Proceedings of 39th IEEE Conference on Decision and Control, 2000.

[9] B. Moghaddam and M.H. Yang. Sex with support vector machines. In Todd K. Leen, Thomas G. Dietterich, and Volker Tresp, editors, Advances in Neural Information Processing 13, pages 960–966. MIT Press, 2000.

[10] T. Robertson, F.T. Wright, and R.L. Dykstra. Order Restricted Statistical Inference. Wiley, 1988.