nips nips2007 nips2007-112 nips2007-112-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Andrew Howard, Tony Jebara
Abstract: A discriminative method is proposed for learning monotonic transformations of the training data while jointly estimating a large-margin classifier. In many domains such as document classification, image histogram classification and gene microarray experiments, fixed monotonic transformations can be useful as a preprocessing step. However, most classifiers only explore these transformations through manual trial and error or via prior domain knowledge. The proposed method learns monotonic transformations automatically while training a large-margin classifier without any prior knowledge of the domain. A monotonic piecewise linear function is learned which transforms data for subsequent processing by a linear hyperplane classifier. Two algorithmic implementations of the method are formalized. The first solves a convergent alternating sequence of quadratic and linear programs until it obtains a locally optimal solution. An improved algorithm is then derived using a convex semidefinite relaxation that overcomes initialization issues in the greedy optimization problem. The effectiveness of these learned transformations on synthetic problems, text data and image data is demonstrated. 1
[1] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
[2] M. Brown, W. Grundy, D. Lin, N. Christianini, C. Sugnet, M. Jr, and D. Haussler. Support vector machine classification of microarray gene expression data, 1999.
[3] O. Chapelle, P. Hafner, and V.N. Vapnik. Support vector machines for histogram-based classification. Neural Networks, IEEE Transactions on, 10:1055–1064, 1999.
[4] C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3):273–297, 1995.
[5] M. Hein and O. Bousquet. Hilbertian metrics and positive definite kernels on probability measures. In Proceedings of Artificial Intelligence and Statistics, 2005.
[6] T. Jebara, R. Kondor, and A. Howard. Probability product kernels. Journal of Machine Learning Research, 5:819–844, 2004.
[7] G. Lanckriet, N. Cristianini, P. Bartlett, L. El Ghaoui, and M. I. Jordan. Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 5:27–72, 2004.
[8] J.B. Lasserre. Convergent LMI relaxations for nonconvex quadratic programs. In Proceedings of 39th IEEE Conference on Decision and Control, 2000.
[9] B. Moghaddam and M.H. Yang. Sex with support vector machines. In Todd K. Leen, Thomas G. Dietterich, and Volker Tresp, editors, Advances in Neural Information Processing 13, pages 960–966. MIT Press, 2000.
[10] T. Robertson, F.T. Wright, and R.L. Dykstra. Order Restricted Statistical Inference. Wiley, 1988.