nips nips2005 nips2005-10 nips2005-10-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Sören Sonnenburg, Gunnar Rätsch, Christin Schäfer
Abstract: While classical kernel-based learning algorithms are based on a single kernel, in practice it is often desirable to use multiple kernels. Lankriet et al. (2004) considered conic combinations of kernel matrices for classification, leading to a convex quadratically constraint quadratic program. We show that it can be rewritten as a semi-infinite linear program that can be efficiently solved by recycling the standard SVM implementations. Moreover, we generalize the formulation and our method to a larger class of problems, including regression and one-class classification. Experimental results show that the proposed algorithm helps for automatic model selection, improving the interpretability of the learning result and works for hundred thousands of examples or hundreds of kernels to be combined. 1
[1] Francis R. Bach, Gert R. G. Lanckriet, and Michael I. Jordan. Multiple kernel learning, conic duality, and the SMO algorithm. In Twenty-first international conference on Machine learning. ACM Press, 2004.
[2] Kristin P. Bennett, Michinari Momma, and Mark J. Embrechts. Mark: a boosting algorithm for heterogeneous kernel models. KDD, pages 24–31, 2002.
[3] Jinbo Bi, Tong Zhang, and Kristin P. Bennett. Column-generation boosting methods for mixture of kernels. KDD, pages 521–526, 2004.
[4] O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee. Choosing multiple parameters for support vector machines. Machine Learning, 46(1-3):131–159, 2002.
[5] I. Grandvalet and S. Canu. Adaptive scaling for feature selection in SVMs. In In Advances in Neural Information Processing Systems, 2002.
[6] R. Hettich and K.O. Kortanek. Semi-infinite programming: Theory, methods and applications. SIAM Review, 3:380–429, September 1993.
[7] G.R.G. Lanckriet, T. De Bie, N. Cristianini, M.I. Jordan, and W.S. Noble. A statistical framework for genomic data fusion. Bioinformatics, 2004.
[8] R. Meir and G. R¨ tsch. An introduction to boosting and leveraging. In S. Mendelson and a A. Smola, editors, Proc. of the first Machine Learning Summer School in Canberra, LNCS, pages 119–184. Springer, 2003. in press.
[9] C.S. Ong, A.J. Smola, and R.C. Williamson. Hyperkernels. In In Advances in Neural Information Processing Systems, volume 15, pages 495–502, 2003.
[10] J. Platt. Fast training of support vector machines using sequential minimal optimization. In B. Sch¨ lkopf, C.J.C. Burges, and A.J. Smola, editors, Advances in Kernel Methods — Support o Vector Learning, pages 185–208, Cambridge, MA, 1999. MIT Press.
[11] G. R¨ tsch. Robust Boosting via Convex Optimization. PhD thesis, University of Potsdam, a Computer Science Dept., August-Bebel-Str. 89, 14482 Potsdam, Germany, 2001.
[12] G. R¨ tsch, A. Demiriz, and K. Bennett. Sparse regression ensembles in infinite and finite hya pothesis spaces. Machine Learning, 48(1-3):193–221, 2002. Special Issue on New Methods for Model Selection and Model Combination. Also NeuroCOLT2 Technical Report NC-TR-2000085.
[13] G. R¨ tsch, S. Sonnenburg, and C. Sch¨ fer. Learning interpretable svms for biological sequence a a classification. BMC Bioinformatics, Special Issue from NIPS workshop on New Problems and Methods in Computational Biology Whistler, Canada, 18 December 2004, 7(Suppl. 1:S9), February 2006.
[14] G. R¨ tsch and M.K. Warmuth. Marginal boosting. NeuroCOLT2 Technical Report 97, Royal a Holloway College, London, July 2001.
[15] B. Sch¨ lkopf and A. J. Smola. Learning with Kernels. MIT Press, Cambridge, MA, 2002. o
[16] S. Sonnenburg, G. R¨ tsch, and C. Sch¨ fer. Learning interpretable SVMs for biological sea a quence classification. In RECOMB 2005, LNBI 3500, pages 389–407. Springer-Verlag Berlin Heidelberg, 2005.
[17] S. Sonnenburg, G. R¨ tsch, S. Sch¨ fer, and B. Sch¨ lkopf. Large scale multiple kernel learning. a a o Journal of Machine Learning Research, 2006. accepted.