nips nips2009 nips2009-92 nips2009-92-reference knowledge-graph by maker-knowledge-mining

92 nips-2009-Fast Graph Laplacian Regularized Kernel Learning via Semidefinite–Quadratic–Linear Programming

Source: pdf

Author: Xiao-ming Wu, Anthony M. So, Zhenguo Li, Shuo-yen R. Li

Abstract: Kernel learning is a powerful framework for nonlinear data modeling. Using the kernel trick, a number of problems have been formulated as semideﬁnite programs (SDPs). These include Maximum Variance Unfolding (MVU) (Weinberger et al., 2004) in nonlinear dimensionality reduction, and Pairwise Constraint Propagation (PCP) (Li et al., 2008) in constrained clustering. Although in theory SDPs can be efﬁciently solved, the high computational complexity incurred in numerically processing the huge linear matrix inequality constraints has rendered the SDP approach unscalable. In this paper, we show that a large class of kernel learning problems can be reformulated as semideﬁnite-quadratic-linear programs (SQLPs), which only contain a simple positive semideﬁnite constraint, a second-order cone constraint and a number of linear constraints. These constraints are much easier to process numerically, and the gain in speedup over previous approaches is at least of the order m2.5 , where m is the matrix dimension. Experimental results are also presented to show the superb computational efﬁciency of our approach.

reference text

Ben-Tal, A., & Nemirovski, A. (2001). Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications. MPS–SIAM Series on Optimization. Philadelphia, Pennsylvania: Society for Industrial and Applied Mathematics. Borchers, B. (1999). CSDP, a C Library for Semideﬁnite Programming. Optimization Methods and Software, 11/12, 613–623. Boyd, S., & Vandenberghe, L. (2004). Convex Optimization. Cambridge: Cambridge University Press. Available online at http://www.stanford.edu/˜boyd/cvxbook/. Chapelle, O., & Vapnik, V. (2000). Model Selection for Support Vector Machines. In S. A. Solla, T. K. Leen and K.-R. M¨ ller (Eds.), Advances in Neural Information Processing Systems 12: Proceedings of the 1999 u Conference, 230–236. Cambridge, Massachusetts: The MIT Press. 8 Globerson, A., & Roweis, S. (2007). Visualizing Pairwise Similarity via Semideﬁnite Programming. Proceedings of the 11th International Conference on Artiﬁcial Intelligence and Statistics (pp. 139–146). Golub, G. H., & Loan, C. F. V. (1996). Matrix Computations. Baltimore, Maryland: The Johns Hopkins University Press. Third edition. Graepel, T. (2002). Kernel Matrix Completion by Semideﬁnite Programming. Proceedings of the 12th International Conference on Artiﬁcial Neural Networks (pp. 694–699). Springer–Verlag. Kulis, B., Sustik, M. A., & Dhillon, I. S. (2009). Low–Rank Kernel Learning with Bregman Matrix Divergences. The Journal of Machine Learning Research, 10, 341–376. Lanckriet, G. R. G., Cristianini, N., Bartlett, P., El Ghaoui, L., & Jordan, M. I. (2004). Learning the Kernel Matrix with Semideﬁnite Programming. The Journal of Machine Learning Research, 5, 27–72. Li, Z., & Liu, J. (2009). Constrained Clustering by Spectral Kernel Learning. To appear in the Proceedings of the 12th IEEE International Conference on Computer Vision. Li, Z., Liu, J., & Tang, X. (2008). Pairwise Constraint Propagation by Semideﬁnite Programming for Semi– Supervised Classiﬁcation. Proceedings of the 25th International Conference on Machine Learning (pp. 576–583). Li, Z., Liu, J., & Tang, X. (2009). Constrained Clustering via Spectral Regularization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2009 (pp. 421–428). Sch¨ lkopf, B., & Smola, A. J. (2002). Learning with Kernels: Support Vector Machines, Regularization, o Optimization, and Beyond. Cambridge, Massachusetts: The MIT Press. Sch¨ lkopf, B., Smola, A. J., & M¨ ller, K.-R. (1998). Nonlinear Component Analysis as a Kernel Eigenvalue o u Problem. Neural Computation, 10, 1299–1319. Sha, F., & Saul, L. K. (2005). Analysis and Extension of Spectral Methods for Nonlinear Dimensionality Reduction. Proceedings of the 22nd International Conference on Machine Learning (pp. 784–791). Shawe-Taylor, J., & Cristianini, N. (2004). Kernel Methods for Pattern Analysis. Cambridge: Cambridge University Press. Singer, A. (2008). A Remark on Global Positioning from Local Distances. Proceedings of the National Academy of Sciences, 105, 9507–9511. So, A. M.-C. (2007). A Semideﬁnite Programming Approach to the Graph Realization Problem: Theory, Applications and Extensions. Doctoral dissertation, Stanford University. Song, L., Smola, A., Borgwardt, K., & Gretton, A. (2008). Colored Maximum Variance Unfolding. In J. C. Platt, D. Koller, Y. Singer and S. Roweis (Eds.), Advances in Neural Information Processing Systems 20: Proceedings of the 2007 Conference, 1385–1392. Cambridge, Massachusetts: The MIT Press. Toh, K. C., T¨ t¨ nc¨ , R. H., & Todd, M. J. (2006). On the Implementation and Usage of SDPT3 — A MATLAB uu u Software Package for Semideﬁnite–Quadratic–Linear Programming, Version 4.0. User’s Guide. T¨ t¨ nc¨ , R. H., Toh, K. C., & Todd, M. J. (2003). Solving Semideﬁnite–Quadratic–Linear Programs using uu u SDPT3. Mathematical Programming, 95, 189–217. Vapnik, V. N. (2000). The Nature of Statistical Learning Theory. Statistics for Engineering and Information Science. New York: Springer–Verlag. Second edition. Weinberger, K. Q., Packer, B. D., & Saul, L. K. (2005). Nonlinear Dimensionality Reduction by Semideﬁnite Programming and Kernel Matrix Factorization. Proceedings of the 10th International Workshop on Artiﬁcial Intelligence and Statistics (pp. 381–388). Weinberger, K. Q., Sha, F., & Saul, L. K. (2004). Learning a Kernel Matrix for Nonlinear Dimensionality Reduction. Proceedings of the 21st International Conference on Machine Learning (pp. 85–92). Weinberger, K. Q., Sha, F., Zhu, Q., & Saul, L. K. (2007). Graph Laplacian Regularization for Large–Scale Semideﬁnite Programming. Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference (pp. 1489–1496). Cambridge, Massachusetts: The MIT Press. Zhou, D., Bousquet, O., Lal, T. N., Weston, J., & Sch¨ lkopf, B. (2004). Learning with Local and Global o Consistency. Advances in Neural Information Processing Systems 16: Proceedings of the 2003 Conference (pp. 595–602). Cambridge, Massachusetts: The MIT Press. 9