nips nips2009 nips2009-92 nips2009-92-reference knowledge-graph by maker-knowledge-mining

92 nips-2009-Fast Graph Laplacian Regularized Kernel Learning via Semidefinite–Quadratic–Linear Programming


Source: pdf

Author: Xiao-ming Wu, Anthony M. So, Zhenguo Li, Shuo-yen R. Li

Abstract: Kernel learning is a powerful framework for nonlinear data modeling. Using the kernel trick, a number of problems have been formulated as semidefinite programs (SDPs). These include Maximum Variance Unfolding (MVU) (Weinberger et al., 2004) in nonlinear dimensionality reduction, and Pairwise Constraint Propagation (PCP) (Li et al., 2008) in constrained clustering. Although in theory SDPs can be efficiently solved, the high computational complexity incurred in numerically processing the huge linear matrix inequality constraints has rendered the SDP approach unscalable. In this paper, we show that a large class of kernel learning problems can be reformulated as semidefinite-quadratic-linear programs (SQLPs), which only contain a simple positive semidefinite constraint, a second-order cone constraint and a number of linear constraints. These constraints are much easier to process numerically, and the gain in speedup over previous approaches is at least of the order m2.5 , where m is the matrix dimension. Experimental results are also presented to show the superb computational efficiency of our approach.


reference text

Ben-Tal, A., & Nemirovski, A. (2001). Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications. MPS–SIAM Series on Optimization. Philadelphia, Pennsylvania: Society for Industrial and Applied Mathematics. Borchers, B. (1999). CSDP, a C Library for Semidefinite Programming. Optimization Methods and Software, 11/12, 613–623. Boyd, S., & Vandenberghe, L. (2004). Convex Optimization. Cambridge: Cambridge University Press. Available online at http://www.stanford.edu/˜boyd/cvxbook/. Chapelle, O., & Vapnik, V. (2000). Model Selection for Support Vector Machines. In S. A. Solla, T. K. Leen and K.-R. M¨ ller (Eds.), Advances in Neural Information Processing Systems 12: Proceedings of the 1999 u Conference, 230–236. Cambridge, Massachusetts: The MIT Press. 8 Globerson, A., & Roweis, S. (2007). Visualizing Pairwise Similarity via Semidefinite Programming. Proceedings of the 11th International Conference on Artificial Intelligence and Statistics (pp. 139–146). Golub, G. H., & Loan, C. F. V. (1996). Matrix Computations. Baltimore, Maryland: The Johns Hopkins University Press. Third edition. Graepel, T. (2002). Kernel Matrix Completion by Semidefinite Programming. Proceedings of the 12th International Conference on Artificial Neural Networks (pp. 694–699). Springer–Verlag. Kulis, B., Sustik, M. A., & Dhillon, I. S. (2009). Low–Rank Kernel Learning with Bregman Matrix Divergences. The Journal of Machine Learning Research, 10, 341–376. Lanckriet, G. R. G., Cristianini, N., Bartlett, P., El Ghaoui, L., & Jordan, M. I. (2004). Learning the Kernel Matrix with Semidefinite Programming. The Journal of Machine Learning Research, 5, 27–72. Li, Z., & Liu, J. (2009). Constrained Clustering by Spectral Kernel Learning. To appear in the Proceedings of the 12th IEEE International Conference on Computer Vision. Li, Z., Liu, J., & Tang, X. (2008). Pairwise Constraint Propagation by Semidefinite Programming for Semi– Supervised Classification. Proceedings of the 25th International Conference on Machine Learning (pp. 576–583). Li, Z., Liu, J., & Tang, X. (2009). Constrained Clustering via Spectral Regularization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2009 (pp. 421–428). Sch¨ lkopf, B., & Smola, A. J. (2002). Learning with Kernels: Support Vector Machines, Regularization, o Optimization, and Beyond. Cambridge, Massachusetts: The MIT Press. Sch¨ lkopf, B., Smola, A. J., & M¨ ller, K.-R. (1998). Nonlinear Component Analysis as a Kernel Eigenvalue o u Problem. Neural Computation, 10, 1299–1319. Sha, F., & Saul, L. K. (2005). Analysis and Extension of Spectral Methods for Nonlinear Dimensionality Reduction. Proceedings of the 22nd International Conference on Machine Learning (pp. 784–791). Shawe-Taylor, J., & Cristianini, N. (2004). Kernel Methods for Pattern Analysis. Cambridge: Cambridge University Press. Singer, A. (2008). A Remark on Global Positioning from Local Distances. Proceedings of the National Academy of Sciences, 105, 9507–9511. So, A. M.-C. (2007). A Semidefinite Programming Approach to the Graph Realization Problem: Theory, Applications and Extensions. Doctoral dissertation, Stanford University. Song, L., Smola, A., Borgwardt, K., & Gretton, A. (2008). Colored Maximum Variance Unfolding. In J. C. Platt, D. Koller, Y. Singer and S. Roweis (Eds.), Advances in Neural Information Processing Systems 20: Proceedings of the 2007 Conference, 1385–1392. Cambridge, Massachusetts: The MIT Press. Toh, K. C., T¨ t¨ nc¨ , R. H., & Todd, M. J. (2006). On the Implementation and Usage of SDPT3 — A MATLAB uu u Software Package for Semidefinite–Quadratic–Linear Programming, Version 4.0. User’s Guide. T¨ t¨ nc¨ , R. H., Toh, K. C., & Todd, M. J. (2003). Solving Semidefinite–Quadratic–Linear Programs using uu u SDPT3. Mathematical Programming, 95, 189–217. Vapnik, V. N. (2000). The Nature of Statistical Learning Theory. Statistics for Engineering and Information Science. New York: Springer–Verlag. Second edition. Weinberger, K. Q., Packer, B. D., & Saul, L. K. (2005). Nonlinear Dimensionality Reduction by Semidefinite Programming and Kernel Matrix Factorization. Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (pp. 381–388). Weinberger, K. Q., Sha, F., & Saul, L. K. (2004). Learning a Kernel Matrix for Nonlinear Dimensionality Reduction. Proceedings of the 21st International Conference on Machine Learning (pp. 85–92). Weinberger, K. Q., Sha, F., Zhu, Q., & Saul, L. K. (2007). Graph Laplacian Regularization for Large–Scale Semidefinite Programming. Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference (pp. 1489–1496). Cambridge, Massachusetts: The MIT Press. Zhou, D., Bousquet, O., Lal, T. N., Weston, J., & Sch¨ lkopf, B. (2004). Learning with Local and Global o Consistency. Advances in Neural Information Processing Systems 16: Proceedings of the 2003 Conference (pp. 595–602). Cambridge, Massachusetts: The MIT Press. 9