nips nips2006 nips2006-83 nips2006-83-reference knowledge-graph by maker-knowledge-mining

83 nips-2006-Generalized Maximum Margin Clustering and Unsupervised Kernel Learning

Source: pdf

Author: Hamed Valizadegan, Rong Jin

Abstract: Maximum margin clustering was proposed lately and has shown promising performance in recent studies [1, 2]. It extends the theory of support vector machine to unsupervised learning. Despite its good performance, there are three major problems with maximum margin clustering that question its eﬃciency for real-world applications. First, it is computationally expensive and diﬃcult to scale to large-scale datasets because the number of parameters in maximum margin clustering is quadratic in the number of examples. Second, it requires data preprocessing to ensure that any clustering boundary will pass through the origins, which makes it unsuitable for clustering unbalanced dataset. Third, it is sensitive to the choice of kernel functions, and requires external procedure to determine the appropriate values for the parameters of kernel functions. In this paper, we propose “generalized maximum margin clustering” framework that addresses the above three problems simultaneously. The new framework generalizes the maximum margin clustering algorithm by allowing any clustering boundaries including those not passing through the origins. It signiﬁcantly improves the computational eﬃciency by reducing the number of parameters. Furthermore, the new framework is able to automatically determine the appropriate kernel matrix without any labeled data. Finally, we show a formal connection between maximum margin clustering and spectral clustering. We demonstrate the eﬃciency of the generalized maximum margin clustering algorithm using both synthetic datasets and real datasets from the UCI repository. 1

reference text

[1] L. Xu, J. Neufeld, B. Larson, and D. Schuurmans. Maximum margin clustering. In Advances in Neural Information Processing Systems (NIPS) 17, 2004.

[2] L. Xu and D. Schuurmans. Unsupervised and semi-supervised multi-class support vector machines. In Proceedings of the 20th National Conference on Artiﬁcial Intelligence (AAAI-05)., 2005.

[3] J. Hartigan and M. Wong. A k-means clustering algorithm. Appl. Statist., 28:100–108, 1979.

[4] R. A. Redner and H. F. Walker. Mixture densities, maximum likelihood and the em algorithm. SIAM Review, 26:195–239, 1984.

[5] C. Ding, X. He, H. Zha, M. Gu, and H. Simon. A min-max cut algorithm for graph partitioning and data clustering. In Proc. IEEE Int’l Conf. Data Mining, 2001.

[6] F. R. Bach and M. I. Jordan. Learning spectral clustering. In Advances in Neural Information Processing Systems 16, 2004.

[7] R. Jin, C. Ding, and F. Kang. A probabilistic approach for optimizing spectral clustering. In Advances in Neural Information Processing Systems 18, 2006.

[8] J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888–905, 2000.

[9] A. Ng, M. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14, 2001.

[10] L. Zelnik-Manor and P. Perona. Self-tuning spectral clustering. In Advances in Neural Information Processing Systems 17, pages 1601–1608, 2005.

[11] N. Cristianini, J. Shawe-Taylor, A. Elisseeﬀ, and J. S. Kandola. On kernel-target alignment. In NIPS, pages 367–373, 2001.

[12] X. Zhu, J. Kandola, Z. Ghahramani, and J. Laﬀerty. Nonparametric transforms of graph kernels for semi-supervised learning. In Advances in Neural Information Processing Systems 17, pages 1641–1648, 2005.

[13] G. R. G. Lanckriet, N. Cristianini, P. L. Bartlett, Laurent El Ghaoui, and Michael I. Jordan. Learning the kernel matrix with semideﬁnite programming. Journal of Machine Learning Research, 5:27–72, 2004.

[14] C. J. C. Burges and D. J. Crisp. Uniqueness theorems for kernel methods. Neurocomputing, 55(1-2):187–220, 2003.

[15] F.R.K. Chung. Spectral Graph Theory. Amer. Math. Society, 1997.