nips nips2004 nips2004-115 nips2004-115-reference knowledge-graph by maker-knowledge-mining

115 nips-2004-Maximum Margin Clustering


Source: pdf

Author: Linli Xu, James Neufeld, Bryce Larson, Dale Schuurmans

Abstract: We propose a new method for clustering based on finding maximum margin hyperplanes through data. By reformulating the problem in terms of the implied equivalence relation matrix, we can pose the problem as a convex integer program. Although this still yields a difficult computational problem, the hard-clustering constraints can be relaxed to a soft-clustering formulation which can be feasibly solved with a semidefinite program. Since our clustering technique only depends on the data through the kernel matrix, we can easily achieve nonlinear clusterings in the same manner as spectral clustering. Experimental results show that our maximum margin clustering technique often obtains more accurate results than conventional clustering methods. The real benefit of our approach, however, is that it leads naturally to a semi-supervised training method for support vector machines. By maximizing the margin simultaneously on labeled and unlabeled training data, we achieve state of the art performance by using a single, integrated learning principle. 1


reference text

[1] A. Ben-Hur, D. Horn, H. Siegelman, and V. Vapnik. Support vector clustering. In Journal of Machine Learning Research 2 (2001), 2001.

[2] K. Bennett and A. Demiriz. Semi-supervised support vector machines. In Advances in Neural Information Processing Systems 11 (NIPS-98), 1998.

[3] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge U. Press, 2004.

[4] Chakra Chennubhotla and Allan Jepson. Eigencuts: Half-lives of eigenflows for spectral clustering. In In Advances in Neural Information Processing Systems, 2002, 2002.

[5] K. Crammer and Y. Singer. On the algorithmic interpretation of multiclass kernel-based vector machines. Journal of Machine Learning Research, 2, 2001.

[6] T. De Bie and N. Cristianini. Convex methods for transduction. In Advances in Neural Information Processing Systems 16 (NIPS-03), 2003.

[7] C. Helmberg. Semidefinite programming for combinatorial optimization. Technical Report ZIB-Report ZR-00-34, Konrad-Zuse-Zentrum Berlin, 2000.

[8] T. Joachims. Transductive inference for text classification using support vector machines. In International Conference on Machine Learning (ICML-99), 1999.

[9] Y. Kluger, R. Basri, J. Chang, and M. Gerstein. Spectral biclustering of microarray cancer data: co-clustering genes and conditions. Genome Research, 13, 2003.

[10] G. Lanckriet, N. Cristianini, P. Bartlett, L Ghaoui, and M. Jordan. Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 5, 2004.

[11] M. Laurent and S. Poljak. On a positive semidefinite relaxation of the cut polytope. Linear Algebra and its Applications, 223/224, 1995.

[12] S. Chawla N. Bansal, A. Blum. Correlation clustering. In Conference on Foundations of Computer Science (FOCS-02), 2002.

[13] J. Kandola N. Cristianini, J. Shawe-Taylor. Spectral kernel methods for clustering. In In Advances in Neural Information Processing System, 2001, 2001.

[14] A. Ng, M. Jordan, and Y Weiss. On spectral clustering: analysis and an algorithm. In Advances in Neural Information Processing Systems 14 (NIPS-01), 2001.

[15] B. Schoelkopf and A. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2002.

[16] J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Trans PAMI, 22(8), 2000.

[17] Y. Weiss. Segmentation using eigenvectors: a unifying view. In International Conference on Computer Vision (ICCV-99), 1999.

[18] X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. In International Conference on Machine Learning (ICML-03), 2003.