nips nips2007 nips2007-46 nips2007-46-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ohad Shamir, Naftali Tishby
Abstract: Over the past few years, the notion of stability in data clustering has received growing attention as a cluster validation criterion in a sample-based framework. However, recent work has shown that as the sample size increases, any clustering model will usually become asymptotically stable. This led to the conclusion that stability is lacking as a theoretical and practical tool. The discrepancy between this conclusion and the success of stability in practice has remained an open question, which we attempt to address. Our theoretical approach is that stability, as used by cluster validation algorithms, is similar in certain respects to measures of generalization in a model-selection framework. In such cases, the model chosen governs the convergence rate of generalization bounds. By arguing that these rates are more important than the sample size, we are led to the prediction that stability-based cluster validation algorithms should not degrade with increasing sample size, despite the asymptotic universal stability. This prediction is substantiated by a theoretical analysis as well as some empirical results. We conclude that stability remains a meaningful cluster validation criterion over finite samples. 1
[1] Shai Ben-David. A framework for statistical clustering with a constant time approximation algorithms for k-median clustering. In Proceedings of COLT 2004, pages 415–426.
[2] Shai Ben-David, Ulrike von Luxburg, and D´ vid P´ l. A sober look at clustering stability. In Proceedings a a of COLT 2006, pages 5–19.
[3] Olivier Bousquet and Andr´ Elisseeff. Stability and generalization. Journal of Machine Learning Ree search, 2:499–526, 2002.
[4] Joachim M. Buhmann and Marcus Held. Model selection in clustering by uniform convergence bounds. In Advances in Neural Information Processing Systems 12, pages 216–222, 1999.
[5] Andrea Caponnetto and Alexander Rakhlin. Stability properties of empirical risk minimization over donsker classes. Journal of Machine Learning Research, 6:2565–2583, 2006.
[6] Trevor Hastie, Robert Tibshirani, Jerome Friedman. The Elements of Statistical Learning. Springer, 2001.
[7] Wassily Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13–30, March 1963.
[8] Samuel Kutin and Partha Niyogi. Almost-everywhere algorithmic stability and generalization error. In Proceeding of the 18th confrence on Uncertainty in Artificial Intelligence (UAI), pages 275–282, 2002.
[9] Tilman Lange, Volker Roth, Mikio L. Braun, and Joachim M. Buhmann. Stability-based validation of clustering solutions. Neural Computation, 16(6):1299–1323, June 2004.
[10] D.A. McAllester. Pac-bayesian stochastic model selection. Machine Learning Journal, 51(1):5–21, 2003.
[11] C. McDiarmid. On the method of bounded differences. In Surveys in Combinatorics, volume 141 of London Mathematical Society Lecture Note Series, pages 148–188. Cambridge University Press, 1989.
[12] Alexander Rakhlin and Andrea Caponnetto. Stability of k-means clustering. In Advances in Neural Information Processing Systems 19. MIT Press, Cambridge, MA, 2007.
[13] Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888–905, 2000.
[14] Ulrike von Luxburg and Shai Ben-David. Towards a statistical theory of clustering. Technical report, PASCAL workshop on clustering, London, 2005. 8