nips nips2009 nips2009-182 nips2009-182-reference knowledge-graph by maker-knowledge-mining

182 nips-2009-Optimal Scoring for Unsupervised Learning

Source: pdf

Author: Zhihua Zhang, Guang Dai

Abstract: We are often interested in casting classiﬁcation and clustering problems as a regression framework, because it is feasible to achieve some statistical properties in this framework by imposing some penalty criteria. In this paper we illustrate optimal scoring, which was originally proposed for performing the Fisher linear discriminant analysis by regression, in the application of unsupervised learning. In particular, we devise a novel clustering algorithm that we call optimal discriminant clustering. We associate our algorithm with the existing unsupervised learning algorithms such as spectral clustering, discriminative clustering and sparse principal component analysis. Experimental results on a collection of benchmark datasets validate the effectiveness of the optimal discriminant clustering algorithm.

reference text

[1] C. M. Bishop. Pattern Recognition and Machine Learning. Springer, ﬁrst edition, 2007.

[2] L. Clemmensen, T. Hastie, and B. Erbøll. Sparse discriminant analysis. Technical report, June 2008.

[3] F. De la Torre and T. Kanade. Discriminative cluster analysis. In The 23rd International Conference on Machine Learning, 2006.

[4] R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classiﬁcation. John Wiley and Sons, New York, second edition, 2001.

[5] T. Hastie, A. Buja, and R. Tibshirani. Penalized discriminant analysis. The Annals of Statistics, 23(1):73–102, 1995.

[6] T. Hastie, R. Tibshirani, and A. Buja. Flexible discriminant analysis by optimal scoring. Journal of the American Statistical Association, 89(428):1255–1270, 1994.

[7] A. Y. Ng, M. I. Jordan, and Y. Weiss. On spectral clustering: analysis and an algorithm. In Advances in Neural Information Processing Systems 14, volume 14, 2002.

[8] C. H. Park and H. Park. A relationship between linear discriminant analysis and the generalized minimum squared error solution. SIAM Journal on Matrix Analysis and Applications, 27(2):474–492, 2005.

[9] J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888–905, 2000.

[10] R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58:267–288, 1996.

[11] M. Wu and B. Sch¨ lkopf. A local learning approach for clustering. In Advances in Neural o Information Processing Systems 19, 2007.

[12] J. Ye. Least squares linear discriminant analysis. In The Twenty-Fourth International Conference on Machine Learning, 2007.

[13] J. Ye, Z. Zhao, and M. Wu. Discriminative k-means for clustering. In Advances in Neural Information Processing Systems 20, 2008.

[14] Z. Zhang, G. Dai, and M. I. Jordan. A ﬂexible and efﬁcient algorithm for regularized Fisher discriminant analysis. In The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), 2009.

[15] Z. Zhang and M. I. Jordan. Multiway spectral clustering: A margin-based perspective. Statistical Science, 23(3):383–403, 2008.

[16] H. Zou and T. Hastie. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, 67:301–320, 2005.

[17] H. Zou, T. Hastie, and R. Tibshirani. Sparse principal component analysis. Journal of Computational and Graphical Statistics, 15:265–286, 2006. 9