nips nips2000 nips2000-27 nips2000-27-reference knowledge-graph by maker-knowledge-mining

27 nips-2000-Automatic Choice of Dimensionality for PCA


Source: pdf

Author: Thomas P. Minka

Abstract: A central issue in principal component analysis (PCA) is choosing the number of principal components to be retained. By interpreting PCA as density estimation, we show how to use Bayesian model selection to estimate the true dimensionality of the data. The resulting estimate is simple to compute yet guaranteed to pick the correct dimensionality, given enough data. The estimate involves an integral over the Steifel manifold of k-frames, which is difficult to compute exactly. But after choosing an appropriate parameterization and applying Laplace's method, an accurate and practical estimator is obtained. In simulations, it is convincingly better than cross-validation and other proposed algorithms, plus it runs much faster.


reference text

[1] C. Bishop. Bayesian PCA. In Neural Information Processing Systems 11, pages 382- 388, 1998.

[2] C. Bregler and S. M. Omohundro. Surface learning with applications to lipreading. In NIPS, pages 43- 50, 1994.

[3] R. Everson and S. Roberts. Inferring the eigenvalues of covariance matrices from limited, noisy data. IEEE Trans Signal Processing, 4 8(7):2083- 2091 , 2000. http : //www. robots . ox . ac . uk/-sjrob/Pubs/spectrum . ps . gz.

[4] K. Fukunaga and D. Olsen. An algorithm for finding intrinsic dimensionality of data. IEEE Trans Computers, 20(2):176-183,1971.

[5] Z. Ghahramani and M. Beal. Variational inference for Bayesian mixtures of factor analysers. In Neural Information Processing Systems 12, 1999.

[6] Z. Ghahramani and G. Hinton. The EM algorithm for mi xtures of factor analyzers. Technical Report CRG-TR-96-1 , University of Toronto, 1996. http : //www . gatsby . ucl . ac . uk/-zoubin/pape rs . html.

[7] A. James. Normal multivariate analysis and the orthogonal group. Annals of Mathematical Statistics, 25(1):40- 75, 1954.

[8] R. E. Kass and A. E. Raftery. Bayes factors and model uncertainty. Technical Report 254, University of Washington, 1993. http : //www . st a t . wa shington . edu/t e ch . reports/tr254 . ps .

[9] D. J. C. MacKay. Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks. Network: Computation in Neural Systems, 6:469- 505 , 1995. http : //wol . r a. phy . cam .a c . uk/mack a y/abstr a cts/ne twork . html.

[10] T. Minka. Automatic choice of dimensionality for PCA. Technical Report 514, MIT Media Lab Vision and Modeling Group, 1999. f tp : //whit e chapel . media . mit .edu/pub/tech-reports/TR-514ABSTRAC T. html.

[11] B. Moghaddam, T. Jebara, and A. Pentland. Bayesian modeling of facial similarity. In Neural Information Processing Systems 11, pages 910-916, 1998.

[12] B. Moghaddam and A. Pentland. Probabilistic visual learning for object representation. IEEE Trans Pattern Analysis and Machine Intelligence, 19(7):696-710, 1997.

[13] J. J. Rajan and P. J. W. Rayner. Model order selection for the singular value decomposition and the discrete Karhunen-Loeve transform using a Bayesian approach. lEE Vision, Image and Signal Processing, 144(2):166- 123, 1997.

[14] M. E. Tipping and C. M. Bishop. Mixtures of probabilistic principal component analysers. Neural Computation, 11(2):443-482, 1999. http : //cit e s ee r . nj . n e c . com/362314 . html.

[15] M. E. Tipping and C. M. Bishop. Probabilistic principal component analysis. J Royal Statistical Society B, 61(3), 1999.