nips nips2008 nips2008-216 nips2008-216-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Cédric Archambeau, Francis R. Bach
Abstract: We present a generative model for performing sparse probabilistic projections, which includes sparse principal component analysis and sparse canonical correlation analysis as special cases. Sparsity is enforced by means of automatic relevance determination or by imposing appropriate prior distributions, such as generalised hyperbolic distributions. We derive a variational Expectation-Maximisation algorithm for the estimation of the hyperparameters and show that our novel probabilistic approach compares favourably to existing techniques. We illustrate how the proposed method can be applied in the context of cryptoanalysis as a preprocessing tool for the construction of template attacks. 1
[1] C. Archambeau, E. Peeters, F.-X. Standaert, and J.-J. Quisquater. Template attacks in principal subspaces. In L. Goubin and M. Matsui, editors, 8th International Workshop on Cryptographic Hardware and Embedded Systems(CHES), volume 4249 of Lecture Notes in Computer Science, pages 1–14. Springer, 2006.
[2] F. Bach and M. I. Jordan. A probabilistic interpretation of canonical correlation analysis. Technical Report 688, Department of Statistics, University of California, Berkeley, 2005.
[3] O. Barndorff-Nielsen and R. Stelzer. Absolute moments of generalized hyperbolic distributions and approximate scaling of normal inverse Gaussian L´ vy processes. Scandinavian Journal of Statistics, e 32(4):617–637, 2005.
[4] P. J. Brown and J. E. Griffin. Bayesian adaptive lassos with non-convex penalization. Technical Report CRiSM 07-02, Department of Statistics, University of Warwick, 2007.
[5] F. Caron and A. Doucet. Sparse bayesian nonparametric regression. In 25th International Conference on Machine Learning (ICML). ACM, 2008.
[6] A. d’Aspremont, E. L. Ghaoui, M. I. Jordan, and G. R. G. Lanckriet. A direct formulation for sparse PCA using semidefinite programming. SIAM Review, 49(3):434–48, 2007.
[7] J. Fan and R. Li. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96:1348–1360, 2001.
[8] A. C. Faul and M. E. Tipping. Analysis of sparse Bayesian learning. In T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14 (NIPS), pages 383–389. The MIT Press, 2002.
[9] Z. Ghahramani and G. E. Hinton. The EM algorithm for mixtures of factor analyzers. Technical Report CRG-TR-96-1, Department of Computer Science, University of Toronto, 1996.
[10] D. Hardoon and J. Shawe-Taylor. Sparse canonical correlation analysis. Technical report, PASCAL EPrints, 2007.
[11] Wenbo Hu. Calibration of multivariate generalized hyperbolic distributions using the EM algorithm, with applications in risk management, portfolio optimization and portfolio credit risk. PhD thesis, Florida State University, United States of America, 2005.
[12] B. Jørgensen. Statistical Properties of the Generalized Inverse Gaussian Distribution. Springer-Verlag, 1982.
[13] A. Klami and S. Kaski. Local dependent components. In Z. Ghahramani, editor, 24th International Conference on Machine Learning (ICML), pages 425–432. Omnipress, 2007.
[14] D. J. C. MacKay. Bayesian methods for backprop networks. In E. Domany, J.L. van Hemmen, and K. Schulten, editors, Models of Neural Networks, III, pages 211–254. 1994.
[15] R. M. Neal and G. E. Hinton. A view of the EM algorithm that justifies incremental, sparse, and other variants. In M. I. Jordan, editor, Learning in Graphical Models, pages 355–368. The MIT press, 1998.
[16] C. D. Sigg and J. M. Buhmann. Expectation-maximization for sparse and non-negative PCA. In 25th International Conference on Machine Learning (ICML). ACM, 2008.
[17] R. Tibshirani. Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society B, 58:267–288, 1996.
[18] M. E. Tipping and C. M. Bishop. Probabilistic principal component analysis. Journal of the Royal Statistical Society B, 61:611–622, 1999.
[19] D. Torres, D. Turnbull, B. K. Sriperumbudur, L. Barrington, and G.Lanckriet. Finding musically meaningful words using sparse CCA. In NIPS workshop on Music, Brain and Cognition, 2007.
[20] H. Zou, T. Hastie, and R. Tibshirani. Sparse principal component analysis. Journal of Computational and Graphical Statistics, 15(2):265–286, 2006. 8