nips nips2004 nips2004-198 nips2004-198-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Antti Honkela, Harri Valpola
Abstract: In this paper we present a framework for using multi-layer perceptron (MLP) networks in nonlinear generative models trained by variational Bayesian learning. The nonlinearity is handled by linearizing it using a Gauss–Hermite quadrature at the hidden neurons. This yields an accurate approximation for cases of large posterior variance. The method can be used to derive nonlinear counterparts for linear algorithms such as factor analysis, independent component/factor analysis and state-space models. This is demonstrated with a nonlinear factor analysis experiment in which even 20 sources can be estimated from a real world speech data set. 1
[1] A. Hyv¨rinen, J. Karhunen, and E. Oja. Independent Component Analysis. J. Wiley, a 2001.
[2] G. E. Hinton and D. van Camp. Keeping neural networks simple by minimizing the description length of the weights. In Proc. of the 6th Ann. ACM Conf. on Computational Learning Theory, pp. 5–13, Santa Cruz, CA, USA, 1993.
[3] P. Sykacek and S. Roberts. Adaptive classification by variational Kalman filtering. In Advances in Neural Information Processing Systems 15, pp. 753–760. MIT Press, 2003.
[4] S. Haykin. Neural Networks – A Comprehensive Foundation, 2nd ed. Prentice-Hall, 1999.
[5] D. Barber and C. Bishop. Ensemble learning for multi-layer networks. In Advances in Neural Information Processing Systems 10, pp. 395–401. MIT Press, 1998.
[6] H. Lappalainen and A. Honkela. Bayesian nonlinear independent component analysis by multi-layer perceptrons. In M. Girolami, ed., Advances in Independent Component Analysis, pp. 93–121. Springer-Verlag, Berlin, 2000.
[7] H. Valpola, E. Oja, A. Ilin, A. Honkela, and J. Karhunen. Nonlinear blind source separation by variational Bayesian learning. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E86-A(3):532–541, 2003.
[8] H. Attias. Independent factor analysis. Neural Computation, 11(4):803–851, 1999.
[9] H. Valpola and J. Karhunen. An unsupervised ensemble learning method for nonlinear dynamic state-space models. Neural Computation, 14(11):2647–2692, 2002.
[10] H. Attias. A variational Bayesian framework for graphical models. In Advances in Neural Information Processing Systems 12, pp. 209–215. MIT Press, 2000.
[11] Z. Ghahramani and M. Beal. Propagation algorithms for variational Bayesian learning. In Advances in Neural Information Processing Systems 13, pp. 507–513. MIT Press, 2001.
[12] A. Honkela. Approximating nonlinear transformations of probability distributions for nonlinear independent component analysis. In Proc. 2004 IEEE Int. Joint Conf. on Neural Networks (IJCNN 2004), pp. 2169–2174, Budapest, Hungary, 2004.
[13] S. Julier and J. K. Uhlmann. A general method for approximating nonlinear transformations of probability distributions. Technical report, Robotics Research Group, Department of Engineering Science, University of Oxford, 1996.
[14] F. Curbera. Delayed curse of dimension for Gaussian integration. Journal of Complexity, 16(2):474–506, 2000.