nips nips2006 nips2006-159 nips2006-159-reference knowledge-graph by maker-knowledge-mining

159 nips-2006-Parameter Expanded Variational Bayesian Methods

Source: pdf

Author: Tommi S. Jaakkola, Yuan Qi

Abstract: Bayesian inference has become increasingly important in statistical machine learning. Exact Bayesian calculations are often not feasible in practice, however. A number of approximate Bayesian methods have been proposed to make such calculations practical, among them the variational Bayesian (VB) approach. The VB approach, while useful, can nevertheless suffer from slow convergence to the approximate solution. To address this problem, we propose Parameter-eXpanded Variational Bayesian (PX-VB) methods to speed up VB. The new algorithm is inspired by parameter-expanded expectation maximization (PX-EM) and parameterexpanded data augmentation (PX-DA). Similar to PX-EM and -DA, PX-VB expands a model with auxiliary variables to reduce the coupling between variables in the original model. We analyze the convergence rates of VB and PX-VB and demonstrate the superior convergence rates of PX-VB in variational probit regression and automatic relevance determination. 1

reference text

Beal, M. (2003). Variational algorithms for approximate Bayesian inference. Doctoral dissertation, Gatsby Computational Neuroscience Unit, University College London. Bishop, C., & Tipping, M. E. (2000). Variational relevance vector machines. 16th UAI. Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., & Saul, L. K. (1998). An introduction to variational methods in graphical models. Learning in Graphical Models. http://www.ai.mit.edu/˜tommi/papers.html. Liu, C., Rubin, D. B., & Wu, Y. N. (1998). Parameter expansion to accelerate EM: the PX-EM algorithm. Biometrika, 85, 755–770. Liu, J. S., & Wu, Y. N. (1999). Parameter expansion for data augmentation. Journal of the American Statistical Association, 94, 1264–1274. Luo, Z. Q., & Tseng, P. (1992). On the convergence of the coordinate descent method for convex differentiable minimization. Journal of Optimization Theory and Applications, 72, 7–35. MacKay, D. J. (1992). Bayesian interpolation. Neural Computation, 4, 415–447. Salakhutdinov, R., Roweis, S. T., & Ghahramani, Z. (2003). Optimization with EM and Expectation-ConjugateGradient. Proceedings of International Conference on Machine Learning. Tipping, M. E. (2000). The relevance vector machine. NIPS (pp. 652–658). The MIT Press. van Dyk, D. A., & Meng, X. L. (2001). The art of data augmentation (with discussion). Journal of Computational and Graphical Statistics, 10, 1–111.