nips nips2012 nips2012-318 nips2012-318-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ben Calderhead, Mátyás A. Sustik
Abstract: One of the enduring challenges in Markov chain Monte Carlo methodology is the development of proposal mechanisms to make moves distant from the current point, that are accepted with high probability and at low computational cost. The recent introduction of locally adaptive MCMC methods based on the natural underlying Riemannian geometry of such models goes some way to alleviating these problems for certain classes of models for which the metric tensor is analytically tractable, however computational efficiency is not assured due to the necessity of potentially high-dimensional matrix operations at each iteration. In this paper we firstly investigate a sampling-based approach for approximating the metric tensor and suggest a valid MCMC algorithm that extends the applicability of Riemannian Manifold MCMC methods to statistical models that do not admit an analytically computable metric tensor. Secondly, we show how the approximation scheme we consider naturally motivates the use of 1 regularisation to improve estimates and obtain a sparse approximate inverse of the metric, which enables stable and sparse approximations of the local geometry to be made. We demonstrate the application of this algorithm for inferring the parameters of a realistic system of ordinary differential equations using a biologically motivated robust Student-t error model, for which the Expected Fisher Information is analytically intractable. 1
[1] M. Girolami and B. Calderhead, Riemann Manifold Langevin and Hamiltonian Monte Carlo Methods (with discussion), Journal of the Royal Statistical Society: Series B, 73:123-214, 2011
[2] H. Haario, E. Saksman and J. Tamminen, An Adaptive Metropolis Algorithm, Bernoulli, 7(2):223-242, 2001
[3] G. Roberts and J. Rosenthal, Examples of Adaptive MCMC, Journal of Computational and Graphical Statistics, 18(2), 2009
[4] G. Roberts and O. Stramer, Langevin diffusions and Metropolis-Hastings algorithms, Methodol. Comput. Appl. Probab., 4, 337-358, 2003
[5] H. Tjelmeland and B. Hegstad, Mode Jumping Proposals in MCMC, Scandinavian Journal of Statistics, 28(1), 2001
[6] S. Amari and H. Nagaoka, Methods of Information Geometry, Oxford University Press, 2000
[7] Y. Qi and T. Minka, Hessian-based Markov Chain Monte-Carlo algorithms, 1st Cape Cod Workshop Monte Carlo Methods, 2002
[8] A. Honkela, T. Raiko, M. Kuusela, M. Tornio and J. Karhunen, Approximate Riemannian conjugate gradient learning for fixed-form variational Bayes, JMLR, 11:3235-3268, 2010
[9] M. K. Murray and J. W. Rice, Differential Geometry and Statistics, Chapman and Hall, 1993
[10] C. R. Rao, Information and accuracy attainable in the estimation of statistical parameters, Bull. Calc. Math. Soc., 37:81-91, 1945
[11] H. Jeffreys, Theory of Probability, 1st ed. The Clarendon Press, Oxford, 1939
[12] J. Kent, Time reversible diffusions, Adv. Appl. Probab., 10:819-835, 1978
[13] J. Besag, P. Green, D. Higdon, and K. Mengersen, Bayesian Computation and Stochastic Systems, Statistical Science, 10(1):3-41, 1995
[14] A. Doucet, P. Jacob and A. Johansen, Discussion of Riemann Manifold Langevin and Hamiltonian Monte Carlo Methods, Journal of the Royal Statistical Society: Series B, 73:162, 2011
[15] J. Friedman, T. Hastie and R. Tibshirani, Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9(3):432-441, 2008
[16] O. Banerjee, L. El Ghaoui and A. d’Aspremont, Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data, JMLR, 9(6), 2008
[17] P. Ravikumar, M. J. Wainwright, G. Raskutti, and B. Yu., Model selection in Gaussian graphical models: High-dimensional consistency of 1 -regularized MLE, NIPS 21, 2008
[18] A. J. Rothman, P. J. Bickel, E. Levina and J. Zhu, Sparse permutation invariant covariance estimation, Electronic Journal of Statistics, 2:494-515, 2008
[19] J. Duchi, S. Gould and D. Koller, Projected Subgradient Methods for Learning Sparse Gaussians, Conference on Uncertainty in Artificial Intelligence, 2008
[20] M. A. Sustik and B. Calderhead, G LASSOFAST: An efficient G LASSO implementation, Technical Report, Computer Science Department, University of Texas at Austin, TR-12-29, 2012
[21] J. C. W. Locke, A. Millar and M. Turner, Modelling genetic networks with noisy and varied experimental data: the circadian clock in Arabidopsis thaliana, J. Theor. Biol. 234:383-393, 2005
[22] B. Calderhead and M. Girolami, Statistical analysis of nonlinear dynamical systems using differential geometric sampling methods, Journal of the Royal Society Interface Focus, 1(6), 2011
[23] R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B, 58:267-288, 1996 9