nips nips2003 nips2003-155 nips2003-155-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jason Palmer, Bhaskar D. Rao, David P. Wipf
Abstract: Recently, relevance vector machines (RVM) have been fashioned from a sparse Bayesian learning (SBL) framework to perform supervised learning using a weight prior that encourages sparsity of representation. The methodology incorporates an additional set of hyperparameters governing the prior, one for each weight, and then adopts a specific approximation to the full marginalization over all weights and hyperparameters. Despite its empirical success however, no rigorous motivation for this particular approximation is currently available. To address this issue, we demonstrate that SBL can be recast as the application of a rigorous variational approximation to the full model by expressing the prior in a dual form. This formulation obviates the necessity of assuming any hyperpriors and leads to natural, intuitive explanations of why sparsity is achieved in practice. 1
[1] C. Bishop and M. Tipping, “Variational relevance vector machines,” Proc. 16th Conf. Uncertainty in Artificial Intelligence, pp. 46–53, 2000.
[2] R. Duda, P. Hart, and D. Stork, Pattern Classification, Wiley, Inc., New York, 2nd ed., 2001.
[3] A.C. Faul and M.E. Tipping, “Analysis of sparse Bayesian learning,” Advances in Neural Information Processing Systems 14, pp. 383–389, 2002.
[4] M. Girolami, “A variational method for learning sparse and overcomplete representations,” Neural Computation, vol. 13, no. 11, pp. 2517–2532, 2001.
[5] M.I. Jordan, Z. Ghahramani, T. Jaakkola, and L.K. Saul, “An introduction to variational methods for graphical models,” Machine Learning, vol. 37, no. 2, pp. 183–233, 1999.
[6] D.J.C. MacKay, “Bayesian interpolation,” Neural Comp., vol. 4, no. 3, pp. 415–447, 1992.
[7] M.E. Tipping, “Sparse Bayesian learning and the relevance vector machine,” Journal of Machine Learning, vol. 1, pp. 211–244, 2001.