nips nips2012 nips2012-283 nips2012-283-reference knowledge-graph by maker-knowledge-mining

283 nips-2012-Putting Bayes to sleep

Source: pdf

Author: Dmitry Adamskiy, Manfred K. Warmuth, Wouter M. Koolen

Abstract: We consider sequential prediction algorithms that are given the predictions from a set of models as inputs. If the nature of the data is changing over time in that different models predict well on different segments of the data, then adaptivity is typically achieved by mixing into the weights in each round a bit of the initial prior (kind of like a weak restart). However, what if the favored models in each segment are from a small subset, i.e. the data is likely to be predicted well by models that predicted well before? Curiously, ﬁtting such “sparse composite models” is achieved by mixing in a bit of all the past posteriors. This self-referential updating method is rather peculiar, but it is efﬁcient and gives superior performance on many natural data sets. Also it is important because it introduces a long-term memory: any model that has done well in the past can be recovered quickly. While Bayesian interpretations can be found for mixing in a bit of the initial prior, no Bayesian interpretation is known for mixing in past posteriors. We build atop the “specialist” framework from the online learning literature to give the Mixing Past Posteriors update a proper Bayesian foundation. We apply our method to a well-studied multitask learning problem and obtain a new intriguing efﬁcient update that achieves a signiﬁcantly better bound. 1

283 nips-2012-Putting Bayes to sleep

reference text