nips nips2003 nips2003-40 nips2003-40-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Harald Steck, Tommi S. Jaakkola
Abstract: The bootstrap has become a popular method for exploring model (structure) uncertainty. Our experiments with artificial and realworld data demonstrate that the graphs learned from bootstrap samples can be severely biased towards too complex graphical models. Accounting for this bias is hence essential, e.g., when exploring model uncertainty. We find that this bias is intimately tied to (well-known) spurious dependences induced by the bootstrap. The leading-order bias-correction equals one half of Akaike’s penalty for model complexity. We demonstrate the effect of this simple bias-correction in our experiments. We also relate this bias to the bias of the plug-in estimator for entropy, as well as to the difference between the expected test and training errors of a graphical model, which asymptotically equals Akaike’s penalty (rather than one half). 1
[1] H. Akaike. Information theory and an extension of the maximum likelihood principle. International Symposium on Information Theory, pp. 267–81. 1973.
[2] Carlton.On the bias of information estimates.Psych. Bulletin, 71:108–13, 1969.
[3] G. Cooper and E. Herskovits. A Bayesian method for constructing Bayesian belief networks from databases. UAI, pp. 86–94. 1991.
[4] A.C. Davison and D.V. Hinkley. Bootstrap methods and their application. 1997.
[5] B. Efron and R. J. Tibshirani. An Introduction to the Bootstrap. 1993.
[6] N. Friedman, M. Goldszmidt, and A. Wyner. Data analysis with Bayesian networks: A bootstrap approach. UAI, pp. 196–205. 1999.
[7] N. Friedman, M. Goldszmidt, and A. Wyner. On the application of the bootstrap for computing confidence measures on features of induced Bayesian networks. AI & Stat., p.p 197–202. 1999.
[8] N. Friedman, M. Linial, I. Nachman, and D. Pe’er. Using Bayesian networks to analyze expression data. Journal of Computational Biology, 7:601–20, 2000.
[9] A. J. Hartemink, D. K. Gifford, T. S. Jaakkola, and R. A. Young. Combining location and expression data for principled discovery of genetic regulatory networks. In Pacific Symposium on Biocomputing, 2002.
[10] D. Heckerman, D. Geiger, and D. M. Chickering. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20:197– 243, 1995.
[11] G. A. Miller. Note on the bias of information estimates. Information Theory in Psychology: Problems and Methods, pages 95–100, 1955.
[12] D. Pe’er, A. Regev, G. Elidan, and N. Friedman. Inferring subnetworks from perturbed expression profiles. Bioinformatics, 1:1–9, 2001.
[13] D. J. Spiegelhalter, N. G. Best, B. P. Carlin, and A. van der Linde. Bayesian measures of model complexity and fit. J. R. Stat. Soc. B, 64:583–639, 2002.
[14] H. Steck and T. S. Jaakkola. (Semi-)predictive discretization during model selection. AI Memo 2003-002, MIT, 2003.
[15] M. Stone. An asymptotic equivalence of choice of model by cross-validation and Akaike’s criterion. J. R. Stat. Soc. B, 36:44–7, 1977.
[16] J. D. Victor. Asymptotic bias in information estimates and the exponential (Bell) polynomials. Neural Computation, 12:2797–804, 2000.