nips nips2002 nips2002-42 nips2002-42-reference knowledge-graph by maker-knowledge-mining

42 nips-2002-Bias-Optimal Incremental Problem Solving

Source: pdf

Author: Jürgen Schmidhuber

Abstract: Given is a problem sequence and a probability distribution (the bias) on programs computing solution candidates. We present an optimally fast way of incrementally solving each task in the sequence. Bias shifts are computed by program preﬁxes that modify the distribution on their sufﬁxes by reusing successful code for previous tasks (stored in non-modiﬁable memory). No tested program gets more runtime than its probability times the total search time. In illustrative experiments, ours becomes the ﬁrst general system to learn a universal solver for arbitrary disk Towers of Hanoi tasks (minimal solution size ). It demonstrates the advantages of incremental learning by proﬁting from previously solved, simpler tasks involving samples of a simple context free language. ¦ ¤ ¢ §¥£¡ 1 Brief Introduction to Optimal Universal Search Consider an asymptotically optimal method for tasks with quickly veriﬁable solutions: ¦ ¦ © £ £¨ © © © © ¦ ¦ ¦ Method 1.1 (L SEARCH ) View the -th binary string as a potential program for a universal Turing machine. Given some problem, for all do: every steps on average execute (if possible) one instruction of the -th program candidate, until one of the programs has computed a solution. ! © © © ¢

reference text

[1] C. W. Anderson. Learning and Problem Solving with Multilayer Connectionist Systems. PhD thesis, University of Massachusetts, Dept. of Comp. and Inf. Sci., 1986.

[2] N. L. Cramer. A representation for the adaptive generation of simple sequential programs. In J.J. Grefenstette, editor, Proceedings of an International Conference on Genetic Algorithms and Their Applications, Carnegie-Mellon University, July 24-26, 1985, Hillsdale NJ, 1985. Lawrence Erlbaum Associates.

[3] M. Hutter. The fastest and shortest algorithm for all well-deﬁned problems. International Journal of Foundations of Computer Science, 13(3):431–443, 2002.

[4] L.P. Kaelbling, M.L. Littman, and A.W. Moore. Reinforcement learning: a survey. Journal of AI research, 4:237–285, 1996.

[5] P. Langley. Learning to search: from weak methods to domain-speciﬁc heuristics. Cognitive Science, 9:217–260, 1985.

[6] L. A. Levin. Universal sequential search problems. Problems of Information Transmission, 9(3):265–266, 1973.

[7] M. Li and P. M. B. Vit´ nyi. An Introduction to Kolmogorov Complexity and its Applications a (2nd edition). Springer, 1997.

[8] C. H. Moore and G. C. Leach. FORTH - a language for interactive computing, 1970. http://www.ultratechnology.com.

[9] J. Schmidhuber. Optimal ordered problem solver. Technical Report IDSIA-12-02, arXiv:cs.AI/0207097 v1, IDSIA, Manno-Lugano, Switzerland, July 2002.

[10] J. Schmidhuber, J. Zhao, and M. Wiering. Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning, 28:105–130, 1997.

[11] R.J. Solomonoff. An application of algorithmic probability to problems in artiﬁcial intelligence. In L. N. Kanal and J. F. Lemmer, editors, Uncertainty in Artiﬁcial Intelligence, pages 473–491. Elsevier Science Publishers, 1986.