jmlr jmlr2008 jmlr2008-8 jmlr2008-8-reference knowledge-graph by maker-knowledge-mining

8 jmlr-2008-Accelerated Neural Evolution through Cooperatively Coevolved Synapses

Source: pdf

Author: Faustino Gomez, Jürgen Schmidhuber, Risto Miikkulainen

Abstract: Many complex control problems require sophisticated solutions that are not amenable to traditional controller design. Not only is it difﬁcult to model real world systems, but often it is unclear what kind of behavior is required to solve the task. Reinforcement learning (RL) approaches have made progress by using direct interaction with the task environment, but have so far not scaled well to large state spaces and environments that are not fully observable. In recent years, neuroevolution, the artiﬁcial evolution of neural networks, has had remarkable success in tasks that exhibit these two properties. In this paper, we compare a neuroevolution method called Cooperative Synapse Neuroevolution (CoSyNE), that uses cooperative coevolution at the level of individual synaptic weights, to a broad range of reinforcement learning algorithms on very difﬁcult versions of the pole balancing problem that involve large (continuous) state spaces and hidden state. CoSyNE is shown to be signiﬁcantly more efﬁcient and powerful than the other methods on these tasks. Keywords: coevolution, recurrent neural networks, non-linear control, genetic algorithms, experimental comparison

reference text

J. S. Albus. A new approach to manipulator control: The cerebellar model articulation controller (CMAC). Journal of Dynamic Systems, Measurement, and Control, 97(3):220–227, 1975. C. W. Anderson. Learning to control an inverted pendulum using neural networks. IEEE Control Systems Magazine, 9:31–37, 1989. C. W. Anderson. Strategy learning with multilayer connectionist representations. Technical Report TR87-509.3, GTE Labs, Waltham, MA, 1987. L. C. Baird and Andrew W. Moore. Gradient descent reinforcement learning. In Advances in Neural Information Processing Systems 12, 1999. R. K. Belew, J. McInerney, and N. N. Schraudolph. Evolving networks: Using the genetic algorithm with connectionist learning. In C. G. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen, editors, Proceedings of the Workshop on Artiﬁcial Life (ALIFE ’90). Reading, MA: Addison-Wesley, 1991. ISBN 0-201-52570-4. J. A. Boyan and A. W. Moore. Generalization in reinforcement learning: Safely approximating the value function. In Advances in Neural Information Processing Systems 7, 1995. B. Bryant and R. Miikkulainen. Neuroevolution of adaptive teams: Learning heterogeneous behavior in homogeneous multi-agent systems. In Congress in Evolutionary Computation, Canberra, Australia, 2003. 960 C OOPERATIVE S YNAPSE N EUROEVOLUTION P. J. Darwen. Co-Evolutionary Learning by Automatic Modularization with Speciation. PhD thesis, University College, University of New South Wales, November 1996. R. Eriksson and B. Olsson. Cooperative coevolution in inventory control optimization. In Proceedings of 3rd International Conference on Artiﬁcial Neural Networks and Genetic Algorithms, 1997. F. Gomez and R. Miikkulainen. Incremental evolution of complex general behavior. Adaptive Behavior, 5:317–342, 1997. F. Gomez, D. Burger, and R. Miikkulainen. A neuroevolution method for dynamic resource allocation on a chip multiprocessor. In Proceedings of the INNS-IEEE International Joint Conference on Neural Networks, pages 2355–2361, Piscataway, NJ, 2001. IEEE. F. J. Gomez. Robust Nonlinear Control through Neuroevolution. PhD thesis, Department of Computer Sciences, The University of Texas at Austin, August 2003. Technical Report AI-TR-03-303. F. J. Gomez and R. Miikkulainen. Active guidance for a ﬁnless rocket using neuroevolution. In E. Cant-Paz et al., editor, Proceedings of the Genetic Evolutionary Computation Conference (GECCO-03). Springer-VerlagBerlin; New York, 2003. D. Grady. The vision thing: Mainly in the brain. Discover, 14:57–66, June 1993. U. Grasemann and R. Miikkulainen. Effective image compression using evolved wavelets. In Proceedings of the Genetic Evolutionary Computation Conference (GECCO-05), New York, 2005. ACM. ISBN 1-59593-010-8. B. Greer, H. Hakonen, R. Lahdelma, and R. Miikkulainen. Numerical optimization with neuroevolution. In Proceedings of the 2002 Congress on Evolutionary Computation (CEC2002), 2002. F. Gruau, D. Whitley, and L. Pyeatt. A comparison between cellular encoding and direct encoding for genetic neural networks. In J. R. Koza, D. E. Goldberg, D. B. Fogel, and R. L. Riolo, editors, Genetic Programming 1996: Proceedings of the First Annual Conference, pages 81–89, Cambridge, MA, 1996a. MIT Press. F. Gruau, D. Whitley, and L. Pyeatt. A comparison between cellular encoding and direct encoding for genetic neural networks. Technical Report NC-TR-96-048, NeuroCOLT, 1996b. G. Grudic. Simulation code for policy http://www.cis.upenn.edu/ grudic/PGRLSim/, 2000. gradient reinforcement learning. N. Hansen and A. Ostermeier. Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation, 9(2):159–195, 2001. S. A. Harp, T. Samad, and A. Guha. Towards the genetic synthesis of neural networks. In Proceedings of the Third International Conference on Genetic Algorithms, pages 360–369, 1989. S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735–1780, 1997. 961 G OMEZ , S CHMIDHUMBER AND M IIKKULAINEN S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber. Gradient ﬂow in recurrent nets: the difﬁculty of learning long-term dependencies. In S. C. Kremer and J. F. Kolen, editors, A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press, 2001. J. H. Holland and J. S. Reitman. Cognitive systems based on adaptive algorithms. In D. A. Waterman and F. Hayes-Roth, editors, Pattern-Directed Inference Systems. Academic Press, New York, 1978. J. Horn, D. E. Goldberg, and K. Deb. Implicit niching in a learning classiﬁer system: Nature’s way. Evolutionary Computation, 2(1):37–66, 1994. P. Husbands and F. Mill. Simulated co-evolution as the mechanism for emergent planning and scheduling. In R. K. Belew and L. B. Booker, editors, Proceedings of the Fourth International Conference on Genetic Algorithms, pages 264–270. San Francisco, CA: Morgan Kaufmann, 1991. ISBN 1-55860-208-9. C. Igel. Neuroevolution for reinforcement learning using evolution strategies. In R. Reynolds, H. Abbass, K. C. Tan, B. McKay, D. Essam, and T. Gedeon, editors, Congress on Evolutionary Computation (CEC 2003), volume 4, pages 2588–2595. IEEE, 2003. J.-S. R. Jang. Self-learning fuzzy controllers based on temporal backpropagation. IEEE Transactions on Neural Networks, 3(5):714–723, September 1992. T. Jansen and R. P. Wiegand. The cooperative coevolutionary (1+1) ea. Evolutionary Computation, 12(4), 2004. T. Jansen and R. P. Wiegand. Exploring the explorative advantage of the CC (1+1) ea. In E. Cant-Paz et al., editor, Proceedings of the Genetic Evolutionary Computation Conference (GECCO-03). Springer-VerlagBerlin; New York, 2003. D. Jefferson, R. Collins, C. Cooper, M. Dyer, M. Flowers, R. Korf, C. Taylor, and A. Wang. Evolution as a theme in artiﬁcial life: The Genesys/Tracker system. In C. G. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen, editors, Proceedings of the Workshop on Artiﬁcial Life (ALIFE ’90). Reading, MA: Addison-Wesley, 1991. ISBN 0-201-52570-4. H. Kitano. Designing neural networks using genetic algorithms with graph generation system. Complex Systems, 4:461–476, 1990. J. R. Koza. Genetic Programming. MIT Press, Cambridge, MA, 1991. L.-J. Lin. Self-improving reactive agents based on reinforcement learning, planning, and teaching. Machine Learning, 8(3):293–321, 1992. L.-J. Lin and T. M. Mitchell. Memory approaches to reinforcement learning in non-Markovian domains. Technical Report CMU-CS-92-138, Carnegie Mellon University, School of Computer Science, May 1992. A. Lubberts and R. Miikkulainen. Co-evolving a go-playing neural network. In Coevolution: Turning Adaptive Algorithms Upon Themselves, Birds-of-a-Feather Workshop, Genetic and Evolutionary Computation Conference (GECCO-2001), 2001. 962 C OOPERATIVE S YNAPSE N EUROEVOLUTION M. Mandischer. Representation and evolution of neural networks. In R.F. Albrecht, C.R. Reeves, and N.C. Steele, editors, Proceedings of the Conference on Artiﬁcial Neural Nets and Genetic Algorithms at Innsbruck, Austria, pages 643–649. Springer-Verlag, 1993. N. Meuleau, L. Peshkin, K.-E. Kim, and L. P. Kaelbling. Learning ﬁnite state controllers for partially observable environments. In 15th International Conference of Uncertainty in AI, 1999. D. Michie and R. A. Chambers. BOXES: An experiment in adaptive control. In E. Dale and D. Michie, editors, Machine Intelligence. Oliver and Boyd, Edinburgh, UK, 1968. G. Miller and D. Cliff. Co-evolution of pursuit and evasion i: Biological and game-theoretic foundations. Technical Report CSRP311, School of Cognitive and Computing Sciences, University of Sussex, Brighton, UK, 1994. D. E. Moriarty. Symbiotic Evolution of Neural Networks in Sequential Decision Tasks. PhD thesis, Department of Computer Sciences, The University of Texas at Austin, 1997. Technical Report UT-AI97-257. S. Nolﬁ and D. Parisi. Learning to adapt to changing environments in evolving neural networks. Technical Report 95-15, Institute of Psychology, National Research Council, Rome, Italy, 1995. L. Panait, S. Luke, and J. F. Harrison. Archive-based cooperative coevolutionary algorithms. In GECCO ’06: Proceedings of the 8th annual conference on Genetic and evolutionary computation, pages 345–352, New York, NY, USA, 2006. ACM Press. ISBN 1-59593-186-4. doi: http://doi.acm.org/10.1145/1143997.1144060. J. Paredis. Steps towards co-evolutionary classiﬁcation neural networks. In R. A. Brooks and P. Maes, editors, Proceedings of the Fourth International Workshop on the Synthesis and Simulation of Living Systems (Artiﬁcial Life IV), pages 102–108. Cambridge, MA: MIT Press, 1994. ISBN 0-262-52190-3. J. Paredis. Coevolutionary computation. Artiﬁcial Life, 2:355–375, 1995. A. S. Perez-Bergquist. Applying ESP and region specialists to neuro-evolution for Go. Technical Report CSTR01-24, Department of Computer Sciences, The University of Texas at Austin, 2001. J. B. Pollack, A. D. Blair, and M. Land. Coevolution of a backgammon player. In C. G. Langton and K. Shimohara, editors, Proceedings of the 5th International Workshop on Artiﬁcial Life: Synthesis and Simulation of Living Systems (ALIFE-96). Cambridge, MA: MIT Press, 1996. ISBN 0-26262111-8. M. A. Potter and K. A. De Jong. Evolving neural networks with collaborative species. In Proceedings of the 1995 Summer Computer Simulation Conference, 1995. C. D. Rosin. Coevolutionary Search Among Adversaries. PhD thesis, University of California, San Diego, San Diego, CA, 1997. J. C. Santamaria, R. S. Sutton, and A. Ram. Experiments with reinforcement learning in problems with continuous state and action spaces. Adaptive Behavior, 6(2):163–218, 1998. 963 G OMEZ , S CHMIDHUMBER AND M IIKKULAINEN N. Saravanan and D. B. Fogel. Evolving neural control systems. IEEE Expert, pages 23–27, June 1995. K. O. Stanley. Efﬁcient Evolution of Neural Networks Through Complexiﬁcation. PhD thesis, Department of Computer Sciences, The University of Texas at Austin, August 2004. Technical Report AI-TR-04-314. K. O. Stanley and R. Miikkulainen. Evolving neural networks through augmenting topologies. Evolutionary Computation, 10:99–127, 2002. R. S. Sutton. Generalization in reinforcement learning: Successful examples using sparse coarse coding. In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors, Advances in Neural Information Processing Systems 8, pages 1038–1044. Cambridge, MA: MIT Press, 1996. R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998. ISBN 0-262-19398-1. R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems 12, volume 12, pages 1057–1063. MIT Press, 2000. G. Tesauro. Practical issues in temporal difference learning. Machine Learning, 8:257–277, 1992. H. M. Voigt, J. Born, and I. Santibanez-Koref. Evolutionary structuring of artiﬁcial neural networks. Technical report, Technical University Berlin, Bio- and Neuroinformatics Research Group, 1993. C. J. C. H. Watkins and P. Dayan. Q-learning. Machine Learning, 8(3):279–292, 1992. P. Werbos. Backpropagation through time: what does it do and how to do it. In Proceedings of IEEE, volume 78, pages 1550–1560, 1990. B. A. Whitehead and T. D. Choate. Cooperative–competitive genetic evolution of radial basis function centers and widths for time series prediction. IEEE Transactions on Neural Networks, 1995. S. Whiteson, N. Kohl, R. Miikkulainen, and P. Stone. Evolving keepaway soccer players through task decomposition. In E. Cant-Paz et al., editor, Proceedings of the Genetic Evolutionary Computation Conference (GECCO-03). Springer-VerlagBerlin; New York, 2003. D. Whitley, S. Dominic, R. Das, and Charles W. Anderson. Genetic reinforcement learning for neurocontrol problems. Machine Learning, 13:259–284, 1993. R. P. Wiegand. An Analysis of Cooperative Coevolutionary Algorithms. PhD thesis, George Mason University, Fall 2003. R. P. Wiegand, W. C. Liles, and K. A. De Jong. An empirical analysis of collaboration methods in cooperative coevolutionary algorithms. In L. Spector et al., editor, Proceedings of the Genetic and Evolutionary Computation Conference, pages 1235–1242. San Francisco, CA: Morgan Kaufmann, 2001. ISBN 1-55860-774-9. URL citeseer.ist.psu.edu/481900.html. 964 C OOPERATIVE S YNAPSE N EUROEVOLUTION A. Wieland. Evolving neural network controllers for unstable systems. In Proceedings of the International Joint Conference on Neural Networks (Seattle, WA), pages 667–673. Piscataway, NJ: IEEE, 1991. D. Wierstra, A. Foerster, J. Peters, and J. Schmidhuber. Solving deep memory pomdps with recurrent policy gradients. In International Conference on Artiﬁcial Neural Networks, 2007. B. Yamauchi and R. D. Beer. Integrating reactive, sequential, and learning behavior using dynamical neural networks. In D. Cliff, P. Husbands, J.-A. Meyer, and S. W. Wilson, editors, From Animals to Animats 3: Proceedings of the Third International Conference on Simulation of Adaptive Behavior, pages 382–391. Cambridge, MA: MIT Press, 1994. ISBN 0-262-53122-4. X. Yao. Evolving artiﬁcial neural networks. Proceedings of the IEEE, 87(9):1423–1447, 1999. 965