jmlr jmlr2010 jmlr2010-37 jmlr2010-37-reference knowledge-graph by maker-knowledge-mining

37 jmlr-2010-Evolving Static Representations for Task Transfer

Source: pdf

Author: Phillip Verbancsics, Kenneth O. Stanley

Abstract: An important goal for machine learning is to transfer knowledge between tasks. For example, learning to play RoboCup Keepaway should contribute to learning the full game of RoboCup soccer. Previous approaches to transfer in Keepaway have focused on transforming the original representation to ﬁt the new task. In contrast, this paper explores the idea that transfer is most effective if the representation is designed to be the same even across different tasks. To demonstrate this point, a bird’s eye view (BEV) representation is introduced that can represent different tasks on the same two-dimensional map. For example, both the 3 vs. 2 and 4 vs. 3 Keepaway tasks can be represented on the same BEV. Yet the problem is that a raw two-dimensional map is high-dimensional and unstructured. This paper shows how this problem is addressed naturally by an idea from evolutionary computation called indirect encoding, which compresses the representation by exploiting its geometry. The result is that the BEV learns a Keepaway policy that transfers without further learning or manipulation. It also facilitates transferring knowledge learned in a different domain, Knight Joust, into Keepaway. Finally, the indirect encoding of the BEV means that its geometry can be changed without altering the solution. Thus static representations facilitate several kinds of transfer.

reference text

Timo Aaltonen et al. Measurement of the top quark mass with dilepton events selected using neuroevolution at CDF. Physical Review Letters, 2009. 1764 E VOLVING S TATIC R EPRESENTATIONS FOR TASK T RANSFER Lee Altenberg. Evolving better representations through selective genome growth. In Proceedings of the IEEE World Congress on Computational Intelligence, pages 182–187, Piscataway, NJ, 1994. IEEE Press. Peter J. Angeline, Gregory M. Saunders, and Jordan B. Pollack. An evolutionary algorithm that constructs recurrent neural networks. IEEE Transactions on Neural Networks, 5:54–65, 1993. Petet J. Bentley and Sanjeev Kumar. The ways to grow designs: A comparison of embryogenies for an evolutionary design problem. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-1999), pages 35–43, San Francisco, 1999. Kuafmann. Luigi Cardamone, Daniele Loiacono, and Pier Luca Lanzi. On-line neuroevolution applied to the open racing car simulator. In Proceedings of the 2009 IEEE Congress on Evolutionary Computation (IEEE CEC 2009), Piscataway, NJ, USA, 2009. IEEE Press. Rich Caruana. Multitask learning. In Machine Learning, pages 41–75, 1997. Mao Cheny, Klaus Dorer, Ehsan Foroughi, Fredrik Heintz, ZhanXiang Huangy, Spiros Kapetanakis, Kostas Kostiadis, Johan Kummeneje, Jan Murray, Itsuki Noda, Oliver Obst, Pat Riley, Timo Steffens, Yi Wangy, and Xiang Yin. Robocup Soccer Server: User’s Manual. The Robocup Federation, 4.00 edition, February 2003. Peter Clark. Machine and Human Learning. London: Kogan Page, 1989. Jeff Clune, Benjamin E. Beckmann, Charles Ofria, and Robert T. Pennock. Evolving coordinated quadruped gaits with the hyperneat generative encoding. In Proceedings of the IEEE Congress on Evolutionary Computation (CEC-2009) Special Section on Evolutionary Robotics, Piscataway, NJ, USA, 2009. IEEE Press. Ronan Collobert and Jason Weston. A uniﬁed architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning, New York, NY, 2008. ACM Press. David D’Ambroiso and Kenneth O. Stanley. Evolving policy geometry for scalable multiagent learning. In Proceedings of the Ninth International Conference on Autonomous Agents and Multiagent Systems (AAMAS-2010), page 8, New York, NY, USA, 2010. ACM Press. David B. D’Ambrosio and Kenneth O. Stanley. Generative encoding for multiagent learning. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2008), New York, NY, 2008. ACM Press. Carlos Diuk, Andre Cohen, and Michael L. Littman. An object-oriented representation for efﬁcient reinforcement learning. In ICML ’08: Proceedings of the 25th International Conference on Machine learning, pages 240–247, New York, NY, USA, 2008. ACM. ISBN 978-1-60558-2054. doi: http://doi.acm.org/10.1145/1390156.1390187. Sao Deroski, Luc De Raedt, and Kurt Driessens. Relational reinforcement learning. Machine Learning, 43(1-2):7–52, April-May 2001. 1765 V ERBANCSICS AND S TANLEY Jason Gauci and Kenneth O. Stanley. Generating large-scale neural networks through discovering geometric regularities. In Proceedings of the Genetic and Evolutionary Computation Conference, page 8, New York, NY, 2007. GECCO-2007, ACM. Jason Gauci and Kenneth O. Stanley. A case study on the critical role of geometric regularity in machine learning. In Proceedings of the Twenty-Third AAAI Conference on Artiﬁcial Intelligence (AAAI-2008), Menlo Park, CA, 2008. AAAI Press. Jason Gauci and Kenneth O. Stanley. Autonomous evolution of topographic regularities in artiﬁcial neural networks. Neural Computation, page 38, 2010. Faustino Gomez and Risto Miikkulainen. Solving non-Markovian control tasks with neuroevolution. In Proceedings of the 16th International Joint Conference on Artiﬁcial Intelligence, pages 1356–1361, San Francisco, 1999. Kaufmann. Frederic Gruau, Darrell Whitley, and Larry Pyeatt. A comparison between cellular encoding and direct encoding for genetic neural networks. In John R. Koza, David E. Goldberg, David B. Fogel, and Rick L. Riolo, editors, Genetic Programming 1996: Proceedings of the First Annual Conference, pages 81–89, Cambridge, MA, 1996. MIT Press. Inman Harvey. The Artiﬁcial Evolution of Adaptive Behavior. PhD thesis, School of Cognitive and Computing Sciences, University of Sussex, Sussex, 1993. Gregory S. Hornby and Jordan B. Pollack. Creating high-level components with a generative representation for body-brain evolution. Artiﬁcial Life, 8(3), 2002. Shivaram Kalyanakrishnan, Yaxin Liu, and Peter Stone. RoboCup 2006: Robot Soccer World Cup X, volume 4434 of Lecture Notes in Computer Science, chapter Half Field Offense in RoboCup Soccer: A Multiagent Reinforcement Learning Case Study. Springer Berlin / Heidelberg, 2007. Hiroaki Kitano, Minoru Asada, Yasuo Kuniyoshi, Itsuki Noda, Eiichi Osawa, and Hitoshi Matsubara. Robocup: A challenge problem for AI. AI Magazine, 18(1):73–87, Spring 1997. Jelle R. Kok, Matthijs T. J. Spaan, and Nikos Vlassis. Non-communicative multi-robot coordination in dynamic environments. Robotics and Autonomous Systems, 50(2-3):99–114, February 2005. Benjamin Kuipers. The spatial semantic heirarchy. Artiﬁcal Intelligence, 119:191–233, 2000. Vadym Kyrylov, Martin Greber, and David Bergman. Multi-criteria optimization of ball passing in simulated soccer. Journal of Multi-Criteria Decision Analysis, 13:103–113, 2005. Aristid Lindenmayer. Mathematical models for cellular interaction in development parts I and II. Journal of Theoretical Biology, 18:280–299 and 300–315, 1968. Alan Mackworth. Agents, bodies, constraints, dynamics, and evolution. AI Magazine, 30(1):7–28, Spring 2009. Andrew P. Martin. Increasing genomic complexity by gene duplication and the origin of vertebrates. The American Naturalist, 154(2):111–128, 1999. 1766 E VOLVING S TATIC R EPRESENTATIONS FOR TASK T RANSFER Jan FH. Metzen, Mark Edgington, Yohannes Kassahun, and Frank Kirchner. Performance evaluation of EANT in the robocup keepaway benchmark. In ICMLA ’07: Proceedings of the Sixth International Conference on Machine Learning and Applications, pages 342–347, Washington, DC, USA, 2007. IEEE Computer Society. ISBN 0-7695-3069-9. doi: http://dx.doi.org/10.1109/ICMLA.2007.80. URL http://dx.doi.org/10.1109/ICMLA.2007.80. Eduardo Morales. Scaling up reinforcement learning with a relational representation. In Proceedings of the Workshop on Adaptability in Multi-agent Systems (AORC-2003), pages 15–26, Sydney, Austrailia, January 2003. Sinno Pan and Qiang Yang. A survey on transfer learning. Technical Report HKUST-CS08-08, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, November 2008. Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley-Interscience, April 1994. Jan Ramon, Kurt Driessens, and Tom Croonenborghs. Transfer learning in reinforcement learning problems through partial policy recycling. In Proceedings of the 18th European Conference on Machine Learning, pages 699–707, Berlin, Germany, 2007. Springer-Verlag. Gavin A. Rummery and Mahesan Niranjan. On-line Q-learning using connectionist systems. CUED/F-INFENG/TR 166, Cambridge University Engineering Department, 1994. Natarajan Saravanan and David B. Fogel. Evolving neural control systems. IEEE Expert, pages 23–27, June 1995. Jurgen Schmidhuber and Fakultat Fur Informatik. On learning how to learn learning strategies. Technical report, Fakultat fur Informatik, Technische Universitat Munchen. Revised, 1994. Kenneth O. Stanley. Compositional pattern producing networks: A novel abstraction of development. Genetic Programming and Evolvable Machines Special Issue on Developmental Systems, 8(2):131–162, 2007. Kenneth O. Stanley and Risto Miikkulainen. Evolving neural networks through augmenting topologies. Evolutionary Computation, 10:99–127, 2002. Kenneth O. Stanley and Risto Miikkulainen. Competitive coevolution through evolutionary complexiﬁcation. Journal of Artiﬁcial Intelligence Research, 21:63–100, 2004. Kenneth O. Stanley, Bobby D. Bryant, and Risto Miikkulainen. Real-time neuroevolution in the NERO video game. IEEE Transactions on Evolutionary Computation Special Issue on Evolutionary Computation and Games, 9(6):653–668, 2005. Kenneth O. Stanley, David B. D’Ambrosio, and Jason Gauci. A hypercube-based indirect encoding for evolving large-scale neural networks. Artiﬁcial Life, 15(2), 2009. Frieder Stolzenburg, Jan Murray, and Karsten Sturm. Multiagent matching algorithms with and without coach. Journal of Decision Systems, 15(2-3):215–240, 2006. Special issue on Decision Support Systems. Guest editors: Fatima C. C. Dargam and Pascale Zarate. 1767 V ERBANCSICS AND S TANLEY Peter Stone and Richard S. Sutton. Scaling reinforcement learning to robocup soccer. In The Eighteenth International Conference on Machine Learning, pages 537–544, New York, NY, June 2001. ICML 2001, ACM. Peter Stone, Richard S. Sutton, and Satinder Singh. Reinforcement learning in 3 vs. 2 keepaway. In Peter Stone, T. Balch, and G. Kraetszchmar, editors, Robocup-2000: Robot soccer world cup IV, pages 249–258. Springer Verlag, Berlin, 2001. Peter Stone, Richard S. Sutton, and Gregory Kuhlmann. Reinforcement learning for RoboCupsoccer keepaway. Adaptive Behavior, 13(3):165–188, 2005. Peter Stone, Gregory Kuhlmann, Matthew E. Taylor, and Yaxin Liu. Keepaway soccer: From machine learning testbed to benchmark. In RoboCup-2005: Robot Soccer World Cup IX, pages 93–105. Springer Verlag, 2006. Richard Sutton and Andrew Barto. Reinforcement Learning: An Introduction. MIT Press, 1998. Richard S. Sutton. Learning to predict by the methods of temporal differences. In Machine Learning, pages 9–44, 1988. Richard S. Sutton. Generalization in reinforcement learning: Successful examples using sparse coarse coding. In Advances in Neural Information Processing Systems 8, pages 1038–1044. MIT Press, 1996. Prasad Tadepalli. Learning to solve problems from exercises. Computational Intelligence, 4(24): 257–291, 2008. Prasad Tadepalli, Robert Givan, and Kurt Driessens. Relational reinforcement learning: An overview. In International Conference on Machine Learning Workshop on Relational Reinforcement Learning, New York, NY, 2004. ACM Press. Erik Talvitie and Satinder Singh. An experts algorithm for transfer learning. In Proceedings of the Twentieth International Joint Conference on Artiﬁcial Intelligence, pages 1065–1070, 2007. Matthew E. Taylor and Peter Stone. Cross-domain transfer for reinforcement learning. In Proceedings of the 24th International Conference on Machine learning, pages 879–886, New York, NY, USA, 2007. ACM. Matthew E. Taylor, Shimon Whiteson, and Peter Stone. Comparing evolutionary and temporal difference methods in a reinforcement learning domain. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2006), pages 1321–1328, New York, NY, July 2006. ACM Press. Matthew E. Taylor, Peter Stone, and Yaxin Liu. Transfer learning vis inter-task mappings for temporal difference learning. Journal of Machine Learning Research, 1(8):2125–2167, September 2007a. Matthew E. Taylor, Shimone Whiteson, and Peter Stone. Transfer via intertask mappings in policy search reinforcement learningn. In The Autonomous Agents and Multi-Agent Systems Conference, New York, NY, May 2007b. AAMAS-2007, ACM Press. 1768 E VOLVING S TATIC R EPRESENTATIONS FOR TASK T RANSFER Gerald Tesauro. Practical issues in temproal difference learning. Machine Learning, 8(3-4):257– 277, May 1992. Sebastian Thrun and Tom M. Mitchell. Learning one more thing. Technical report, Carnegie Mellon University, 1994. Lisa Torrey, Jude W. Shavlik, Trevor Walker, and Richard Maclin. Rule extraction for transfer learning. In Rule Extraction from Support Vector Machines, pages 67–82. Springer-Verlag, Berlin, Germany, 2008a. Lisa Torrey, TrevorWalker, Richard Maclin, and Jude Shavlik. Advice taking and transfer learning: Naturally inspired extensions to reinforcement learning. In AAAI Fall Symposium on Naturally Inspired AI, Washington, DC, 2008b. AAAI Press. Alan Turing. The Chemical Basis of Morphogenesis. Royal Society of London Philosophical Transactions Series B, 237:37–72, August 1952. James D. Watson, Nancy H. Hopkins, Jeffrey W. Roberts, Joan A. Steitz, and Alan M. Weiner. Molecular Biology of the Gene Fourth Edition. The Benjamin Cummings Publishing Company, Inc., Menlo Park, CA, 1987. Shimon Whiteson. Improving reinforcement learning function approximators via neuroevolution. In AAMAS ’05: Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems, pages 1386–1386, New York, NY, USA, 2005. ACM. ISBN 1-59593093-0. doi: http://doi.acm.org/10.1145/1082473.1082794. Shimon Whiteson and Daniel Whiteson. Stochastic optimization for collision selection in high energy physics. In IAAI 2007: Proceedings of the Nineteenth Annual Innovative Applications of Artiﬁcial Intelligence Conference, Vancouver, British Columbia, Canada, July 2007. AAAI Press. Shimon Whiteson, Nate Kohl, Risto Miikkulainen, and Peter Stone. Evolving soccer keepaway players through task decomposition. Mach. Learn., 59(1-2):5–30, 2005. Xin Yao. Evolving artiﬁcial neural networks. Proceedings of the IEEE, 87(9):1423–1447, 1999. 1769