nips nips2008 nips2008-212 nips2008-212-reference knowledge-graph by maker-knowledge-mining

212 nips-2008-Skill Characterization Based on Betweenness

Source: pdf

Author: Ozgur Simsek, Andre S. Barreto

Abstract: We present a characterization of a useful class of skills based on a graphical representation of an agent’s interaction with its environment. Our characterization uses betweenness, a measure of centrality on graphs. It captures and generalizes (at least intuitively) the bottleneck concept, which has inspired many of the existing skill-discovery algorithms. Our characterization may be used directly to form a set of skills suitable for a given task. More importantly, it serves as a useful guide for developing incremental skill-discovery algorithms that do not rely on knowing or representing the interaction graph in its entirety. 1

reference text

[1] L. C. Freeman. A set of measures of centrality based upon betweenness. Sociometry, 40:35–41, 1977.

[2] L. C. Freeman. Centrality in social networks: Conceptual clariﬁcation. Social Networks, 1:215–239, 1979.

[3] A. McGovern and A. G. Barto. Automatic discovery of subgoals in reinforcement learning using diverse density. In Proceedings of the Eighteenth International Conference on Machine Learning, 2001.

[4] M. Stolle and D. Precup. Learning options in reinforcement learning. Lecture Notes in Computer Science, 2371:212–223, 2002.

[5] M. Stolle. Automated discovery of options in reinforcement learning. Master’s thesis, McGill University, 2004.

[6] I. Menache, S. Mannor, and N. Shimkin. Q-Cut - Dynamic discovery of sub-goals in reinforcement learning. In Proceedings of the Thirteenth European Conference on Machine Learning, 2002. ¨ ¸ ¸

[7] O. Simsek and A. G. Barto. Using relative novelty to identify useful temporal abstractions in reinforcement learning. In Proceedings of the Twenty-First International Conference on Machine Learning, 2004.

[8] S. Mannor, I. Menache, A. Hoze, and U. Klein. Dynamic abstraction in reinforcement learning via clustering. In Proceedings of the Twenty-First International Conference on Machine Learning, 2004. ¨ ¸ ¸

[9] O. Simsek, A. P. Wolfe, and A. G. Barto. Identifying useful subgoals in reinforcement learning by local graph partitioning. In Proceedings of the Twenty-Second International Conference on Machine Learning, 2005.

[10] U. Brandes. A faster algorithm for betweenness centrality. Journal of Mathematical Sociology, 25(2):163–177, 2001.

[11] T. G. Dietterich. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artiﬁcial Intelligence Research, 13:227–303, 2000.

[12] A. G. Barto, S. Singh, and N. Chentanez. Intrinsically motivated learning of hierarchical collections of skills. In Proceedings of the Third International Conference on Developmental Learning, 2004.

[13] S. Singh, A. G. Barto, and N. Chentanez. Intrinsically motivated reinforcement learning. In Advances in Neural Information Processing Systems, 2005.

[14] R. S. Sutton, D. Precup, and S. P. Singh. Between MDPs and Semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artiﬁcial Intelligence, 112(1-2):181–211, 1999.

[15] D. Precup. Temporal abstraction in reinforcement learning. PhD thesis, University of Massachusetts Amherst, 2000.

[16] A. McGovern, R. S. Sutton, and A. H. Fagg. Roles of macro-actions in accelerating reinforcement learning. In Grace Hopper Celebration of Women in Computing, 1997.

[17] S. Amarel. On representations of problems of reasoning about actions. In Machine Intelligence 3, pages 131–171. Edinburgh University Press, 1968. 8