nips nips2013 nips2013-71 knowledge-graph by maker-knowledge-mining

71 nips-2013-Convergence of Monte Carlo Tree Search in Simultaneous Move Games


Source: pdf

Author: Viliam Lisy, Vojta Kovarik, Marc Lanctot, Branislav Bosansky

Abstract: unkown-abstract

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Convergence of Monte Carlo Tree Search in Simultaneous Move Games Viliam Lis´ 1 y Vojtˇ ch Kovaˇ´k1 e rı Marc Lanctot2 Branislav Boˇansk´ 1 s y 2 1 Department of Knowledge Engineering Maastricht University, The Netherlands marc. [sent-1, score-0.258]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('branislav', 0.363), ('fee', 0.363), ('prague', 0.363), ('czech', 0.335), ('netherlands', 0.275), ('ch', 0.258), ('bo', 0.25), ('marc', 0.232), ('games', 0.205), ('engineering', 0.182), ('agent', 0.181), ('simultaneous', 0.163), ('move', 0.13), ('carlo', 0.129), ('monte', 0.127), ('tree', 0.11), ('technology', 0.106), ('center', 0.096), ('search', 0.067), ('technical', 0.057), ('department', 0.057), ('convergence', 0.047), ('knowledge', 0.046), ('computer', 0.03), ('science', 0.028), ('university', 0.024)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999994 71 nips-2013-Convergence of Monte Carlo Tree Search in Simultaneous Move Games

Author: Viliam Lisy, Vojta Kovarik, Marc Lanctot, Branislav Bosansky

Abstract: unkown-abstract

2 0.087638095 129 nips-2013-Generalized Random Utility Models with Multiple Types

Author: Hossein Azari Soufiani, Hansheng Diao, Zhenyu Lai, David C. Parkes

Abstract: We propose a model for demand estimation in multi-agent, differentiated product settings and present an estimation algorithm that uses reversible jump MCMC techniques to classify agents’ types. Our model extends the popular setup in Berry, Levinsohn and Pakes (1995) to allow for the data-driven classification of agents’ types using agent-level data. We focus on applications involving data on agents’ ranking over alternatives, and present theoretical conditions that establish the identifiability of the model and uni-modality of the likelihood/posterior. Results on both real and simulated data provide support for the scalability of our approach. 1

3 0.075213172 278 nips-2013-Reward Mapping for Transfer in Long-Lived Agents

Author: Xiaoxiao Guo, Satinder Singh, Richard L. Lewis

Abstract: We consider how to transfer knowledge from previous tasks (MDPs) to a current task in long-lived and bounded agents that must solve a sequence of tasks over a finite lifetime. A novel aspect of our transfer approach is that we reuse reward functions. While this may seem counterintuitive, we build on the insight of recent work on the optimal rewards problem that guiding an agent’s behavior with reward functions other than the task-specifying reward function can help overcome computational bounds of the agent. Specifically, we use good guidance reward functions learned on previous tasks in the sequence to incrementally train a reward mapping function that maps task-specifying reward functions into good initial guidance reward functions for subsequent tasks. We demonstrate that our approach can substantially improve the agent’s performance relative to other approaches, including an approach that transfers policies. 1

4 0.065208182 47 nips-2013-Bayesian Hierarchical Community Discovery

Author: Charles Blundell, Yee Whye Teh

Abstract: We propose an efficient Bayesian nonparametric model for discovering hierarchical community structure in social networks. Our model is a tree-structured mixture of potentially exponentially many stochastic blockmodels. We describe a family of greedy agglomerative model selection algorithms that take just one pass through the data to learn a fully probabilistic, hierarchical community model. In the worst case, Our algorithms scale quadratically in the number of vertices of the network, but independent of the number of nested communities. In practice, the run time of our algorithms are two orders of magnitude faster than the Infinite Relational Model, achieving comparable or better accuracy. 1

5 0.064462468 248 nips-2013-Point Based Value Iteration with Optimal Belief Compression for Dec-POMDPs

Author: Liam C. MacDermed, Charles Isbell

Abstract: We present four major results towards solving decentralized partially observable Markov decision problems (DecPOMDPs) culminating in an algorithm that outperforms all existing algorithms on all but one standard infinite-horizon benchmark problems. (1) We give an integer program that solves collaborative Bayesian games (CBGs). The program is notable because its linear relaxation is very often integral. (2) We show that a DecPOMDP with bounded belief can be converted to a POMDP (albeit with actions exponential in the number of beliefs). These actions correspond to strategies of a CBG. (3) We present a method to transform any DecPOMDP into a DecPOMDP with bounded beliefs (the number of beliefs is a free parameter) using optimal (not lossless) belief compression. (4) We show that the combination of these results opens the door for new classes of DecPOMDP algorithms based on previous POMDP algorithms. We choose one such algorithm, point-based valued iteration, and modify it to produce the first tractable value iteration method for DecPOMDPs that outperforms existing algorithms. 1

6 0.058131088 79 nips-2013-DESPOT: Online POMDP Planning with Regularization

7 0.056208715 93 nips-2013-Discriminative Transfer Learning with Tree-based Priors

8 0.047883235 355 nips-2013-Which Space Partitioning Tree to Use for Search?

9 0.035373989 1 nips-2013-(More) Efficient Reinforcement Learning via Posterior Sampling

10 0.035318505 54 nips-2013-Bayesian optimization explains human active search

11 0.034545865 169 nips-2013-Learning to Prune in Metric and Non-Metric Spaces

12 0.031133676 50 nips-2013-Bayesian Mixture Modelling and Inference based Thompson Sampling in Monte-Carlo Tree Search

13 0.029509319 137 nips-2013-High-Dimensional Gaussian Process Bandits

14 0.025930159 16 nips-2013-A message-passing algorithm for multi-agent trajectory planning

15 0.025842689 228 nips-2013-Online Learning of Dynamic Parameters in Social Networks

16 0.025487967 48 nips-2013-Bayesian Inference and Learning in Gaussian Process State-Space Models with Particle MCMC

17 0.025320204 151 nips-2013-Learning Chordal Markov Networks by Constraint Satisfaction

18 0.02395194 191 nips-2013-Minimax Optimal Algorithms for Unconstrained Linear Optimization

19 0.023468733 17 nips-2013-A multi-agent control framework for co-adaptation in brain-computer interfaces

20 0.023320219 164 nips-2013-Learning and using language via recursive pragmatic reasoning about other agents


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.042), (1, -0.027), (2, -0.018), (3, 0.006), (4, 0.021), (5, -0.0), (6, 0.044), (7, -0.01), (8, -0.019), (9, 0.014), (10, -0.001), (11, 0.01), (12, 0.042), (13, 0.028), (14, -0.044), (15, 0.092), (16, 0.008), (17, 0.102), (18, 0.063), (19, -0.009), (20, -0.041), (21, 0.015), (22, -0.073), (23, 0.031), (24, -0.005), (25, -0.018), (26, -0.024), (27, 0.02), (28, 0.058), (29, 0.03), (30, 0.069), (31, 0.002), (32, 0.027), (33, -0.071), (34, 0.019), (35, 0.032), (36, 0.12), (37, -0.004), (38, -0.003), (39, -0.03), (40, 0.001), (41, 0.049), (42, -0.007), (43, 0.01), (44, -0.002), (45, 0.014), (46, -0.023), (47, 0.003), (48, 0.066), (49, 0.038)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.98190963 71 nips-2013-Convergence of Monte Carlo Tree Search in Simultaneous Move Games

Author: Viliam Lisy, Vojta Kovarik, Marc Lanctot, Branislav Bosansky

Abstract: unkown-abstract

2 0.59743965 129 nips-2013-Generalized Random Utility Models with Multiple Types

Author: Hossein Azari Soufiani, Hansheng Diao, Zhenyu Lai, David C. Parkes

Abstract: We propose a model for demand estimation in multi-agent, differentiated product settings and present an estimation algorithm that uses reversible jump MCMC techniques to classify agents’ types. Our model extends the popular setup in Berry, Levinsohn and Pakes (1995) to allow for the data-driven classification of agents’ types using agent-level data. We focus on applications involving data on agents’ ranking over alternatives, and present theoretical conditions that establish the identifiability of the model and uni-modality of the likelihood/posterior. Results on both real and simulated data provide support for the scalability of our approach. 1

3 0.56141758 248 nips-2013-Point Based Value Iteration with Optimal Belief Compression for Dec-POMDPs

Author: Liam C. MacDermed, Charles Isbell

Abstract: We present four major results towards solving decentralized partially observable Markov decision problems (DecPOMDPs) culminating in an algorithm that outperforms all existing algorithms on all but one standard infinite-horizon benchmark problems. (1) We give an integer program that solves collaborative Bayesian games (CBGs). The program is notable because its linear relaxation is very often integral. (2) We show that a DecPOMDP with bounded belief can be converted to a POMDP (albeit with actions exponential in the number of beliefs). These actions correspond to strategies of a CBG. (3) We present a method to transform any DecPOMDP into a DecPOMDP with bounded beliefs (the number of beliefs is a free parameter) using optimal (not lossless) belief compression. (4) We show that the combination of these results opens the door for new classes of DecPOMDP algorithms based on previous POMDP algorithms. We choose one such algorithm, point-based valued iteration, and modify it to produce the first tractable value iteration method for DecPOMDPs that outperforms existing algorithms. 1

4 0.5164181 16 nips-2013-A message-passing algorithm for multi-agent trajectory planning

Author: Jose Bento, Nate Derbinsky, Javier Alonso-Mora, Jonathan S. Yedidia

Abstract: We describe a novel approach for computing collision-free global trajectories for p agents with specified initial and final configurations, based on an improved version of the alternating direction method of multipliers (ADMM). Compared with existing methods, our approach is naturally parallelizable and allows for incorporating different cost functionals with only minor adjustments. We apply our method to classical challenging instances and observe that its computational requirements scale well with p for several cost functionals. We also show that a specialization of our algorithm can be used for local motion planning by solving the problem of joint optimization in velocity space. 1

5 0.51456928 278 nips-2013-Reward Mapping for Transfer in Long-Lived Agents

Author: Xiaoxiao Guo, Satinder Singh, Richard L. Lewis

Abstract: We consider how to transfer knowledge from previous tasks (MDPs) to a current task in long-lived and bounded agents that must solve a sequence of tasks over a finite lifetime. A novel aspect of our transfer approach is that we reuse reward functions. While this may seem counterintuitive, we build on the insight of recent work on the optimal rewards problem that guiding an agent’s behavior with reward functions other than the task-specifying reward function can help overcome computational bounds of the agent. Specifically, we use good guidance reward functions learned on previous tasks in the sequence to incrementally train a reward mapping function that maps task-specifying reward functions into good initial guidance reward functions for subsequent tasks. We demonstrate that our approach can substantially improve the agent’s performance relative to other approaches, including an approach that transfers policies. 1

6 0.48745364 355 nips-2013-Which Space Partitioning Tree to Use for Search?

7 0.43017274 79 nips-2013-DESPOT: Online POMDP Planning with Regularization

8 0.35972294 47 nips-2013-Bayesian Hierarchical Community Discovery

9 0.34824711 228 nips-2013-Online Learning of Dynamic Parameters in Social Networks

10 0.34318966 128 nips-2013-Generalized Method-of-Moments for Rank Aggregation

11 0.3350684 169 nips-2013-Learning to Prune in Metric and Non-Metric Spaces

12 0.33384576 93 nips-2013-Discriminative Transfer Learning with Tree-based Priors

13 0.32144767 164 nips-2013-Learning and using language via recursive pragmatic reasoning about other agents

14 0.31919658 50 nips-2013-Bayesian Mixture Modelling and Inference based Thompson Sampling in Monte-Carlo Tree Search

15 0.29840142 250 nips-2013-Policy Shaping: Integrating Human Feedback with Reinforcement Learning

16 0.28181088 58 nips-2013-Binary to Bushy: Bayesian Hierarchical Clustering with the Beta Coalescent

17 0.27034876 340 nips-2013-Understanding variable importances in forests of randomized trees

18 0.24857195 32 nips-2013-Aggregating Optimistic Planning Trees for Solving Markov Decision Processes

19 0.24762514 82 nips-2013-Decision Jungles: Compact and Rich Models for Classification

20 0.2427921 57 nips-2013-Beyond Pairwise: Provably Fast Algorithms for Approximate $k$-Way Similarity Search


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(22, 0.657), (33, 0.04), (34, 0.09), (56, 0.029)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.81940788 71 nips-2013-Convergence of Monte Carlo Tree Search in Simultaneous Move Games

Author: Viliam Lisy, Vojta Kovarik, Marc Lanctot, Branislav Bosansky

Abstract: unkown-abstract

2 0.42037743 343 nips-2013-Unsupervised Structure Learning of Stochastic And-Or Grammars

Author: Kewei Tu, Maria Pavlovskaia, Song-Chun Zhu

Abstract: Stochastic And-Or grammars compactly represent both compositionality and reconfigurability and have been used to model different types of data such as images and events. We present a unified formalization of stochastic And-Or grammars that is agnostic to the type of the data being modeled, and propose an unsupervised approach to learning the structures as well as the parameters of such grammars. Starting from a trivial initial grammar, our approach iteratively induces compositions and reconfigurations in a unified manner and optimizes the posterior probability of the grammar. In our empirical evaluation, we applied our approach to learning event grammars and image grammars and achieved comparable or better performance than previous approaches. 1

3 0.22649106 236 nips-2013-Optimal Neural Population Codes for High-dimensional Stimulus Variables

Author: Zhuo Wang, Alan Stocker, Daniel Lee

Abstract: In many neural systems, information about stimulus variables is often represented in a distributed manner by means of a population code. It is generally assumed that the responses of the neural population are tuned to the stimulus statistics, and most prior work has investigated the optimal tuning characteristics of one or a small number of stimulus variables. In this work, we investigate the optimal tuning for diffeomorphic representations of high-dimensional stimuli. We analytically derive the solution that minimizes the L2 reconstruction loss. We compared our solution with other well-known criteria such as maximal mutual information. Our solution suggests that the optimal weights do not necessarily decorrelate the inputs, and the optimal nonlinearity differs from the conventional equalization solution. Results illustrating these optimal representations are shown for some input distributions that may be relevant for understanding the coding of perceptual pathways. 1

4 0.15322584 129 nips-2013-Generalized Random Utility Models with Multiple Types

Author: Hossein Azari Soufiani, Hansheng Diao, Zhenyu Lai, David C. Parkes

Abstract: We propose a model for demand estimation in multi-agent, differentiated product settings and present an estimation algorithm that uses reversible jump MCMC techniques to classify agents’ types. Our model extends the popular setup in Berry, Levinsohn and Pakes (1995) to allow for the data-driven classification of agents’ types using agent-level data. We focus on applications involving data on agents’ ranking over alternatives, and present theoretical conditions that establish the identifiability of the model and uni-modality of the likelihood/posterior. Results on both real and simulated data provide support for the scalability of our approach. 1

5 0.15273488 256 nips-2013-Probabilistic Principal Geodesic Analysis

Author: Miaomiao Zhang, P.T. Fletcher

Abstract: Principal geodesic analysis (PGA) is a generalization of principal component analysis (PCA) for dimensionality reduction of data on a Riemannian manifold. Currently PGA is defined as a geometric fit to the data, rather than as a probabilistic model. Inspired by probabilistic PCA, we present a latent variable model for PGA that provides a probabilistic framework for factor analysis on manifolds. To compute maximum likelihood estimates of the parameters in our model, we develop a Monte Carlo Expectation Maximization algorithm, where the expectation is approximated by Hamiltonian Monte Carlo sampling of the latent variables. We demonstrate the ability of our method to recover the ground truth parameters in simulated sphere data, as well as its effectiveness in analyzing shape variability of a corpus callosum data set from human brain images. 1

6 0.15256348 202 nips-2013-Multiclass Total Variation Clustering

7 0.15235049 38 nips-2013-Approximate Dynamic Programming Finally Performs Well in the Game of Tetris

8 0.15225424 48 nips-2013-Bayesian Inference and Learning in Gaussian Process State-Space Models with Particle MCMC

9 0.15214777 351 nips-2013-What Are the Invariant Occlusive Components of Image Patches? A Probabilistic Generative Approach

10 0.1517543 219 nips-2013-On model selection consistency of penalized M-estimators: a geometric theory

11 0.15149818 210 nips-2013-Noise-Enhanced Associative Memories

12 0.15075816 143 nips-2013-Integrated Non-Factorized Variational Inference

13 0.14962883 122 nips-2013-First-order Decomposition Trees

14 0.14030357 346 nips-2013-Variational Inference for Mahalanobis Distance Metrics in Gaussian Process Regression

15 0.13774997 39 nips-2013-Approximate Gaussian process inference for the drift function in stochastic differential equations

16 0.13540095 347 nips-2013-Variational Planning for Graph-based MDPs

17 0.13345152 348 nips-2013-Variational Policy Search via Trajectory Optimization

18 0.13320573 115 nips-2013-Factorized Asymptotic Bayesian Inference for Latent Feature Models

19 0.13317735 312 nips-2013-Stochastic Gradient Riemannian Langevin Dynamics on the Probability Simplex

20 0.1322666 144 nips-2013-Inverse Density as an Inverse Problem: the Fredholm Equation Approach