nips nips2002 nips2002-152 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Luis E. Ortiz, Michael Kearns
Abstract: We introduce NashProp, an iterative and local message-passing algorithm for computing Nash equilibria in multi-player games represented by arbitrary undirected graphs. We provide a formal analysis and experimental evidence demonstrating that NashProp performs well on large graphical games with many loops, often converging in just a dozen iterations on graphs with hundreds of nodes. NashProp generalizes the tree algorithm of (Kearns et al. 2001), and can be viewed as similar in spirit to belief propagation in probabilistic inference, and thus complements the recent work of (Vickrey and Koller 2002), who explored a junction tree approach. Thus, as for probabilistic inference, we have at least two promising general-purpose approaches to equilibria computation in graphs.
Reference: text
sentIndex sentText sentNum sentScore
1 edu ¡ Abstract We introduce NashProp, an iterative and local message-passing algorithm for computing Nash equilibria in multi-player games represented by arbitrary undirected graphs. [sent-4, score-0.27]
2 We provide a formal analysis and experimental evidence demonstrating that NashProp performs well on large graphical games with many loops, often converging in just a dozen iterations on graphs with hundreds of nodes. [sent-5, score-0.305]
3 2001), and can be viewed as similar in spirit to belief propagation in probabilistic inference, and thus complements the recent work of (Vickrey and Koller 2002), who explored a junction tree approach. [sent-7, score-0.267]
4 1 Introduction There has been considerable recent interest in representational and algorithmic issues arising in multi-player game theory. [sent-9, score-0.191]
5 One example is the recent work on graphical games (Kearns et al. [sent-10, score-0.197]
6 Here a multi-player game is represented by an undirected graph. [sent-12, score-0.229]
7 The interpretation is that while the global equilibria of the game depend on the actions of all players, individual payoffs for a player are determined solely by his own action and the actions of his immediate neighbors in the graph. [sent-13, score-0.872]
8 Like graphical models in probabilistic inference, graphical games may provide an exponentially more succinct representation than the standard “tabular” or normal form of the game. [sent-14, score-0.311]
9 Also as for probabilistic inference, the problem of computing equilibria on arbitrary graphs is intractable in general, and so it is of interest to identify both natural special topologies permitting fast Nash computations, and good heuristics for general graphs. [sent-15, score-0.306]
10 KLS gave a dynamic programming algorithm for computing Nash equilibria in graphical games in which the underlying graph is a tree, and drew analogies to the polytree algorithm for probabilistic inference (Pearl 1988). [sent-16, score-0.452]
11 Roughly speaking, the first approach is to take an arbitrary graph and “turn it into a tree” via triangulation, and subsequently run the tree-based algorithm on the resulting junction tree (Lauritzen and Spiegelhalter 1988). [sent-19, score-0.25]
12 This approach has the merit of being guaranteed to perform inference correctly, but the drawback of requiring the computation to be done on the junction tree. [sent-20, score-0.182]
13 On highly loopy graphs, junction tree computations may require exponential time. [sent-21, score-0.29]
14 Belief propagation has the merit of each iteration being quite efficient, but the drawback of having no guarantee of convergence in general (though recent theoretical work has established convergence for certain special cases (Weiss 2000)). [sent-24, score-0.177]
15 In recent work, (Vickrey and Koller 2002) proposed a number of heuristics for equilibria computation in graphical games, including a constraint satisfaction generalization of KLS that essentially provides a junction tree approach for arbitrary graphical games. [sent-25, score-0.567]
16 They also gave promising experimental results for this heuristic on certain loopy graphs that result in manageable junction trees. [sent-26, score-0.295]
17 In this work, we introduce the NashProp algorithm, a different KLS generalization which provides an approach analogous to loopy belief propagation for graphical games. [sent-27, score-0.295]
18 Like belief propagation, NashProp is a local message-passing algorithm that operates directly on the original graph of the game, requiring no triangulation or moralization 1 operations. [sent-28, score-0.201]
19 The table player sends to neighboring player in the graph indicates the values “believes” he can play given a setting of and the information he has received in tables from his other neighbors, a kind of conditional Nash equilibrium. [sent-31, score-0.863]
20 In the second phase of NashProp, the players attempt to incrementally construct an equilibrium obeying constraints imposed by the tables computed in the first phase. [sent-32, score-0.714]
21 ¡ ¡ Interestingly, we can provide rather strong theory for the first phase, proving that the tables must always converge, and result in a reduced search space that can never eliminate an equilibrium. [sent-33, score-0.273]
22 When run using a discretization scheme introduced by KLS, the first phase of NashProp will actually converge in time polynomial in the size of the game representation. [sent-34, score-0.5]
23 We also report on a number of controlled experiments with NashProp on loopy graphs, including some that would be difficult via the junction tree approach due to the graph topology. [sent-35, score-0.352]
24 2 Preliminaries The normal or tabular form of an -player, two-action2 game is defined by a set of matrices ( ), each with indices. [sent-37, score-0.252]
25 The entry specifies the payoff to player when the joint action of the players is . [sent-38, score-0.554]
26 The actions 0 and 1 are the pure strategies of each player, while a mixed strategy that the player will play 0. [sent-40, score-0.586]
27 For any joint for player is given by the probability mixed strategy, given by a product distribution , we define the expected payoff to player as , where indicates that each is 0 with probability and 1 with probability . [sent-41, score-0.815]
28 A (Nash) equilibrium for the game is a mixed strategy such that for any player , and for any , . [sent-44, score-0.765]
29 ) In other words, no player can improve their expected payoff by deviating unilaterally from a Nash equilibrium. [sent-46, score-0.461]
30 The classic theorem of (Nash 1951) states that for any game, there exists a Nash equilibrium in the space of joint mixed strategies. [sent-47, score-0.323]
31 An -Nash equilibrium is a mixed strategy such that for any player , and for any value , . [sent-49, score-0.574]
32 ) Thus, no player can improve their expected payoff by more than by f © f ¤3 3 © 3 ' P&i;© c3 ¤ £ d f A 93 ¤ £ (¦ &$ g P3 Y¤ 3 V ' % " Y¤ h 3 " f 2 Y¤ `3 1 Unlike for inference, moralization may be required for games even on undirected graphs. [sent-51, score-0.555]
33 An -player graphical game is a pair , where is an undirected graph on vertices and is a set of matrices called the local game matrices. [sent-55, score-0.729]
34 Each player is represented by a vertex in , and the interpretation is that each player’s payoff is determined solely by the actions in their local neighborhood in . [sent-56, score-0.575]
35 Thus the matrix has an index for each of the neighbors of , and itself, and for , denotes the payoff to when he an index for and his neighbors play . [sent-57, score-0.559]
36 The expected payoff under a mixed strategy is defined analogously. [sent-58, score-0.368]
37 ¢ § 3 NashProp: Table-Passing Phase The table-passing phase of NashProp proceeds in a series of rounds. [sent-63, score-0.233]
38 In each round, every node will send a different binary-valued table to each of its neighbors in the graph. [sent-64, score-0.274]
39 Thus, are neighbors, the table sent from to in round shall be denoted if vertices and . [sent-65, score-0.252]
40 This table is indexed by the continuum of for players and , respectively. [sent-67, score-0.19]
41 Intuitively, the possible mixed strategies binary value indicates player ’s (possibly incorrect) “belief” that there exists a and . [sent-68, score-0.486]
42 As these tables are indexed by continuous values, it is not clear how they can be finitely represented. [sent-72, score-0.252]
43 However, as in KLS, we shall shortly introduce a finite discretization of these tables whose resolution is dependent only on local neighborhood size, yet is sufficient to compute global (approximate) equilibria. [sent-73, score-0.469]
44 For the sake of generality we shall work with the exact tables in the ensuing formal analysis, which will immediately apply to the approximation algorithm as well. [sent-74, score-0.311]
45 For every edge , the table-passing phase initialization is for all . [sent-77, score-0.233]
46 Let us denote the neighbors of other than (if any) by . [sent-78, score-0.193]
47 For , the table entry is assigned the value 1 if and only if there each exists a vector of mixed strategies for such that % ¡ 5 5 5 776 8 @9¨ 8 ¨ @9D 3 ¡ ¦ @R % 9 @ % @ ¦ aT § 4§ ¦ © % @ 6 % 9 R % ' 0 C8B¨ A( 575657 A % ¡ © % " 1! [sent-79, score-0.345]
48 If has no neighbors other than , we define Condition 1 above to hold vacuously. [sent-86, score-0.193]
49 For the induction, assume for contradiction that for some , there exists a pair of neighboring players and a strategy pair such that yet . [sent-94, score-0.201]
50 Since , the definition of the table-passing phase implies that there exists a witness for the neighbors of other F I P 3 ' ( % & ' 0 A % & © " 1! [sent-95, score-0.562]
51 Since all tables begin filled with 1 entries, and Lemma 1 states entries can only change from 1 to 0, the table-passing phase must converge: exists. [sent-106, score-0.496]
52 ¨ ¦ ©§" ¤ ¢ ¥£¡ ' %" (¦ 5$ 7 % , the limit % ' 0 P D Theorem 2 For all % ¡ It is also immediately obvious that the limit tables must all simultaneously balance each other, in the sense of obeying Conditions 1 and 2. [sent-110, score-0.361]
53 If this were not true the tables would be altered by a single round of the table-passing phase. [sent-112, score-0.28]
54 A 0 ' We next establish that the table-passing phase will never eliminate any global Nash equilibria. [sent-115, score-0.289]
55 Let be any mixed strategy for the entire population of players, and let us use to denote the mixed strategy assigned to player by . [sent-116, score-0.717]
56 We can now establish a strong sense in which the set of balanced limit tables characterizes the Nash equilibria of the global game. [sent-127, score-0.47]
57 We say that is consistent with the if for every vertex with neighbors we have , and is a witness to this value. [sent-128, score-0.373]
58 ¡ ' ¡" 3 @ ¡ ¡ % 0 6' ¦ %" &$ # 3 % ¡ 3 be any global mixed strategy. [sent-131, score-0.196]
59 ¡ ' W " 3 % ¡ Theorem 4 Let balanced limit tables % & ' e " 3 3 is certainly a For the other direction, if is a Nash equilibrium, then for all , best response to the strategy of its neighbors . [sent-140, score-0.598]
60 So for consistency with the , it remains to show that for every player and its neighbors , and for all . [sent-141, score-0.475]
61 Theorem 4 is important because it establishes that the table-passing phase provides us with an alternative — and hopefully vastly reduced — seach space for Nash equilibria. [sent-146, score-0.233]
62 Rather than search for equilibria in the space of all mixed strategies, Theorem 4 asserts that we can limit our search to the space of that are consistent with the balanced limit tables , with no fear of missing equilibria. [sent-147, score-0.729]
63 The demand for consistency with the limit tables is a locally stronger demand than merely asking for a player to be playing a best response to its neighborhood. [sent-148, score-0.592]
64 The most obvious case is that in which many of the tables contain a large fraction of 0 entries, since every such entry eliminates all mixed strategies in which the corresponding pair of vertices plays the corresponding pair of values. [sent-153, score-0.568]
65 We shall also see that even when such reduction does not occur, the underlying graphical structure of the game may still yield significant computational benefits in the search for a consistent mixed strategy. [sent-155, score-0.592]
66 have continuous indices Thus far we have assumed that the binary-valued tables and , and thus it is not clear how they can be finitely represented 3 . [sent-157, score-0.228]
67 The total number of entries in each table will be and thus exponential in , but the payoff matrices for the players are already exponential in , so our tables remain polynomial in the size of the graphical game representation. [sent-161, score-0.908]
68 It is easily verified that none of the key results establishing this fact (specifically, Lemmas 2, 3 and 4 of KLS) depend on the underlying graph being a tree, but hold for all graphical games. [sent-163, score-0.176]
69 ' ( % @ % A ' @ % % & f f b ¡ @ ' § ¦ 2 f @ ¡ ¤ ¢ ¥£¡ § ¨ © § ¡ § P § Precise analogues of all the results of the preceding section can thus be established for the discretized instantiation of the table-passing phase (details omitted). [sent-164, score-0.303]
70 In particular, the tablepassing phase will now converge to finite balanced limit tables, and consistency with these tables characterizes -Nash equilibria. [sent-165, score-0.647]
71 Furthermore, since every round prior to convergence must change at least one entry in one table, the table-passing phase must thus converge in at most rounds, which is again polynomial in the size of the game representation. [sent-166, score-0.571]
72 Each round of the table-passing phase takes at most on the order of computational steps in the worst case (though possibly considerably less), giving a total running time to the table-passing phase that scales polynomially with the size of the game. [sent-167, score-0.551]
73 We have already suggested that the tables represent a solution space that may be considerably smaller than the set of all mixed strategies. [sent-170, score-0.425]
74 Thus, if are all the neighbors of , we define to be 1 if and only if there exists (again called a witness to ) such that for all , and is a best response to ; otherwise we define to be 0. [sent-174, score-0.374]
75 ' ¦ ¡ @ ' ¦ ¦ ' 3 @ ¦ A If is any global mixed strategy, it is easily verified that is consistent with the 3 3 We note that the KLS proof that the exact tables must admit a rectilinear representation holds generally, but we cannot bound their complexity here. [sent-177, score-0.448]
76 ¡ ¦ i ' ¡ " c3 @ if and only if for all nodes , with the assignment of the neighbors of in as a witness. [sent-179, score-0.265]
77 The first step of the assignment-passing phase of NashProp is thus the computation of the at each vertex , which is again a local computation in the graph. [sent-180, score-0.32]
78 assigns itself value , and assigns each of its neighbors the value . [sent-186, score-0.193]
79 If there is a partial assignment to the neighbors of , attempt to extend it to a witness to such that for all , and assign any previously unassigned neighbors their values in this witness. [sent-189, score-0.527]
80 If all the neighbors of have been assigned, make sure is a best response. [sent-190, score-0.193]
81 It is easily verified that if this process succeeds in assigning all vertices, the resulting mixed strategy is consistent with the and thus a Nash equilibrium (or approximate equilibrium in the discretized case). [sent-192, score-0.484]
82 The difficulty, of course, is that the inductive step of the assignment-passing phase may fail due to cycles in the graph — we may reach a node whose neighbor partial assignment cannot be extended, or whose assigned value is not a best response to its complete neighborhood assignment. [sent-193, score-0.602]
83 @ ¡ ¡ The overall NashProp algorithm thus consists of the (always converging) table-passing phase followed by the backtracking local assignment-passing phase. [sent-196, score-0.3]
84 6 Experimental Results We have implemented the NashProp algorithm (with distinct table-passing and assignmentpassing 5 phases) as described, and run a series of controlled experiments on loopy graphs of varying size and topology. [sent-200, score-0.176]
85 Our implementation thus takes both and as inputs, and attempts to find an -Nash equilibrium running NashProp on tables of resolution . [sent-202, score-0.358]
86 f f f We first draw attention to Figure 1, in which we provide a visual display of the evolution of the tables computed by the NashProp table-passing phase for a small (3 by 3) grid game. [sent-203, score-0.558]
87 Note that for this game, the table-passing phase constrains the search space tremendously — so much so that the projection sets entirely determine the unique equilibrium, and the assignment-passing phase is superfluous. [sent-204, score-0.511]
88 r=1 r=3 r=2 r=8 Figure 1: Visual display of the NashProp table-passing phase after rounds 1,2 and 3 and 8 (where convergence occurs). [sent-208, score-0.353]
89 For the reward functions, each player has a distinct preference for one of his two actions. [sent-210, score-0.299]
90 Excluding the small fraction of trials in which the assignment-passing phase failed to find a solution, even on grid and loopy chord graphs with 100 nodes, we find convergence of both the table and assignment-passing phases in less than a dozen rounds. [sent-221, score-0.701]
91 We next note that there is considerable variation across topologies (and little within) in the amount of work done by the table-passing phase, both in terms of the expected number of rounds to convergence, and the fraction of 0 entries that have been computed at completion. [sent-222, score-0.232]
92 For example, for cycles the amount of work in both senses is at its highest, while for grids with random rewards it is lowest. [sent-223, score-0.293]
93 For grids and chordal cycles, decreasing the value of (and thus increasing the bias of the payoff matrices) generally causes more to be accomplished by the table-passing phase. [sent-224, score-0.51]
94 00 0 0 0 20 40 60 number of players 80 100 0 20 40 60 number of players 80 100 Figure 2: Plots showing the number of rounds taken by the NashProp table-passing (left) and assignment-passing (right) phases in computing an equilibrium, for a variety of different graph topologies. [sent-249, score-0.439]
95 ¥ ¦¢ ¡ ¢ ¡ ¤£ £ ¡ ¦ ¥£ ¡ ¨ ¤ § ¨¢ ¥ ¢ £ ¢ £ ¤¢ ¨ ¨ ¡¨ ¡ ¢ © ¤ £ ¡ ¨ ©¤ £ ¡ ¢ ¢ § ¨¢ ¡ ¢ ¢ ¤£ £ ¡ ¥© ¤ £ ¡ ¨ outbound tables — there have too many neighbors whose combined setting can act as a witnesses for a 1 in an outbound table. [sent-255, score-0.523]
96 However, as suggested by the theory, greater progress (and computation) in the tablepassing phase pays dividends in the assignment-passing phase, since the search space may have been dramatically reduced. [sent-256, score-0.329]
97 For example, for chordal and grid graphs with biased rewards, the ordering of plots by convergence time is essentially reversed from the tablepassing to assignment-passing phases. [sent-257, score-0.547]
98 This suggests that, when it occurs, the additional convergence time in the table-passing phase is worth the investment. [sent-258, score-0.266]
99 However, we again note that even for the least useful table-passing phase (for grids with random rewards), the assignment-passing phase (which thus exploits the graph structure alone) still manages to find an equilibrium rapidly. [sent-259, score-0.736]
100 Correctness of local probability propagation in graphical models with loops. [sent-290, score-0.198]
wordName wordTfidf (topN-words)
[('nashprop', 0.428), ('nash', 0.291), ('chordal', 0.257), ('player', 0.251), ('phase', 0.233), ('tables', 0.228), ('neighbors', 0.193), ('game', 0.191), ('kls', 0.188), ('mixed', 0.164), ('payoff', 0.149), ('vickrey', 0.12), ('players', 0.119), ('junction', 0.119), ('graphical', 0.114), ('witness', 0.109), ('equilibria', 0.109), ('cycles', 0.104), ('grids', 0.104), ('equilibrium', 0.104), ('loopy', 0.102), ('grid', 0.097), ('rounds', 0.087), ('rewards', 0.085), ('games', 0.083), ('topologies', 0.081), ('graphs', 0.074), ('tree', 0.069), ('vertices', 0.068), ('graph', 0.062), ('koller', 0.058), ('strategy', 0.055), ('shall', 0.054), ('phases', 0.052), ('round', 0.052), ('chords', 0.051), ('outbound', 0.051), ('polytree', 0.051), ('ringofrings', 0.051), ('tablepassing', 0.051), ('discretization', 0.049), ('actions', 0.048), ('preference', 0.048), ('vertex', 0.047), ('table', 0.047), ('search', 0.045), ('response', 0.045), ('rings', 0.045), ('propagation', 0.044), ('strategies', 0.044), ('heuristics', 0.042), ('settings', 0.041), ('nodes', 0.04), ('balanced', 0.04), ('neighborhood', 0.04), ('induction', 0.04), ('local', 0.04), ('undirected', 0.038), ('limit', 0.037), ('established', 0.037), ('tabular', 0.036), ('entry', 0.035), ('belief', 0.035), ('entries', 0.035), ('biased', 0.035), ('dozen', 0.034), ('moralization', 0.034), ('unilaterally', 0.034), ('node', 0.034), ('inference', 0.033), ('convergence', 0.033), ('veri', 0.033), ('considerably', 0.033), ('discretized', 0.033), ('global', 0.032), ('assignment', 0.032), ('consistency', 0.031), ('sent', 0.031), ('trivially', 0.031), ('kearns', 0.03), ('triangulation', 0.03), ('merit', 0.03), ('obeying', 0.03), ('fraction', 0.029), ('immediately', 0.029), ('theorem', 0.028), ('assigned', 0.028), ('backtracking', 0.027), ('deviating', 0.027), ('exists', 0.027), ('converge', 0.027), ('lemma', 0.027), ('resolution', 0.026), ('lauritzen', 0.025), ('matrices', 0.025), ('play', 0.024), ('establish', 0.024), ('indexed', 0.024), ('neighbor', 0.024), ('consistent', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 152 nips-2002-Nash Propagation for Loopy Graphical Games
Author: Luis E. Ortiz, Michael Kearns
Abstract: We introduce NashProp, an iterative and local message-passing algorithm for computing Nash equilibria in multi-player games represented by arbitrary undirected graphs. We provide a formal analysis and experimental evidence demonstrating that NashProp performs well on large graphical games with many loops, often converging in just a dozen iterations on graphs with hundreds of nodes. NashProp generalizes the tree algorithm of (Kearns et al. 2001), and can be viewed as similar in spirit to belief propagation in probabilistic inference, and thus complements the recent work of (Vickrey and Koller 2002), who explored a junction tree approach. Thus, as for probabilistic inference, we have at least two promising general-purpose approaches to equilibria computation in graphs.
2 0.32843432 175 nips-2002-Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games
Author: Xiaofeng Wang, Tuomas Sandholm
Abstract: Multiagent learning is a key problem in AI. In the presence of multiple Nash equilibria, even agents with non-conflicting interests may not be able to learn an optimal coordination policy. The problem is exaccerbated if the agents do not know the game and independently receive noisy payoffs. So, multiagent reinforfcement learning involves two interrelated problems: identifying the game and learning to play. In this paper, we present optimal adaptive learning, the first algorithm that converges to an optimal Nash equilibrium with probability 1 in any team Markov game. We provide a convergence proof, and show that the algorithm’s parameters are easy to set to meet the convergence conditions.
3 0.32231757 78 nips-2002-Efficient Learning Equilibrium
Author: Ronen I. Brafman, Moshe Tennenholtz
Abstract: We introduce efficient learning equilibrium (ELE), a normative approach to learning in non cooperative settings. In ELE, the learning algorithms themselves are required to be in equilibrium. In addition, the learning algorithms arrive at a desired value after polynomial time, and deviations from a prescribed ELE become irrational after polynomial time. We prove the existence of an ELE in the perfect monitoring setting, where the desired value is the expected payoff in a Nash equilibrium. We also show that an ELE does not always exist in the imperfect monitoring case. Yet, it exists in the special case of common-interest games. Finally, we extend our results to general stochastic games. 1
4 0.10814803 94 nips-2002-Fractional Belief Propagation
Author: Wim Wiegerinck, Tom Heskes
Abstract: We consider loopy belief propagation for approximate inference in probabilistic graphical models. A limitation of the standard algorithm is that clique marginals are computed as if there were no loops in the graph. To overcome this limitation, we introduce fractional belief propagation. Fractional belief propagation is formulated in terms of a family of approximate free energies, which includes the Bethe free energy and the naive mean-field free as special cases. Using the linear response correction of the clique marginals, the scale parameters can be tuned. Simulation results illustrate the potential merits of the approach.
5 0.094681427 130 nips-2002-Learning in Zero-Sum Team Markov Games Using Factored Value Functions
Author: Michail G. Lagoudakis, Ronald Parr
Abstract: We present a new method for learning good strategies in zero-sum Markov games in which each side is composed of multiple agents collaborating against an opposing team of agents. Our method requires full observability and communication during learning, but the learned policies can be executed in a distributed manner. The value function is represented as a factored linear architecture and its structure determines the necessary computational resources and communication bandwidth. This approach permits a tradeoff between simple representations with little or no communication between agents and complex, computationally intensive representations with extensive coordination between agents. Thus, we provide a principled means of using approximation to combat the exponential blowup in the joint action space of the participants. The approach is demonstrated with an example that shows the efficiency gains over naive enumeration.
6 0.094490632 189 nips-2002-Stable Fixed Points of Loopy Belief Propagation Are Local Minima of the Bethe Free Energy
7 0.071772099 4 nips-2002-A Differential Semantics for Jointree Algorithms
8 0.070674933 13 nips-2002-A Note on the Representational Incompatibility of Function Approximation and Factored Dynamics
9 0.060805175 203 nips-2002-Using Tarjan's Red Rule for Fast Dependency Tree Construction
10 0.060047302 80 nips-2002-Exact MAP Estimates by (Hyper)tree Agreement
11 0.059945129 124 nips-2002-Learning Graphical Models with Mercer Kernels
12 0.059160255 32 nips-2002-Approximate Inference and Protein-Folding
13 0.057530265 44 nips-2002-Binary Tuning is Optimal for Neural Rate Coding with High Temporal Resolution
14 0.050408259 157 nips-2002-On the Dirichlet Prior and Bayesian Regularization
15 0.047033012 73 nips-2002-Dynamic Bayesian Networks with Deterministic Latent Tables
16 0.04547802 74 nips-2002-Dynamic Structure Super-Resolution
17 0.043014541 82 nips-2002-Exponential Family PCA for Belief Compression in POMDPs
18 0.042623051 181 nips-2002-Self Supervised Boosting
19 0.040488221 77 nips-2002-Effective Dimension and Generalization of Kernel Learning
20 0.039201658 85 nips-2002-Fast Kernels for String and Tree Matching
topicId topicWeight
[(0, -0.143), (1, -0.017), (2, -0.204), (3, -0.031), (4, -0.048), (5, -0.009), (6, -0.005), (7, -0.036), (8, -0.228), (9, -0.29), (10, 0.173), (11, 0.33), (12, 0.201), (13, -0.124), (14, 0.012), (15, 0.027), (16, -0.027), (17, 0.097), (18, -0.113), (19, 0.028), (20, 0.075), (21, 0.071), (22, 0.127), (23, 0.053), (24, 0.078), (25, 0.014), (26, 0.008), (27, -0.011), (28, 0.007), (29, 0.011), (30, -0.061), (31, -0.046), (32, -0.05), (33, 0.029), (34, -0.027), (35, -0.016), (36, -0.012), (37, -0.018), (38, 0.017), (39, 0.008), (40, -0.016), (41, 0.019), (42, -0.055), (43, 0.005), (44, 0.02), (45, 0.026), (46, -0.018), (47, 0.04), (48, 0.0), (49, -0.042)]
simIndex simValue paperId paperTitle
same-paper 1 0.96137929 152 nips-2002-Nash Propagation for Loopy Graphical Games
Author: Luis E. Ortiz, Michael Kearns
Abstract: We introduce NashProp, an iterative and local message-passing algorithm for computing Nash equilibria in multi-player games represented by arbitrary undirected graphs. We provide a formal analysis and experimental evidence demonstrating that NashProp performs well on large graphical games with many loops, often converging in just a dozen iterations on graphs with hundreds of nodes. NashProp generalizes the tree algorithm of (Kearns et al. 2001), and can be viewed as similar in spirit to belief propagation in probabilistic inference, and thus complements the recent work of (Vickrey and Koller 2002), who explored a junction tree approach. Thus, as for probabilistic inference, we have at least two promising general-purpose approaches to equilibria computation in graphs.
2 0.88496619 78 nips-2002-Efficient Learning Equilibrium
Author: Ronen I. Brafman, Moshe Tennenholtz
Abstract: We introduce efficient learning equilibrium (ELE), a normative approach to learning in non cooperative settings. In ELE, the learning algorithms themselves are required to be in equilibrium. In addition, the learning algorithms arrive at a desired value after polynomial time, and deviations from a prescribed ELE become irrational after polynomial time. We prove the existence of an ELE in the perfect monitoring setting, where the desired value is the expected payoff in a Nash equilibrium. We also show that an ELE does not always exist in the imperfect monitoring case. Yet, it exists in the special case of common-interest games. Finally, we extend our results to general stochastic games. 1
3 0.84790874 175 nips-2002-Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games
Author: Xiaofeng Wang, Tuomas Sandholm
Abstract: Multiagent learning is a key problem in AI. In the presence of multiple Nash equilibria, even agents with non-conflicting interests may not be able to learn an optimal coordination policy. The problem is exaccerbated if the agents do not know the game and independently receive noisy payoffs. So, multiagent reinforfcement learning involves two interrelated problems: identifying the game and learning to play. In this paper, we present optimal adaptive learning, the first algorithm that converges to an optimal Nash equilibrium with probability 1 in any team Markov game. We provide a convergence proof, and show that the algorithm’s parameters are easy to set to meet the convergence conditions.
4 0.40422586 130 nips-2002-Learning in Zero-Sum Team Markov Games Using Factored Value Functions
Author: Michail G. Lagoudakis, Ronald Parr
Abstract: We present a new method for learning good strategies in zero-sum Markov games in which each side is composed of multiple agents collaborating against an opposing team of agents. Our method requires full observability and communication during learning, but the learned policies can be executed in a distributed manner. The value function is represented as a factored linear architecture and its structure determines the necessary computational resources and communication bandwidth. This approach permits a tradeoff between simple representations with little or no communication between agents and complex, computationally intensive representations with extensive coordination between agents. Thus, we provide a principled means of using approximation to combat the exponential blowup in the joint action space of the participants. The approach is demonstrated with an example that shows the efficiency gains over naive enumeration.
5 0.31972545 94 nips-2002-Fractional Belief Propagation
Author: Wim Wiegerinck, Tom Heskes
Abstract: We consider loopy belief propagation for approximate inference in probabilistic graphical models. A limitation of the standard algorithm is that clique marginals are computed as if there were no loops in the graph. To overcome this limitation, we introduce fractional belief propagation. Fractional belief propagation is formulated in terms of a family of approximate free energies, which includes the Bethe free energy and the naive mean-field free as special cases. Using the linear response correction of the clique marginals, the scale parameters can be tuned. Simulation results illustrate the potential merits of the approach.
6 0.31104469 189 nips-2002-Stable Fixed Points of Loopy Belief Propagation Are Local Minima of the Bethe Free Energy
7 0.30257213 4 nips-2002-A Differential Semantics for Jointree Algorithms
8 0.24176967 32 nips-2002-Approximate Inference and Protein-Folding
9 0.23692347 80 nips-2002-Exact MAP Estimates by (Hyper)tree Agreement
10 0.23179229 185 nips-2002-Speeding up the Parti-Game Algorithm
11 0.22673251 203 nips-2002-Using Tarjan's Red Rule for Fast Dependency Tree Construction
12 0.21461117 13 nips-2002-A Note on the Representational Incompatibility of Function Approximation and Factored Dynamics
13 0.19116797 179 nips-2002-Scaling of Probability-Based Optimization Algorithms
14 0.19097988 44 nips-2002-Binary Tuning is Optimal for Neural Rate Coding with High Temporal Resolution
15 0.18728279 157 nips-2002-On the Dirichlet Prior and Bayesian Regularization
16 0.17570983 163 nips-2002-Prediction and Semantic Association
17 0.16702349 172 nips-2002-Recovering Articulated Model Topology from Observed Rigid Motion
18 0.16407998 56 nips-2002-Concentration Inequalities for the Missing Mass and for Histogram Rule Error
19 0.16343296 107 nips-2002-Identity Uncertainty and Citation Matching
20 0.1620791 100 nips-2002-Half-Lives of EigenFlows for Spectral Clustering
topicId topicWeight
[(11, 0.031), (22, 0.246), (23, 0.012), (42, 0.078), (54, 0.11), (55, 0.023), (57, 0.038), (67, 0.012), (68, 0.039), (74, 0.172), (87, 0.015), (92, 0.044), (98, 0.092)]
simIndex simValue paperId paperTitle
same-paper 1 0.84969413 152 nips-2002-Nash Propagation for Loopy Graphical Games
Author: Luis E. Ortiz, Michael Kearns
Abstract: We introduce NashProp, an iterative and local message-passing algorithm for computing Nash equilibria in multi-player games represented by arbitrary undirected graphs. We provide a formal analysis and experimental evidence demonstrating that NashProp performs well on large graphical games with many loops, often converging in just a dozen iterations on graphs with hundreds of nodes. NashProp generalizes the tree algorithm of (Kearns et al. 2001), and can be viewed as similar in spirit to belief propagation in probabilistic inference, and thus complements the recent work of (Vickrey and Koller 2002), who explored a junction tree approach. Thus, as for probabilistic inference, we have at least two promising general-purpose approaches to equilibria computation in graphs.
2 0.80697459 40 nips-2002-Bayesian Models of Inductive Generalization
Author: Neville E. Sanjana, Joshua B. Tenenbaum
Abstract: We argue that human inductive generalization is best explained in a Bayesian framework, rather than by traditional models based on similarity computations. We go beyond previous work on Bayesian concept learning by introducing an unsupervised method for constructing flexible hypothesis spaces, and we propose a version of the Bayesian Occam’s razor that trades off priors and likelihoods to prevent under- or over-generalization in these flexible spaces. We analyze two published data sets on inductive reasoning as well as the results of a new behavioral study that we have carried out.
3 0.67702752 162 nips-2002-Parametric Mixture Models for Multi-Labeled Text
Author: Naonori Ueda, Kazumi Saito
Abstract: We propose probabilistic generative models, called parametric mixture models (PMMs), for multiclass, multi-labeled text categorization problem. Conventionally, the binary classification approach has been employed, in which whether or not text belongs to a category is judged by the binary classifier for every category. In contrast, our approach can simultaneously detect multiple categories of text using PMMs. We derive efficient learning and prediction algorithms for PMMs. We also empirically show that our method could significantly outperform the conventional binary methods when applied to multi-labeled text categorization using real World Wide Web pages. 1
4 0.66872352 175 nips-2002-Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games
Author: Xiaofeng Wang, Tuomas Sandholm
Abstract: Multiagent learning is a key problem in AI. In the presence of multiple Nash equilibria, even agents with non-conflicting interests may not be able to learn an optimal coordination policy. The problem is exaccerbated if the agents do not know the game and independently receive noisy payoffs. So, multiagent reinforfcement learning involves two interrelated problems: identifying the game and learning to play. In this paper, we present optimal adaptive learning, the first algorithm that converges to an optimal Nash equilibrium with probability 1 in any team Markov game. We provide a convergence proof, and show that the algorithm’s parameters are easy to set to meet the convergence conditions.
5 0.66103333 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture
Author: David R. Martin, Charless C. Fowlkes, Jitendra Malik
Abstract: The goal of this work is to accurately detect and localize boundaries in natural scenes using local image measurements. We formulate features that respond to characteristic changes in brightness and texture associated with natural boundaries. In order to combine the information from these features in an optimal way, a classifier is trained using human labeled images as ground truth. We present precision-recall curves showing that the resulting detector outperforms existing approaches.
6 0.65996873 135 nips-2002-Learning with Multiple Labels
7 0.65832859 74 nips-2002-Dynamic Structure Super-Resolution
8 0.65732431 39 nips-2002-Bayesian Image Super-Resolution
9 0.65388942 78 nips-2002-Efficient Learning Equilibrium
10 0.6516875 90 nips-2002-Feature Selection in Mixture-Based Clustering
11 0.6501171 89 nips-2002-Feature Selection by Maximum Marginal Diversity
12 0.6494835 124 nips-2002-Learning Graphical Models with Mercer Kernels
13 0.64772987 173 nips-2002-Recovering Intrinsic Images from a Single Image
14 0.64761108 163 nips-2002-Prediction and Semantic Association
15 0.64698923 157 nips-2002-On the Dirichlet Prior and Bayesian Regularization
16 0.64565355 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search
18 0.64451331 52 nips-2002-Cluster Kernels for Semi-Supervised Learning
19 0.64442343 114 nips-2002-Information Regularization with Partially Labeled Data
20 0.64232022 48 nips-2002-Categorization Under Complexity: A Unified MDL Account of Human Learning of Regular and Irregular Categories