nips nips2007 nips2007-208 knowledge-graph by maker-knowledge-mining

208 nips-2007-TrueSkill Through Time: Revisiting the History of Chess


Source: pdf

Author: Pierre Dangauthier, Ralf Herbrich, Tom Minka, Thore Graepel

Abstract: We extend the Bayesian skill rating system TrueSkill to infer entire time series of skills of players by smoothing through time instead of filtering. The skill of each participating player, say, every year is represented by a latent skill variable which is affected by the relevant game outcomes that year, and coupled with the skill variables of the previous and subsequent year. Inference in the resulting factor graph is carried out by approximate message passing (EP) along the time series of skills. As before the system tracks the uncertainty about player skills, explicitly models draws, can deal with any number of competing entities and can infer individual skills from team results. We extend the system to estimate player-specific draw margins. Based on these models we present an analysis of the skill curves of important players in the history of chess over the past 150 years. Results include plots of players’ lifetime skill development as well as the ability to compare the skills of different players across time. Our results indicate that a) the overall playing strength has increased over the past 150 years, and b) that modelling a player’s ability to force a draw provides significantly better predictive power. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 com Abstract We extend the Bayesian skill rating system TrueSkill to infer entire time series of skills of players by smoothing through time instead of filtering. [sent-9, score-1.172]

2 The skill of each participating player, say, every year is represented by a latent skill variable which is affected by the relevant game outcomes that year, and coupled with the skill variables of the previous and subsequent year. [sent-10, score-2.062]

3 As before the system tracks the uncertainty about player skills, explicitly models draws, can deal with any number of competing entities and can infer individual skills from team results. [sent-12, score-0.565]

4 Based on these models we present an analysis of the skill curves of important players in the history of chess over the past 150 years. [sent-14, score-1.123]

5 Results include plots of players’ lifetime skill development as well as the ability to compare the skills of different players across time. [sent-15, score-1.041]

6 Our results indicate that a) the overall playing strength has increased over the past 150 years, and b) that modelling a player’s ability to force a draw provides significantly better predictive power. [sent-16, score-0.183]

7 1 Introduction Competitive games and sports can benefit from statistical skill ratings for use in matchmaking as well as for providing criteria for the admission to tournaments. [sent-17, score-0.753]

8 From a historical perspective, skill ratings also provide information about the general development of skill within the discipline or for a particular group of interest. [sent-18, score-1.327]

9 Also, they can give a fascinating narrative about the key players in a given discipline, allowing a glimpse at their rise and fall or their struggle against their contemporaries. [sent-19, score-0.26]

10 In order to provide good estimates of the current skill level of players skill rating systems have traditionally been designed as filters that combine a new game outcome with knowledge about a player’s skill from the past to obtain a new estimate. [sent-20, score-2.285]

11 In contrast, when taking a historical view we would like to infer the skill of a player at a given point in the past when both their past as well as their future achievements are known. [sent-21, score-0.991]

12 The best known such skill filter based rating system is the Elo system [3] developed by Arpad Elo in 1959 and adopted by the World Chess Federation FIDE in 1970 [4]. [sent-22, score-0.719]

13 Elo models the 1 1 probability of the game outcome as P (1 wins over 2|s1 , s2 ) := Φ s√−s2 where s1 and s2 2β are the skill ratings of each player, Φ denotes the cumulative density of a zero-mean unitvariance Gaussian and β is the assumed variability of performance around skill. [sent-23, score-0.871]

14 Denote the game outcomes by y = +1 if player 1 wins, y = −1 if player 2 wins and y = 0 if a draw occurs. [sent-24, score-0.854]

15 The TrueSkill rating system [6] improves on the Elo system in a number of ways. [sent-26, score-0.136]

16 TrueSkill’s current belief about a player’s skill is represented by a Gaussian distribution with mean µ and variance σ 2 . [sent-27, score-0.583]

17 As a consequence, TrueSkill does not require a provisional rating period and converges to the true skills of players very quickly. [sent-28, score-0.542]

18 Crucially for its application in the Xbox Live online gaming system (see [6] for details) it can also infer skills from games with more than two participating entities and infers individual players’ skills from the outcomes of team games. [sent-30, score-0.729]

19 As a skill rating and matchmaking system TrueSkill operates as a filter as discussed above. [sent-31, score-0.718]

20 However, due to its fully probabilistic formulation it is possible to extend Trueskill to perform smoothing on a time series of player skills. [sent-32, score-0.238]

21 In this paper we extend TrueSkill to provide accurate estimates of the past skill levels of players at any point in time taking into account both their past and their future achievements. [sent-33, score-0.893]

22 5 million games of chess played over the last 150 years. [sent-35, score-0.403]

23 In Section 2 we review previous work on historical chess ratings. [sent-37, score-0.343]

24 In Section 3 we present two models for historical ratings through time, one assuming a fixed draw margin and one estimating the draw margin per player per year. [sent-38, score-0.664]

25 5 million games and gain some fascinating chess specific insights from the data. [sent-41, score-0.403]

26 2 Previous Work on Historical Chess Ratings Estimating players’ skills in retrospective allows one to take into account more information and hence can be expected to lead to more precise estimates. [sent-42, score-0.22]

27 The pioneer in this field was Arpad Elo himself, when he encountered the necessity of initializing the skill values of the Elo system when it was first deployed. [sent-43, score-0.609]

28 To that end he fitted a smooth curve to skill estimates from five-year periods; however little is known about the details of his method [3]. [sent-44, score-0.583]

29 Probably best known in the chess community is the Chessmetrics system [8], which aims at improving the Elo scores by attempting to obtain a better fit with the observed data. [sent-45, score-0.292]

30 The first approach to the historical rating problem with a solid statistical foundation was developed by Mark Glickman, chairman of the USCF Rating Committee. [sent-47, score-0.161]

31 Glickman models skills as Gaussian variables whose variances indicate the reliability of the skill estimate, an idea later adopted in the TrueSkill model as well. [sent-49, score-0.82]

32 The second statistically well founded approach are Rod Edwards’s Edo Historical Chess Ratings [2], which are also based on the Bradley-Terry model but have been applied only to historical games from the 19th century. [sent-52, score-0.163]

33 In order to model skill dynamics Edwards considers 2 the same player at different times as several distinct players, whose skills are linked together by a set of virtual games which are assumed to end in draws. [sent-53, score-1.146]

34 Although TrueSkill is applicable to the case of multiple team games, we will only consider the two player case for this application to chess. [sent-57, score-0.263]

35 Consider a game such as chess in which a number of, say, N players {1, . [sent-59, score-0.632]

36 Denote the series of game outcomes between two t t players i and j in year t by yij (k) ∈ {+1, −1, 0} where k ∈ 1, . [sent-63, score-0.625]

37 , Kij denotes the number of game outcomes available for that pair of players in that year. [sent-66, score-0.458]

38 Furthermore, let y = +1 if player i wins, y = −1 if player j wins and y = 0 in case of a draw. [sent-67, score-0.527]

39 1 Vanilla TrueSkill In the Vanilla TrueSkill system, each player i is assumed to have an unknown skill t st ∈ R at time t. [sent-69, score-1.042]

40 We assume that a game outcome yij (k) is generated as follows. [sent-70, score-0.272]

41 For i t each of the two players i and j performances pij (k) and pt (k) are drawn according to ji t p pt (k) |st = N pt (k) ; st , β 2 . [sent-71, score-1.057]

42 The outcome yij (k) of the game between players i and ij i ij i j is then determined as   +1 if pt (k) > pt (k) + ε ij ji t −1 if pt (k) > pt (k) + ε , yij (k) := ij ij  0 if pt (k) − pt (k) ≤ ε ij ji where the parameter ε > 0 is the draw margin. [sent-72, score-2.391]

43 In order to infer the unknown skills s t the i 2 TrueSkill model assumes a factorising Gaussian prior p s0 = N s0 ; µ0 , σ0 over skills and a i i Gaussian drift of skills between time steps given by p st |st−1 = N st ; st−1 , τ 2 . [sent-73, score-1.148]

44 µW ← µW + µL ← µL − 2 σW ·v cij 2 σL ·v cij µW − µ L ε , cij cij µW − µ L ε , cij cij and σW ← σW and σL ← σL 1− 1− 2 σW 2 ·w cij 2 σL ·w c2 ij µW − µ L ε , cij cij µW − µ L ε , cij cij . [sent-75, score-1.459]

45 Φ (t − α) For the case of a draw we have the following update equations: µi ← µi + 2 σi ·v ˜ cij µi − µ i ε , cij cij and σi ← σi 3 1− 2 σi ·w ˜ c2 ij µi − µ i ε , cij cij , and similarly for player j. [sent-77, score-1.076]

46 Φ (d) − Φ (−s) t In order to approximate the skill parameters µt and σi for all players i ∈ {1, . [sent-79, score-0.821]

47 , T } the Vanilla TrueSkill algorithm initialises each skill belief with µ 0 ← µ0 i 0 and σi ← σ0 . [sent-85, score-0.583]

48 T } in order, goes through t the game outcomes yij (k) in random order and updates the skill beliefs according to the equations above. [sent-89, score-0.897]

49 Since no knowledge is assumed about game outcomes within a given year, the results of inference should be independent of the order of games within a year. [sent-93, score-0.324]

50 More concretely, if player A beats player B and player B later turns out to be very strong (i. [sent-96, score-0.714]

51 , as evidenced by him beating very strong player C repeatedly), then Vanilla TrueSkill cannot propagate that information backwards in time to correct player A’s skill estimate upwards. [sent-98, score-1.059]

52 The basic idea is to update repeatedly on the same game outcomes but making sure that the effect of the previous update on that game outcome is removed before the new effect is added. [sent-100, score-0.43]

53 t More specifically, we go through the game outcomes yij within a year t several times until t convergence. [sent-102, score-0.387]

54 The update for a game outcome yij (k) is performed in the same way as before but saving the upward messages mf (pt (k),st )→st (st ) which describe the effect of the updated i ij i i t performance pt (k) on the underlying skill st . [sent-103, score-1.469]

55 The new downward message serves as the effective prior belief on the performance pt (k). [sent-106, score-0.22]

56 At convergence, the dependency of the inferred skills on the order i of game outcomes vanishes. [sent-107, score-0.44]

57 The first forward pass of TTT is identical to the inference pass of Vanilla TrueSkill except that the forward messages mf (st−1 ,st )→st (st ) are stored. [sent-111, score-0.202]

58 They represent the influence of skill estimate st−1 at time i i i i i t − 1 on skill estimate st at time t. [sent-112, score-1.387]

59 In the backward pass, these messages are then used to i calculate the new backward messages mf (st−1 ,st )→st−1 st−1 , which effectively serve as the i i i i new prior for time step t − 1, mf (st−1 ,st )→st−1 st−1 = i i i i ∞ f st−1 , st i i −∞ p (st ) i mf (st−1 ,st )→st (st ) i i i dst . [sent-113, score-0.578]

60 i i This procedure is repeated forward and backward along the time series of skills until convergence. [sent-114, score-0.286]

61 In the left graph there are three types of variables: skills s, performances p, performance differences d. [sent-117, score-0.258]

62 In the TTT-D graphs there are two additional types: draw margins ε and winning thresholds u. [sent-118, score-0.169]

63 3 TTT with Individual Draw Margins (TTT-D) From exploring the data it is known that the probability of draw not only increases markedly through the history of chess, but is also positively correlated with playing skill and even varies considerably across individual players. [sent-121, score-0.713]

64 Suppose each player i at every time-step t is characterised by an unknown skill st ∈ R and a player-specific draw margin εt > 0. [sent-123, score-1.187]

65 Again, performances pt (k) and pt (k) i i ij ji are drawn according to p pt (k) |st = N pt (k) ; st , β 2 . [sent-124, score-1.061]

66 In this model a game outcome ij i ij i t yij (k) between players i and j at time t is generated as follows:   +1 t −1 yij (k) =  0 if if if pt (k) > pt (k) + εt ij ji j pt (k) > pt (k) + εt ji ij i −εt ≤ pt (k) − pt (k) ≤ εt i ij ji j . [sent-125, score-2.237]

67 In addition to the Gaussian assumption about player skills as in the Vanilla TrueSkill model of Section 3. [sent-126, score-0.458]

68 1 we assume a factorising Gaussian distribution for the player-specific draw 2 margins p ε0 = N ε0 ; ν0 , ς0 and a Gaussian drift of draw margins between time steps i i t t−1 t t−1 2 given by p εi |εi = N ε ; ε , ς . [sent-127, score-0.331]

69 The factor graph for the case of win/loss is shown in Figure 1 (centre) and for the case of a draw in Figure 1 (right). [sent-128, score-0.149]

70 Note, that the positivity of the player-specific draw margins at each time step t is enforced by a factor > 0 . [sent-129, score-0.178]

71 Inference in the TTT-D model is again performed by expectation propagation, both within a given year t as well as across years in a forward backward manner. [sent-130, score-0.177]

72 Note that in this model the current belief about the skill of a player is represented by four numbers: µ t and i t t t σi for the skill and νi and ςi for the player-specific draw margin. [sent-131, score-1.511]

73 Players with a high value t of νi can be thought of as having the ability to achieve a draw against strong players, while players with a high value of µt have the ability to achieve a win. [sent-132, score-0.345]

74 5 0 1850 1872 1894 1916 1938 1960 1982 2004 Year Figure 2: (Left) Distribution over number of recorded match outcomes played per year in the ChessBase database. [sent-136, score-0.187]

75 (Right) The log-evidence P (y|β, τ ) for the TTT model as a function of the variation of player performance, β, and skill dynamics, τ . [sent-137, score-0.821]

76 4 Experiments and Results Our experiments are based on a data-set of chess match outcomes collected by ChessBase 1 . [sent-139, score-0.358]

77 5 million chess games from 1560 to 2006 played between ≈200,000 unique players. [sent-141, score-0.403]

78 The draw margin was chosen such that the a-priori probability of draw between two equally skilled players matches the overall draw probability of 30. [sent-149, score-0.597]

79 Moreover, the model has a translational invariance in the skills and a scale invariance in β/σ 0 and τ /σ0 . [sent-151, score-0.22]

80 Interestingly, the log-evidence is neither largest for τ 0 (complete de-coupling) nor for τ → 0 (constant skill over life-time) indicating that it is important to model the dynamics of Chess players. [sent-154, score-0.602]

81 In Figure 3 we have plotted the skill evolution for some well–known players of the last 150 years when fitting the TTT model (µt , σ t are shown). [sent-158, score-0.879]

82 In Figure 4 the skill evolution of the same players is plotted when fitting the TTT-D model; the dashed lines show µ t + εt 1 For more information, see http://www. [sent-159, score-0.841]

83 2 6 Figure 3: Skill evolution of top Chess players with TTT; see text for details. [sent-169, score-0.258]

84 Moreover, there seems to be a noticeable increase in overall skill since the 1980’s. [sent-173, score-0.583]

85 Looking at Figure 4 we see that players have different abilities to force a draw; the strongest player to do so is Boris Spassky (1937–). [sent-174, score-0.514]

86 This ability got stronger after 1975 which explains why the model with a fixed draw margin estimates Spassky’s skill larger. [sent-175, score-0.728]

87 Looking at individual players we see that Paul Morphy (1837–1884), “The Pride and Sorrow of Chess”, is particularly strong when comparing his skill to those of his contemporaries in the next 80 years. [sent-176, score-0.821]

88 He is considered to have been the greatest chess master of his time, and this is well supported by our analysis. [sent-177, score-0.266]

89 Also, Fischer is the only one of these players whose εt decreased over time—when he was active, he was known for the large margin by which he won! [sent-181, score-0.276]

90 Finally, Garry Kasparov (1963–) is considered the strongest Chess player of all time. [sent-182, score-0.259]

91 In fact, based on our analysis Kasparov is still considerably stronger than Vladimir Kramnik (1975–) but a contender for the crown of strongest player in the world is Viswanathan Anand (1969–), a former FIDE world champion. [sent-184, score-0.259]

92 5 Conclusion We have extended the Bayesian rating system TrueSkill to provide player ratings through time on a unified scale. [sent-185, score-0.407]

93 In addition, we introduced a new model that tracks player-specific draw margins and thus models the game outcomes even more precisely. [sent-186, score-0.389]

94 The resulting factor graph model for our large ChessBase database of game outcomes has 18. [sent-187, score-0.279]

95 One of the key questions provoked by this work concerns the comparability of skill estimates across different eras of chess history. [sent-192, score-0.849]

96 Edwards [2] points out that we would not be able to detect any skill improvement if two players of equal skill were to learn about a skill-improving breakthrough in chess theory at the same time but would only play against each other. [sent-194, score-1.67]

97 However, this argument does not rule out the possibility that with more players and chess knowledge flowing less perfectly the improvement may be detectable. [sent-195, score-0.504]

98 After all, we do see a marked improvement in the average skill of the top players. [sent-196, score-0.583]

99 In future work, we would like to address the issue of skill calibration across years further, e. [sent-197, score-0.621]

100 , by introducing a latent variable for each year that serves as the prior for new players joining the pool. [sent-199, score-0.311]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('skill', 0.583), ('trueskill', 0.341), ('chess', 0.266), ('players', 0.238), ('player', 0.238), ('st', 0.221), ('skills', 0.22), ('pt', 0.169), ('ttt', 0.154), ('game', 0.128), ('cij', 0.124), ('elo', 0.11), ('draw', 0.107), ('ij', 0.095), ('yij', 0.094), ('outcomes', 0.092), ('games', 0.086), ('rating', 0.084), ('historical', 0.077), ('spassky', 0.076), ('vanilla', 0.075), ('year', 0.073), ('mf', 0.067), ('fischer', 0.062), ('ratings', 0.059), ('chessbase', 0.051), ('kasparov', 0.051), ('wins', 0.051), ('outcome', 0.05), ('ji', 0.048), ('margins', 0.046), ('boris', 0.038), ('edo', 0.038), ('margin', 0.038), ('years', 0.038), ('backward', 0.038), ('erent', 0.037), ('past', 0.036), ('di', 0.034), ('message', 0.033), ('edwards', 0.03), ('million', 0.029), ('messages', 0.029), ('forward', 0.028), ('system', 0.026), ('anand', 0.025), ('arpad', 0.025), ('bobby', 0.025), ('chessmetrics', 0.025), ('discipline', 0.025), ('factorising', 0.025), ('garry', 0.025), ('glicko', 0.025), ('kramnik', 0.025), ('lttt', 0.025), ('matchmaking', 0.025), ('morphy', 0.025), ('viswanathan', 0.025), ('team', 0.025), ('factor', 0.025), ('minka', 0.024), ('playing', 0.023), ('ep', 0.022), ('ectively', 0.022), ('fide', 0.022), ('glickman', 0.022), ('inactivity', 0.022), ('pw', 0.022), ('pij', 0.022), ('dst', 0.022), ('fascinating', 0.022), ('vladimir', 0.022), ('played', 0.022), ('ect', 0.021), ('strongest', 0.021), ('performances', 0.021), ('infer', 0.021), ('participating', 0.02), ('evolution', 0.02), ('dynamics', 0.019), ('gb', 0.019), ('pl', 0.019), ('entities', 0.019), ('microsoft', 0.018), ('downward', 0.018), ('won', 0.018), ('inference', 0.018), ('bayesian', 0.017), ('minutes', 0.017), ('force', 0.017), ('herbrich', 0.017), ('upward', 0.017), ('reliability', 0.017), ('passing', 0.017), ('graph', 0.017), ('database', 0.017), ('tracks', 0.016), ('winning', 0.016), ('pass', 0.016), ('update', 0.016)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 208 nips-2007-TrueSkill Through Time: Revisiting the History of Chess

Author: Pierre Dangauthier, Ralf Herbrich, Tom Minka, Thore Graepel

Abstract: We extend the Bayesian skill rating system TrueSkill to infer entire time series of skills of players by smoothing through time instead of filtering. The skill of each participating player, say, every year is represented by a latent skill variable which is affected by the relevant game outcomes that year, and coupled with the skill variables of the previous and subsequent year. Inference in the resulting factor graph is carried out by approximate message passing (EP) along the time series of skills. As before the system tracks the uncertainty about player skills, explicitly models draws, can deal with any number of competing entities and can infer individual skills from team results. We extend the system to estimate player-specific draw margins. Based on these models we present an analysis of the skill curves of important players in the history of chess over the past 150 years. Results include plots of players’ lifetime skill development as well as the ability to compare the skills of different players across time. Our results indicate that a) the overall playing strength has increased over the past 150 years, and b) that modelling a player’s ability to force a draw provides significantly better predictive power. 1

2 0.15652271 165 nips-2007-Regret Minimization in Games with Incomplete Information

Author: Martin Zinkevich, Michael Johanson, Michael Bowling, Carmelo Piccione

Abstract: Extensive games are a powerful model of multiagent decision-making scenarios with incomplete information. Finding a Nash equilibrium for very large instances of these games has received a great deal of recent attention. In this paper, we describe a new technique for solving large games based on regret minimization. In particular, we introduce the notion of counterfactual regret, which exploits the degree of incomplete information in an extensive game. We show how minimizing counterfactual regret minimizes overall regret, and therefore in self-play can be used to compute a Nash equilibrium. We demonstrate this technique in the domain of poker, showing we can solve abstractions of limit Texas Hold’em with as many as 1012 states, two orders of magnitude larger than previous methods. 1

3 0.13189687 176 nips-2007-Sequential Hypothesis Testing under Stochastic Deadlines

Author: Peter Frazier, Angela J. Yu

Abstract: Most models of decision-making in neuroscience assume an infinite horizon, which yields an optimal solution that integrates evidence up to a fixed decision threshold; however, under most experimental as well as naturalistic behavioral settings, the decision has to be made before some finite deadline, which is often experienced as a stochastic quantity, either due to variable external constraints or internal timing uncertainty. In this work, we formulate this problem as sequential hypothesis testing under a stochastic horizon. We use dynamic programming tools to show that, for a large class of deadline distributions, the Bayes-optimal solution requires integrating evidence up to a threshold that declines monotonically over time. We use numerical simulations to illustrate the optimal policy in the special cases of a fixed deadline and one that is drawn from a gamma distribution.

4 0.12827215 55 nips-2007-Computing Robust Counter-Strategies

Author: Michael Johanson, Martin Zinkevich, Michael Bowling

Abstract: Adaptation to other initially unknown agents often requires computing an effective counter-strategy. In the Bayesian paradigm, one must find a good counterstrategy to the inferred posterior of the other agents’ behavior. In the experts paradigm, one may want to choose experts that are good counter-strategies to the other agents’ expected behavior. In this paper we introduce a technique for computing robust counter-strategies for adaptation in multiagent scenarios under a variety of paradigms. The strategies can take advantage of a suspected tendency in the decisions of the other agents, while bounding the worst-case performance when the tendency is not observed. The technique involves solving a modified game, and therefore can make use of recently developed algorithms for solving very large extensive games. We demonstrate the effectiveness of the technique in two-player Texas Hold’em. We show that the computed poker strategies are substantially more robust than best response counter-strategies, while still exploiting a suspected tendency. We also compose the generated strategies in an experts algorithm showing a dramatic improvement in performance over using simple best responses. 1

5 0.12759317 102 nips-2007-Incremental Natural Actor-Critic Algorithms

Author: Shalabh Bhatnagar, Mohammad Ghavamzadeh, Mark Lee, Richard S. Sutton

Abstract: We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochastic gradient descent. Methods based on policy gradients in this way are of special interest because of their compatibility with function approximation methods, which are needed to handle large or infinite state spaces. The use of temporal difference learning in this way is of interest because in many applications it dramatically reduces the variance of the gradient estimates. The use of the natural gradient is of interest because it can produce better conditioned parameterizations and has been shown to further reduce variance in some cases. Our results extend prior two-timescale convergence results for actor-critic methods by Konda and Tsitsiklis by using temporal difference learning in the actor and by incorporating natural gradients, and they extend prior empirical studies of natural actor-critic methods by Peters, Vijayakumar and Schaal by providing the first convergence proofs and the first fully incremental algorithms. 1

6 0.092368506 54 nips-2007-Computational Equivalence of Fixed Points and No Regret Algorithms, and Convergence to Equilibria

7 0.077623136 151 nips-2007-Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs

8 0.071075447 86 nips-2007-Exponential Family Predictive Representations of State

9 0.068858735 123 nips-2007-Loop Series and Bethe Variational Bounds in Attractive Graphical Models

10 0.068034396 57 nips-2007-Congruence between model and human attention reveals unique signatures of critical visual events

11 0.065756433 5 nips-2007-A Game-Theoretic Approach to Apprenticeship Learning

12 0.064764179 127 nips-2007-Measuring Neural Synchrony by Message Passing

13 0.062928043 92 nips-2007-Fixing Max-Product: Convergent Message Passing Algorithms for MAP LP-Relaxations

14 0.052399263 197 nips-2007-The Infinite Markov Model

15 0.050383076 199 nips-2007-The Price of Bandit Information for Online Optimization

16 0.046001721 191 nips-2007-Temporal Difference Updating without a Learning Rate

17 0.040087722 19 nips-2007-Active Preference Learning with Discrete Choice Data

18 0.039271899 214 nips-2007-Variational inference for Markov jump processes

19 0.036148995 187 nips-2007-Structured Learning with Approximate Inference

20 0.035537429 168 nips-2007-Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.107), (1, -0.134), (2, 0.024), (3, -0.026), (4, 0.137), (5, -0.046), (6, -0.052), (7, 0.031), (8, 0.022), (9, 0.017), (10, -0.129), (11, -0.101), (12, -0.018), (13, 0.02), (14, 0.078), (15, 0.02), (16, -0.011), (17, 0.061), (18, -0.128), (19, 0.14), (20, -0.019), (21, 0.104), (22, 0.164), (23, 0.051), (24, 0.067), (25, 0.136), (26, -0.054), (27, 0.07), (28, 0.02), (29, -0.093), (30, -0.008), (31, -0.095), (32, 0.017), (33, -0.012), (34, 0.202), (35, -0.081), (36, 0.06), (37, -0.085), (38, 0.076), (39, 0.013), (40, -0.026), (41, -0.228), (42, -0.112), (43, -0.017), (44, -0.004), (45, -0.055), (46, -0.012), (47, -0.015), (48, -0.09), (49, 0.014)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97404617 208 nips-2007-TrueSkill Through Time: Revisiting the History of Chess

Author: Pierre Dangauthier, Ralf Herbrich, Tom Minka, Thore Graepel

Abstract: We extend the Bayesian skill rating system TrueSkill to infer entire time series of skills of players by smoothing through time instead of filtering. The skill of each participating player, say, every year is represented by a latent skill variable which is affected by the relevant game outcomes that year, and coupled with the skill variables of the previous and subsequent year. Inference in the resulting factor graph is carried out by approximate message passing (EP) along the time series of skills. As before the system tracks the uncertainty about player skills, explicitly models draws, can deal with any number of competing entities and can infer individual skills from team results. We extend the system to estimate player-specific draw margins. Based on these models we present an analysis of the skill curves of important players in the history of chess over the past 150 years. Results include plots of players’ lifetime skill development as well as the ability to compare the skills of different players across time. Our results indicate that a) the overall playing strength has increased over the past 150 years, and b) that modelling a player’s ability to force a draw provides significantly better predictive power. 1

2 0.57705039 57 nips-2007-Congruence between model and human attention reveals unique signatures of critical visual events

Author: Robert Peters, Laurent Itti

Abstract: Current computational models of bottom-up and top-down components of attention are predictive of eye movements across a range of stimuli and of simple, fixed visual tasks (such as visual search for a target among distractors). However, to date there exists no computational framework which can reliably mimic human gaze behavior in more complex environments and tasks, such as driving a vehicle through traffic. Here, we develop a hybrid computational/behavioral framework, combining simple models for bottom-up salience and top-down relevance, and looking for changes in the predictive power of these components at different critical event times during 4.7 hours (500,000 video frames) of observers playing car racing and flight combat video games. This approach is motivated by our observation that the predictive strengths of the salience and relevance models exhibit reliable temporal signatures during critical event windows in the task sequence—for example, when the game player directly engages an enemy plane in a flight combat game, the predictive strength of the salience model increases significantly, while that of the relevance model decreases significantly. Our new framework combines these temporal signatures to implement several event detectors. Critically, we find that an event detector based on fused behavioral and stimulus information (in the form of the model’s predictive strength) is much stronger than detectors based on behavioral information alone (eye position) or image information alone (model prediction maps). This approach to event detection, based on eye tracking combined with computational models applied to the visual input, may have useful applications as a less-invasive alternative to other event detection approaches based on neural signatures derived from EEG or fMRI recordings. 1

3 0.5312292 127 nips-2007-Measuring Neural Synchrony by Message Passing

Author: Justin Dauwels, François Vialatte, Tomasz Rutkowski, Andrzej S. Cichocki

Abstract: A novel approach to measure the interdependence of two time series is proposed, referred to as “stochastic event synchrony” (SES); it quantifies the alignment of two point processes by means of the following parameters: time delay, variance of the timing jitter, fraction of “spurious” events, and average similarity of events. SES may be applied to generic one-dimensional and multi-dimensional point processes, however, the paper mainly focusses on point processes in time-frequency domain. The average event similarity is in that case described by two parameters: the average frequency offset between events in the time-frequency plane, and the variance of the frequency offset (“frequency jitter”); SES then consists of five parameters in total. Those parameters quantify the synchrony of oscillatory events, and hence, they provide an alternative to existing synchrony measures that quantify amplitude or phase synchrony. The pairwise alignment of point processes is cast as a statistical inference problem, which is solved by applying the maxproduct algorithm on a graphical model. The SES parameters are determined from the resulting pairwise alignment by maximum a posteriori (MAP) estimation. The proposed interdependence measure is applied to the problem of detecting anomalies in EEG synchrony of Mild Cognitive Impairment (MCI) patients; the results indicate that SES significantly improves the sensitivity of EEG in detecting MCI.

4 0.51519811 55 nips-2007-Computing Robust Counter-Strategies

Author: Michael Johanson, Martin Zinkevich, Michael Bowling

Abstract: Adaptation to other initially unknown agents often requires computing an effective counter-strategy. In the Bayesian paradigm, one must find a good counterstrategy to the inferred posterior of the other agents’ behavior. In the experts paradigm, one may want to choose experts that are good counter-strategies to the other agents’ expected behavior. In this paper we introduce a technique for computing robust counter-strategies for adaptation in multiagent scenarios under a variety of paradigms. The strategies can take advantage of a suspected tendency in the decisions of the other agents, while bounding the worst-case performance when the tendency is not observed. The technique involves solving a modified game, and therefore can make use of recently developed algorithms for solving very large extensive games. We demonstrate the effectiveness of the technique in two-player Texas Hold’em. We show that the computed poker strategies are substantially more robust than best response counter-strategies, while still exploiting a suspected tendency. We also compose the generated strategies in an experts algorithm showing a dramatic improvement in performance over using simple best responses. 1

5 0.46891338 165 nips-2007-Regret Minimization in Games with Incomplete Information

Author: Martin Zinkevich, Michael Johanson, Michael Bowling, Carmelo Piccione

Abstract: Extensive games are a powerful model of multiagent decision-making scenarios with incomplete information. Finding a Nash equilibrium for very large instances of these games has received a great deal of recent attention. In this paper, we describe a new technique for solving large games based on regret minimization. In particular, we introduce the notion of counterfactual regret, which exploits the degree of incomplete information in an extensive game. We show how minimizing counterfactual regret minimizes overall regret, and therefore in self-play can be used to compute a Nash equilibrium. We demonstrate this technique in the domain of poker, showing we can solve abstractions of limit Texas Hold’em with as many as 1012 states, two orders of magnitude larger than previous methods. 1

6 0.43891206 176 nips-2007-Sequential Hypothesis Testing under Stochastic Deadlines

7 0.41398332 191 nips-2007-Temporal Difference Updating without a Learning Rate

8 0.37201449 102 nips-2007-Incremental Natural Actor-Critic Algorithms

9 0.32249114 86 nips-2007-Exponential Family Predictive Representations of State

10 0.31217325 151 nips-2007-Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs

11 0.29078189 123 nips-2007-Loop Series and Bethe Variational Bounds in Attractive Graphical Models

12 0.27725571 197 nips-2007-The Infinite Markov Model

13 0.25837263 52 nips-2007-Competition Adds Complexity

14 0.25387523 214 nips-2007-Variational inference for Markov jump processes

15 0.2448047 19 nips-2007-Active Preference Learning with Discrete Choice Data

16 0.22839986 54 nips-2007-Computational Equivalence of Fixed Points and No Regret Algorithms, and Convergence to Equilibria

17 0.20571834 28 nips-2007-Augmented Functional Time Series Representation and Forecasting with Gaussian Processes

18 0.19960582 187 nips-2007-Structured Learning with Approximate Inference

19 0.19239254 5 nips-2007-A Game-Theoretic Approach to Apprenticeship Learning

20 0.191122 77 nips-2007-Efficient Inference for Distributions on Permutations


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.04), (13, 0.054), (16, 0.026), (18, 0.019), (21, 0.042), (31, 0.022), (34, 0.039), (35, 0.028), (42, 0.421), (47, 0.069), (83, 0.048), (85, 0.012), (87, 0.028), (90, 0.045)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.74278849 208 nips-2007-TrueSkill Through Time: Revisiting the History of Chess

Author: Pierre Dangauthier, Ralf Herbrich, Tom Minka, Thore Graepel

Abstract: We extend the Bayesian skill rating system TrueSkill to infer entire time series of skills of players by smoothing through time instead of filtering. The skill of each participating player, say, every year is represented by a latent skill variable which is affected by the relevant game outcomes that year, and coupled with the skill variables of the previous and subsequent year. Inference in the resulting factor graph is carried out by approximate message passing (EP) along the time series of skills. As before the system tracks the uncertainty about player skills, explicitly models draws, can deal with any number of competing entities and can infer individual skills from team results. We extend the system to estimate player-specific draw margins. Based on these models we present an analysis of the skill curves of important players in the history of chess over the past 150 years. Results include plots of players’ lifetime skill development as well as the ability to compare the skills of different players across time. Our results indicate that a) the overall playing strength has increased over the past 150 years, and b) that modelling a player’s ability to force a draw provides significantly better predictive power. 1

2 0.43434229 36 nips-2007-Better than least squares: comparison of objective functions for estimating linear-nonlinear models

Author: Tatyana Sharpee

Abstract: This paper compares a family of methods for characterizing neural feature selectivity with natural stimuli in the framework of the linear-nonlinear model. In this model, the neural firing rate is a nonlinear function of a small number of relevant stimulus components. The relevant stimulus dimensions can be found by maximizing one of the family of objective functions, R´ nyi divergences of different e orders [1, 2]. We show that maximizing one of them, R´ nyi divergence of ore der 2, is equivalent to least-square fitting of the linear-nonlinear model to neural data. Next, we derive reconstruction errors in relevant dimensions found by maximizing R´ nyi divergences of arbitrary order in the asymptotic limit of large spike e numbers. We find that the smallest errors are obtained with R´ nyi divergence of e order 1, also known as Kullback-Leibler divergence. This corresponds to finding relevant dimensions by maximizing mutual information [2]. We numerically test how these optimization schemes perform in the regime of low signal-to-noise ratio (small number of spikes and increasing neural noise) for model visual neurons. We find that optimization schemes based on either least square fitting or information maximization perform well even when number of spikes is small. Information maximization provides slightly, but significantly, better reconstructions than least square fitting. This makes the problem of finding relevant dimensions, together with the problem of lossy compression [3], one of examples where informationtheoretic measures are no more data limited than those derived from least squares. 1

3 0.41517386 128 nips-2007-Message Passing for Max-weight Independent Set

Author: Sujay Sanghavi, Devavrat Shah, Alan S. Willsky

Abstract: We investigate the use of message-passing algorithms for the problem of finding the max-weight independent set (MWIS) in a graph. First, we study the performance of loopy max-product belief propagation. We show that, if it converges, the quality of the estimate is closely related to the tightness of an LP relaxation of the MWIS problem. We use this relationship to obtain sufficient conditions for correctness of the estimate. We then develop a modification of max-product – one that converges to an optimal solution of the dual of the MWIS problem. We also develop a simple iterative algorithm for estimating the max-weight independent set from this dual solution. We show that the MWIS estimate obtained using these two algorithms in conjunction is correct when the graph is bipartite and the MWIS is unique. Finally, we show that any problem of MAP estimation for probability distributions over finite domains can be reduced to an MWIS problem. We believe this reduction will yield new insights and algorithms for MAP estimation. 1

4 0.27115971 95 nips-2007-HM-BiTAM: Bilingual Topic Exploration, Word Alignment, and Translation

Author: Bing Zhao, Eric P. Xing

Abstract: We present a novel paradigm for statistical machine translation (SMT), based on a joint modeling of word alignment and the topical aspects underlying bilingual document-pairs, via a hidden Markov Bilingual Topic AdMixture (HM-BiTAM). In this paradigm, parallel sentence-pairs from a parallel document-pair are coupled via a certain semantic-flow, to ensure coherence of topical context in the alignment of mapping words between languages, likelihood-based training of topic-dependent translational lexicons, as well as in the inference of topic representations in each language. The learned HM-BiTAM can not only display topic patterns like methods such as LDA [1], but now for bilingual corpora; it also offers a principled way of inferring optimal translation using document context. Our method integrates the conventional model of HMM — a key component for most of the state-of-the-art SMT systems, with the recently proposed BiTAM model [10]; we report an extensive empirical analysis (in many ways complementary to the description-oriented [10]) of our method in three aspects: bilingual topic representation, word alignment, and translation.

5 0.27001438 86 nips-2007-Exponential Family Predictive Representations of State

Author: David Wingate, Satinder S. Baveja

Abstract: In order to represent state in controlled, partially observable, stochastic dynamical systems, some sort of sufficient statistic for history is necessary. Predictive representations of state (PSRs) capture state as statistics of the future. We introduce a new model of such systems called the “Exponential family PSR,” which defines as state the time-varying parameters of an exponential family distribution which models n sequential observations in the future. This choice of state representation explicitly connects PSRs to state-of-the-art probabilistic modeling, which allows us to take advantage of current efforts in high-dimensional density estimation, and in particular, graphical models and maximum entropy models. We present a parameter learning algorithm based on maximum likelihood, and we show how a variety of current approximate inference methods apply. We evaluate the quality of our model with reinforcement learning by directly evaluating the control performance of the model. 1

6 0.26744559 140 nips-2007-Neural characterization in partially observed populations of spiking neurons

7 0.2673102 63 nips-2007-Convex Relaxations of Latent Variable Training

8 0.26632124 100 nips-2007-Hippocampal Contributions to Control: The Third Way

9 0.26609594 168 nips-2007-Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods

10 0.26607338 153 nips-2007-People Tracking with the Laplacian Eigenmaps Latent Variable Model

11 0.2659415 93 nips-2007-GRIFT: A graphical model for inferring visual classification features from human data

12 0.26505178 138 nips-2007-Near-Maximum Entropy Models for Binary Neural Representations of Natural Images

13 0.26426542 18 nips-2007-A probabilistic model for generating realistic lip movements from speech

14 0.26406908 34 nips-2007-Bayesian Policy Learning with Trans-Dimensional MCMC

15 0.26406604 84 nips-2007-Expectation Maximization and Posterior Constraints

16 0.26312691 105 nips-2007-Infinite State Bayes-Nets for Structured Domains

17 0.26084974 122 nips-2007-Locality and low-dimensions in the prediction of natural experience from fMRI

18 0.26066273 164 nips-2007-Receptive Fields without Spike-Triggering

19 0.26045999 17 nips-2007-A neural network implementing optimal state estimation based on dynamic spike train decoding

20 0.25998208 73 nips-2007-Distributed Inference for Latent Dirichlet Allocation