nips nips2004 nips2004-122 knowledge-graph by maker-knowledge-mining

122 nips-2004-Modelling Uncertainty in the Game of Go


Source: pdf

Author: David H. Stern, Thore Graepel, David MacKay

Abstract: Go is an ancient oriental game whose complexity has defeated attempts to automate it. We suggest using probability in a Bayesian sense to model the uncertainty arising from the vast complexity of the game tree. We present a simple conditional Markov random field model for predicting the pointwise territory outcome of a game. The topology of the model reflects the spatial structure of the Go board. We describe a version of the Swendsen-Wang process for sampling from the model during learning and apply loopy belief propagation for rapid inference and prediction. The model is trained on several hundred records of professional games. Our experimental results indicate that the model successfully learns to predict territory despite its simplicity. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 uk Abstract Go is an ancient oriental game whose complexity has defeated attempts to automate it. [sent-12, score-0.18]

2 We suggest using probability in a Bayesian sense to model the uncertainty arising from the vast complexity of the game tree. [sent-13, score-0.239]

3 We present a simple conditional Markov random field model for predicting the pointwise territory outcome of a game. [sent-14, score-0.638]

4 We describe a version of the Swendsen-Wang process for sampling from the model during learning and apply loopy belief propagation for rapid inference and prediction. [sent-16, score-0.251]

5 Our experimental results indicate that the model successfully learns to predict territory despite its simplicity. [sent-18, score-0.578]

6 1 Introduction The game of Go originated in China over 4000 years ago. [sent-19, score-0.18]

7 Two players, Black and White, take turns to place stones on the intersections of an N × N grid (usually N = 19 but smaller boards are in use as well). [sent-23, score-0.25]

8 Players place their stones in order to create territory by occupying or surrounding areas of the board. [sent-25, score-0.828]

9 The player with the most territory at the end of the game is the winner. [sent-26, score-0.821]

10 A stone is captured if it has been completely surrounded (in the horizontal and vertical directions) by stones of the opponent’s colour. [sent-27, score-0.299]

11 Stones in a contiguous ‘chain’ have the common fate property: they are captured all together or not at all [1]. [sent-28, score-0.067]

12 The game that emerges from these simple rules has a complexity that defeats attempts to apply minimax search. [sent-29, score-0.18]

13 The best Go programs play only at the level of weak amateur Go players and Go is therefore considered to be a serious AI challenge not unlike Chess in the 1960s. [sent-30, score-0.104]

14 There are two main reasons for this state of affairs: firstly, the high branching factor of Go (typically 200 to 300 potential moves per position) prevents the expansion of a game tree to any useful depth. [sent-31, score-0.254]

15 Go players evaluate positions using visual pattern recognition and qualitative intuitions which are difficult to formalise. [sent-34, score-0.148]

16 Most Go programs rely on a large amount of hand-tailored rules and expert knowl- edge [2]. [sent-35, score-0.067]

17 Schraudolph, Dayan and Sejnowski [3] trained a multi-layer perceptron to evaluate board positions by temporal difference learning. [sent-37, score-0.357]

18 Enzenberger [4] improved on this by structuring the topologies of his neural networks according to the relationships between stones on the board. [sent-38, score-0.25]

19 [1] made use of the common fate property of chains to construct an efficient graph-based representation of the board. [sent-40, score-0.106]

20 Our starting point is the uncertainty about the future course of the game that arises from the vast complexity of the game tree. [sent-42, score-0.419]

21 Taste lingers, and likewise the influence of a Go stone lingers (even if it appears weak or dead) because of the uncertainty of the effect it may have in the future. [sent-46, score-0.124]

22 We use a probabilistic model that takes the current board position and predicts for every intersection of the board if it will be Black or White territory. [sent-47, score-0.63]

23 Given such a model the score of the game can be predicted and hence an evaluation function produced. [sent-48, score-0.223]

24 2 Models for Predicting Territory Consider the Go board as an undirected Graph G = (N , E) with N = Nx ×Ny nodes n ∈ N representing vertices on the board and edges e ∈ E connecting vertically and horizontally neighbouring points. [sent-50, score-0.702]

25 We can denote a position as the vector c ∈ {Black, White, Empty}N for cn = c(n) and similarly the final territory outcome of the game as s ∈ {+1, −1}N for sn = s(n). [sent-51, score-0.904]

26 For convenience we score from the point of view of Black so elements of s representing Black territory are valued +1 and elements representing white territory are valued −1. [sent-52, score-1.291]

27 Go players will note that we are adopting the Chinese method of scoring empty as well as occupied intersections. [sent-53, score-0.183]

28 The distribution we wish to model is P (s|c), that is, the distribution over final territory outcomes given the current position. [sent-54, score-0.648]

29 • Most importantly, the detailed outcomes provide us with a simple evaluation function for Go positions by the expected score, u(c) := i si P (s|c) . [sent-56, score-0.284]

30 An alternative (and probably better) evaluation function is given by the probability of winning which takes the form P (Black wins) = P ( i si > komi), where komi refers to the winning threshold for Black. [sent-57, score-0.232]

31 • Connectivity of stones is vital because stones can draw strength from other stones. [sent-58, score-0.5]

32 Connectivity could be measured by the correlation between nodes under the distribution P (s|c). [sent-59, score-0.062]

33 This would allow us to segment the board into ‘groups’ of stones to reduce complexity. [sent-60, score-0.54]

34 • It would also be useful to observe cases where we have an anti-correlation between nodes in the territory prediction. [sent-61, score-0.64]

35 • The fate of a group of Go stones could be estimated from the distribution P (s|c) by marginalising out the nodes not involved. [sent-63, score-0.379]

36 The way stones exert long range influence can be considered recursive. [sent-64, score-0.25]

37 A stone influences its neighbours, who influence their neighbours and so on. [sent-65, score-0.089]

38 A simple model which exploits this idea is to consider the Go board itself as an undirected graphical model in the form of a Conditional Random Field (CRF) [5]. [sent-66, score-0.29]

39 We factorize the distribution as   1 1 P (s|c) = ψf (sf , cf , θ f ) = exp  log(ψf (sf , cf , θ f )) . [sent-67, score-0.122]

40 Z(c, θ) Z(c, θ) f ∈F f ∈F (1) The simplest form of this model has one factor for each pair of neighbouring nodes i, j so ψf (sf , cf , θ f ) = ψf (si , sj , ci , cj , θ f ). [sent-68, score-0.548]

41 Boltzmann5 For our first model we decompose the factors into ‘coupling’ terms and ‘external field’ terms as follows:   1 P (s|c) = exp  {w(ci , cj )si sj + h(ci )si + h(cj )sj } (2) Z(c, θ) (i,j)∈F This gives a Boltzmann machine whose connections have the grid topology of the board. [sent-69, score-0.312]

42 The couplings between territory-outcome nodes depend on the current board position local to those nodes and the external field at each node is determined by the state of the board at that location. [sent-70, score-0.837]

43 We assume that Go positions with their associated territory positions are symmetric with respect to colour reversal so ψf (si , sj , ci , cj , θ f ) = ψf (−si , −sj , −ci , −cj , θ f ). [sent-71, score-1.131]

44 Pairwise connections are also invariant to direction reversal so ψf (si , sj , ci , cj , θ f ) = ψf (sj , si , cj , ci , θ f ). [sent-72, score-0.691]

45 For example we would expect wchains to take on a large positive value since chains have common fate. [sent-76, score-0.116]

46 BoltzmannLiberties A feature that has particular utility for evaluating Go positions is the number of liberties associated with a chain of stones. [sent-77, score-0.191]

47 A liberty of a chain is an empty vertex adjacent to it. [sent-78, score-0.295]

48 The number of liberties indicates a chain’s safety because the opponent would have to occupy all the liberties to capture the chain. [sent-79, score-0.183]

49 Our second model takes this information into account:   1 P (s|c) = exp  w(ci , cj , si , sj , li , lj ) , (3) Z(c, θ) (i,j)∈F where li is element i of a vector l ∈ {+1, +2, +3, 4 or more}N the liberty count of each vertex on the Go board. [sent-80, score-0.562]

50 A group with four or more liberties is considered relatively safe. [sent-81, score-0.077]

51 We trained the two models using board positions from a database of 22,000 games between expert Go players1 . [sent-84, score-0.441]

52 The territory outcomes of a subset of these games 1 The GoGoD database, April 2003. [sent-85, score-0.688]

53 Shown are the differences between the running averages and the exact marginals for each of the 361 nodes plotted as a function of the number of wholeboard samples. [sent-91, score-0.108]

54 Each training example comprised a board position c, with its associated territory outcome s. [sent-93, score-0.954]

55 3 Inference Methods It is possible to perform exact inference on the model by variable elimination [6]. [sent-96, score-0.084]

56 Eliminating nodes one diagonal at a time gave an efficient computation. [sent-97, score-0.062]

57 The cost of exact inference was still too high for general use but it was used to compare other inference methods. [sent-98, score-0.097]

58 Sampling The standard method for sampling from a Boltzmann machine is to use Gibbs sampling where each node is updated one at a time, conditional on the others. [sent-99, score-0.108]

59 The original Swendsen-Wang algorithm samples from a ferromagnetic Ising model with no external field by adding an additional set of ‘bond’ nodes d, one attached to each factor (edge) in the original graph. [sent-102, score-0.088]

60 Each of these nodes can either be in the state ‘bond’ or ‘no bond’. [sent-103, score-0.09]

61 The new factor potentials ψf (sf , cf , df , θ f ) are chosen such that if a bond exists between a pair of spins then they are forced to be in the same state. [sent-104, score-0.245]

62 Conditional on the bonds, each cluster has an equal probability of having all its spins in the ‘up’ state or all in the ‘down’ state. [sent-105, score-0.095]

63 The new factor potentials ψf (sf , cf , df , θ f ) have the following effect: if the coupling is positive then when the d node is in the ‘bond’ state it forces the two spins to be in the same state; if the coupling is negative the ‘bond’ state forces the two spins to be opposite. [sent-108, score-0.382]

64 Figure 1 shows that the mixing rate of the sampling process is improved by using Swendsen-Wang allowing us to find accurate marginals for a single position in a couple of seconds. [sent-110, score-0.115]

65 html Loopy Belief Propagation In order to perform very rapid (approximate) inference we used the loopy belief propagation (BP) algorithm [9] and the results are examined in Section 4. [sent-114, score-0.209]

66 This algorithm is similar to an influence function [10], as often used by Go programmers to segment the board into Black and White territory and for this reason is laid out below. [sent-115, score-0.868]

67 For each board vertex j ∈ N , create a data structure called a node containing: 1. [sent-116, score-0.359]

68 A(j), the set of nodes corresponding to the neighbours of vertex j, 2. [sent-117, score-0.171]

69 a set of new messages mnew (sj ) ∈ Mnew , one for each i ∈ A(j), ij 3. [sent-118, score-0.185]

70 a set of old messages mold (sj ) ∈ Mold , one for each i ∈ A(j), ij 4. [sent-119, score-0.108]

71 ψ(i,j) (si , sj ) is the factor potential (see (1)) and in the case of Boltzmann5 takes on the form ψ(i,j) (si , sj ) = exp (w(ci , cj )si sj + h(ci )si + h(cj )sj ). [sent-123, score-0.752]

72 Now the probability of each vertex being Black or White territory is found by normalising the beliefs at each node. [sent-124, score-0.647]

73 For example P (sj = Black) = bj (Black)/Z where Z = bj (Black) + bj (White). [sent-125, score-0.183]

74 The accuracy of the loopy BP approximation appears to be improved by using it during the parameter learning stage in cases where it is to be used in evaluation. [sent-126, score-0.106]

75 This model was trained on 290 positions from expert Go games at move 80. [sent-128, score-0.227]

76 (a) Boltzmann5 (Exact) (b) Boltzmann5 (Loopy BP) Figure 2: Comparing territory predictions for a Go position from a professional game at move 90. [sent-130, score-0.996]

77 The small black and white squares at each vertex represent the average territory prediction at that vertex, from −1 (maximum white square) to +1 (maximum black square). [sent-132, score-1.211]

78 For example wchains corresponds to the correlation between the likely territory outcome of two adjacent vertices in a chain of connected stones. [sent-139, score-0.761]

79 Interestingly, the value of the parameter wempty (corresponding to the coupling between territory predictions of neighbouring vertices in empty space) is 0. [sent-141, score-0.902]

80 Territory Predictions Figure 2 gives examples of territory predictions generated by Boltzmann5. [sent-144, score-0.632]

81 In comparison, Figure 3 shows the prediction of BoltzmannLiberties and a territory prediction from The Many Faces of Go [2]. [sent-145, score-0.578]

82 Go players confirm that the territory predictions produced by the models are reasonable, even around loose groups of Black and White stones. [sent-146, score-0.713]

83 Compare Figures 2 (a) and 3 (a); when liberty counts are included as features, the model can more confidently identify which of the two small chains competing in the bottom right of the board is dead. [sent-147, score-0.406]

84 Comparing Figure 2 (a) and (b) Loopy BP appears to give over-confident predictions in the top right of the board where few stones are present. [sent-148, score-0.594]

85 However, it is a good approximation where many stones are present (bottom left). [sent-149, score-0.25]

86 Comparing Models and Inference Methods Figure 4 shows cross-entropies between model territory predictions and true final territory outcomes for a dataset of expert games. [sent-150, score-1.324]

87 BoltzmannLiberties performs better than Boltzmann5 (when using Swendsen-Wang) the difference in (a) BoltzmannLiberties (Exact) (b) Many Faces of Go Figure 3: Diagram (a) is produced by exact inference (training was also by Loopy BP). [sent-153, score-0.06]

88 Diagram (b) shows the territory predicted by The Many Faces of Go (MFG) [2]. [sent-154, score-0.578]

89 MFG uses of a rule-based expert system and its prediction for each vertex has three possible values: ‘White’, ‘Black’ or ‘unknown/neutral’. [sent-155, score-0.113]

90 performance increasing later in the game when liberty counts become more useful. [sent-156, score-0.257]

91 A measure of performance of such a model is the likelihood it assigns to professional moves as measured by log P (move|model). [sent-158, score-0.104]

92 (4) games moves We can obtain a probability over moves by choosing a Gibbs distribution with the negative energy replaced by the evaluation function, P (move|model, w) = eβu(c ,w) Z(w) (5) where u(c , w) is an evaluation function evaluated at the board position c resulting from a given move. [sent-159, score-0.558]

93 The inverse temperature parameter β determines the degree to which the move made depends on its evaluation. [sent-160, score-0.076]

94 The territory predictions from the models Boltzmann5 and BoltzmannLiberties can be combined with the evaluation function of Section 2 to produce position evaluators. [sent-161, score-0.725]

95 6 Conclusions We have presented a probabilistic framework for modelling uncertainty in the game of Go. [sent-162, score-0.253]

96 A simple model which incorporates the spatial structure of a board position can perform well at predicting the territory outcomes of Go games. [sent-163, score-0.988]

97 The models described here could be improved by extracting more features from board positions and increasing the size of the factors (see (1)). [sent-164, score-0.357]

98 5 Swendsen−Wang B5 BLib Move 20 B5 BLib Move 80 B5 BLib Move 150 B5 BLib Move 20 B5 BLib Move 80 B5 BLib Move 150 N 1 Figure 4: Cross entropies N n [sn log sn + (1 − sn ) log(1 − sn )] between actual and predicted territory outcomes, sn and n for 327 Go positions. [sent-173, score-0.818]

99 3 board positions were analysed for each game (moves 20, 80 and 150). [sent-175, score-0.537]

100 Temporal difference learning of position evaluation in the game of go. [sent-195, score-0.273]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('territory', 0.578), ('board', 0.29), ('stones', 0.25), ('go', 0.24), ('sj', 0.22), ('game', 0.18), ('mnew', 0.154), ('black', 0.147), ('white', 0.135), ('blib', 0.135), ('boltzmannliberties', 0.116), ('loopy', 0.106), ('si', 0.104), ('empty', 0.102), ('cj', 0.092), ('bond', 0.086), ('bp', 0.082), ('players', 0.081), ('liberties', 0.077), ('liberty', 0.077), ('mold', 0.077), ('swendsen', 0.077), ('wchains', 0.077), ('sf', 0.076), ('ci', 0.076), ('move', 0.076), ('outcomes', 0.07), ('vertex', 0.069), ('positions', 0.067), ('fate', 0.067), ('spins', 0.067), ('nodes', 0.062), ('cf', 0.061), ('bj', 0.061), ('sn', 0.06), ('professional', 0.058), ('wempty', 0.058), ('sy', 0.057), ('url', 0.057), ('predictions', 0.054), ('xy', 0.051), ('coupling', 0.05), ('position', 0.05), ('stone', 0.049), ('chain', 0.047), ('moves', 0.046), ('gibbs', 0.045), ('expert', 0.044), ('evaluation', 0.043), ('graepel', 0.043), ('sampling', 0.042), ('neighbours', 0.04), ('games', 0.04), ('chains', 0.039), ('hstones', 0.039), ('komi', 0.039), ('lingers', 0.039), ('mfg', 0.039), ('taste', 0.039), ('wchain', 0.039), ('belief', 0.039), ('di', 0.037), ('inference', 0.037), ('modelling', 0.037), ('neighbouring', 0.037), ('end', 0.037), ('outcome', 0.036), ('uncertainty', 0.036), ('winter', 0.034), ('faces', 0.032), ('erence', 0.031), ('messages', 0.031), ('thore', 0.031), ('generalisation', 0.031), ('reversal', 0.031), ('df', 0.031), ('ising', 0.031), ('opponent', 0.029), ('couplings', 0.029), ('bonds', 0.029), ('eld', 0.028), ('state', 0.028), ('wang', 0.027), ('propagation', 0.027), ('mackay', 0.027), ('learnt', 0.027), ('external', 0.026), ('japanese', 0.026), ('player', 0.026), ('uence', 0.025), ('elimination', 0.024), ('boltzmann', 0.024), ('conditional', 0.024), ('winning', 0.023), ('exact', 0.023), ('vertices', 0.023), ('programs', 0.023), ('vast', 0.023), ('marginals', 0.023), ('playing', 0.023)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 122 nips-2004-Modelling Uncertainty in the Game of Go

Author: David H. Stern, Thore Graepel, David MacKay

Abstract: Go is an ancient oriental game whose complexity has defeated attempts to automate it. We suggest using probability in a Bayesian sense to model the uncertainty arising from the vast complexity of the game tree. We present a simple conditional Markov random field model for predicting the pointwise territory outcome of a game. The topology of the model reflects the spatial structure of the Go board. We describe a version of the Swendsen-Wang process for sampling from the model during learning and apply loopy belief propagation for rapid inference and prediction. The model is trained on several hundred records of professional games. Our experimental results indicate that the model successfully learns to predict territory despite its simplicity. 1

2 0.094357572 203 nips-2004-Validity Estimates for Loopy Belief Propagation on Binary Real-world Networks

Author: Joris M. Mooij, Hilbert J. Kappen

Abstract: We introduce a computationally efficient method to estimate the validity of the BP method as a function of graph topology, the connectivity strength, frustration and network size. We present numerical results that demonstrate the correctness of our estimates for the uniform random model and for a real-world network (“C. Elegans”). Although the method is restricted to pair-wise interactions, no local evidence (zero “biases”) and binary variables, we believe that its predictions correctly capture the limitations of BP for inference and MAP estimation on arbitrary graphical models. Using this approach, we find that BP always performs better than MF. Especially for large networks with broad degree distributions (such as scale-free networks) BP turns out to significantly outperform MF. 1

3 0.091002651 116 nips-2004-Message Errors in Belief Propagation

Author: Alexander T. Ihler, John W. Fisher, Alan S. Willsky

Abstract: Belief propagation (BP) is an increasingly popular method of performing approximate inference on arbitrary graphical models. At times, even further approximations are required, whether from quantization or other simplified message representations or from stochastic approximation methods. Introducing such errors into the BP message computations has the potential to adversely affect the solution obtained. We analyze this effect with respect to a particular measure of message error, and show bounds on the accumulation of errors in the system. This leads both to convergence conditions and error bounds in traditional and approximate BP message passing. 1

4 0.080582827 129 nips-2004-New Criteria and a New Algorithm for Learning in Multi-Agent Systems

Author: Rob Powers, Yoav Shoham

Abstract: We propose a new set of criteria for learning algorithms in multi-agent systems, one that is more stringent and (we argue) better justified than previous proposed criteria. Our criteria, which apply most straightforwardly in repeated games with average rewards, consist of three requirements: (a) against a specified class of opponents (this class is a parameter of the criterion) the algorithm yield a payoff that approaches the payoff of the best response, (b) against other opponents the algorithm’s payoff at least approach (and possibly exceed) the security level payoff (or maximin value), and (c) subject to these requirements, the algorithm achieve a close to optimal payoff in self-play. We furthermore require that these average payoffs be achieved quickly. We then present a novel algorithm, and show that it meets these new criteria for a particular parameter class, the class of stationary opponents. Finally, we show that the algorithm is effective not only in theory, but also empirically. Using a recently introduced comprehensive game theoretic test suite, we show that the algorithm almost universally outperforms previous learning algorithms. 1

5 0.074410729 10 nips-2004-A Probabilistic Model for Online Document Clustering with Application to Novelty Detection

Author: Jian Zhang, Zoubin Ghahramani, Yiming Yang

Abstract: In this paper we propose a probabilistic model for online document clustering. We use non-parametric Dirichlet process prior to model the growing number of clusters, and use a prior of general English language model as the base distribution to handle the generation of novel clusters. Furthermore, cluster uncertainty is modeled with a Bayesian Dirichletmultinomial distribution. We use empirical Bayes method to estimate hyperparameters based on a historical dataset. Our probabilistic model is applied to the novelty detection task in Topic Detection and Tracking (TDT) and compared with existing approaches in the literature. 1

6 0.066397235 161 nips-2004-Self-Tuning Spectral Clustering

7 0.058267035 28 nips-2004-Bayesian inference in spiking neurons

8 0.057178788 63 nips-2004-Expectation Consistent Free Energies for Approximate Inference

9 0.051142946 171 nips-2004-Solitaire: Man Versus Machine

10 0.050988428 130 nips-2004-Newscast EM

11 0.04785677 48 nips-2004-Convergence and No-Regret in Multiagent Learning

12 0.047392391 162 nips-2004-Semi-Markov Conditional Random Fields for Information Extraction

13 0.044769589 198 nips-2004-Unsupervised Variational Bayesian Learning of Nonlinear Models

14 0.044277333 87 nips-2004-Integrating Topics and Syntax

15 0.042631213 37 nips-2004-Co-Training and Expansion: Towards Bridging Theory and Practice

16 0.042560413 9 nips-2004-A Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning

17 0.041690513 50 nips-2004-Dependent Gaussian Processes

18 0.041330393 61 nips-2004-Efficient Out-of-Sample Extension of Dominant-Set Clusters

19 0.03980732 183 nips-2004-Temporal-Difference Networks

20 0.039474826 123 nips-2004-Multi-agent Cooperation in Diverse Population Games


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.125), (1, -0.018), (2, 0.059), (3, -0.045), (4, 0.009), (5, 0.04), (6, -0.029), (7, 0.104), (8, -0.089), (9, -0.173), (10, 0.056), (11, -0.044), (12, -0.005), (13, 0.03), (14, 0.053), (15, -0.023), (16, 0.022), (17, 0.071), (18, 0.101), (19, 0.056), (20, -0.016), (21, 0.013), (22, 0.019), (23, -0.025), (24, 0.044), (25, -0.012), (26, -0.036), (27, -0.021), (28, -0.087), (29, -0.008), (30, -0.004), (31, 0.019), (32, 0.044), (33, 0.056), (34, 0.029), (35, -0.055), (36, -0.137), (37, 0.129), (38, -0.07), (39, -0.066), (40, -0.028), (41, -0.052), (42, -0.1), (43, 0.016), (44, -0.004), (45, -0.022), (46, -0.033), (47, 0.096), (48, 0.031), (49, 0.038)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95489067 122 nips-2004-Modelling Uncertainty in the Game of Go

Author: David H. Stern, Thore Graepel, David MacKay

Abstract: Go is an ancient oriental game whose complexity has defeated attempts to automate it. We suggest using probability in a Bayesian sense to model the uncertainty arising from the vast complexity of the game tree. We present a simple conditional Markov random field model for predicting the pointwise territory outcome of a game. The topology of the model reflects the spatial structure of the Go board. We describe a version of the Swendsen-Wang process for sampling from the model during learning and apply loopy belief propagation for rapid inference and prediction. The model is trained on several hundred records of professional games. Our experimental results indicate that the model successfully learns to predict territory despite its simplicity. 1

2 0.57635659 203 nips-2004-Validity Estimates for Loopy Belief Propagation on Binary Real-world Networks

Author: Joris M. Mooij, Hilbert J. Kappen

Abstract: We introduce a computationally efficient method to estimate the validity of the BP method as a function of graph topology, the connectivity strength, frustration and network size. We present numerical results that demonstrate the correctness of our estimates for the uniform random model and for a real-world network (“C. Elegans”). Although the method is restricted to pair-wise interactions, no local evidence (zero “biases”) and binary variables, we believe that its predictions correctly capture the limitations of BP for inference and MAP estimation on arbitrary graphical models. Using this approach, we find that BP always performs better than MF. Especially for large networks with broad degree distributions (such as scale-free networks) BP turns out to significantly outperform MF. 1

3 0.57224053 63 nips-2004-Expectation Consistent Free Energies for Approximate Inference

Author: Manfred Opper, Ole Winther

Abstract: We propose a novel a framework for deriving approximations for intractable probabilistic models. This framework is based on a free energy (negative log marginal likelihood) and can be seen as a generalization of adaptive TAP [1, 2, 3] and expectation propagation (EP) [4, 5]. The free energy is constructed from two approximating distributions which encode different aspects of the intractable model such a single node constraints and couplings and are by construction consistent on a chosen set of moments. We test the framework on a difficult benchmark problem with binary variables on fully connected graphs and 2D grid graphs. We find good performance using sets of moments which either specify factorized nodes or a spanning tree on the nodes (structured approximation). Surprisingly, the Bethe approximation gives very inferior results even on grids. 1

4 0.51872575 130 nips-2004-Newscast EM

Author: Wojtek Kowalczyk, Nikos A. Vlassis

Abstract: We propose a gossip-based distributed algorithm for Gaussian mixture learning, Newscast EM. The algorithm operates on network topologies where each node observes a local quantity and can communicate with other nodes in an arbitrary point-to-point fashion. The main difference between Newscast EM and the standard EM algorithm is that the M-step in our case is implemented in a decentralized manner: (random) pairs of nodes repeatedly exchange their local parameter estimates and combine them by (weighted) averaging. We provide theoretical evidence and demonstrate experimentally that, under this protocol, nodes converge exponentially fast to the correct estimates in each M-step of the EM algorithm. 1

5 0.47980943 10 nips-2004-A Probabilistic Model for Online Document Clustering with Application to Novelty Detection

Author: Jian Zhang, Zoubin Ghahramani, Yiming Yang

Abstract: In this paper we propose a probabilistic model for online document clustering. We use non-parametric Dirichlet process prior to model the growing number of clusters, and use a prior of general English language model as the base distribution to handle the generation of novel clusters. Furthermore, cluster uncertainty is modeled with a Bayesian Dirichletmultinomial distribution. We use empirical Bayes method to estimate hyperparameters based on a historical dataset. Our probabilistic model is applied to the novelty detection task in Topic Detection and Tracking (TDT) and compared with existing approaches in the literature. 1

6 0.47791111 171 nips-2004-Solitaire: Man Versus Machine

7 0.46536797 129 nips-2004-New Criteria and a New Algorithm for Learning in Multi-Agent Systems

8 0.45494026 116 nips-2004-Message Errors in Belief Propagation

9 0.4400433 57 nips-2004-Economic Properties of Social Networks

10 0.42322037 43 nips-2004-Conditional Models of Identity Uncertainty with Application to Noun Coreference

11 0.40723151 86 nips-2004-Instance-Specific Bayesian Model Averaging for Classification

12 0.39284706 29 nips-2004-Beat Tracking the Graphical Model Way

13 0.38296542 41 nips-2004-Comparing Beliefs, Surveys, and Random Walks

14 0.37831196 123 nips-2004-Multi-agent Cooperation in Diverse Population Games

15 0.37549067 162 nips-2004-Semi-Markov Conditional Random Fields for Information Extraction

16 0.36407331 108 nips-2004-Markov Networks for Detecting Overalpping Elements in Sequence Data

17 0.35528815 95 nips-2004-Large-Scale Prediction of Disulphide Bond Connectivity

18 0.33838007 55 nips-2004-Distributed Occlusion Reasoning for Tracking with Nonparametric Belief Propagation

19 0.32424748 199 nips-2004-Using Machine Learning to Break Visual Human Interaction Proofs (HIPs)

20 0.31862116 146 nips-2004-Pictorial Structures for Molecular Modeling: Interpreting Density Maps


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(13, 0.073), (15, 0.123), (17, 0.018), (26, 0.046), (31, 0.018), (32, 0.013), (33, 0.136), (35, 0.024), (39, 0.014), (50, 0.018), (82, 0.409), (89, 0.015)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.78179842 122 nips-2004-Modelling Uncertainty in the Game of Go

Author: David H. Stern, Thore Graepel, David MacKay

Abstract: Go is an ancient oriental game whose complexity has defeated attempts to automate it. We suggest using probability in a Bayesian sense to model the uncertainty arising from the vast complexity of the game tree. We present a simple conditional Markov random field model for predicting the pointwise territory outcome of a game. The topology of the model reflects the spatial structure of the Go board. We describe a version of the Swendsen-Wang process for sampling from the model during learning and apply loopy belief propagation for rapid inference and prediction. The model is trained on several hundred records of professional games. Our experimental results indicate that the model successfully learns to predict territory despite its simplicity. 1

2 0.7063483 76 nips-2004-Hierarchical Bayesian Inference in Networks of Spiking Neurons

Author: Rajesh P. Rao

Abstract: There is growing evidence from psychophysical and neurophysiological studies that the brain utilizes Bayesian principles for inference and decision making. An important open question is how Bayesian inference for arbitrary graphical models can be implemented in networks of spiking neurons. In this paper, we show that recurrent networks of noisy integrate-and-fire neurons can perform approximate Bayesian inference for dynamic and hierarchical graphical models. The membrane potential dynamics of neurons is used to implement belief propagation in the log domain. The spiking probability of a neuron is shown to approximate the posterior probability of the preferred state encoded by the neuron, given past inputs. We illustrate the model using two examples: (1) a motion detection network in which the spiking probability of a direction-selective neuron becomes proportional to the posterior probability of motion in a preferred direction, and (2) a two-level hierarchical network that produces attentional effects similar to those observed in visual cortical areas V2 and V4. The hierarchical model offers a new Bayesian interpretation of attentional modulation in V2 and V4. 1

3 0.67417562 188 nips-2004-The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space

Author: Robert Jenssen, Deniz Erdogmus, Jose Principe, Torbjørn Eltoft

Abstract: A new distance measure between probability density functions (pdfs) is introduced, which we refer to as the Laplacian pdf distance. The Laplacian pdf distance exhibits a remarkable connection to Mercer kernel based learning theory via the Parzen window technique for density estimation. In a kernel feature space defined by the eigenspectrum of the Laplacian data matrix, this pdf distance is shown to measure the cosine of the angle between cluster mean vectors. The Laplacian data matrix, and hence its eigenspectrum, can be obtained automatically based on the data at hand, by optimal Parzen window selection. We show that the Laplacian pdf distance has an interesting interpretation as a risk function connected to the probability of error. 1

4 0.52691257 28 nips-2004-Bayesian inference in spiking neurons

Author: Sophie Deneve

Abstract: We propose a new interpretation of spiking neurons as Bayesian integrators accumulating evidence over time about events in the external world or the body, and communicating to other neurons their certainties about these events. In this model, spikes signal the occurrence of new information, i.e. what cannot be predicted from the past activity. As a result, firing statistics are close to Poisson, albeit providing a deterministic representation of probabilities. We proceed to develop a theory of Bayesian inference in spiking neural networks, recurrent interactions implementing a variant of belief propagation. Many perceptual and motor tasks performed by the central nervous system are probabilistic, and can be described in a Bayesian framework [4, 3]. A few important but hidden properties, such as direction of motion, or appropriate motor commands, are inferred from many noisy, local and ambiguous sensory cues. These evidences are combined with priors about the sensory world and body. Importantly, because most of these inferences should lead to quick and irreversible decisions in a perpetually changing world, noisy cues have to be integrated on-line, but in a way that takes into account unpredictable events, such as a sudden change in motion direction or the appearance of a new stimulus. This raises the question of how this temporal integration can be performed at the neural level. It has been proposed that single neurons in sensory cortices represent and compute the log probability that a sensory variable takes on a certain value (eg Is visual motion in the neuron’s preferred direction?) [9, 7]. Alternatively, to avoid normalization issues and provide an appropriate signal for decision making, neurons could represent the log probability ratio of a particular hypothesis (eg is motion more likely to be towards the right than towards the left) [7, 6]. Log probabilities are convenient here, since under some assumptions, independent noisy cues simply combine linearly. Moreover, there are physiological evidence for the neural representation of log probabilities and log probability ratios [9, 6, 7]. However, these models assume that neurons represent probabilities in their firing rates. We argue that it is important to study how probabilistic information are encoded in spikes. Indeed, it seems spurious to marry the idea of an exquisite on-line integration of noisy cues with an underlying rate code that requires averaging on large populations of noisy neurons and long periods of time. In particular, most natural tasks require this integration to take place on the time scale of inter-spike intervals. Spikes are more efficiently signaling events ∗ Institute of Cognitive Science, 69645 Bron, France than analog quantities. In addition, a neural theory of inference with spikes will bring us closer to the physiological level and generate more easily testable predictions. Thus, we propose a new theory of neural processing in which spike trains provide a deterministic, online representation of a log-probability ratio. Spikes signals events, eg that the log-probability ratio has exceeded what could be predicted from previous spikes. This form of coding was loosely inspired by the idea of ”energy landscape” coding proposed by Hinton and Brown [2]. However, contrary to [2] and other theories using rate-based representation of probabilities, this model is self-consistent and does not require different models for encoding and decoding: As output spikes provide new, unpredictable, temporally independent evidence, they can be used directly as an input to other Bayesian neurons. Finally, we show that these neurons can be used as building blocks in a theory of approximate Bayesian inference in recurrent spiking networks. Connections between neurons implement an underlying Bayesian network, consisting of coupled hidden Markov models. Propagation of spikes is a form of belief propagation in this underlying graphical model. Our theory provides computational explanations of some general physiological properties of cortical neurons, such as spike frequency adaptation, Poisson statistics of spike trains, the existence of strong local inhibition in cortical columns, and the maintenance of a tight balance between excitation and inhibition. Finally, we discuss the implications of this model for the debate about temporal versus rate-based neural coding. 1 Spikes and log posterior odds 1.1 Synaptic integration seen as inference in a hidden Markov chain We propose that each neuron codes for an underlying ”hidden” binary variable, xt , whose state evolves over time. We assume that xt depends only on the state at the previous time step, xt−dt , and is conditionally independent of other past states. The state xt can switch 1 from 0 to 1 with a constant rate ron = dt limdt→0 P (xt = 1|xt−dt = 0), and from 1 to 0 with a constant rate roff . For example, these transition rates could represent how often motion in a preferred direction appears the receptive field and how long it is likely to stay there. The neuron infers the state of its hidden variable from N noisy synaptic inputs, considered to be observations of the hidden state. In this initial version of the model, we assume that these inputs are conditionally independent homogeneous Poisson processes, synapse i i emitting a spike between time t and t + dt (si = 1) with constant probability qon dt if t i xt = 1, and another constant probability qoff dt if xt = 0. The synaptic spikes are assumed to be otherwise independent of previous synaptic spikes, previous states and spikes at other synapses. The resulting generative model is a hidden Markov chain (figure 1-A). However, rather than estimating the state of its hidden variable and communicating this estimate to other neurons (for example by emitting a spike when sensory evidence for xt = 1 goes above a threshold) the neuron reports and communicates its certainty that the current state is 1. This certainty takes the form of the log of the ratio of the probability that the hidden state is 1, and the probability that the state is 0, given all the synaptic inputs P (xt =1|s0→t ) received so far: Lt = log P (xt =0|s0→t ) . We use s0→t as a short hand notation for the N synaptic inputs received at present and in the past. We will refer to it as the log odds ratio. Thanks to the conditional independencies assumed in the generative model, we can compute this Log odds ratio iteratively. Taking the limit as dt goes to zero, we get the following differential equation: ˙ L = ron 1 + e−L − roff 1 + eL + i wi δ(si − 1) − θ t B. A. xt ron .roff dt qon , qoff st xt ron .roff i t st dt s qon , qoff qon , qoff st dt xt j st Ot It Gt Ot Lt t t dt C. E. 2 0 -2 -4 D. 500 1000 1500 2000 2500 2 3000 Count Log odds 4 20 Lt 0 -2 0 500 1000 1500 2000 2500 Time Ot 3000 0 200 400 600 ISI Figure 1: A. Generative model for the synaptic input. B. Schematic representation of log odds ratio encoding and decoding. The dashed circle represents both eventual downstream elements and the self-prediction taking place inside the model neuron. A spike is fired only when Lt exceeds Gt . C. One example trial, where the state switches from 0 to 1 (shaded area) and back to 0. plain: Lt , dotted: Gt . Black stripes at the top: corresponding spikes train. D. Mean Log odds ratio (dark line) and mean output firing rate (clear line). E. Output spike raster plot (1 line per trial) and ISI distribution for the neuron shown is C. and D. Clear line: ISI distribution for a poisson neuron with the same rate. wi , the synaptic weight, describe how informative synapse i is about the state of the hidden i qon variable, e.g. wi = log qi . Each synaptic spike (si = 1) gives an impulse to the log t off odds ratio, which is positive if this synapse is more active when the hidden state if 1 (i.e it increases the neuron’s confidence that the state is 1), and negative if this synapse is more active when xt = 0 (i.e it decreases the neuron’s confidence that the state is 1). The bias, θ, is determined by how informative it is not to receive any spike, e.g. θ = i i i qon − qoff . By convention, we will consider that the ”bias” is positive or zero (if not, we need simply to invert the status of the state x). 1.2 Generation of output spikes The spike train should convey a sparse representation of Lt , so that each spike reports new information about the state xt that is not redundant with that reported by other, preceding, spikes. This proposition is based on three arguments: First, spikes, being metabolically expensive, should be kept to a minimum. Second, spikes conveying redundant information would require a decoding of the entire spike train, whereas independent spike can be taken into account individually. And finally, we seek a self consistent model, with the spiking output having a similar semantics to its spiking input. To maximize the independence of the spikes (conditioned on xt ), we propose that the neuron fires only when the difference between its log odds ratio Lt and a prediction Gt of this log odds ratio based on the output spikes emitted so far reaches a certain threshold. Indeed, supposing that downstream elements predicts Lt as best as they can, the neuron only needs to fire when it expects that prediction to be too inaccurate (figure 1-B). In practice, this will happen when the neuron receives new evidence for xt = 1. Gt should thereby follow the same dynamics as Lt when spikes are not received. The equation for Gt and the output Ot (Ot = 1 when an output spike is fired) are given by: ˙ G = Ot = ron 1 + e−L − roff 1 + eL + go δ(Ot − 1) go 1. when Lt > Gt + , 0 otherwise, 2 (1) (2) Here go , a positive constant, is the only free parameter, the other parameters being constrained by the statistics of the synaptic input. 1.3 Results Figure 1-C plots a typical trial, showing the behavior of L, G and O before, during and after presentation of the stimulus. As random synaptic inputs are integrated, L fluctuates and eventually exceeds G + 0.5, leading to an output spike. Immediately after a spike, G jumps to G + go , which prevents (except in very rare cases) a second spike from immediately following the first. Thus, this ”jump” implements a relative refractory period. However, ron G decays as it tends to converge back to its stable level gstable = log roff . Thus L eventually exceeds G again, leading to a new spike. This threshold crossing happens more often during stimulation (xt = 1) as the net synaptic input alters to create a higher overall level of certainty, Lt . Mean Log odds ratio and output firing rate ¯ The mean firing rate Ot of the Bayesian neuron during presentation of its preferred stimulus (i.e. when xt switches from 0 to 1 and back to 0) is plotted in figure 1-D, together with the ¯ mean log posterior ratio Lt , both averaged over trials. Not surprisingly, the log-posterior ratio reflects the leaky integration of synaptic evidence, with an effective time constant that depends on the transition probabilities ron , roff . If the state is very stable (ron = roff ∼ 0), synaptic evidence is integrated over almost infinite time periods, the mean log posterior ratio tending to either increase or decrease linearly with time. In the example in figure 1D, the state is less stable, so ”old” synaptic evidence are discounted and Lt saturates. ¯ In contrast, the mean output firing rate Ot tracks the state of xt almost perfectly. This is because, as a form of predictive coding, the output spikes reflect the new synaptic i evidence, It = i δ(st − 1) − θ, rather than the log posterior ratio itself. In particular, the mean output firing rate is a rectified linear function of the mean input, e. g. + ¯ ¯ wi q i −θ . O= 1I= go i on(off) Analogy with a leaky integrate and fire neuron We can get an interesting insight into the computation performed by this neuron by linearizing L and G around their mean levels over trials. Here we reduce the analysis to prolonged, statistically stable periods when the state is constant (either ON or OFF). In this case, the ¯ ¯ mean level of certainty L and its output prediction G are also constant over time. We make the rough approximation that the post spike jump, go , and the input fluctuations are small ¯ compared to the mean level of certainty L. Rewriting Vt = Lt − Gt + go 2 as the ”membrane potential” of the Bayesian neuron: ˙ V = −kL V + It − ∆go − go Ot ¯ ¯ ¯ where kL = ron e−L + roff eL , the ”leak” of the membrane potential, depends on the overall ¯ level of certainty. ∆go is positive and a monotonic increasing function of go . A. s t1 dt s t1 s t1 dt B. C. x t1 x t3 dt x t3 x t3 dt x t1 x t1 x t1 x t2 x t3 x t1 … x tn x t3 x t2 … x tn … dt dt Lx2 D. x t2 dt s t2 dt x t2 s t2 x t2 dt s t2 dt Log odds 10 No inh -0.5 -1 -1 -1.5 -2 5 Feedback 500 1000 1500 2000 Tiger Stripes 0 -5 -10 500 1000 1500 2000 2500 Time Figure 2: A. Bayesian causal network for yt (tiger), x1 (stripes) and x2 (paws). B. A nett t work feedforward computing the log posterior for x1 . C. A recurrent network computing t the log posterior odds for all variables. D. Log odds ratio in a simulated trial with the net2 1 1 work in C (see text). Thick line: Lx , thin line: Lx , dash-dotted: Lx without inhibition. t t t 2 Insert: Lx averaged over trials, showing the effect of feedback. t The linearized Bayesian neuron thus acts in its stable regime as a leaky integrate and fire (LIF) neuron. The membrane potential Vt integrates its input, Jt = It − ∆go , with a leak kL . The neuron fires when its membrane potential reaches a constant threshold go . After ¯ each spikes, Vt is reset to 0. Interestingly, for appropriately chosen compression factor go , the mean input to the lin¯ ¯ earized neuron J = I − ∆go ≈ 0 1 . This means that the membrane potential is purely driven to its threshold by input fluctuations, or a random walk in membrane potential. As a consequence, the neuron’s firing will be memoryless, and close to a Poisson process. In particular, we found Fano factor close to 1 and quasi-exponential ISI distribution (figure 1E) on the entire range of parameters tested. Indeed, LIF neurons with balanced inputs have been proposed as a model to reproduce the statistics of real cortical neurons [8]. This balance is implemented in our model by the neuron’s effective self-inhibition, even when the synaptic input itself is not balanced. Decoding As we previously said, downstream elements could predict the log odds ratio Lt by computing Gt from the output spikes (Eq 1, fig 1-B). Of course, this requires an estimate of the transition probabilities ron , roff , that could be learned from the observed spike trains. However, we show next that explicit decoding is not necessary to perform bayesian inference in spiking networks. Intuitively, this is because the quantity that our model neurons receive and transmit, eg new information, is exactly what probabilistic inference algorithm propagate between connected statistical elements. 1 ¯ Even if go is not chosen optimally, the influence of the drift J is usually negligible compared to the large fluctuations in membrane potential. 2 Bayesian inference in cortical networks The model neurons, having the same input and output semantics, can be used as building blocks to implement more complex generative models consisting of coupled Markov chains. Consider, for example, the example in figure 2-A. Here, a ”parent” variable x1 t (the presence of a tiger) can cause the state of n other ”children” variables ([xk ]k=2...n ), t of whom two are represented (the presence of stripes,x2 , and motion, x3 ). The ”chilt t dren” variables are Bayesian neurons identical to those described previously. The resulting bayesian network consist of n + 1 coupled hidden Markov chains. Inference in this architecture corresponds to computing the log posterior odds ratio for the tiger, x1 , and the log t posterior of observing stripes or motion, ([xk ]k=2...n ), given the synaptic inputs received t by the entire network so far, i.e. s2 , . . . , sk . 0→t 0→t Unfortunately, inference and learning in this network (and in general in coupled Markov chains) requires very expensive computations, and cannot be performed by simply propagating messages over time and among the variable nodes. In particular, the state of a child k variable xt depends on xk , sk , x1 and the state of all other children at the previous t t t−dt time step, [xj ]2

5 0.50431353 194 nips-2004-Theory of localized synfire chain: characteristic propagation speed of stable spike pattern

Author: Kosuke Hamaguchi, Masato Okada, Kazuyuki Aihara

Abstract: Repeated spike patterns have often been taken as evidence for the synfire chain, a phenomenon that a stable spike synchrony propagates through a feedforward network. Inter-spike intervals which represent a repeated spike pattern are influenced by the propagation speed of a spike packet. However, the relation between the propagation speed and network structure is not well understood. While it is apparent that the propagation speed depends on the excitatory synapse strength, it might also be related to spike patterns. We analyze a feedforward network with Mexican-Hattype connectivity (FMH) using the Fokker-Planck equation. We show that both a uniform and a localized spike packet are stable in the FMH in a certain parameter region. We also demonstrate that the propagation speed depends on the distinct firing patterns in the same network.

6 0.48548362 173 nips-2004-Spike-timing Dependent Plasticity and Mutual Information Maximization for a Spiking Neuron Model

7 0.48201185 181 nips-2004-Synergies between Intrinsic and Synaptic Plasticity in Individual Model Neurons

8 0.47747776 148 nips-2004-Probabilistic Computation in Spiking Populations

9 0.47321787 154 nips-2004-Resolving Perceptual Aliasing In The Presence Of Noisy Sensors

10 0.47057444 84 nips-2004-Inference, Attention, and Decision in a Bayesian Neural Architecture

11 0.47020856 112 nips-2004-Maximising Sensitivity in a Spiking Network

12 0.46695748 153 nips-2004-Reducing Spike Train Variability: A Computational Theory Of Spike-Timing Dependent Plasticity

13 0.46255147 118 nips-2004-Methods for Estimating the Computational Power and Generalization Capability of Neural Microcircuits

14 0.46101952 151 nips-2004-Rate- and Phase-coded Autoassociative Memory

15 0.45602438 56 nips-2004-Dynamic Bayesian Networks for Brain-Computer Interfaces

16 0.44341698 178 nips-2004-Support Vector Classification with Input Data Uncertainty

17 0.44206381 131 nips-2004-Non-Local Manifold Tangent Learning

18 0.44186366 189 nips-2004-The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees

19 0.44064566 68 nips-2004-Face Detection --- Efficient and Rank Deficient

20 0.440294 4 nips-2004-A Generalized Bradley-Terry Model: From Group Competition to Individual Skill