nips nips2004 nips2004-74 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Moray Allan, Christopher Williams
Abstract: We describe how we used a data set of chorale harmonisations composed by Johann Sebastian Bach to train Hidden Markov Models. Using a probabilistic framework allows us to create a harmonisation system which learns from examples, and which can compose new harmonisations. We make a quantitative comparison of our system’s harmonisation performance against simpler models, and provide example harmonisations. 1
Reference: text
sentIndex sentText sentNum sentScore
1 uk Abstract We describe how we used a data set of chorale harmonisations composed by Johann Sebastian Bach to train Hidden Markov Models. [sent-11, score-0.674]
2 Using a probabilistic framework allows us to create a harmonisation system which learns from examples, and which can compose new harmonisations. [sent-12, score-0.631]
3 We make a quantitative comparison of our system’s harmonisation performance against simpler models, and provide example harmonisations. [sent-13, score-0.532]
4 1 Introduction Chorale harmonisation is a traditional part of the theoretical education of Western classical musicians. [sent-14, score-0.513]
5 Given a melody, the task is to create three further lines of music which will sound pleasant when played simultaneously with the original melody. [sent-15, score-0.22]
6 A good chorale harmonisation will show an understanding of the basic ‘rules’ of harmonisation, which codify the aesthetic preferences of the style. [sent-16, score-0.821]
7 Here we approach chorale harmonisation as a machine learning task, in a probabilistic framework. [sent-17, score-0.821]
8 We use example harmonisations to build a model of harmonic processes. [sent-18, score-0.532]
9 Section 2 below gives an overview of the musical background to chorale harmonisation. [sent-20, score-0.494]
10 Section 3 explains how we can create a harmonisation system using Hidden Markov Models. [sent-21, score-0.593]
11 Section 4 examines the system’s performance quantitatively and provides example harmonisations generated by the system. [sent-22, score-0.366]
12 At first chorales had only relatively simple melodic lines, but soon composers began to arrange more complex music to accompany the original tunes. [sent-26, score-0.391]
13 In the pieces by Bach which we use here, the chorale tune is taken generally unchanged in the highest voice, and three other musical parts are created alongside it, supporting it and each other. [sent-27, score-0.545]
14 By the eighteenth century, a complex system of rules had developed, dictating what combinations of notes should be played at the same time or following previous notes. [sent-28, score-0.25]
15 The training and test chorales used here are divided into two sets: one for chorales in ‘major’ keys, and one for chorales in ‘minor’ keys. [sent-32, score-0.777]
16 Major and minor keys are based around different sets of notes, and musical lines in major and minor keys behave differently. [sent-33, score-0.507]
17 The representation we use to model harmonisations divides up chorales into discrete timesteps according to the regular beat underlying their musical rhythm. [sent-34, score-0.862]
18 At each time-step we represent the notes in the various musical parts by counting how far apart they are in terms of all the possible ‘semitone’ notes. [sent-35, score-0.344]
19 1 Harmonisation Model HMM for Harmonisation We construct a Hidden Markov model in which the visible states are melody notes and the hidden states are chords. [sent-37, score-0.712]
20 A sequence of observed events makes up a melody line, and a sequence of hidden events makes up a possible harmonisation for a melody line. [sent-38, score-1.278]
21 We denote the sequence of melody notes as Y and the harmonic motion as C, with yt representing the melody at time t, and ct the harmonic state. [sent-39, score-1.342]
22 Hidden Markov Models are generative models: here we model how a visible melody line is emitted by a hidden sequence of harmonies. [sent-40, score-0.497]
23 This makes sense in musical terms, since we can view a chorale as having an underlying harmonic structure, and the individual notes of the melody line as chosen to be compatible with this harmonic state at each time step. [sent-41, score-1.339]
24 We will create separate models for chorales in major and minor keys, since these groups have different harmonic structures. [sent-42, score-0.559]
25 For our model we divide each chorale into time steps of a single beat, making the assumption that the harmonic state does not change during a beat. [sent-43, score-0.519]
26 ) We want to create a model which we can use to predict three further notes at each of these time steps, one for each of the three additional musical lines in the harmonisation. [sent-45, score-0.413]
27 Here we represent a choice of notes by a list of pitch intervals. [sent-47, score-0.195]
28 By using intervals in this way we represent the relationship between the added notes and the melody at a given time step, without reference to the absolute pitch of the melody note. [sent-48, score-0.731]
29 These interval sets alone would be harmonically ambiguous, so we disambiguate them using harmonic labels, which are included in the training data set. [sent-49, score-0.196]
30 Adding harmonic labels means that our hidden symbols not only identify a particular chord, but also the harmonic function that the chord is serving. [sent-50, score-0.688]
31 Here (an A major chord) the alto, tenor and bass notes are respectively 4, 9, and 16 semitones below the soprano melody. [sent-52, score-0.34]
32 The harmonic label is ‘T’, labelling this as functionally a ‘tonic’ chord. [sent-53, score-0.166]
33 Our representation of both melody and harmony distinguishes between a note which is continued from the previous beat and a repeated note. [sent-54, score-0.393]
34 We make a first-order Markov assumption concerning the transition probabilities between the hidden states, which represent choices of chord on an individual beat: P (ct |ct−1 , ct−2 , . [sent-55, score-0.373]
35 We make a similar assumption concerning emission probabilities to model how the observed event, a melody note, results from the hidden state, a chord: P (yt |ct , . [sent-59, score-0.454]
36 In the Hidden Markov Models used here, the ‘hidden’ states of chords and harmonic symbols are in fact visible in the data during training. [sent-66, score-0.405]
37 Using a Hidden Markov Model framework allows us to conduct efficient inference over our harmonisation choices. [sent-70, score-0.536]
38 In this way our harmonisation system will ‘plan’ over an entire harmonisation rather than simply making immediate choices based on the local context. [sent-71, score-1.071]
39 This means, for example, that we can hope to compose appropriate ‘cadences’ to bring our harmonisations to pleasant closes rather than finishing abruptly. [sent-72, score-0.442]
40 Given a new melody line, we can use the Viterbi algorithm to find the most likely state sequence, and thus harmonisation, given our model. [sent-73, score-0.313]
41 We can also provide alternative harmonisations by sampling from the posterior [see 1, p. [sent-74, score-0.366]
42 , yt−1 , ct−1 = j, ct = k) = αt−1 (j)P (ct = k|ct−1 = j). [sent-80, score-0.208]
43 , yt−1 , ct = k) = αt−1 (j)P (ct = k|ct−1 = j) . [sent-84, score-0.208]
44 3 HMM for Ornamentation The chorale harmonisations produced by the Hidden Markov Model described above harmonise the original melody according to beat-long time steps. [sent-94, score-0.971]
45 Chorale harmonisations are Table 1: Comparison of predictive power achieved by different models of harmonic sequences on training and test data sets (nats). [sent-95, score-0.579]
46 84 not limited to this rhythmic form, so here we add a secondary ornamentation stage which can add passing notes to decorate these harmonisations. [sent-112, score-0.337]
47 Generating a harmonisation and adding the ornamentation as a second stage greatly reduces the number of hidden states in the initial harmonisation model: if we went straight to fully-ornamented hidden states then the data available to us concerning each state would be extremely limited. [sent-113, score-1.583]
48 Moreover, since the passing notes do not change the harmonic structure of a piece but only ornament it, adding these passing notes after first determining the harmonic structure for a chorale is a plausible compositional process. [sent-114, score-1.03]
49 The notes added in this ornamentation stage generally smooth out the movement between notes in a line of music, so we set up the visible states in terms of how much the three harmonising musical lines rise or fall from one time-step to the next. [sent-116, score-0.921]
50 The hidden states describe ornamentation of this motion in terms of the movement made by each part during the time step, relative to its starting pitch. [sent-117, score-0.351]
51 This relative motion is described at a time resolution four times as fine as the harmonic movement. [sent-118, score-0.19]
52 4 Results Our training and test data are derived from chorale harmonisations by Johann Sebastian Bach. [sent-122, score-0.704]
53 1 These provide a relatively large set of harmonisations by a single composer, and are long established as a standard reference among music theorists. [sent-123, score-0.479]
54 There are 202 chorales in major keys of which 121 were used for training and 81 used for testing; and 180 chorales in minor keys (split 108/72). [sent-124, score-0.76]
55 Using a probabilistic framework allows us to give quantitative answers to questions about the performance of the harmonisation system. [sent-125, score-0.532]
56 There are many quantities we could compute, but here we will look at how high a probability the model assigns to Bach’s own harmonisations given the respective melody lines. [sent-126, score-0.654]
57 Whereas verbal descriptions of harmonisation performance are unavoidably vague and hard to compare, these figures allow our model’s performance to be directly compared with that of any future probabilistic harmonisation system. [sent-129, score-1.026]
58 Table 1 shows the average negative log probability per symbol of Bach’s chord symbol 1 We used a computer-readable edition of Bach’s chorales downloaded from ftp://i11ftp. [sent-130, score-0.532]
59 The Hidden Markov Model here has 5046 hidden chord states and 58 visible melody states. [sent-136, score-0.711]
60 The Hidden Markov Model finds a better fit to the training data than the simpler models: to choose a good chord for a particular beat we need to take into account both the melody note on that beat and the surrounding chords. [sent-137, score-0.65]
61 Even the simplest model of the data, which assumes that each chord is drawn independently, performs worse on the test data than the training data, showing that we are suffering from sparse data. [sent-138, score-0.239]
62 There are many chords, chord to melody note emissions, and especially chord to chord transitions, that are seen in the test data but never occur in the training data. [sent-139, score-0.925]
63 The models’ performance with unseen data could be improved by using a more sophisticated smoothing method, for example taking into account the overall relative frequencies of harmonic symbols when assigning probabilities to unseen chord transitions. [sent-140, score-0.505]
64 However, this lower performance with unseen test data is not a problem for the task we approach here, of generating new harmonisations, as long as we can learn a large enough vocabulary of events from the training data to be able to find good harmonisations for new chorale melodies. [sent-141, score-0.766]
65 Figures 2 and 3 show the most likely harmonisations under our model for two short chorales. [sent-142, score-0.366]
66 There is an appropriate harmonic movement through the harmonisations, and they come to plausible cadences. [sent-145, score-0.194]
67 The generated harmonisations suffer somewhat from not taking into account the flow of the individual musical lines which we add. [sent-146, score-0.607]
68 There are large jumps, especially in the bass line, more often than is desirable – the bass line suffers most since has the greatest variance with respect to the soprano melody. [sent-147, score-0.17]
69 This excessive jumping also feeds through to reduce the performance of the ornamentation stage, creating visible states which are unseen in the training data. [sent-148, score-0.308]
70 The model structure means that the most likely harmonisation leaves these states unornamented. [sent-149, score-0.565]
71 5 Relationship to previous work Even while Bach was still composing chorales, music theorists were catching up with musical practice by writing treatises to explain and to teach harmonisation. [sent-156, score-0.299]
72 Two famous examples, Rameau’s Treatise on Harmony [2] and the Gradus ad Parnassum by Fux [3], show how musical style was systematised and formalised into sets of rules. [sent-157, score-0.243]
73 The traditional formulation of harmonisation technique in terms of rules suggests that we might create an automatic harmonisation system by finding as many rules as we can and encoding them as a consistent set of constraints. [sent-158, score-1.222]
74 Pachet and Roy [4] provide a good overview of constraintbased harmonisation systems. [sent-159, score-0.513]
75 Using standard constraintsatisfaction techniques for harmonisation is problematic, since the space and time needs of the solver tend to rise extremely quickly with the length of the piece. [sent-162, score-0.53]
76 In their comparison the ordinary constraint-based system actually performs much better, and they argue that this is because it possesses implicit control knowledge which the system based on the genetic algorithm lacks. [sent-167, score-0.181]
77 Even if they can be made more efficient, these rule-based systems do not perform the full task of our harmonisation system. [sent-168, score-0.513]
78 Rules written by a human penalise undesirable combinations of notes, so that they will be filtered out when the best chord is chosen from all those compatible with the harmony already decided. [sent-174, score-0.29]
79 In contrast, our model learns all its harmonic ‘rules’ from its training data. [sent-175, score-0.196]
80 [9] use n-gram Markov models to generate harmonic structures. [sent-177, score-0.183]
81 Unlike in chorale harmonisation, there is no predetermined tune with which the harmonies need to fit. [sent-178, score-0.37]
82 An automatically annotated corpus is used to train Markov models using contexts of different lengths, and the weighted sum of the probabilities assigned by these models used to predict harmonic movement. [sent-180, score-0.223]
83 Using a Hidden Markov Model rather than simple n-grams allows this kind of template to be included in the model as the visible state of the system: the chorale tunes in our system can be thought of as complex templates for harmonisations. [sent-184, score-0.49]
84 In our system the ‘planning’ ability of Hidden Markov Models, using the combination of chords and harmonic labels encoded in the hidden states, produces cadences which bring the chorale tunes to harmonic closure. [sent-187, score-0.957]
85 First, their system only describes the harmonisation in terms of the harmonic label (e. [sent-193, score-0.724]
86 Secondly, they do not give a quantitative evaluation of the harmonisations produced as in our Table 1. [sent-196, score-0.385]
87 Thirdly, in [12] a Markov model on blocks of chord sequences rather than on individual chords is explored. [sent-197, score-0.297]
88 6 Discussion Using the framework of probabilistic influence allows us to perform efficient inference to generate new chorale harmonisations, avoiding the computational scaling problems suffered by constraint-based harmonisation systems. [sent-198, score-0.821]
89 An alternative might be to use an Autoregressive Hidden Markov Model [13], which models the transitions between visible states as well as the hidden state transitions modelled by an ordinary Hidden Markov Model. [sent-202, score-0.36]
90 Not all of Bach’s chorale harmonisations are in the same style. [sent-203, score-0.674]
91 Some of his harmonisations are intentionally complex, and others intentionally simple. [sent-204, score-0.424]
92 We could improve our harmonisations by modelling this stylistic variation, either manually annotating training chorales according to their style or by training a mixture of HMMs. [sent-205, score-0.712]
93 As we only wish to model the hidden harmonic state given the melody, rather than construct a full generative model of the data, Conditional Random Fields (CRFs) [14] provide a related but alternative framework. [sent-206, score-0.326]
94 For example, we might be able to make better predictions by taking into account the current time step’s position within its musical bar. [sent-212, score-0.207]
95 Music theory recognises a hierarchy of stressed beats within a bar, and harmonic movement should correlated with these stresses. [sent-213, score-0.213]
96 Our system described above only considers chords as sets of intervals, and thus does not have a notion of the key of a piece (other than major or minor). [sent-215, score-0.19]
97 In general more interesting harmonies result when musical lines are closer together and their movements are more constrained. [sent-218, score-0.264]
98 Another dimension that could be explored with CRFs would be to take into account the words of the chorales, since Bach’s own harmonisations are affected by the properties of the texts as well as of the melodies. [sent-219, score-0.387]
99 The four-part harmonisation problem: a comparison between genetic algorithms and a rule-based system. [sent-257, score-0.578]
100 HARMONET: A neural net for harmonizing chorales in the style of J. [sent-263, score-0.286]
wordName wordTfidf (topN-words)
[('harmonisation', 0.513), ('harmonisations', 0.366), ('chorale', 0.308), ('melody', 0.268), ('chorales', 0.249), ('chord', 0.209), ('ct', 0.208), ('musical', 0.186), ('harmonic', 0.166), ('notes', 0.158), ('ornamentation', 0.132), ('hidden', 0.115), ('music', 0.113), ('chords', 0.088), ('bach', 0.084), ('markov', 0.083), ('keys', 0.07), ('visible', 0.067), ('genetic', 0.065), ('harmony', 0.064), ('yt', 0.062), ('beat', 0.061), ('ponsford', 0.059), ('bass', 0.058), ('minor', 0.055), ('states', 0.052), ('rules', 0.047), ('state', 0.045), ('system', 0.045), ('cadences', 0.044), ('harmonies', 0.044), ('harmonising', 0.044), ('harmonization', 0.044), ('ln', 0.04), ('johann', 0.038), ('compose', 0.038), ('pleasant', 0.038), ('pitch', 0.037), ('symbol', 0.037), ('style', 0.037), ('major', 0.037), ('nishing', 0.035), ('events', 0.035), ('create', 0.035), ('lines', 0.034), ('pieces', 0.033), ('symbols', 0.032), ('training', 0.03), ('bwv', 0.029), ('fux', 0.029), ('gradus', 0.029), ('harmonise', 0.029), ('hild', 0.029), ('intentionally', 0.029), ('maj', 0.029), ('melodic', 0.029), ('pachet', 0.029), ('semitones', 0.029), ('soprano', 0.029), ('tenor', 0.029), ('wiggins', 0.029), ('movement', 0.028), ('sebastian', 0.028), ('unseen', 0.027), ('passing', 0.027), ('edinburgh', 0.027), ('concerning', 0.026), ('ordinary', 0.026), ('tonic', 0.025), ('tunes', 0.025), ('line', 0.025), ('hmm', 0.025), ('motion', 0.024), ('conduct', 0.023), ('century', 0.023), ('probabilities', 0.023), ('sequence', 0.022), ('emission', 0.022), ('voice', 0.022), ('alto', 0.022), ('automatic', 0.022), ('account', 0.021), ('piece', 0.02), ('master', 0.02), ('famous', 0.02), ('stage', 0.02), ('respective', 0.02), ('beats', 0.019), ('seeing', 0.019), ('crfs', 0.019), ('quantitative', 0.019), ('transitions', 0.019), ('secondly', 0.019), ('evolutionary', 0.019), ('proc', 0.018), ('fields', 0.018), ('tune', 0.018), ('rise', 0.017), ('compatible', 0.017), ('models', 0.017)]
simIndex simValue paperId paperTitle
same-paper 1 1.0 74 nips-2004-Harmonising Chorales by Probabilistic Inference
Author: Moray Allan, Christopher Williams
Abstract: We describe how we used a data set of chorale harmonisations composed by Johann Sebastian Bach to train Hidden Markov Models. Using a probabilistic framework allows us to create a harmonisation system which learns from examples, and which can compose new harmonisations. We make a quantitative comparison of our system’s harmonisation performance against simpler models, and provide example harmonisations. 1
2 0.10025641 29 nips-2004-Beat Tracking the Graphical Model Way
Author: Dustin Lang, Nando D. Freitas
Abstract: We present a graphical model for beat tracking in recorded music. Using a probabilistic graphical model allows us to incorporate local information and global smoothness constraints in a principled manner. We evaluate our model on a set of varied and difficult examples, and achieve impressive results. By using a fast dual-tree algorithm for graphical model inference, our system runs in less time than the duration of the music being processed. 1
3 0.072695658 152 nips-2004-Real-Time Pitch Determination of One or More Voices by Nonnegative Matrix Factorization
Author: Fei Sha, Lawrence K. Saul
Abstract: An auditory “scene”, composed of overlapping acoustic sources, can be viewed as a complex object whose constituent parts are the individual sources. Pitch is known to be an important cue for auditory scene analysis. In this paper, with the goal of building agents that operate in human environments, we describe a real-time system to identify the presence of one or more voices and compute their pitch. The signal processing in the front end is based on instantaneous frequency estimation, a method for tracking the partials of voiced speech, while the pattern-matching in the back end is based on nonnegative matrix factorization, an unsupervised algorithm for learning the parts of complex objects. While supporting a framework to analyze complicated auditory scenes, our system maintains real-time operability and state-of-the-art performance in clean speech.
4 0.048701327 174 nips-2004-Spike Sorting: Bayesian Clustering of Non-Stationary Data
Author: Aharon Bar-hillel, Adam Spiro, Eran Stark
Abstract: Spike sorting involves clustering spike trains recorded by a microelectrode according to the source neuron. It is a complicated problem, which requires a lot of human labor, partly due to the non-stationary nature of the data. We propose an automated technique for the clustering of non-stationary Gaussian sources in a Bayesian framework. At a first search stage, data is divided into short time frames and candidate descriptions of the data as a mixture of Gaussians are computed for each frame. At a second stage transition probabilities between candidate mixtures are computed, and a globally optimal clustering is found as the MAP solution of the resulting probabilistic model. Transition probabilities are computed using local stationarity assumptions and are based on a Gaussian version of the Jensen-Shannon divergence. The method was applied to several recordings. The performance appeared almost indistinguishable from humans in a wide range of scenarios, including movement, merges, and splits of clusters. 1
5 0.04533419 44 nips-2004-Conditional Random Fields for Object Recognition
Author: Ariadna Quattoni, Michael Collins, Trevor Darrell
Abstract: We present a discriminative part-based approach for the recognition of object classes from unsegmented cluttered scenes. Objects are modeled as flexible constellations of parts conditioned on local observations found by an interest operator. For each object class the probability of a given assignment of parts to local features is modeled by a Conditional Random Field (CRF). We propose an extension of the CRF framework that incorporates hidden variables and combines class conditional CRFs into a unified framework for part-based object recognition. The parameters of the CRF are estimated in a maximum likelihood framework and recognition proceeds by finding the most likely class under our model. The main advantage of the proposed CRF framework is that it allows us to relax the assumption of conditional independence of the observed data (i.e. local features) often used in generative approaches, an assumption that might be too restrictive for a considerable number of object classes.
6 0.043224987 126 nips-2004-Nearly Tight Bounds for the Continuum-Armed Bandit Problem
7 0.041716993 124 nips-2004-Multiple Alignment of Continuous Time Series
8 0.041288085 31 nips-2004-Blind One-microphone Speech Separation: A Spectral Learning Approach
9 0.040303826 183 nips-2004-Temporal-Difference Networks
10 0.039951183 110 nips-2004-Matrix Exponential Gradient Updates for On-line Learning and Bregman Projection
11 0.038195781 148 nips-2004-Probabilistic Computation in Spiking Populations
12 0.035746694 206 nips-2004-Worst-Case Analysis of Selective Sampling for Linear-Threshold Algorithms
13 0.035585035 1 nips-2004-A Cost-Shaping LP for Bellman Error Minimization with Performance Guarantees
14 0.035352062 107 nips-2004-Making Latin Manuscripts Searchable using gHMM's
15 0.035041627 56 nips-2004-Dynamic Bayesian Networks for Brain-Computer Interfaces
16 0.032682538 189 nips-2004-The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees
17 0.03243684 162 nips-2004-Semi-Markov Conditional Random Fields for Information Extraction
18 0.032307196 64 nips-2004-Experts in a Markov Decision Process
19 0.032284945 28 nips-2004-Bayesian inference in spiking neurons
20 0.032282222 12 nips-2004-A Temporal Kernel-Based Model for Tracking Hand Movements from Neural Activities
topicId topicWeight
[(0, -0.095), (1, -0.026), (2, 0.055), (3, -0.055), (4, -0.02), (5, -0.034), (6, 0.06), (7, 0.026), (8, 0.01), (9, -0.021), (10, 0.005), (11, -0.002), (12, -0.033), (13, 0.058), (14, 0.053), (15, 0.006), (16, 0.045), (17, -0.058), (18, -0.023), (19, -0.002), (20, -0.056), (21, 0.033), (22, -0.057), (23, -0.021), (24, -0.005), (25, 0.005), (26, -0.03), (27, -0.035), (28, -0.045), (29, -0.028), (30, -0.037), (31, -0.07), (32, -0.042), (33, -0.055), (34, -0.001), (35, -0.104), (36, 0.098), (37, 0.063), (38, -0.117), (39, 0.096), (40, -0.087), (41, -0.028), (42, 0.007), (43, 0.013), (44, 0.013), (45, -0.006), (46, 0.165), (47, 0.172), (48, -0.064), (49, -0.255)]
simIndex simValue paperId paperTitle
same-paper 1 0.92792821 74 nips-2004-Harmonising Chorales by Probabilistic Inference
Author: Moray Allan, Christopher Williams
Abstract: We describe how we used a data set of chorale harmonisations composed by Johann Sebastian Bach to train Hidden Markov Models. Using a probabilistic framework allows us to create a harmonisation system which learns from examples, and which can compose new harmonisations. We make a quantitative comparison of our system’s harmonisation performance against simpler models, and provide example harmonisations. 1
2 0.70577037 29 nips-2004-Beat Tracking the Graphical Model Way
Author: Dustin Lang, Nando D. Freitas
Abstract: We present a graphical model for beat tracking in recorded music. Using a probabilistic graphical model allows us to incorporate local information and global smoothness constraints in a principled manner. We evaluate our model on a set of varied and difficult examples, and achieve impressive results. By using a fast dual-tree algorithm for graphical model inference, our system runs in less time than the duration of the music being processed. 1
3 0.40232891 171 nips-2004-Solitaire: Man Versus Machine
Author: Xiang Yan, Persi Diaconis, Paat Rusmevichientong, Benjamin V. Roy
Abstract: In this paper, we use the rollout method for policy improvement to analyze a version of Klondike solitaire. This version, sometimes called thoughtful solitaire, has all cards revealed to the player, but then follows the usual Klondike rules. A strategy that we establish, using iterated rollouts, wins about twice as many games on average as an expert human player does. 1
4 0.38937435 6 nips-2004-A Hidden Markov Model for de Novo Peptide Sequencing
Author: Bernd Fischer, Volker Roth, Jonas Grossmann, Sacha Baginsky, Wilhelm Gruissem, Franz Roos, Peter Widmayer, Joachim M. Buhmann
Abstract: De novo Sequencing of peptides is a challenging task in proteome research. While there exist reliable DNA-sequencing methods, the highthroughput de novo sequencing of proteins by mass spectrometry is still an open problem. Current approaches suffer from a lack in precision to detect mass peaks in the spectrograms. In this paper we present a novel method for de novo peptide sequencing based on a hidden Markov model. Experiments effectively demonstrate that this new method significantly outperforms standard approaches in matching quality. 1
5 0.36974764 147 nips-2004-Planning for Markov Decision Processes with Sparse Stochasticity
Author: Maxim Likhachev, Sebastian Thrun, Geoffrey J. Gordon
Abstract: Planning algorithms designed for deterministic worlds, such as A* search, usually run much faster than algorithms designed for worlds with uncertain action outcomes, such as value iteration. Real-world planning problems often exhibit uncertainty, which forces us to use the slower algorithms to solve them. Many real-world planning problems exhibit sparse uncertainty: there are long sequences of deterministic actions which accomplish tasks like moving sensor platforms into place, interspersed with a small number of sensing actions which have uncertain outcomes. In this paper we describe a new planning algorithm, called MCP (short for MDP Compression Planning), which combines A* search with value iteration for solving Stochastic Shortest Path problem in MDPs with sparse stochasticity. We present experiments which show that MCP can run substantially faster than competing planners in domains with sparse uncertainty; these experiments are based on a simulation of a ground robot cooperating with a helicopter to fill in a partial map and move to a goal location. In deterministic planning problems, optimal paths are acyclic: no state is visited more than once. Because of this property, algorithms like A* search can guarantee that they visit each state in the state space no more than once. By visiting the states in an appropriate order, it is possible to ensure that we know the exact value of all of a state’s possible successors before we visit that state; so, the first time we visit a state we can compute its correct value. By contrast, if actions have uncertain outcomes, optimal paths may contain cycles: some states will be visited two or more times with positive probability. Because of these cycles, there is no way to order states so that we determine the values of a state’s successors before we visit the state itself. Instead, the only way to compute state values is to solve a set of simultaneous equations. In problems with sparse stochasticity, only a small fraction of all states have uncertain outcomes. It is these few states that cause all of the cycles: while a deterministic state s may participate in a cycle, the only way it can do so is if one of its successors has an action with a stochastic outcome (and only if this stochastic action can lead to a predecessor of s). In such problems, we would like to build a smaller MDP which contains only states which are related to stochastic actions. We will call such an MDP a compressed MDP, and we will call its states distinguished states. We could then run fast algorithms like A* search to plan paths between distinguished states, and reserve slower algorithms like value iteration for deciding how to deal with stochastic outcomes. (a) Segbot (d) Planning map (b) Robotic helicopter (e) Execution simulation (c) 3D Map Figure 1: Robot-Helicopter Coordination There are two problems with such a strategy. First, there can be a large number of states which are related to stochastic actions, and so it may be impractical to enumerate all of them and make them all distinguished states; we would prefer instead to distinguish only states which are likely to be encountered while executing some policy which we are considering. Second, there can be a large number of ways to get from one distinguished state to another: edges in the compressed MDP correspond to sequences of actions in the original MDP. If we knew the values of all of the distinguished states exactly, then we could use A* search to generate optimal paths between them, but since we do not we cannot. In this paper, we will describe an algorithm which incrementally builds a compressed MDP using a sequence of deterministic searches. It adds states and edges to the compressed MDP only by encountering them along trajectories; so, it never adds irrelevant states or edges to the compressed MDP. Trajectories are generated by deterministic search, and so undistinguished states are treated only with fast algorithms. Bellman errors in the values for distinguished states show us where to try additional trajectories, and help us build the relevant parts of the compressed MDP as quickly as possible. 1 Robot-Helicopter Coordination Problem The motivation for our research was the problem of coordinating a ground robot and a helicopter. The ground robot needs to plan a path from its current location to a goal, but has only partial knowledge of the surrounding terrain. The helicopter can aid the ground robot by flying to and sensing places in the map. Figure 1(a) shows our ground robot, a converted Segway with a SICK laser rangefinder. Figure 1(b) shows the helicopter, also with a SICK. Figure 1(c) shows a 3D map of the environment in which the robot operates. The 3D map is post-processed to produce a discretized 2D environment (Figure 1(d)). Several places in the map are unknown, either because the robot has not visited them or because their status may have changed (e.g, a car may occupy a driveway). Such places are shown in Figure 1(d) as white squares. The elevation of each white square is proportional to the probability that there is an obstacle there; we assume independence between unknown squares. The robot must take the unknown locations into account when planning for its route. It may plan a path through these locations, but it risks having to turn back if its way is blocked. Alternately, the robot can ask the helicopter to fly to any of these places and sense them. We assign a cost to running the robot, and a somewhat higher cost to running the helicopter. The planning task is to minimize the expected overall cost of running the robot and the helicopter while getting the robot to its destination and the helicopter back to its home base. Figure 1(e) shows a snapshot of the robot and helicopter executing a policy. Designing a good policy for the robot and helicopter is a POMDP planning problem; unfortunately POMDPs are in general difficult to solve (PSPACE-complete [7]). In the POMDP representation, a state is the position of the robot, the current location of the helicopter (a point on a line segment from one of the unknown places to another unknown place or the home base), and the true status of each unknown location. The positions of the robot and the helicopter are observable, so that the only hidden variables are whether each unknown place is occupied. The number of states is (# of robot locations)×(# of helicopter locations)×2# of unknown places . So, the number of states is exponential in the number of unknown places and therefore quickly becomes very large. We approach the problem by planning in the belief state space, that is, the space of probability distributions over world states. This problem is a continuous-state MDP; in this belief MDP, our state consists of the ground robot’s location, the helicopter’s location, and a probability of occupancy for each unknown location. We will discretize the continuous probability variables by breaking the interval [0, 1] into several chunks; so, the number of belief states is exponential in the number of unknown places, and classical algorithms such as value iteration are infeasible even on small problems. If sensors are perfect, this domain is acyclic: after we sense a square we know its true state forever after. On the other hand, imperfect sensors can lead to cycles: new sensor data can contradict older sensor data and lead to increased uncertainty. With or without sensor noise, our belief state MDP differs from general MDPs because its stochastic transitions are sparse: large portions of the policy (while the robot and helicopter are traveling between unknown locations) are deterministic. The algorithm we propose in this paper takes advantage of this property of the problem, as we explain in the next section. 2 The Algorithm Our algorithm can be broken into two levels. At a high level, it constructs a compressed MDP, denoted M c , which contains only the start, the goal, and some states which are outcomes of stochastic actions. At a lower level, it repeatedly runs deterministic searches to find new information to put into M c . This information includes newly-discovered stochastic actions and their outcomes; better deterministic paths from one place to another; and more accurate value estimates similar to Bellman backups. The deterministic searches can use an admissible heuristic h to focus their effort, so we can often avoid putting many irrelevant actions into M c . Because M c will often be much smaller than M , we can afford to run stochastic planning algorithms like value iteration on it. On the other hand, the information we get by planning in M c will improve the heuristic values that we use in our deterministic searches; so, the deterministic searches will tend to visit only relevant portions of the state space. 2.1 Constructing and Solving a Compressed MDP Each action in the compressed MDP represents several consecutive actions in M : if we see a sequence of states and actions s1 , a1 , s2 , a2 , . . . , sk , ak where a1 through ak−1 are deterministic but ak is stochastic, then we can represent it in M c with a single action a, available at s1 , whose outcome distribution is P (s | sk , ak ) and whose cost is k−1 c(si , ai , si+1 ) + c(sk , ak , s ) c(s1 , a, s ) = i=1 (See Figure 2(a) for an example of such a compressed action.) In addition, if we see a sequence of deterministic actions ending in sgoal , say s1 , a1 , s2 , a2 , . . . , sk , ak , sk+1 = sgoal , k we can define a compressed action which goes from s1 to sgoal at cost i=1 c(si , ai , si+1 ). We can label each compressed action that starts at s with (s, s , a) (where a = null if s = sgoal ). Among all compressed actions starting at s and ending at (s , a) there is (at least) one with lowest expected cost; we will call such an action an optimal compression of (s, s , a). Write Astoch for the set of all pairs (s, a) such that action a when taken from state s has more than one possible outcome, and include as well (sgoal , null). Write Sstoch for the states which are possible outcomes of the actions in Astoch , and include sstart as well. If we include in our compressed MDP an optimal compression of (s, s , a) for every s ∈ Sstoch and every (s , a) ∈ Astoch , the result is what we call the full compressed MDP; an example is in Figure 2(b). If we solve the full compressed MDP, the value of each state will be the same as the value of the corresponding state in M . However, we do not need to do that much work: (a) action compression (b) full MDP compression (c) incremental MDP compression Figure 2: MDP compression Main() 01 initialize M c with sstart and sgoal and set their v-values to 0; 02 while (∃s ∈ M c s.t. RHS(s) − v(s) > δ and s belongs to the current greedy policy) 03 select spivot to be any such state s; 04 [v; vlim ] = Search(spivot ); 05 v(spivot ) = v; 06 set the cost c(spivot , a, sgoal ) of the limit action a from spivot to vlim ; ¯ ¯ 07 optionally run some algorithm satisfying req. A for a bounded amount of time to improve the value function in M c ; Figure 3: MCP main loop many states and actions in the full compressed MDP are irrelevant since they do not appear in the optimal policy from sstart to sgoal . So, the goal of the MCP algorithm will be to construct only the relevant part of the compressed MDP by building M c incrementally. Figure 2(c) shows the incremental construction of a compressed MDP which contains all of the stochastic states and actions along an optimal policy in M . The pseudocode for MCP is given in Figure 3. It begins by initializing M c to contain only sstart and sgoal , and it sets v(sstart ) = v(sgoal ) = 0. It maintains the invariant that 0 ≤ v(s) ≤ v ∗ (s) for all s. On each iteration, MCP looks at the Bellman error of each of the states in M c . The Bellman error is v(s) − RHS(s), where RHS(s) = min RHS(s, a) a∈A(s) RHS(s, a) = Es ∈succ(s,a) (c(s, a, s ) + v(s )) By convention the min of an empty set is ∞, so an s which does not have any compressed actions yet is considered to have infinite RHS. MCP selects a state with negative Bellman error, spivot , and starts a search at that state. (We note that there exist many possible ways to select spivot ; for example, we can choose the state with the largest negative Bellman error, or the largest error when weighted by state visitation probabilities in the best policy in M c .) The goal of this search is to find a new compressed action a such that its RHS-value can provide a new lower bound on v ∗ (spivot ). This action can either decrease the current RHS(spivot ) (if a seems to be a better action in terms of the current v-values of action outcomes) or prove that the current RHS(spivot ) is valid. Since v(s ) ≤ v ∗ (s ), one way to guarantee that RHS(spivot , a) ≤ v ∗ (spivot ) is to compute an optimal compression of (spivot , s, a) for all s, a, then choose the one with the smallest RHS. A more sophisticated strategy is to use an A∗ search with appropriate safeguards to make sure we never overestimate the value of a stochastic action. MCP, however, uses a modified A∗ search which we will describe in the next section. As the search finds new compressed actions, it adds them and their outcomes to M c . It is allowed to initialize newly-added states to any admissible values. When the search returns, MCP sets v(spivot ) to the returned value. This value is at least as large as RHS(spivot ). Consequently, Bellman error for spivot becomes non-negative. In addition to the compressed action and the updated value, the search algorithm returns a “limit value” vlim (spivot ). These limit values allow MCP to run a standard MDP planning algorithm on M c to improve its v(s) estimates. MCP can use any planning algorithm which guarantees that, for any s, it will not lower v(s) and will not increase v(s) beyond the smaller of vlim (s) and RHS(s) (Requirement A). For example, we could insert a fake “limit action” into M c which goes directly from spivot to sgoal at cost vlim (spivot ) (as we do on line 06 in Figure 3), then run value iteration for a fixed amount of time, selecting for each backup a state with negative Bellman error. After updating M c from the result of the search and any optional planning, MCP begins again by looking for another state with a negative Bellman error. It repeats this process until there are no negative Bellman errors larger than δ. For small enough δ, this property guarantees that we will be able to find a good policy (see section 2.3). 2.2 Searching the MDP Efficiently The top level algorithm (Figure 3) repeatedly invokes a search method for finding trajectories from spivot to sgoal . In order for the overall algorithm to work correctly, there are several properties that the search must satisfy. First, the estimate v that search returns for the expected cost of spivot should always be admissible. That is, 0 ≤ v ≤ v ∗ (spivot ) (Property 1). Second, the estimate v should be no less than the one-step lookahead value of spivot in M c . That is, v ≥ RHS(spivot ) (Property 2). This property ensures that search either increases the value of spivot or finds additional (or improved) compressed actions. The third and final property is for the vlim value, and it is only important if MCP uses its optional planning step (line 07). The property is that v ≤ vlim ≤ v ∗ (spivot ) (Property 3). Here v ∗ (spivot ) denotes the minimum expected cost of starting at spivot , picking a compressed action not in M c , and acting optimally from then on. (Note that v ∗ can be larger than v ∗ if the optimal compressed action is already part of M c .) Property 3 uses v ∗ rather than v ∗ since the latter is not known while it is possible to compute a lower bound on the former efficiently (see below). One could adapt A* search to satisfy at least Properties 1 and 2 by assuming that we can control the outcome of stochastic actions. However, this sort of search is highly optimistic and can bias the search towards improbable trajectories. Also, it can only use heuristics which are even more optimistic than it is: that is, h must be admissible with respect to the optimistic assumption of controlled outcomes. We therefore present a version of A*, called MCP-search (Figure 4), that is more efficient for our purposes. MCP-search finds the correct expected value for the first stochastic action it encounters on any given trajectory, and is therefore far less optimistic. And, MCP-search only requires heuristic values to be admissible with respect to v ∗ values, h(s) ≤ v ∗ (s). Finally, MCP-search speeds up repetitive searches by improving heuristic values based on previous searches. A* maintains a priority queue, OPEN, of states which it plans to expand. The OPEN queue is sorted by f (s) = g(s)+h(s), so that A* always expands next a state which appears to be on the shortest path from start to goal. During each expansion a state s is removed from OPEN and all the g-values of s’s successors are updated; if g(s ) is decreased for some state s , A* inserts s into OPEN. A* terminates as soon as the goal state is expanded. We use the variant of A* with pathmax [5] to use efficiently heuristics that do not satisfy the triangle inequality. MCP is similar to A∗ , but the OPEN list can also contain state-action pairs {s, a} where a is a stochastic action (line 31). Plain states are represented in OPEN as {s, null}. Just ImproveHeuristic(s) 01 if s ∈ M c then h(s) = max(h(s), v(s)); 02 improve heuristic h(s) further if possible using f best and g(s) from previous iterations; procedure fvalue({s, a}) 03 if s = null return ∞; 04 else if a = null return g(s) + h(s); 05 else return g(s) + max(h(s), Es ∈Succ(s,a) {c(s, a, s ) + h(s )}); CheckInitialize(s) 06 if s was accessed last in some previous search iteration 07 ImproveHeuristic(s); 08 if s was not yet initialized in the current search iteration 09 g(s) = ∞; InsertUpdateCompAction(spivot , s, a) 10 reconstruct the path from spivot to s; 11 insert compressed action (spivot , s, a) into A(spivot ) (or update the cost if a cheaper path was found) 12 for each outcome u of a that was not in M c previously 13 set v(u) to h(u) or any other value less than or equal to v ∗ (u); 14 set the cost c(u, a, sgoal ) of the limit action a from u to v(u); ¯ ¯ procedure Search(spivot ) 15 CheckInitialize(sgoal ), CheckInitialize(spivot ); 16 g(spivot ) = 0; 17 OPEN = {{spivot , null}}; 18 {sbest , abest } = {null, null}, f best = ∞; 19 while(g(sgoal ) > min{s,a}∈OPEN (fvalue({s, a})) AND f best + θ > min{s,a}∈OPEN (fvalue({s, a}))) 20 remove {s, a} with the smallest fvalue({s, a}) from OPEN breaking ties towards the pairs with a = null; 21 if a = null //expand state s 22 for each s ∈ Succ(s) 23 CheckInitialize(s ); 24 for each deterministic a ∈ A(s) 25 s = Succ(s, a ); 26 h(s ) = max(h(s ), h(s) − c(s, a , s )); 27 if g(s ) > g(s) + c(s, a , s ) 28 g(s ) = g(s) + c(s, a , s ); 29 insert/update {s , null} into OPEN with fvalue({s , null}); 30 for each stochastic a ∈ A(s) 31 insert/update {s, a } into OPEN with fvalue({s, a }); 32 else //encode stochastic action a from state s as a compressed action from spivot 33 InsertUpdateCompAction(spivot , s, a); 34 if f best > fvalue({s, a}) then {sbest , abest } = {s, a}, f best = fvalue({s, a}); 35 if (g(sgoal ) ≤ min{s,a}∈OPEN (fvalue({s, a})) AND OPEN = ∅) 36 reconstruct the path from spivot to sgoal ; 37 update/insert into A(spivot ) a deterministic action a leading to sgoal ; 38 if f best ≥ g(sgoal ) then {sbest , abest } = {sgoal , null}, f best = g(sgoal ); 39 return [f best; min{s,a}∈OPEN (fvalue({s, a}))]; Figure 4: MCP-search Algorithm like A*, MCP-search expands elements in the order of increasing f -values, but it breaks ties towards elements encoding plain states (line 20). The f -value of {s, a} is defined as g(s) + max[h(s), Es ∈Succ(s,a) (c(s, a, s ) + h(s ))] (line 05). This f -value is a lower bound on the cost of a policy that goes from sstart to sgoal by first executing a series of deterministic actions until action a is executed from state s. This bound is as tight as possible given our heuristic values. State expansion (lines 21-31) is very similar to A∗ . When the search removes from OPEN a state-action pair {s, a} with a = null, it adds a compressed action to M c (line 33). It also adds a compressed action if there is an optimal deterministic path to sgoal (line 37). f best tracks the minimum f -value of all the compressed actions found. As a result, f best ≤ v ∗ (spivot ) and is used as a new estimate for v(spivot ). The limit value vlim (spivot ) is obtained by continuing the search until the minimum f -value of elements in OPEN approaches f best + θ for some θ ≥ 0 (line 19). This minimum f -value then provides a lower bound on v ∗ (spivot ). To speed up repetitive searches, MCP-search improves the heuristic of every state that it encounters for the first time in the current search iteration (lines 01 and 02). Line 01 uses the fact that v(s) from M c is a lower bound on v ∗ (s). Line 02 uses the fact that f best − g(s) is a lower bound on v ∗ (s) at the end of each previous call to Search; for more details see [4]. 2.3 Theoretical Properties of the Algorithm We now present several theorems about our algorithm. The proofs of these and other theorems can be found in [4]. The first theorem states the main properties of MCP-search. Theorem 1 The search function terminates and the following holds for the values it returns: (a) if sbest = null then v ∗ (spivot ) ≥ f best ≥ E{c(spivot , abest , s ) + v(s )} (b) if sbest = null then v ∗ (spivot ) = f best = ∞ (c) f best ≤ min{s,a}∈OPEN (fvalue({s, a})) ≤ v ∗ (spivot ). If neither sgoal nor any state-action pairs were expanded, then sbest = null and (b) says that there is no policy from spivot that has a finite expected cost. Using the above theorem it is easy to show that MCP-search satisfies Properties 1, 2 and 3, considering that f best is returned as variable v and min{s,a}∈OPEN (fvalue({s, a})) is returned as variable vlim in the main loop of the MCP algorithm (Figure 3). Property 1 follows directly from (a) and (b) and the fact that costs are strictly positive and v-values are non-negative. Property 2 also follows trivially from (a) and (b). Property 3 follows from (c). Given these properties c the next theorem states the correctness of the outer MCP algorithm (in the theorem πgreedy denotes a greedy policy that always chooses an action that looks best based on its cost and the v-values of its immediate successors). Theorem 2 Given a deterministic search algorithm which satisfies Properties 1–3, the c MCP algorithm will terminate. Upon termination, for every state s ∈ M c ∩ πgreedy we ∗ have RHS(s) − δ ≤ v(s) ≤ v (s). Given the above theorem one can show that for 0 ≤ δ < cmin (where cmin is the c smallest expected action cost in our MDP) the expected cost of executing πgreedy from cmin ∗ sstart is at most cmin −δ v (sstart ). Picking δ ≥ cmin is not guaranteed to result in a proper policy, even though Theorem 2 continues to hold. 3 Experimental Study We have evaluated the MCP algorithm on the robot-helicopter coordination problem described in section 1. To obtain an admissible heuristic, we first compute a value function for every possible configuration of obstacles. Then we weight the value functions by the probabilities of their obstacle configurations, sum them, and add the cost of moving the helicopter back to its base if it is not already there. This procedure results in optimistic cost estimates because it pretends that the robot will find out the obstacle locations immediately instead of having to wait to observe them. The results of our experiments are shown in Figure 5. We have compared MCP against three algorithms: RTDP [1], LAO* [2] and value iteration on reachable states (VI). RTDP can cope with large size MDPs by focussing its planning efforts along simulated execution trajectories. LAO* uses heuristics to prune away irrelevant states, then repeatedly performs dynamic programming on the states in its current partial policy. We have implemented LAO* so that it reduces to AO* [6] when environments are acyclic (e.g., the robot-helicopter problem with perfect sensing). VI was only able to run on the problems with perfect sensing since the number of reachable states was too large for the others. The results support the claim that MCP can solve large problems with sparse stochasticity. For the problem with perfect sensing, on average MCP was able to plan 9.5 times faster than LAO*, 7.5 times faster than RTDP, and 8.5 times faster than VI. On average for these problems, MCP computed values for 58633 states while M c grew to 396 states, and MCP encountered 3740 stochastic transitions (to give a sense of the degree of stochasticity). The main cost of MCP was in its deterministic search subroutine; this fact suggests that we might benefit from anytime search techniques such as ARA* [3]. The results for the problems with imperfect sensing show that, as the number and density of uncertain outcomes increases, the advantage of MCP decreases. For these problems MCP was able to solve environments 10.2 times faster than LAO* but only 2.2 times faster than RTDP. On average MCP computed values for 127,442 states, while the size of M c was 3,713 states, and 24,052 stochastic transitions were encountered. Figure 5: Experimental results. The top row: the robot-helicopter coordination problem with perfect sensors. The bottom row: the robot-helicopter coordination problem with sensor noise. Left column: running times (in secs) for each algorithm grouped by environments. Middle column: the number of backups for each algorithm grouped by environments. Right column: an estimate of the expected cost of an optimal policy (v(sstart )) vs. running time (in secs) for experiment (k) in the top row and experiment (e) in the bottom row. Algorithms in the bar plots (left to right): MCP, LAO*, RTDP and VI (VI is only shown for problems with perfect sensing). The characteristics of the environments are given in the second and third rows under each of the bar plot. The second row indicates how many cells the 2D plane is discretized into, and the third row indicates the number of initially unknown cells in the environment. 4 Discussion The MCP algorithm incrementally builds a compressed MDP using a sequence of deterministic searches. Our experimental results suggest that MCP is advantageous for problems with sparse stochasticity. In particular, MCP has allowed us to scale to larger environments than were otherwise possible for the robot-helicopter coordination problem. Acknowledgements This research was supported by DARPA’s MARS program. All conclusions are our own. References [1] S. Bradtke A. Barto and S. Singh. Learning to act using real-time dynamic programming. Artificial Intelligence, 72:81–138, 1995. [2] E. Hansen and S. Zilberstein. LAO*: A heuristic search algorithm that finds solutions with loops. Artificial Intelligence, 129:35–62, 2001. [3] M. Likhachev, G. Gordon, and S. Thrun. ARA*: Anytime A* with provable bounds on sub-optimality. In Advances in Neural Information Processing Systems (NIPS) 16. Cambridge, MA: MIT Press, 2003. [4] M. Likhachev, G. Gordon, and S. Thrun. MCP: Formal analysis. Technical report, Carnegie Mellon University, Pittsburgh, PA, 2004. [5] L. Mero. A heuristic search algorithm with modifiable estimate. Artificial Intelligence, 23:13–27, 1984. [6] N. Nilsson. Principles of Artificial Intelligence. Palo Alto, CA: Tioga Publishing, 1980. [7] C. H. Papadimitriou and J. N. Tsitsiklis. The complexity of Markov decision processses. Mathematics of Operations Research, 12(3):441–450, 1987.
6 0.35559669 107 nips-2004-Making Latin Manuscripts Searchable using gHMM's
7 0.33451015 193 nips-2004-Theories of Access Consciousness
8 0.33321327 108 nips-2004-Markov Networks for Detecting Overalpping Elements in Sequence Data
9 0.33103982 1 nips-2004-A Cost-Shaping LP for Bellman Error Minimization with Performance Guarantees
10 0.31669435 152 nips-2004-Real-Time Pitch Determination of One or More Voices by Nonnegative Matrix Factorization
11 0.30992651 124 nips-2004-Multiple Alignment of Continuous Time Series
12 0.30511117 159 nips-2004-Schema Learning: Experience-Based Construction of Predictive Action Models
13 0.28339449 120 nips-2004-Modeling Conversational Dynamics as a Mixed-Memory Markov Process
14 0.26931784 97 nips-2004-Learning Efficient Auditory Codes Using Spikes Predicts Cochlear Filters
15 0.26611975 43 nips-2004-Conditional Models of Identity Uncertainty with Application to Noun Coreference
16 0.26515156 183 nips-2004-Temporal-Difference Networks
17 0.26471135 75 nips-2004-Heuristics for Ordering Cue Search in Decision Making
18 0.26182196 162 nips-2004-Semi-Markov Conditional Random Fields for Information Extraction
19 0.26067448 56 nips-2004-Dynamic Bayesian Networks for Brain-Computer Interfaces
20 0.2564854 180 nips-2004-Synchronization of neural networks by mutual learning and its application to cryptography
topicId topicWeight
[(13, 0.052), (15, 0.075), (26, 0.044), (31, 0.028), (33, 0.17), (35, 0.019), (39, 0.025), (50, 0.035), (51, 0.011), (52, 0.012), (69, 0.403)]
simIndex simValue paperId paperTitle
1 0.84018159 128 nips-2004-Neural Network Computation by In Vitro Transcriptional Circuits
Author: Jongmin Kim, John Hopfield, Erik Winfree
Abstract: The structural similarity of neural networks and genetic regulatory networks to digital circuits, and hence to each other, was noted from the very beginning of their study [1, 2]. In this work, we propose a simple biochemical system whose architecture mimics that of genetic regulation and whose components allow for in vitro implementation of arbitrary circuits. We use only two enzymes in addition to DNA and RNA molecules: RNA polymerase (RNAP) and ribonuclease (RNase). We develop a rate equation for in vitro transcriptional networks, and derive a correspondence with general neural network rate equations [3]. As proof-of-principle demonstrations, an associative memory task and a feedforward network computation are shown by simulation. A difference between the neural network and biochemical models is also highlighted: global coupling of rate equations through enzyme saturation can lead to global feedback regulation, thus allowing a simple network without explicit mutual inhibition to perform the winner-take-all computation. Thus, the full complexity of the cell is not necessary for biochemical computation: a wide range of functional behaviors can be achieved with a small set of biochemical components. 1
same-paper 2 0.72957623 74 nips-2004-Harmonising Chorales by Probabilistic Inference
Author: Moray Allan, Christopher Williams
Abstract: We describe how we used a data set of chorale harmonisations composed by Johann Sebastian Bach to train Hidden Markov Models. Using a probabilistic framework allows us to create a harmonisation system which learns from examples, and which can compose new harmonisations. We make a quantitative comparison of our system’s harmonisation performance against simpler models, and provide example harmonisations. 1
3 0.66549033 18 nips-2004-Algebraic Set Kernels with Application to Inference Over Local Image Representations
Author: Amnon Shashua, Tamir Hazan
Abstract: This paper presents a general family of algebraic positive definite similarity functions over spaces of matrices with varying column rank. The columns can represent local regions in an image (whereby images have varying number of local parts), images of an image sequence, motion trajectories in a multibody motion, and so forth. The family of set kernels we derive is based on a group invariant tensor product lifting with parameters that can be naturally tuned to provide a cook-book of sorts covering the possible ”wish lists” from similarity measures over sets of varying cardinality. We highlight the strengths of our approach by demonstrating the set kernels for visual recognition of pedestrians using local parts representations. 1
4 0.45908681 29 nips-2004-Beat Tracking the Graphical Model Way
Author: Dustin Lang, Nando D. Freitas
Abstract: We present a graphical model for beat tracking in recorded music. Using a probabilistic graphical model allows us to incorporate local information and global smoothness constraints in a principled manner. We evaluate our model on a set of varied and difficult examples, and achieve impressive results. By using a fast dual-tree algorithm for graphical model inference, our system runs in less time than the duration of the music being processed. 1
5 0.44460282 3 nips-2004-A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound
Author: Dori Peleg, Ron Meir
Abstract: A novel linear feature selection algorithm is presented based on the global minimization of a data-dependent generalization error bound. Feature selection and scaling algorithms often lead to non-convex optimization problems, which in many previous approaches were addressed through gradient descent procedures that can only guarantee convergence to a local minimum. We propose an alternative approach, whereby the global solution of the non-convex optimization problem is derived via an equivalent optimization problem. Moreover, the convex optimization task is reduced to a conic quadratic programming problem for which efficient solvers are available. Highly competitive numerical results on both artificial and real-world data sets are reported. 1
6 0.44378775 69 nips-2004-Fast Rates to Bayes for Kernel Machines
7 0.44354138 44 nips-2004-Conditional Random Fields for Object Recognition
8 0.4430778 45 nips-2004-Confidence Intervals for the Area Under the ROC Curve
9 0.44303668 207 nips-2004-ℓ₀-norm Minimization for Basis Selection
10 0.44264761 204 nips-2004-Variational Minimax Estimation of Discrete Distributions under KL Loss
11 0.44237888 82 nips-2004-Incremental Algorithms for Hierarchical Classification
12 0.44204736 38 nips-2004-Co-Validation: Using Model Disagreement on Unlabeled Data to Validate Classification Algorithms
13 0.44204283 174 nips-2004-Spike Sorting: Bayesian Clustering of Non-Stationary Data
14 0.44203177 77 nips-2004-Hierarchical Clustering of a Mixture Model
15 0.44133914 161 nips-2004-Self-Tuning Spectral Clustering
16 0.44132212 86 nips-2004-Instance-Specific Bayesian Model Averaging for Classification
17 0.44118965 166 nips-2004-Semi-supervised Learning via Gaussian Processes
18 0.44048789 2 nips-2004-A Direct Formulation for Sparse PCA Using Semidefinite Programming
19 0.43926898 72 nips-2004-Generalization Error and Algorithmic Convergence of Median Boosting
20 0.43911505 167 nips-2004-Semi-supervised Learning with Penalized Probabilistic Clustering