nips nips2004 nips2004-117 knowledge-graph by maker-knowledge-mining

117 nips-2004-Methods Towards Invasive Human Brain Computer Interfaces


Source: pdf

Author: Thomas N. Lal, Thilo Hinterberger, Guido Widman, Michael Schröder, N. J. Hill, Wolfgang Rosenstiel, Christian E. Elger, Niels Birbaumer, Bernhard Schölkopf

Abstract: During the last ten years there has been growing interest in the development of Brain Computer Interfaces (BCIs). The field has mainly been driven by the needs of completely paralyzed patients to communicate. With a few exceptions, most human BCIs are based on extracranial electroencephalography (EEG). However, reported bit rates are still low. One reason for this is the low signal-to-noise ratio of the EEG [16]. We are currently investigating if BCIs based on electrocorticography (ECoG) are a viable alternative. In this paper we present the method and examples of intracranial EEG recordings of three epilepsy patients with electrode grids placed on the motor cortex. The patients were asked to repeatedly imagine movements of two kinds, e.g., tongue or finger movements. We analyze the classifiability of the data using Support Vector Machines (SVMs) [18, 21] and Recursive Channel Elimination (RCE) [11]. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The field has mainly been driven by the needs of completely paralyzed patients to communicate. [sent-16, score-0.34]

2 With a few exceptions, most human BCIs are based on extracranial electroencephalography (EEG). [sent-17, score-0.236]

3 We are currently investigating if BCIs based on electrocorticography (ECoG) are a viable alternative. [sent-20, score-0.043]

4 In this paper we present the method and examples of intracranial EEG recordings of three epilepsy patients with electrode grids placed on the motor cortex. [sent-21, score-0.821]

5 The patients were asked to repeatedly imagine movements of two kinds, e. [sent-22, score-0.386]

6 1 Introduction Completely paralyzed patients cannot communicate despite intact cognitive functions. [sent-26, score-0.34]

7 The disease Amyotrophic Lateral Sclerosis (ALS) for example, leads to complete paralysis of the voluntary muscular system caused by the degeneration of the motor neurons. [sent-27, score-0.18]

8 [1, 9] developed a Brain Computer Interface (BCI), called the Thought Translation Device (TTD), which is used by several paralyzed patients. [sent-29, score-0.083]

9 In order to use the interface, patients have to learn to voluntary regulate their Slow Cortical Potentials (SCP). [sent-30, score-0.327]

10 Not all patients manage Figure 1: The left picture schematically shows the position of the 8x8 electrode grid of patient II. [sent-33, score-0.707]

11 As shown in the right picture the electrodes are connected to the amplifier via cables that are passed through the skull. [sent-35, score-0.241]

12 Most such BCIs require a data collection phase during which the subject repeatedly produces brain states of clearly separable locations. [sent-41, score-0.123]

13 This function can be used in online applications to identify the different brain states produced by the subject. [sent-43, score-0.083]

14 The majority of BCIs is based on extracranial EEG-recordings during imagined limb movements. [sent-44, score-0.201]

15 Movement-related cortical potentials in humans on the basis of electrocorticographical data have also been studied, e. [sent-46, score-0.078]

16 Very recently the first work describing BCIs based on electrocorticographic recordings was published [6, 13]. [sent-49, score-0.174]

17 Successful approaches have been developed using BCIs based on single unit, multiunit or field potentials recordings of primates. [sent-50, score-0.251]

18 taught monkeys to control a cursor on the basis of potentials from 7-30 motor cortex neurons [19]. [sent-52, score-0.312]

19 The BCI developed by [3] enables monkeys to reach and grasp using a robot arm. [sent-53, score-0.054]

20 Their system is based on recordings from frontoparietal cell ensembles. [sent-54, score-0.131]

21 2 Electrocorticography and Epilepsy All patients presented suffer from a focal epilepsy. [sent-56, score-0.284]

22 The epileptic focus - the part of the brain which is responsible for the seizures - is removed by resection. [sent-57, score-0.252]

23 Prior to surgery, the epileptic focus has to be localized. [sent-58, score-0.086]

24 In some complicated cases, this must be done by placing electrodes onto the surface of the cortex as well as into deeper regions of the brain. [sent-59, score-0.255]

25 The skull over the region of interest is removed, the electrodes are positioned and the incision is sutured. [sent-60, score-0.157]

26 The electrodes are connected to a recording device via cables (cf. [sent-61, score-0.27]

27 Over a period of a 5 to 14 days ECoG is continuously recorded until the patient has had enough seizures to precisely localize the focus [10]. [sent-63, score-0.323]

28 Prior to surgery the parts of the cortex that are covered by the electrodes are identified by the electric stimulation of electrodes. [sent-64, score-0.548]

29 In the current setup, the patients keep the electrode implants for one to two weeks. [sent-65, score-0.427]

30 Furthermore most of the patients cannot concentrate for a long period of time. [sent-68, score-0.284]

31 All three patients had an electrode grid implanted that partly covered the right or the left motor cortex. [sent-71, score-0.754]

32 patient implanted electrodes I 64-grid right hemisphere, two 4-strip interhemisphere 64-grid right hemisphere 20-grid central, four 16-strips frontal II III 3 task trials left vs. [sent-72, score-0.559]

33 tongue 150 100 Experimental Situation and Data Acquisition The experiments were performed in the department of epileptology of the University of Bonn. [sent-75, score-0.172]

34 We recorded ECoG data from three epileptic patients with a sampling rate of 1000Hz. [sent-76, score-0.411]

35 The electrode grids were placed on the cortex under the dura mater and covered the primary motor and premotor area as well as the fronto-temporal region either of the right or left hemisphere. [sent-77, score-0.501]

36 Furthermore two of the patients had additional electrodes implanted on other parts of the cortex (cf. [sent-79, score-0.698]

37 The imagery tasks were chosen such that the involved parts of the brain • were covered by the electrode grid • were represented spatially separate in the primary motor cortex. [sent-81, score-0.505]

38 The expected well-localized signal in motor-related tasks suggested discrimination tasks using imagination of hand, little finger, or tongue movements. [sent-82, score-0.224]

39 The patients were seated in a bed facing a monitor and were asked to repeatedly imagine two different movements. [sent-83, score-0.386]

40 The 4 second imagination phase started with a cue that was presented in the form of a picture showing either a tongue or a little finger for patients II and III. [sent-85, score-0.599]

41 The cue for patient I was an arrow pointing left or right. [sent-86, score-0.289]

42 The images which were used as a cue are shown in Figure 5. [sent-88, score-0.05]

43 For every trial and every electrode we thus obtained an EEG sequence that consisted of 1500 samples. [sent-91, score-0.191]

44 The concatenated model parameters of the channels together with the descriptor of the imagined task (i. [sent-94, score-0.385]

45 The data of patient III for example, consists of only 100 training points of Figure 2: The patients were asked to repeatedly imagine two different movements that are represented separately at the primary cortex, e. [sent-99, score-0.625]

46 This figure shows two stimuli that were used as a cue for imagery. [sent-102, score-0.05]

47 Recursive Channel Elimination (RCE) [11] treats features that belong to the data of a channel in a consistent way. [sent-116, score-0.152]

48 The evaluation criteria that determines which of the remaining channels will be removed is the mean of the weight vector entries that correspond to a channel’s features. [sent-118, score-0.374]

49 All features of the channel with the smallest mean value are removed from the data. [sent-119, score-0.192]

50 We compare how well SVMs can generalize given the data of different subsets of ECoG-channels: (i) the complete data, i. [sent-125, score-0.047]

51 all channels (ii) the subset of channels suggested by RCE. [sent-127, score-0.668]

52 In this setting we use the list of ranked channels from RCE in the following way: For every l in the range of one to the total number of channels, we calculate a 10-fold cross-validation error on the data of the l best-ranked channels. [sent-128, score-0.461]

53 We use the subset of channels which leads to the lowest error estimate. [sent-129, score-0.366]

54 The underlying assumption used here is that the classification-relevant information is extremely localized and that two correctly chosen channels contain sufficient information for classification purposes. [sent-131, score-0.334]

55 For regularization purposes we use a ridge on the kernel matrix which corresponds to a 2-norm penalty on the slack variables [4]. [sent-134, score-0.057]

56 C4 muV C3 C2 C1 0 500 1000 1500 time [ms] 2000 2500 3000 3500 Figure 3: This plot shows ECoG recordings from 4 channels while the patient was imagining movements. [sent-135, score-0.704]

57 The amplitude of the recordings ranges roughly from -100 µV to +100 µV which is on the order of five to ten times the amplitude measured with extracranial EEG. [sent-137, score-0.281]

58 To evaluate the classification performance of an SVM that is trained on a specific subset of channels we calculate its prediction error on a separate test set. [sent-138, score-0.366]

59 Via 10-fold cross-validation on the training set we estimate all parameters for the different considered subsets (i)-(iv): (i) The ridge is estimated. [sent-140, score-0.104]

60 (iii) We restrict the training set and the test set to the 2 best ranked channels by RCE. [sent-144, score-0.429]

61 The ridge is then estimated on the restricted training set. [sent-145, score-0.057]

62 For (i)-(iv) we obtain 50 test error estimates from the 50 repetitions for each patient. [sent-149, score-0.062]

63 For patient I the error decreases from 38% to 24% when using the channel subsets suggested by RCE. [sent-152, score-0.47]

64 In average RCE selects channel subsets of size 5. [sent-153, score-0.199]

65 For patient II the number of channels is reduced to one third but the channel selection process does not yield an increased accuracy. [sent-155, score-0.76]

66 The error of 40% can be reduced to 23% for patient III using in average 5 channels selected by RCE. [sent-156, score-0.605]

67 For patients I and III the choice of the best 2 ranked channels leads to a much lower error as well. [sent-157, score-0.745]

68 The direct comparison of the results using the two best ranked channels to two randomly chosen channels shows how well the RCE ranking method works: For patient three the error drops from chance level for two random channels to 18 % using the two best-ranked channels. [sent-158, score-1.416]

69 The reason why there is such a big difference in performance for patient III when comparing (i) and (iii) might be, that out of the 84 electrodes, only 20 are located over or close to the motor cortex. [sent-159, score-0.376]

70 In contrast to patient III, the electrodes of patient II are all more or less located close to Table 2: Classification Results. [sent-161, score-0.635]

71 We compare the classification accuracy of SVMs trained on the data of different channel subsets: (i) all ECoG-channels, (ii) the subset determined by Recursive Channel Elimination (RCE), (iii) the subset consisting of the two best ranked channels by RCE and (iv) two randomly drawn channels. [sent-162, score-0.581]

72 Using the two best ranked channels by RCE also yields good results for two patients. [sent-165, score-0.429]

73 SVMs trained on two random channels show performance better than chance only for patient II. [sent-166, score-0.621]

74 pat I II III all channels (i) #channels error 74 64 84 0. [sent-167, score-0.366]

75 13 RCE top 2 (iii) error random 2 (iv) error 0. [sent-183, score-0.064]

76 This explains why data from two randomly drawn channels can yield a classification rate better than chance. [sent-192, score-0.334]

77 Furthermore patient II had the fewest electrodes implanted and thus the chance of randomly choosing an electrode close to an important location is higher than for the other two patients. [sent-193, score-0.716]

78 8 Discussion We recorded ECoG-data from three epilepsy patients during a motor imagery experiment. [sent-194, score-0.577]

79 Although only few data were collected, the following conclusions can be drawn: • The data of all three patients is reasonably well classifiable. [sent-195, score-0.284]

80 This is still high compared to the best error rates from BCI based on extracranial EEG which are as low as 10% (e. [sent-199, score-0.209]

81 5 seconds data from each trial only and that very few training points (100-200) were available. [sent-203, score-0.085]

82 Furthermore, extracranial EEG has been studied and developed for a number of years. [sent-204, score-0.177]

83 RCE successfully identifies subsets of ECoG-channels that lead to good classification performance. [sent-206, score-0.047]

84 • Poor classification rates using two randomly drawn channels and high classification rates using the two best-ranked channels by RCE suggest that classification relevant information is focused on small parts of the cortex and depends on the location of the physiological function. [sent-208, score-0.85]

85 • The best ranked RCE-channels correspond well with the results from the electric stimulation (cf. [sent-209, score-0.233]

86 For instance, it is still an open question whether the patients are able to adjust to a trained classifier and whether the classifying function can be transferred from session to session. [sent-212, score-0.312]

87 Moreover, experiments that are based on tasks different from motor imaginary need to X X X X X X X X X Figure 4: Electric stimulation of the implanted electrodes helps to identify the parts of the cortex that are covered by the electrode grid. [sent-213, score-0.862]

88 The red (solid) dots on the left picture mark the motor cortex of patient II as identified by the electric stimulation method. [sent-215, score-0.653]

89 The positions marked with yellow crosses correspond to the epileptic focus. [sent-216, score-0.086]

90 The red points on the right image are the best ranked channels by Recursive Channel Elimination (RCE). [sent-217, score-0.429]

91 The RCE-channels correspond well to the results from the electro stimulation diagnosis. [sent-218, score-0.075]

92 It is quite conceivable that the tasks that have been found to work well for extracranial EEG are not ideal for ECoG. [sent-220, score-0.15]

93 Likewise, it is unclear whether our preprocessing and machine learning methods, originally developed for extracranial EEG data, are well adapted to the different type of data that ECoG delivers. [sent-221, score-0.177]

94 Classifying single trial EEG: Towards brain u computer interfacing. [sent-247, score-0.131]

95 Learning to control a brain-machine interface for reaching and grasping by primates. [sent-269, score-0.061]

96 Towards a direct brain interface based on human subdural recordings and wavelet packet analysis. [sent-292, score-0.275]

97 Separability of EEG signals recorded during right and left motor imagery using adaptive autoregressive parameters. [sent-370, score-0.263]

98 Corticomuscular coherence in the 6-15 hz band: is the cortex involved in the generation of physiologic tremor? [sent-383, score-0.098]

99 Optimal spatial filtering of sinu gle trial EEG during imagined hand movement. [sent-389, score-0.099]

100 Event-related desynchronization and movement-related cortical potentials on the ECoG and EEG. [sent-413, score-0.078]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('rce', 0.407), ('channels', 0.334), ('patients', 0.284), ('patient', 0.239), ('eeg', 0.206), ('ecog', 0.193), ('electrodes', 0.157), ('bcis', 0.153), ('channel', 0.152), ('extracranial', 0.15), ('iii', 0.149), ('electrode', 0.143), ('motor', 0.137), ('recordings', 0.131), ('implanted', 0.129), ('tongue', 0.129), ('nger', 0.102), ('cortex', 0.098), ('ranked', 0.095), ('electroencephalography', 0.086), ('epileptic', 0.086), ('brain', 0.083), ('elimination', 0.082), ('interfaces', 0.082), ('iv', 0.078), ('stimulation', 0.075), ('bci', 0.068), ('ii', 0.066), ('epilepsy', 0.064), ('imagination', 0.064), ('surgery', 0.064), ('electric', 0.063), ('interface', 0.061), ('covered', 0.061), ('recursive', 0.059), ('ridge', 0.057), ('biomedical', 0.057), ('paralyzed', 0.056), ('rfe', 0.056), ('birbaumer', 0.051), ('clinical', 0.051), ('hinterberger', 0.051), ('imagery', 0.051), ('imagined', 0.051), ('cue', 0.05), ('potentials', 0.05), ('classi', 0.05), ('trial', 0.048), ('chance', 0.048), ('subsets', 0.047), ('neurophysiology', 0.045), ('svms', 0.044), ('bonn', 0.043), ('cables', 0.043), ('eberhard', 0.043), ('electrocorticographic', 0.043), ('electrocorticography', 0.043), ('epileptology', 0.043), ('invasive', 0.043), ('karls', 0.043), ('multiunit', 0.043), ('seizures', 0.043), ('voluntary', 0.043), ('device', 0.042), ('picture', 0.041), ('recorded', 0.041), ('repeatedly', 0.04), ('removed', 0.04), ('serruya', 0.037), ('braincomputer', 0.037), ('rehabilitation', 0.037), ('ttd', 0.037), ('wolpaw', 0.037), ('tubingen', 0.037), ('seconds', 0.037), ('germany', 0.036), ('selection', 0.035), ('transactions', 0.035), ('autoregressive', 0.034), ('grids', 0.034), ('hemisphere', 0.034), ('imagine', 0.033), ('error', 0.032), ('schr', 0.032), ('imaginary', 0.032), ('little', 0.031), ('parts', 0.03), ('repetitions', 0.03), ('asked', 0.029), ('engineering', 0.029), ('recording', 0.028), ('placed', 0.028), ('classifying', 0.028), ('cortical', 0.028), ('june', 0.028), ('cation', 0.027), ('developed', 0.027), ('rates', 0.027), ('usa', 0.027), ('monkeys', 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000005 117 nips-2004-Methods Towards Invasive Human Brain Computer Interfaces

Author: Thomas N. Lal, Thilo Hinterberger, Guido Widman, Michael Schröder, N. J. Hill, Wolfgang Rosenstiel, Christian E. Elger, Niels Birbaumer, Bernhard Schölkopf

Abstract: During the last ten years there has been growing interest in the development of Brain Computer Interfaces (BCIs). The field has mainly been driven by the needs of completely paralyzed patients to communicate. With a few exceptions, most human BCIs are based on extracranial electroencephalography (EEG). However, reported bit rates are still low. One reason for this is the low signal-to-noise ratio of the EEG [16]. We are currently investigating if BCIs based on electrocorticography (ECoG) are a viable alternative. In this paper we present the method and examples of intracranial EEG recordings of three epilepsy patients with electrode grids placed on the motor cortex. The patients were asked to repeatedly imagine movements of two kinds, e.g., tongue or finger movements. We analyze the classifiability of the data using Support Vector Machines (SVMs) [18, 21] and Recursive Channel Elimination (RCE) [11]. 1

2 0.24659073 20 nips-2004-An Auditory Paradigm for Brain-Computer Interfaces

Author: N. J. Hill, Thomas N. Lal, Karin Bierig, Niels Birbaumer, Bernhard Schölkopf

Abstract: Motivated by the particular problems involved in communicating with “locked-in” paralysed patients, we aim to develop a braincomputer interface that uses auditory stimuli. We describe a paradigm that allows a user to make a binary decision by focusing attention on one of two concurrent auditory stimulus sequences. Using Support Vector Machine classification and Recursive Channel Elimination on the independent components of averaged eventrelated potentials, we show that an untrained user’s EEG data can be classified with an encouragingly high level of accuracy. This suggests that it is possible for users to modulate EEG signals in a single trial by the conscious direction of attention, well enough to be useful in BCI. 1

3 0.22400904 56 nips-2004-Dynamic Bayesian Networks for Brain-Computer Interfaces

Author: Pradeep Shenoy, Rajesh P. Rao

Abstract: We describe an approach to building brain-computer interfaces (BCI) based on graphical models for probabilistic inference and learning. We show how a dynamic Bayesian network (DBN) can be used to infer probability distributions over brain- and body-states during planning and execution of actions. The DBN is learned directly from observed data and allows measured signals such as EEG and EMG to be interpreted in terms of internal states such as intent to move, preparatory activity, and movement execution. Unlike traditional classification-based approaches to BCI, the proposed approach (1) allows continuous tracking and prediction of internal states over time, and (2) generates control signals based on an entire probability distribution over states rather than binary yes/no decisions. We present preliminary results of brain- and body-state estimation using simultaneous EEG and EMG signals recorded during a self-paced left/right hand movement task. 1

4 0.10596009 172 nips-2004-Sparse Coding of Natural Images Using an Overcomplete Set of Limited Capacity Units

Author: Eizaburo Doi, Michael S. Lewicki

Abstract: It has been suggested that the primary goal of the sensory system is to represent input in such a way as to reduce the high degree of redundancy. Given a noisy neural representation, however, solely reducing redundancy is not desirable, since redundancy is the only clue to reduce the effects of noise. Here we propose a model that best balances redundancy reduction and redundant representation. Like previous models, our model accounts for the localized and oriented structure of simple cells, but it also predicts a different organization for the population. With noisy, limited-capacity units, the optimal representation becomes an overcomplete, multi-scale representation, which, compared to previous models, is in closer agreement with physiological data. These results offer a new perspective on the expansion of the number of neurons from retina to V1 and provide a theoretical model of incorporating useful redundancy into efficient neural representations. 1

5 0.080483198 12 nips-2004-A Temporal Kernel-Based Model for Tracking Hand Movements from Neural Activities

Author: Lavi Shpigelman, Koby Crammer, Rony Paz, Eilon Vaadia, Yoram Singer

Abstract: We devise and experiment with a dynamical kernel-based system for tracking hand movements from neural activity. The state of the system corresponds to the hand location, velocity, and acceleration, while the system’s input are the instantaneous spike rates. The system’s state dynamics is defined as a combination of a linear mapping from the previous estimated state and a kernel-based mapping tailored for modeling neural activities. In contrast to generative models, the activity-to-state mapping is learned using discriminative methods by minimizing a noise-robust loss function. We use this approach to predict hand trajectories on the basis of neural activity in motor cortex of behaving monkeys and find that the proposed approach is more accurate than both a static approach based on support vector regression and the Kalman filter. 1

6 0.074135937 155 nips-2004-Responding to Modalities with Different Latencies

7 0.0613176 174 nips-2004-Spike Sorting: Bayesian Clustering of Non-Stationary Data

8 0.057363261 139 nips-2004-Optimal Aggregation of Classifiers and Boosting Maps in Functional Magnetic Resonance Imaging

9 0.056060128 51 nips-2004-Detecting Significant Multidimensional Spatial Clusters

10 0.054511622 84 nips-2004-Inference, Attention, and Decision in a Bayesian Neural Architecture

11 0.048154838 34 nips-2004-Breaking SVM Complexity with Cross-Training

12 0.045066427 33 nips-2004-Brain Inspired Reinforcement Learning

13 0.042468801 156 nips-2004-Result Analysis of the NIPS 2003 Feature Selection Challenge

14 0.041885704 75 nips-2004-Heuristics for Ordering Cue Search in Decision Making

15 0.040778641 4 nips-2004-A Generalized Bradley-Terry Model: From Group Competition to Individual Skill

16 0.039523657 3 nips-2004-A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound

17 0.037719063 92 nips-2004-Kernel Methods for Implicit Surface Modeling

18 0.037248787 143 nips-2004-PAC-Bayes Learning of Conjunctions and Classification of Gene-Expression Data

19 0.036933873 144 nips-2004-Parallel Support Vector Machines: The Cascade SVM

20 0.036855504 89 nips-2004-Joint MRI Bias Removal Using Entropy Minimization Across Images


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.13), (1, -0.035), (2, -0.006), (3, -0.032), (4, -0.051), (5, 0.055), (6, 0.245), (7, -0.019), (8, 0.079), (9, -0.096), (10, -0.083), (11, 0.073), (12, -0.058), (13, -0.168), (14, 0.291), (15, -0.032), (16, 0.073), (17, -0.097), (18, -0.142), (19, 0.096), (20, 0.049), (21, -0.094), (22, -0.083), (23, 0.143), (24, 0.002), (25, 0.126), (26, 0.062), (27, 0.141), (28, -0.103), (29, 0.047), (30, -0.046), (31, 0.055), (32, 0.044), (33, 0.051), (34, -0.056), (35, 0.056), (36, -0.04), (37, 0.027), (38, -0.114), (39, -0.029), (40, -0.108), (41, 0.031), (42, -0.015), (43, 0.056), (44, -0.016), (45, 0.024), (46, 0.013), (47, -0.151), (48, -0.054), (49, 0.051)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9483248 117 nips-2004-Methods Towards Invasive Human Brain Computer Interfaces

Author: Thomas N. Lal, Thilo Hinterberger, Guido Widman, Michael Schröder, N. J. Hill, Wolfgang Rosenstiel, Christian E. Elger, Niels Birbaumer, Bernhard Schölkopf

Abstract: During the last ten years there has been growing interest in the development of Brain Computer Interfaces (BCIs). The field has mainly been driven by the needs of completely paralyzed patients to communicate. With a few exceptions, most human BCIs are based on extracranial electroencephalography (EEG). However, reported bit rates are still low. One reason for this is the low signal-to-noise ratio of the EEG [16]. We are currently investigating if BCIs based on electrocorticography (ECoG) are a viable alternative. In this paper we present the method and examples of intracranial EEG recordings of three epilepsy patients with electrode grids placed on the motor cortex. The patients were asked to repeatedly imagine movements of two kinds, e.g., tongue or finger movements. We analyze the classifiability of the data using Support Vector Machines (SVMs) [18, 21] and Recursive Channel Elimination (RCE) [11]. 1

2 0.82676405 20 nips-2004-An Auditory Paradigm for Brain-Computer Interfaces

Author: N. J. Hill, Thomas N. Lal, Karin Bierig, Niels Birbaumer, Bernhard Schölkopf

Abstract: Motivated by the particular problems involved in communicating with “locked-in” paralysed patients, we aim to develop a braincomputer interface that uses auditory stimuli. We describe a paradigm that allows a user to make a binary decision by focusing attention on one of two concurrent auditory stimulus sequences. Using Support Vector Machine classification and Recursive Channel Elimination on the independent components of averaged eventrelated potentials, we show that an untrained user’s EEG data can be classified with an encouragingly high level of accuracy. This suggests that it is possible for users to modulate EEG signals in a single trial by the conscious direction of attention, well enough to be useful in BCI. 1

3 0.81669062 56 nips-2004-Dynamic Bayesian Networks for Brain-Computer Interfaces

Author: Pradeep Shenoy, Rajesh P. Rao

Abstract: We describe an approach to building brain-computer interfaces (BCI) based on graphical models for probabilistic inference and learning. We show how a dynamic Bayesian network (DBN) can be used to infer probability distributions over brain- and body-states during planning and execution of actions. The DBN is learned directly from observed data and allows measured signals such as EEG and EMG to be interpreted in terms of internal states such as intent to move, preparatory activity, and movement execution. Unlike traditional classification-based approaches to BCI, the proposed approach (1) allows continuous tracking and prediction of internal states over time, and (2) generates control signals based on an entire probability distribution over states rather than binary yes/no decisions. We present preliminary results of brain- and body-state estimation using simultaneous EEG and EMG signals recorded during a self-paced left/right hand movement task. 1

4 0.44662565 12 nips-2004-A Temporal Kernel-Based Model for Tracking Hand Movements from Neural Activities

Author: Lavi Shpigelman, Koby Crammer, Rony Paz, Eilon Vaadia, Yoram Singer

Abstract: We devise and experiment with a dynamical kernel-based system for tracking hand movements from neural activity. The state of the system corresponds to the hand location, velocity, and acceleration, while the system’s input are the instantaneous spike rates. The system’s state dynamics is defined as a combination of a linear mapping from the previous estimated state and a kernel-based mapping tailored for modeling neural activities. In contrast to generative models, the activity-to-state mapping is learned using discriminative methods by minimizing a noise-robust loss function. We use this approach to predict hand trajectories on the basis of neural activity in motor cortex of behaving monkeys and find that the proposed approach is more accurate than both a static approach based on support vector regression and the Kalman filter. 1

5 0.38376364 155 nips-2004-Responding to Modalities with Different Latencies

Author: Fredrik Bissmarck, Hiroyuki Nakahara, Kenji Doya, Okihide Hikosaka

Abstract: Motor control depends on sensory feedback in multiple modalities with different latencies. In this paper we consider within the framework of reinforcement learning how different sensory modalities can be combined and selected for real-time, optimal movement control. We propose an actor-critic architecture with multiple modules, whose output are combined using a softmax function. We tested our architecture in a simulation of a sequential reaching task. Reaching was initially guided by visual feedback with a long latency. Our learning scheme allowed the agent to utilize the somatosensory feedback with shorter latency when the hand is near the experienced trajectory. In simulations with different latencies for visual and somatosensory feedback, we found that the agent depended more on feedback with shorter latency. 1

6 0.31554642 172 nips-2004-Sparse Coding of Natural Images Using an Overcomplete Set of Limited Capacity Units

7 0.25497863 139 nips-2004-Optimal Aggregation of Classifiers and Boosting Maps in Functional Magnetic Resonance Imaging

8 0.25396076 109 nips-2004-Mass Meta-analysis in Talairach Space

9 0.25186744 38 nips-2004-Co-Validation: Using Model Disagreement on Unlabeled Data to Validate Classification Algorithms

10 0.23924753 156 nips-2004-Result Analysis of the NIPS 2003 Feature Selection Challenge

11 0.21994253 193 nips-2004-Theories of Access Consciousness

12 0.21826173 106 nips-2004-Machine Learning Applied to Perception: Decision Images for Gender Classification

13 0.20683181 29 nips-2004-Beat Tracking the Graphical Model Way

14 0.19958095 34 nips-2004-Breaking SVM Complexity with Cross-Training

15 0.19717069 120 nips-2004-Modeling Conversational Dynamics as a Mixed-Memory Markov Process

16 0.19455844 75 nips-2004-Heuristics for Ordering Cue Search in Decision Making

17 0.19024709 3 nips-2004-A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound

18 0.18873419 143 nips-2004-PAC-Bayes Learning of Conjunctions and Classification of Gene-Expression Data

19 0.18780874 159 nips-2004-Schema Learning: Experience-Based Construction of Predictive Action Models

20 0.1873682 84 nips-2004-Inference, Attention, and Decision in a Bayesian Neural Architecture


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(13, 0.064), (15, 0.071), (18, 0.437), (26, 0.053), (31, 0.029), (33, 0.147), (35, 0.029), (50, 0.028), (63, 0.029)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.76241821 117 nips-2004-Methods Towards Invasive Human Brain Computer Interfaces

Author: Thomas N. Lal, Thilo Hinterberger, Guido Widman, Michael Schröder, N. J. Hill, Wolfgang Rosenstiel, Christian E. Elger, Niels Birbaumer, Bernhard Schölkopf

Abstract: During the last ten years there has been growing interest in the development of Brain Computer Interfaces (BCIs). The field has mainly been driven by the needs of completely paralyzed patients to communicate. With a few exceptions, most human BCIs are based on extracranial electroencephalography (EEG). However, reported bit rates are still low. One reason for this is the low signal-to-noise ratio of the EEG [16]. We are currently investigating if BCIs based on electrocorticography (ECoG) are a viable alternative. In this paper we present the method and examples of intracranial EEG recordings of three epilepsy patients with electrode grids placed on the motor cortex. The patients were asked to repeatedly imagine movements of two kinds, e.g., tongue or finger movements. We analyze the classifiability of the data using Support Vector Machines (SVMs) [18, 21] and Recursive Channel Elimination (RCE) [11]. 1

2 0.70632583 75 nips-2004-Heuristics for Ordering Cue Search in Decision Making

Author: Peter M. Todd, Anja Dieckmann

Abstract: Simple lexicographic decision heuristics that consider cues one at a time in a particular order and stop searching for cues as soon as a decision can be made have been shown to be both accurate and frugal in their use of information. But much of the simplicity and success of these heuristics comes from using an appropriate cue order. For instance, the Take The Best heuristic uses validity order for cues, which requires considerable computation, potentially undermining the computational advantages of the simple decision mechanism. But many cue orders can achieve good decision performance, and studies of sequential search for data records have proposed a number of simple ordering rules that may be of use in constructing appropriate decision cue orders as well. Here we consider a range of simple cue ordering mechanisms, including tallying, swapping, and move-to-front rules, and show that they can find cue orders that lead to reasonable accuracy and considerable frugality when used with lexicographic decision heuristics. 1 O ne -Re ason De c i si on M aki ng and O r de r e d Se ar c h How do we know what information to consider when making a decision? Imagine the problem of deciding which of two objects or options is greater along some criterion, such as which of two cities is larger. We may know various facts about each city, such as whether they have a major sports team or a university or airport. To decide between them, we could weight and sum all the cues we know, or we could use a simpler lexicographic rule to look at one cue at a time in a particular order until we find a cue that discriminates between the options and indicates a choice [1]. Such lexicographic rules are used by people in a variety of decision tasks [2]-[4], and have been shown to be both accurate in their inferences and frugal in the amount of information they consider before making a decision. For instance, Gigerenzer and colleagues [5] demonstrated the surprising performance of several decision heuristics that stop information search as soon as one discriminating cue is found; because only that cue is used to make the decision, and no integration of information is involved, they called these heuristics “one-reason” decision mechanisms. Given some set of cues that can be looked up to make the decision, these heuristics differ mainly in the search rule that determines the order in which the information is searched. But then the question of what information to consider becomes, how are these search orders determined? Particular cue orders make a difference, as has been shown in research on the Take The Best heuristic (TTB) [6], [7]. TTB consists of three building blocks. (1) Search rule: Search through cues in the order of their validity, a measure of accuracy equal to the proportion of correct decisions made by a cue out of all the times that cue discriminates between pairs of options. (2) Stopping rule: Stop search as soon as one cue is found that discriminates between the two options. (3) Decision rule: Select the option to which the discriminating cue points, that is, the option that has the cue value associated with higher criterion values. The performance of TTB has been tested on several real-world data sets, ranging from professors’ salaries to fish fertility [8], in cross-validation comparisons with other more complex strategies. Across 20 data sets, TTB used on average only a third of the available cues (2.4 out of 7.7), yet still outperformed multiple linear regression in generalization accuracy (71% vs. 68%). The even simpler Minimalist heuristic, which searches through available cues in a random order, was more frugal (using 2.2 cues on average), yet still achieved 65% accuracy. But the fact that the accuracy of Minimalist lagged behind TTB by 6 percentage points indicates that part of the secret of TTB’s success lies in its ordered search. Moreover, in laboratory experiments [3], [4], [9], people using lexicographic decision strategies have been shown to employ cue orders based on the cues’ validities or a combination of validity and discrimination rate (proportion of decision pairs on which a cue discriminates between the two options). Thus, the cue order used by a lexicographic decision mechanism can make a considerable difference in accuracy; the same holds true for frugality, as we will see. But constructing an exact validity order, as used by Take The Best, takes considerable information and computation [10]. If there are N known objects to make decisions over, and C cues known for each object, then each of the C cues must be evaluated for whether it discriminates correctly (counting up R right decisions), incorrectly (W wrong decisions), or does not discriminate between each of the N·(N-1)/2 possible object pairs, yielding C·N·(N-1)/2 checks to perform to gather the information needed to compute cue validities (v = R/(R+W)) in this domain. But a decision maker typically does not know all of the objects to be decided upon, nor even all the cue values for those objects, ahead of time—is there any simpler way to find an accurate and frugal cue order? In this paper, we address this question through simulation-based comparison of a variety of simple cue-order-learning rules. Hope comes from two directions: first, there are many cue orders besides the exact validity ordering that can yield good performance; and second, research in computer science has demonstrated the efficacy of a range of simple ordering rules for a closely related search problem. Consequently, we find that simple mechanisms at the cue-order-learning stage can enable simple mechanisms at the decision stage, such as lexicographic one-reason decision heuristics, to perform well. 2 Si mpl e appr oac he s to c onstr uc ti ng c ue s e ar c h or de r s To compare different cue ordering rules, we evaluate the performance of different cue orders when used by a one-reason decision heuristic within a particular well-studied sample domain: large German cities, compared on the criterion of population size using 9 cues ranging from having a university to the presence of an intercity train line [6], [7]. Examining this domain makes it clear that there are many good possible cue orders. When used with one-reason stopping and decision building blocks, the mean accuracy of the 362,880 (9!) cue orders is 70%, equivalent to the performance expected from Minimalist. The accuracy of the validity order, 74.2%, falls toward the upper end of the accuracy range (62-75.8%), but there are still 7421 cue orders that do better than the validity order. The frugality of the search orders ranges from 2.53 cues per decision to 4.67, with a mean of 3.34 corresponding to using Minimalist; TTB has a frugality of 4.23, implying that most orders are more frugal. Thus, there are many accurate and frugal cue orders that could be found—a satisficing decision maker not requiring optimal performance need only land on one. An ordering problem of this kind has been studied in computer science for nearly four decades, and can provide us with a set of potential heuristics to test. Consider the case of a set of data records arranged in a list, each of which will be required during a set of retrievals with a particular probability pi. On each retrieval, a key is given (e.g. a record’s title) and the list is searched from the front to the end until the desired record, matching that key, is found. The goal is to minimize the mean search time for accessing the records in this list, for which the optimal ordering is in decreasing order of pi. But if these retrieval probabilities are not known ahead of time, how can the list be ordered after each successive retrieval to achieve fast access? This is the problem of self-organizing sequential search [11], [12]. A variety of simple sequential search heuristics have been proposed for this problem, centering on three main approaches: (1) transpose, in which a retrieved record is moved one position closer to the front of the list (i.e., swapping with the record in front of it); (2) move-to-front (MTF), in which a retrieved record is put at the front of the list, and all other records remain in the same relative order; and (3) count, in which a tally is kept of the number of times each record is retrieved, and the list is reordered in decreasing order of this tally after each retrieval. Because count rules require storing additional information, more attention has focused on the memory-free transposition and MTF rules. Analytic and simulation results (reviewed in [12]) have shown that while transposition rules can come closer to the optimal order asymptotically, in the short run MTF rules converge more quickly (as can count rules). This may make MTF (and count) rules more appealing as models of cue order learning by humans facing small numbers of decision trials. Furthermore, MTF rules are more responsive to local structure in the environment (e.g., clumped retrievals over time of a few records), and transposition can result in very poor performance under some circumstances (e.g., when neighboring pairs of “popular” records get trapped at the end of the list by repeatedly swapping places). It is important to note that there are important differences between the selforganizing sequential search problem and the cue-ordering problem we address here. In particular, when a record is sought that matches a particular key, search proceeds until the correct record is found. In contrast, when a decision is made lexicographically and the list of cues is searched through, there is no one “correct” cue to find—each cue may or may not discriminate (allow a decision to be made). Furthermore, once a discriminating cue is found, it may not even make the right decision. Thus, given feedback about whether a decision was right or wrong, a discriminating cue could potentially be moved up or down in the ordered list. This dissociation between making a decision or not (based on the cue discrimination rates), and making a right or wrong decision (based on the cue validities), means that there are two ordering criteria in this problem—frugality and accuracy—as opposed to the single order—search time—for records based on their retrieval probability pi . Because record search time corresponds to cue frugality, the heuristics that work well for the self-organizing sequential search task are likely to produce orders that emphasize frugality (reflecting cue discrimination rates) over accuracy in the cue-ordering task. Nonetheless, these heuristics offer a useful starting point for exploring cue-ordering rules. 2.1 The cue-ordering rules We focus on search order construction processes that are psychologically plausible by being frugal both in terms of information storage and in terms of computation. The decision situation we explore is different from the one assumed by Juslin and Persson [10] who strongly differentiate learning about objects from later making decisions about them. Instead we assume a learning-while-doing situation, consisting of tasks that have to be done repeatedly with feedback after each trial about the adequacy of one’s decision. For instance, we can observe on multiple occasions which of two supermarket checkout lines, the one we have chosen or (more likely) another one, is faster, and associate this outcome with cues including the lines’ lengths and the ages of their respective cashiers. In such situations, decision makers can learn about the differential usefulness of cues for solving the task via the feedback received over time. We compare several explicitly defined ordering rules that construct cue orders for use by lexicographic decision mechanisms applied to a particular probabilistic inference task: forced choice paired comparison, in which a decision maker has to infer which of two objects, each described by a set of binary cues, is “bigger” on a criterion—just the task for which TTB was formulated. After an inference has been made, feedback is given about whether a decision was right or wrong. Therefore, the order-learning algorithm has information about which cues were looked up, whether a cue discriminated, and whether a discriminating cue led to the right or wrong decision. The rules we propose differ in which pieces of information they use and how they use them. We classify the learning rules based on their memory requirement—high versus low—and their computational requirements in terms of full or partial reordering (see Table 1). Table 1: Learning rules classified by memory and computational requirements High memory load, complete reordering High memory load, local reordering Low memory load, local reordering Validity: reorders cues based on their current validity Tally swap: moves cue up (down) one position if it has made a correct (incorrect) decision if its tally of correct minus incorrect decisions is ( ) than that of next higher (lower) cue Simple swap: moves cue up one position after correct decision, and down after an incorrect decision Tally: reorders cues by number of correct minus incorrect decisions made so far Associative/delta rule: reorders cues by learned association strength Move-to-front (2 forms): Take The Last (TTL): moves discriminating cue to front TTL-correct: moves cue to front only if it correctly discriminates The validity rule, a type of count rule, is the most demanding of the rules we consider in terms of both memory requirements and computational complexity. It keeps a count of all discriminations made by a cue so far (in all the times that the cue was looked up) and a separate count of all the correct discriminations. Therefore, memory load is comparatively high. The validity of each cue is determined by dividing its current correct discrimination count by its total discrimination count. Based on these values computed after each decision, the rule reorders the whole set of cues from highest to lowest validity. The tally rule only keeps one count per cue, storing the number of correct decisions made by that cue so far minus the number of incorrect decisions. If a cue discriminates correctly on a given trial, one point is added to its tally, if it leads to an incorrect decision, one point is subtracted. The tally rule is less demanding in terms of memory and computation: Only one count is kept, no division is required. The simple swap rule uses the transposition rather than count approach. This rule has no memory of cue performance other than an ordered list of all cues, and just moves a cue up one position in this list whenever it leads to a correct decision, and down if it leads to an incorrect decision. In other words, a correctly deciding cue swaps positions with its nearest neighbor upwards in the cue order, and an incorrectly deciding cue swaps positions with its nearest neighbor downwards. The tally swap rule is a hybrid of the simple swap rule and the tally rule. It keeps a tally of correct minus incorrect discriminations per cue so far (so memory load is high) but only locally swaps cues: When a cue makes a correct decision and its tally is greater than or equal to that of its upward neighbor, the two cues swap positions. When a cue makes an incorrect decision and its tally is smaller than or equal to that of its downward neighbor, the two cues also swap positions. We also evaluate two types of move-to-front rules. First, the Take The Last (TTL) rule moves the last discriminating cue (that is, whichever cue was found to discriminate for the current decision) to the front of the order. This is equivalent to the Take The Last heuristic [6], [7], which uses a memory of cues that discriminated in the past to determine cue search order for subsequent decisions. Second, TTLcorrect moves the last discriminating cue to the front of the order only if it correctly discriminated; otherwise, the cue order remains unchanged. This rule thus takes accuracy as well as frugality into account. Finally, we include an associative learning rule that uses the delta rule to update cue weights according to whether they make correct or incorrect discriminations, and then reorders all cues in decreasing order of this weight after each decision. This corresponds to a simple network with nine input units encoding the difference in cue value between the two objects (A and B) being decided on (i.e., ini = -1 if cuei(A) cuei(B), and 0 if cuei(A)=cuei(B) or cuei was not checked) and with one output unit whose target value encodes the correct decision (t = 1 if criterion(A)>criterion(B), otherwise -1), and with the weights between inputs and output updated according to wi = lr · (t - ini·wi) · ini with learning rate lr = 0.1. We expect this rule to behave similarly to Oliver’s rule initially (moving a cue to the front of the list by giving it the largest weight when weights are small) and to swap later on (moving cues only a short distance once weights are larger). 3 Si mul ati on Study of Si mpl e O r de r i ng Rul e s To test the performance of these order learning rules, we use the German cities data set [6], [7], consisting of the 83 largest-population German cities (those with more than 100,000 inhabitants), described on 9 cues that give some information about population size. Discrimination rate and validity of the cues are negatively correlated (r = -.47). We present results averaged over 10,000 learning trials for each rule, starting from random initial cue orders. Each trial consisted of 100 decisions between randomly selected decision pairs. For each decision, the current cue order was used to look up cues until a discriminating cue was found, which was used to make the decision (employing a onereason or lexicographic decision strategy). After each decision, the cue order was updated using the particular order-learning rule. We start by considering the cumulative accuracies (i.e., online or amortized performance—[12]) of the rules, defined as the total percentage of correct decisions made so far at any point in the learning process. The contrasting measure of offline accuracy—how well the current learned cue order would do if it were applied to the entire test set—will be subsequently reported (see Figure 1). For all but the move-to-front rules, cumulative accuracies soon rise above that of the Minimalist heuristic (proportion correct = .70) which looks up cues in random order and thus serves as a lower benchmark. However, at least throughout the first 100 decisions, cumulative accuracies stay well below the (offline) accuracy that would be achieved by using TTB for all decisions (proportion correct = .74), looking up cues in the true order of their ecological validities. Except for the move-to-front rules, whose cumulative accuracies are very close to Minimalist (mean proportion correct in 100 decisions: TTL: .701; TTL-correct: .704), all learning rules perform on a surprisingly similar level, with less than one percentage point difference in favor of the most demanding rule (i.e., delta rule: .719) compared to the least (i.e., simple swap: .711; for comparison: tally swap: .715; tally: .716; validity learning rule: .719). Offline accuracies are slightly higher, again with the exception of the move to front rules (TTL: .699; TTL-correct: .702; simple swap: .714; tally swap: .719; tally: .721; validity learning rule: .724; delta rule: .725; see Figure 1). In longer runs (10,000 decisions) the validity learning rule is able to converge on TTB’s accuracy, but the tally rule’s performance changes little (to .73). Figure 1: Mean offline accuracy of order learning rules Figure 2: Mean offline frugality of order learning rules All learning rules are, however, more frugal than TTB, and even more frugal than Minimalist, both in terms of online as well as offline frugality. Let us focus on their offline frugality (see Figure 2): On average, the rules look up fewer cues than Minimalist before reaching a decision. There is little difference between the associative rule, the tallying rules and the swapping rules (mean number of cues looked up in 100 decisions: delta rule: 3.20; validity learning rule: 3.21; tally: 3.01; tally swap: 3.04; simple swap: 3.13). Most frugal are the two move-to front rules (TTL-correct: 2.87; TTL: 2.83). Consistent with this finding, all of the learning rules lead to cue orders that show positive correlations with the discrimination rate cue order (reaching the following values after 100 decisions: validity learning rule: r = .18; tally: r = .29; tally swap: r = .24; simple swap: r = .18; TTL-correct: r = .48; TTL: r = .56). This means that cues that often lead to discriminations are more likely to end up in the first positions of the order. This is especially true for the move-to-front rules. In contrast, the cue orders resulting from all learning rules but the validity learning rule do not correlate or correlate negatively with the validity cue order, and even the correlations of the cue orders resulting from the validity learning rule after 100 decisions only reach an average r = .12. But why would the discrimination rates of cues exert more of a pull on cue order than validity, even when the validity learning rule is applied? As mentioned earlier, this is what we would expect for the move-to-front rules, but it was unexpected for the other rules. Part of the explanation comes from the fact that in the city data set we used for the simulations, validity and discrimination rate of cues are negatively correlated. Having a low discrimination rate means that a cue has little chance to be used and hence to demonstrate its high validity. Whatever learning rule is used, if such a cue is displaced downward to the lower end of the order by other cues, it may have few chances to escape to the higher ranks where it belongs. The problem is that when a decision pair is finally encountered for which that cue would lead to a correct decision, it is unlikely to be checked because other, more discriminating although less valid, cues are looked up before and already bring about a decision. Thus, because one-reason decision making is intertwined with the learning mechanism and so influences which cues can be learned about, what mainly makes a cue come early in the order is producing a high number of correct decisions and not so much a high ratio of correct discriminations to total discriminations regardless of base rates. This argument indicates that performance may differ in environments where cue validities and discrimination rates correlate positively. We tested the learning rules on one such data set (r=.52) of mammal species life expectancies, predicted from 9 cues. It also differs from the cities environment with a greater difference between TTB’s and Minimalist’s performance (6.5 vs. 4 percentage points). In terms of offline accuracy, the validity learning rule now indeed more closely approaches TTB’s accuracy after 100 decisions (.773 vs. .782)., The tally rule, in contrast, behaves very much as in the cities environment, reaching an accuracy of .752, halfway between TTB and Minimalist (accuracy =.716). Thus only some learning rules can profit from the positive correlation. 4 D i s c u s s i on Most of the simpler cue order learning rules we have proposed do not fall far behind a validity learning rule in accuracy, and although the move-to-front rules cannot beat the accuracy achieved if cues were selected randomly, they compensate for this failure by being highly frugal. Interestingly, the rules that do achieve higher accuracy than Minimalist also beat random cue selection in terms of frugality. On the other hand, all rules, even the delta rule and the validity learning rule, stay below TTB’s accuracy across a relatively high number of decisions. But often it is necessary to make good decisions without much experience. Therefore, learning rules should be preferred that quickly lead to orders with good performance. The relatively complex rules with relatively high memory requirement, i.e., the delta and the validity learning rule, but also the tally learning rule, more quickly rise in accuracy compared the rules with lower requirements. Especially the tally rule thus represents a good compromise between cost, correctness and psychological plausibility considerations. Remember that the rules based on tallies assume full memory of all correct minus incorrect decisions made by a cue so far. But this does not make the rule implausible, at least from a psychological perspective, even though computer scientists were reluctant to adopt such counting approaches because of their extra memory requirements. There is considerable evidence that people are actually very good at remembering the frequencies of events. Hasher and Zacks [13] conclude from a wide range of studies that frequencies are encoded in an automatic way, implying that people are sensitive to this information without intention or special effort. Estes [14] pointed out the role frequencies play in decision making as a shortcut for probabilities. Further, the tally rule and the tally swap rule are comparatively simple, not having to keep track of base rates or perform divisions as does the validity rule. From the other side, the simple swap and move to front rules may not be much simpler, because storing a cue order may be about as demanding as storing a set of tallies. We have run experiments (reported elsewhere) in which indeed the tally swap rule best accounts for people’s actual processes of ordering cues. Our goal in this paper was to explore how well simple cue-ordering rules could work in conjunction with lexicographic decision strategies. This is important because it is necessary to take into account the set-up costs of a heuristic in addition to its application costs when considering the mechanism’s overall simplicity. As the example of the validity search order of TTB shows, what is easy to apply may not necessarily be so easy to set up. But simple rules can also be at work in the construction of a heuristic’s building blocks. We have proposed such rules for the construction of one building block, the search order. Simple learning rules inspired by research in computer science can enable a one-reason decision heuristic to perform only slightly worse than if it had full knowledge of cue validities from the very beginning. Giving up the assumption of full a priori knowledge for the slight decrease in accuracy seems like a reasonable bargain: Through the addition of learning rules, one-reason decision heuristics might lose some of their appeal to decision theorists who were surprised by the performance of such simple mechanisms compared to more complex algorithms, but they gain psychological plausibility and so become more attractive as explanations for human decision behavior. References [1] Fishburn, P.C. (1974). Lexicographic orders, utilities and decision rules: A survey. Management Science, 20, 1442-1471. [2] Payne, J.W., Bettman, J.R., & Johnson, E.J. (1993). The adaptive decision maker. New York: Cambridge University Press. [3] Bröder, A. (2000). Assessing the empirical validity of the “Take-The-Best” heuristic as a model of human probabilistic inference. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26 (5), 1332-1346. [4] Bröder, A. (2003). Decision making with the “adaptive toolbox”: Influence of environmental structure, intelligence, and working memory load. Journal of Experimental Psychology: Learning, Memory, & Cognition, 29, 611-625. [5] Gigerenzer, G., Todd, P.M., & The ABC Research Group (1999). Simple heuristics that make us smart. New York: Oxford University Press. [6] Gigerenzer, G., & Goldstein, D.G. (1996). Reasoning the fast and frugal way: Models of bounded rationality. Psychological Review, 103 (4), 650-669. [7] Gigerenzer, G., & Goldstein, D.G. (1999). Betting on one good reason: The Take The Best Heuristic. In G. Gigerenzer, P.M. Todd & The ABC Research Group, Simple heuristics that make us smart. New York: Oxford University Press. [8] Czerlinski, J., Gigerenzer, G., & Goldstein, D.G. (1999). How good are simple heuristics? In G. Gigerenzer, P.M. Todd & The ABC Research Group, Simple heuristics that make us smart. New York: Oxford University Press. [9] Newell, B.R., & Shanks, D.R. (2003). Take the best or look at the rest? Factors influencing ‘one-reason’ decision making. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 53-65. [10] Juslin, P., & Persson, M. (2002). PROBabilities from EXemplars (PROBEX): a “lazy” algorithm for probabilistic inference from generic knowledge. Cognitive Science, 26, 563-607. [11] Rivest, R. (1976). On self-organizing sequential search heuristics. Communications of the ACM, 19(2), 63-67. [12] Bentley, J.L. & McGeoch, C.C. (1985). Amortized analyses of self-organizing sequential search heuristics. Communications of the ACM, 28(4), 404-411. [13] Hasher, L., & Zacks, R.T. (1984). Automatic Processing of fundamental information: The case of frequency of occurrence. American Psychologist, 39, 1372-1388. [14] Estes, W.K. (1976). The cognitive side of probability learning. Psychological Review, 83, 3764.

3 0.48247352 131 nips-2004-Non-Local Manifold Tangent Learning

Author: Yoshua Bengio, Martin Monperrus

Abstract: We claim and present arguments to the effect that a large class of manifold learning algorithms that are essentially local and can be framed as kernel learning algorithms will suffer from the curse of dimensionality, at the dimension of the true underlying manifold. This observation suggests to explore non-local manifold learning algorithms which attempt to discover shared structure in the tangent planes at different positions. A criterion for such an algorithm is proposed and experiments estimating a tangent plane prediction function are presented, showing its advantages with respect to local manifold learning algorithms: it is able to generalize very far from training data (on learning handwritten character image rotations), where a local non-parametric method fails. 1

4 0.46065557 20 nips-2004-An Auditory Paradigm for Brain-Computer Interfaces

Author: N. J. Hill, Thomas N. Lal, Karin Bierig, Niels Birbaumer, Bernhard Schölkopf

Abstract: Motivated by the particular problems involved in communicating with “locked-in” paralysed patients, we aim to develop a braincomputer interface that uses auditory stimuli. We describe a paradigm that allows a user to make a binary decision by focusing attention on one of two concurrent auditory stimulus sequences. Using Support Vector Machine classification and Recursive Channel Elimination on the independent components of averaged eventrelated potentials, we show that an untrained user’s EEG data can be classified with an encouragingly high level of accuracy. This suggests that it is possible for users to modulate EEG signals in a single trial by the conscious direction of attention, well enough to be useful in BCI. 1

5 0.41163856 56 nips-2004-Dynamic Bayesian Networks for Brain-Computer Interfaces

Author: Pradeep Shenoy, Rajesh P. Rao

Abstract: We describe an approach to building brain-computer interfaces (BCI) based on graphical models for probabilistic inference and learning. We show how a dynamic Bayesian network (DBN) can be used to infer probability distributions over brain- and body-states during planning and execution of actions. The DBN is learned directly from observed data and allows measured signals such as EEG and EMG to be interpreted in terms of internal states such as intent to move, preparatory activity, and movement execution. Unlike traditional classification-based approaches to BCI, the proposed approach (1) allows continuous tracking and prediction of internal states over time, and (2) generates control signals based on an entire probability distribution over states rather than binary yes/no decisions. We present preliminary results of brain- and body-state estimation using simultaneous EEG and EMG signals recorded during a self-paced left/right hand movement task. 1

6 0.39453146 119 nips-2004-Mistake Bounds for Maximum Entropy Discrimination

7 0.38779643 69 nips-2004-Fast Rates to Bayes for Kernel Machines

8 0.3871727 204 nips-2004-Variational Minimax Estimation of Discrete Distributions under KL Loss

9 0.38665786 72 nips-2004-Generalization Error and Algorithmic Convergence of Median Boosting

10 0.38629341 45 nips-2004-Confidence Intervals for the Area Under the ROC Curve

11 0.38570485 86 nips-2004-Instance-Specific Bayesian Model Averaging for Classification

12 0.38535503 174 nips-2004-Spike Sorting: Bayesian Clustering of Non-Stationary Data

13 0.38453612 206 nips-2004-Worst-Case Analysis of Selective Sampling for Linear-Threshold Algorithms

14 0.38372356 102 nips-2004-Learning first-order Markov models for control

15 0.38367712 207 nips-2004-ℓ₀-norm Minimization for Basis Selection

16 0.38364607 189 nips-2004-The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees

17 0.38260049 77 nips-2004-Hierarchical Clustering of a Mixture Model

18 0.38158238 3 nips-2004-A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound

19 0.38152018 103 nips-2004-Limits of Spectral Clustering

20 0.38151973 64 nips-2004-Experts in a Markov Decision Process