nips nips2003 nips2003-188 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Xuerui Wang, Rebecca Hutchinson, Tom M. Mitchell
Abstract: We consider learning to classify cognitive states of human subjects, based on their brain activity observed via functional Magnetic Resonance Imaging (fMRI). This problem is important because such classifiers constitute “virtual sensors” of hidden cognitive states, which may be useful in cognitive science research and clinical applications. In recent work, Mitchell, et al. [6,7,9] have demonstrated the feasibility of training such classifiers for individual human subjects (e.g., to distinguish whether the subject is reading an ambiguous or unambiguous sentence, or whether they are reading a noun or a verb). Here we extend that line of research, exploring how to train classifiers that can be applied across multiple human subjects, including subjects who were not involved in training the classifier. We describe the design of several machine learning approaches to training multiple-subject classifiers, and report experimental results demonstrating the success of these methods in learning cross-subject classifiers for two different fMRI data sets. 1
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract We consider learning to classify cognitive states of human subjects, based on their brain activity observed via functional Magnetic Resonance Imaging (fMRI). [sent-7, score-0.522]
2 This problem is important because such classifiers constitute “virtual sensors” of hidden cognitive states, which may be useful in cognitive science research and clinical applications. [sent-8, score-0.294]
3 [6,7,9] have demonstrated the feasibility of training such classifiers for individual human subjects (e. [sent-10, score-0.406]
4 , to distinguish whether the subject is reading an ambiguous or unambiguous sentence, or whether they are reading a noun or a verb). [sent-12, score-0.607]
5 Here we extend that line of research, exploring how to train classifiers that can be applied across multiple human subjects, including subjects who were not involved in training the classifier. [sent-13, score-0.502]
6 We describe the design of several machine learning approaches to training multiple-subject classifiers, and report experimental results demonstrating the success of these methods in learning cross-subject classifiers for two different fMRI data sets. [sent-14, score-0.068]
7 1 Introduction The advent of functional Magnetic Resonance Imaging (fMRI) has made it possible to safely, non-invasively observe correlates of neural activity across the entire human brain at high spatial resolution. [sent-15, score-0.352]
8 A typical fMRI session can produce a three dimensional image of brain activation once per second, with a spatial resolution of a few millimeters, yielding tens of millions of individual fMRI observations over the course of a twenty-minute session. [sent-16, score-0.237]
9 This fMRI technology holds the potential to revolutionize studies of human cognitive processing, provided we can develop appropriate data analysis methods. [sent-17, score-0.236]
10 Researchers have now employed fMRI to conduct hundreds of studies that identify which regions of the brain are activated on average when a human performs a particular cognitive task (e. [sent-18, score-0.44]
11 Typical research publications describe summary statistics of brain activity in various locations, calculated by averaging together fMRI observations collected over multiple time intervals during which the subject responds to repeated stimuli of a particular type. [sent-21, score-0.429]
12 Our interest here is in a different problem: training classifiers to automatically decode the subject’s cognitive state at a single instant or interval in time. [sent-22, score-0.314]
13 If we can reliably train such classifiers, we may be able to use these as “virtual sensors” of hidden cognitive states, to observe previously hidden cognitive processes in the brain. [sent-23, score-0.349]
14 Whereas their work focussed primarily on training a different classifier for each human subject, our focus in this paper is on training a single classifier that can be used across multiple human subjects, including humans not involved in the training process. [sent-26, score-0.399]
15 This is challenging because different brains have substantially different sizes and shapes, and because different people may generate different brain activation given the same cognitive state. [sent-27, score-0.448]
16 Below we briefly survey related work, describe a range of machine learning approaches to this problem, and present experimental results showing statistically significant cross-subject classifier accuracies for two different fMRI studies. [sent-28, score-0.216]
17 [6,7,9] describe methods for training classifiers of cognitive states, focussing primarily on training subject-specific classifiers. [sent-30, score-0.283]
18 More specifically, they train classifiers that distinguish among a set of predefined cognitive states, based on a single fMRI image or fixed window of fMRI images collected relative to the presentation of a particular stimulus. [sent-31, score-0.289]
19 They used several different classifiers, and report that dimensionality reduction methods are essential given the high dimensional, sparse training data. [sent-33, score-0.068]
20 [11] report that they have been able to predict whether a verbal experience will be remembered later, based on the magnitude of activity within certain parts of left prefrontal and temporal cortices during that experience. [sent-37, score-0.236]
21 [2] show that different patterns of fMRI activity are generated when a subject views a photograph of a face versus a house, etc. [sent-39, score-0.252]
22 , [8]) also seeks to decode observed brain activity (often EEG or direct neural recordings, rather than fMRI) typically for the purpose of controlling external devices. [sent-43, score-0.307]
23 , In is a sequence of n fMRI images collected during a contiguous time interval and where CognitiveState is the set of cognitive states to be discriminated. [sent-51, score-0.289]
24 We explore a number of classifier training methods, including: • Gaussian Naive Bayes (GNB). [sent-52, score-0.068]
25 Classifiers were evaluated using a “leave one subject out” cross validation procedure, in which each of the m human subjects was used as a test subject while training on the remaining m−1 subjects, and the mean accuracy over these held out subjects was calculated. [sent-59, score-0.873]
26 We explored a variety of approaches to reducing the dimensionality of the input feature vector, including methods that select a subset of available features, methods that replace multiple feature values by their mean, and methods that use both of these extractions. [sent-62, score-0.141]
27 In the latter two cases, we take means over values found within anatomically defined brain regions (e. [sent-63, score-0.221]
28 , dorsolateral prefrontal cortex) which are referred to as Regions of Interest, or ROIs). [sent-65, score-0.076]
29 We considered the following feature extraction methods: • Average. [sent-66, score-0.066]
30 For each ROI, calculate the mean activity over all voxels in the ROI. [sent-67, score-0.207]
31 For each ROI, select the n most active voxels2 , then calculate the mean of their values. [sent-70, score-0.081]
32 Here the “most active” voxels are those whose activity while performing the task varies the most from their activity when the subject is at rest (see [7] for details). [sent-72, score-0.388]
33 Select the n most active voxels over the entire brain. [sent-74, score-0.212]
34 3 Registering Data from Multiple Subjects Given the different sizes and shapes of different brains, it is not possible to directly map the voxels in one brain to those in another. [sent-77, score-0.308]
35 We considered two different methods for producing representations of fMRI data for use across multiple subjects: • ROI Mapping. [sent-78, score-0.091]
36 Abstract the voxel data in each brain using the Average or ActiveAvg(n) feature extraction method described above. [sent-79, score-0.272]
37 Because each brain contains the same set of anatomically defined ROIs, we can use the resulting representation of average activity per ROI as a canonical representation across subjects. [sent-80, score-0.371]
38 The coordinate system of each brain is transformed (geometrically morphed) into the coordinate system of a standard brain (known as the Talairach-Tournoux coordinate system [10]). [sent-82, score-0.435]
39 After this transformation, each brain has the same shape and size, though the transformation is usually imperfect. [sent-83, score-0.211]
40 ROI Mapping results in just one feature per ROI (we work with at most 35 ROIs per brain) at each timepoint, whereas Talairach coordinates retain the voxel-level resolution (on the order of 15,000 voxels per brain). [sent-92, score-0.23]
41 ROI Mapping reduces noise by averaging voxel activations, whereas the Talairach transformation effectively introduces new noise due to imperfections in the morphing transformation. [sent-94, score-0.063]
42 Notice both of these transformations require background knowledge about brain anatomy in order to identify anatomical landmarks or ROIs. [sent-96, score-0.177]
43 4 Case Studies This section describes two fMRI case studies used for training classifiers (detailed in [7]). [sent-97, score-0.105]
44 1 Sentence versus Picture Study In this fMRI study [3], thirteen normal subjects performed a sequence of trials. [sent-99, score-0.328]
45 During each trial they were first shown a sentence and a simple picture, then asked whether the sentence correctly described the picture. [sent-100, score-0.518]
46 We used this data set to explore the feasibility of training classifiers to distinguish whether the subject is examining a sentence or a picture during a particular time interval. [sent-101, score-0.736]
47 In half of the trials the picture was presented first, followed by the sentence, which we will refer to as PS data set. [sent-102, score-0.168]
48 In the remaining trials, the sentence was presented first, followed by the picture, which we will call SP data set. [sent-103, score-0.233]
49 ” The learning task we consider here is to train a classifier to determine, given a particular 16-image interval of fMRI data, whether the subject was viewing a sentence or a picture during this interval. [sent-106, score-0.658]
50 2 Syntactic Ambiguity Study In this fMRI study [4], subjects were presented with ambiguous and unambiguous sentences, and were asked to respond to a yes-no question about the content of each sentence. [sent-113, score-0.475]
51 The questions were designed to ensure that the subject was in fact processing the sentence. [sent-114, score-0.105]
52 Five normal subjects participated in this study, which we will refer to as SA data set. [sent-115, score-0.236]
53 We are interested here in learning a classifier that takes as input an interval of fMRI activity, and determines whether the subject was currently reading an unambiguous or ambiguous sentence. [sent-116, score-0.441]
54 An example ambiguous sentence is “The experienced soldiers warned about the dangers conducted the midnight raid. [sent-117, score-0.463]
55 ” An example of an unambiguous sentence is “The experienced soldiers spoke about the dangers before the midnight raid. [sent-118, score-0.474]
56 where I1 is the image captured at the time when the sentence is first presented to the subject. [sent-123, score-0.233]
57 5 Experimental Results The primary goal of this work is to determine whether and how it is possible to train classifiers of cognitive states across multiple human subjects. [sent-125, score-0.467]
58 We experimented using data from the two case studies described above, measuring the accuracy of classifiers trained for single subjects, as well as those trained for multiple subjects. [sent-126, score-0.216]
59 Note we might expect the multiple subject classification accuracies to be lower due to differences among subjects, or to be higher due to the larger number of training examples available. [sent-127, score-0.407]
60 Table 1 displays the lowest accuracies that are statistically significant at the 95% confidence level, where the expected accuracy due to chance is 0. [sent-133, score-0.314]
61 We will not report confidence interval individually for each accuracy because they are very similar. [sent-135, score-0.116]
62 Table 1: The lowest accuracies that are significantly better than chance at the 95% level. [sent-136, score-0.217]
63 Table 2 shows the classifier accuracies for the Sentence versus Picture study, when training across subjects and testing on the subject withheld from the training set. [sent-143, score-0.748]
64 For comparison, it also shows (in parentheses) the average accuracy achieved by classifiers trained and tested on single subjects. [sent-144, score-0.13]
65 All results are highly significant compared to the 50% accuracy expected by chance, demonstrating convincingly the feasibility of training classifiers to distinguish cognitive states in subjects beyond the training set. [sent-145, score-0.77]
66 In fact, the accuracy achieved on the left out subject for the multiple-subject classifiers is often very close to the average accuracy of the single-subject classifiers, and in several cases it is significantly better. [sent-146, score-0.274]
67 This surprisingly positive result indicates that the accuracy of the multiple-subject classifier, when tested on new subjects outside the training set, is comparable to the average accuracy achieved when training and testing using data from a single subject. [sent-147, score-0.541]
68 Presumably this can be explained by the fact that it is trained using an order of magnitude more training examples, from twelve subjects rather than one. [sent-148, score-0.336]
69 The increase in training set size apparently compensates for the variability among subjects. [sent-149, score-0.068]
70 A second trend apparent in Table 2 is that the accuracies in SP or PS data sets are better than the accuracies when using their union (SP+PS). [sent-150, score-0.38]
71 4 They include pars opercularis of the inferior frontal gyrus, pars triangularis of the inferior frontal gyrus, Wernicke’s area, and the superior temporal gyrus. [sent-152, score-0.463]
72 5 Under cross validation, we learn m classifiers, and the accuracy we reported is the mean accuracy of these classifiers. [sent-153, score-0.142]
73 The size of the confidence interval we compute is the upper bound of the size of Table 2: Multiple-subject accuracies in the Sentence versus Picture study (ROI mapping). [sent-154, score-0.327]
74 Numbers in parenthesis are the corresponding mean accuracies of single-subject classifiers. [sent-155, score-0.245]
75 2%) Table 3: Multiple-subject accuracies in the Syntactic Ambiguity study (ROI mapping). [sent-210, score-0.248]
76 Numbers in parenthesis are the corresponding mean accuracies of single-subject classifiers. [sent-211, score-0.245]
77 To choose n in ActiveAvg(n), we explored all even numbers less than 50, reporting the best. [sent-212, score-0.084]
78 0%) Classifier accuracies for the Syntactic Ambiguity study are shown in Table 3. [sent-233, score-0.248]
79 The accuracies for both singlesubject and multiple-subject classifiers are lower than in the first study, perhaps due in part to the smaller number of subjects and training examples. [sent-236, score-0.494]
80 Although we cannot draw strong conclusions from the results of this study, it provides modest additional support for the feasibility of training multiple-subject classifiers using ROI mapping. [sent-237, score-0.118]
81 Note that accuracies of the multiple-subject classifiers are again comparable to those of single subject classifiers. [sent-238, score-0.295]
82 2 Talairach Coordinates Next we explore the Talairach coordinates method for merging data from multiple subjects. [sent-240, score-0.083]
83 Note one difficulty in utilizing the Talairach transformation here is that slightly different regions of the brain were scanned for different subjects. [sent-242, score-0.262]
84 Figure 1 shows the portions of the brain that were scanned for two of the subjects along with the intersection of these regions from all five subjects. [sent-243, score-0.464]
85 6 We experienced technical difficulties in applying the Talairach transformation software to the Sentence versus Picture study (see [3] for details). [sent-246, score-0.16]
86 Subject 1 Subject 2 Intersecting all subjects Figure 1: The two leftmost panels show in color the scanned portion of the brain for two subjects (Syntactic Ambiguity study) in Talairach space in sagittal view. [sent-247, score-0.7]
87 The rightmost panel shows the intersection of these scanned bands across all five subjects. [sent-248, score-0.098]
88 The results of training multiple-subject classifiers based on the Talairach coordinates method are shown in Table 4. [sent-249, score-0.107]
89 When using the Talairach method, we found the most effective feature extraction approach was the Active(n) feature selection approach, which chooses the n most active voxels from across the brain. [sent-252, score-0.357]
90 Note that it is not possible to use this feature selection approach with the ROI Mapping method, because the individual voxels from different brains can only be aligned after performing the Talairach transformation. [sent-253, score-0.255]
91 Table 4: Multiple-subject accuracies in the Syntactic Ambiguity study (Talairach coordinates). [sent-254, score-0.248]
92 Numbers in parenthesis are the mean accuracies of single-subject classifiers. [sent-255, score-0.245]
93 For n in Active(n), we explored all even numbers less than 200, reporting the best. [sent-256, score-0.084]
94 0%) Summary and Conclusions The primary goal of this research was to determine whether it is feasible to use machine learning methods to decode mental states across multiple human subjects. [sent-267, score-0.319]
95 Two methods were explored to train multiple-subject classifiers based on fMRI data. [sent-269, score-0.088]
96 ROI mapping abstracts fMRI data by using the mean fMRI activity in each of several anatomically defined ROIs to map different brains in terms of ROIs. [sent-270, score-0.26]
97 The transformation to Talairach coordinates morphs brains into a standard coordinate frame, retaining the approximate spatial resolution of the original data. [sent-271, score-0.22]
98 , whether the subject was viewing a picture or a sentence describing a picture, and to apply these successfully to subjects outside the training set. [sent-274, score-0.862]
99 In many cases, the classification accuracy for subjects outside the training set equalled or exceeded the accuracy achieved by training on data from just the single subject. [sent-275, score-0.514]
100 A second research direction is to develop learning methods that take advantage of data from multiple studies, in contrast to the single study efforts described here. [sent-282, score-0.102]
wordName wordTfidf (topN-words)
[('fmri', 0.423), ('roi', 0.313), ('talairach', 0.277), ('subjects', 0.236), ('sentence', 0.233), ('activeavg', 0.221), ('accuracies', 0.19), ('brain', 0.177), ('picture', 0.168), ('classi', 0.158), ('ers', 0.157), ('cognitive', 0.147), ('voxels', 0.131), ('mitchell', 0.117), ('gnb', 0.111), ('subject', 0.105), ('unambiguous', 0.096), ('ambiguity', 0.093), ('brains', 0.092), ('syntactic', 0.085), ('ambiguous', 0.085), ('ps', 0.081), ('active', 0.081), ('activity', 0.076), ('sp', 0.075), ('pars', 0.074), ('rois', 0.074), ('accuracy', 0.071), ('states', 0.07), ('training', 0.068), ('inferior', 0.065), ('gyrus', 0.064), ('distinguish', 0.06), ('study', 0.058), ('reading', 0.058), ('er', 0.056), ('train', 0.055), ('hutchinson', 0.055), ('niculescu', 0.055), ('parenthesis', 0.055), ('decode', 0.054), ('human', 0.052), ('whether', 0.052), ('scanned', 0.051), ('pereira', 0.051), ('feasibility', 0.05), ('keller', 0.048), ('mapping', 0.048), ('across', 0.047), ('interval', 0.045), ('classifier', 0.044), ('anatomically', 0.044), ('prefrontal', 0.044), ('multiple', 0.044), ('frontal', 0.042), ('noun', 0.041), ('coordinates', 0.039), ('dence', 0.037), ('studies', 0.037), ('cognitivestate', 0.037), ('dangers', 0.037), ('haxby', 0.037), ('midnight', 0.037), ('opercularis', 0.037), ('photograph', 0.037), ('soldiers', 0.037), ('sulcus', 0.037), ('triangularis', 0.037), ('verbal', 0.037), ('sentences', 0.035), ('transformation', 0.034), ('extraction', 0.034), ('versus', 0.034), ('experienced', 0.034), ('prede', 0.034), ('explored', 0.033), ('trained', 0.032), ('dorsolateral', 0.032), ('resonance', 0.032), ('activation', 0.032), ('feature', 0.032), ('table', 0.031), ('wang', 0.029), ('verb', 0.029), ('wagner', 0.029), ('voxel', 0.029), ('sa', 0.029), ('svm', 0.029), ('imaging', 0.028), ('resolution', 0.028), ('average', 0.027), ('coordinate', 0.027), ('chance', 0.027), ('temporal', 0.027), ('collected', 0.027), ('reporting', 0.026), ('magnetic', 0.026), ('cortex', 0.026), ('statistically', 0.026), ('numbers', 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999988 188 nips-2003-Training fMRI Classifiers to Detect Cognitive States across Multiple Human Subjects
Author: Xuerui Wang, Rebecca Hutchinson, Tom M. Mitchell
Abstract: We consider learning to classify cognitive states of human subjects, based on their brain activity observed via functional Magnetic Resonance Imaging (fMRI). This problem is important because such classifiers constitute “virtual sensors” of hidden cognitive states, which may be useful in cognitive science research and clinical applications. In recent work, Mitchell, et al. [6,7,9] have demonstrated the feasibility of training such classifiers for individual human subjects (e.g., to distinguish whether the subject is reading an ambiguous or unambiguous sentence, or whether they are reading a noun or a verb). Here we extend that line of research, exploring how to train classifiers that can be applied across multiple human subjects, including subjects who were not involved in training the classifier. We describe the design of several machine learning approaches to training multiple-subject classifiers, and report experimental results demonstrating the success of these methods in learning cross-subject classifiers for two different fMRI data sets. 1
2 0.17053135 95 nips-2003-Insights from Machine Learning Applied to Human Visual Classification
Author: Felix A. Wichmann, Arnulf B. Graf
Abstract: We attempt to understand visual classification in humans using both psychophysical and machine learning techniques. Frontal views of human faces were used for a gender classification task. Human subjects classified the faces and their gender judgment, reaction time and confidence rating were recorded. Several hyperplane learning algorithms were used on the same classification task using the Principal Components of the texture and shape representation of the faces. The classification performance of the learning algorithms was estimated using the face database with the true gender of the faces as labels, and also with the gender estimated by the subjects. We then correlated the human responses to the distance of the stimuli to the separating hyperplane of the learning algorithms. Our results suggest that human classification can be modeled by some hyperplane algorithms in the feature space we used. For classification, the brain needs more processing for stimuli close to that hyperplane than for those further away. 1
3 0.11143167 161 nips-2003-Probabilistic Inference in Human Sensorimotor Processing
Author: Konrad P. Körding, Daniel M. Wolpert
Abstract: When we learn a new motor skill, we have to contend with both the variability inherent in our sensors and the task. The sensory uncertainty can be reduced by using information about the distribution of previously experienced tasks. Here we impose a distribution on a novel sensorimotor task and manipulate the variability of the sensory feedback. We show that subjects internally represent both the distribution of the task as well as their sensory uncertainty. Moreover, they combine these two sources of information in a way that is qualitatively predicted by optimal Bayesian processing. We further analyze if the subjects can represent multimodal distributions such as mixtures of Gaussians. The results show that the CNS employs probabilistic models during sensorimotor learning even when the priors are multimodal.
4 0.095124222 52 nips-2003-Different Cortico-Basal Ganglia Loops Specialize in Reward Prediction at Different Time Scales
Author: Saori C. Tanaka, Kenji Doya, Go Okada, Kazutaka Ueda, Yasumasa Okamoto, Shigeto Yamawaki
Abstract: To understand the brain mechanisms involved in reward prediction on different time scales, we developed a Markov decision task that requires prediction of both immediate and future rewards, and analyzed subjects’ brain activities using functional MRI. We estimated the time course of reward prediction and reward prediction error on different time scales from subjects' performance data, and used them as the explanatory variables for SPM analysis. We found topographic maps of different time scales in medial frontal cortex and striatum. The result suggests that different cortico-basal ganglia loops are specialized for reward prediction on different time scales. 1 Intro du ction In our daily life, we make decisions based on the prediction of rewards on different time scales; immediate and long-term effects of an action are often in conflict, and biased evaluation of immediate or future outcome can lead to pathetic behaviors. Lesions in the central serotonergic system result in impulsive behaviors in humans [1], and animals [2, 3], which can be attributed to deficits in reward prediction on a long time scale. Damages in the ventral part of medial frontal cortex (MFC) also cause deficits in decision-making that requires assessment of future outcomes [4-6]. A possible mechanism underlying these observations is that different brain areas are specialized for reward prediction on different time scales, and that the ascending serotonergic system activates those specialized for predictions in longer time scales [7]. The theoretical framework of temporal difference (TD) learning [8] successfully explains reward-predictive activities of the midbrain dopaminergic system as well as those of the cortex and the striatum [9-13]. In TD learning theory, the predicted amount of future reward starting from a state s(t) is formulated as the “value function” V(t) = E[r(t + 1) + γ r(t + 2) + γ 2r(t + 3) + …] (1) and learning is based on the TD error δ(t) = r(t) + γ V(t) – V(t - 1). (2) The ‘discount factor’ γ controls the time scale of prediction; while only the immediate reward r(t + 1) is considered with γ = 0, rewards in the longer future are taken into account with γ closer to 1. In order to test the above hypothesis [7], we developed a reinforcement learning task which requires a large value of discount factor for successful performance, and analyzed subjects’ brain activities using functional MRI. In addition to conventional block-design analysis, a novel model-based regression analysis revealed topographic representation of prediction time scale with in the cortico-basal ganglia loops. 2 2.1 Methods Markov Decision Task In the Markov decision task (Fig. 1), markers on the corners of a square present four states, and the subject selects one of two actions by pressing a button (a1 = left button, a2 = right button) (Fig. 1A). The action determines both the amount of reward and the movement of the marker (Fig. 1B). In the REGULAR condition, the next trial is started from the marker position at the end of the previous trial. Therefore, in order to maximize the reward acquired in a long run, the subject has to select an action by taking into account both the immediate reward and the future reward expected from the subsequent state. The optimal behavior is to receive small negative rewards at states s 2, s3, and s4 to obtain a large positive reward at state s1 (Fig. 1C). In the RANDOM condition, next trial is started from a random marker position so that the subject has to consider only immediate reward. Thus, the optimal behavior is to collect a larger reward at each state (Fig. 1D). In the baseline condition (NO condition), the reward is always zero. In order to learn the optimal behaviors, the discount factor γ has to be larger than 0.3425 in REGULAR condition, while it can be arbitrarily small in RANDOM condition. 2.2 fMRI imaging Eighteen healthy, right-handed volunteers (13 males and 5 females), gave informed consent to take part in the study, with the approval of the ethics and safety committees of ATR and Hiroshima University. A 0 Time 1.0 2.0 2.5 3.0 100 C B +r 2 s2 s1 REGULAR condition s2 -r 1 -r 2 +r 1 s1 100 D RANDOM condition +r 2 s2 s1 -r 1 +r 1 -r 2 -r 1 s4 +r 2 4.0 (s) -r 1 s3 a1 a2 r1 = 20 10 yen r2 = 100 10 yen +r 1 -r 1 s4 -r 1 -r 1 s3 s4 -r 1 s3 Fig. 1. (A) Sequence of stimulus and response events in the Markov decision task. First, one of four squares representing present state turns green (0s). As the fixation point turns green (1s), the subject presses either the right or left button within 1 second. After 1s delay, the green square changes its position (2s), and then a reward for the current action is presented by a number (2.5s) and a bar graph showing cumulative reward during the block is updated (3.0s). One trial takes four seconds. Subjects performed five trials in the NO condition, 32 trials in the RANDOM condition, five trials in the NO condition, and 32 trials in the REGULAR condition in one block. They repeated four blocks; thus, the entire experiment consisted of 312 trials, taking about 20 minutes. (B) The rule of the reward and marker movement. (C) In the REGULAR condition, the optimal behavior is to receive small negative rewards –r 1 (-10, -20, or -30 yen) at states s2, s3, and s4 to obtain a large positive reward +r2 (90, 100, or 110 yen) at state s1. (D) In the RANDOM condition, the next trial is started from random state. Thus, the optimal behavior is to select a larger reward at each state. A 1.5-Tesla scanner (Marconi, MAGNEX ECLIPSE, Japan) was used to acquire both structural T1-weighted images (TR = 12 s, TE = 450 ms, flip angle = 20 deg, matrix = 256 × 256, FoV = 256 mm, thickness = 1 mm, slice gap = 0 mm ) and T2*-weighted echo planar images (TR = 4 s, TE = 55 msec, flip angle = 90 deg, 38 transverse slices, matrix = 64 × 64, FoV = 192 mm, thickness = 4 mm, slice gap = 0 mm, slice gap = 0 mm) with blood oxygen level-dependent (BOLD) contrast. 2.3 Data analysis The data were preprocessed and analyzed with SPM99 (Friston et al., 1995; Wellcome Department of Cognitive Neurology, London, UK). The first three volumes of images were discarded to avoid T1 equilibrium effects. The images were realigned to the first image as a reference, spatially normalized with respect to the Montreal Neurological Institute EPI template, and spatially smoothed with a Gaussian kernel (8 mm, full-width at half-maximum). A RANDOM condition action larger reward Fig. 2. The selected action of a representative single subject (solid line) and the group average ratio of selecting optimal action (dashed line) in (A) RANDOM and (B) REGULAR conditions. smaller reward 1 32 64 96 128 96 128 trial REGULAR condition B action optimal nonoptimal 1 32 64 trial Images of parameter estimates for the contrast of interest were created for each subject. These were then used for a second-level group analysis using a one-sample t-test across the subjects (random effects analysis). We conducted two types of analysis. One was block design analysis using three boxcar regressors convolved with a hemodynamic response function as the reference waveform for each condition (RANDOM, REGULAR, and NO). The other was multivariate regression analysis using explanatory variables, representing the time course of the reward prediction V(t) and reward prediction error δ(t) estimated from subjects’ performance data (described below), in addition to three regressors representing the condition of the block. 2.4 Estimation of predicted reward V(t) and prediction error δ(t) The time course of reward prediction V(t) and reward prediction error δ(t) were estimated from each subject’s performance data, i.e. state s(t), action a(t), and reward r(t), as follows. If the subject starts from a state s(t) and comes back to the same state after k steps, the expected cumulative reward V(t) should satisfy the consistency condition V(t) = r(t + 1) + γ r(t + 2) + … + γ k-1 r(t + k) + γ kV(t). (3) Thus, for each time t of the data file, we calculated the weighted sum of the rewards acquired until the subject returned to the same state and estimated the value function for that episode as r ( t + 1) + γ r ( t + 2 ) + ... + γ k −1r ( t + k ) . ˆ (t ) = V 1− γ k (4) The estimate of the value function V(t) at time t was given by the average of all previous episodes from the same state as at time t V (t ) = 1 L L ∑ Vˆ ( t ) , l (5) l =1 where {t1, …, tL} are the indices of time visiting the same state as s(t), i.e. s(t1) = … = s(tL) = s(t). The TD error was given by the difference between the actual reward r(t) and the temporal difference of the value function V(t) according to equation (2). Assuming that different brain areas are involved in reward prediction on different time scales, we varied the discount factor γ as 0, 0.3, 0.6, 0.8, 0.9, and 0.99. Fig. 3. (A) In REGULAR vs. RANDOM comparison, significant activation was observed in DLPFC ((x, y, z) = (46, 45, 9), peak t = 4.06) (p < 0.001 uncorrected). (B) In RANDOM vs. REGULAR comparison, significant activation was observed in lateral OFC ((x, y, z) = (-32, 9, -21), peak t = 4.90) (p < 0.001 uncorrected). 3 3.1 R e sul t s Behavioral results Figure 2 summarizes the learning performance of a representative single subject (solid line) and group average (dashed line) during fMRI measurement. Fourteen subjects successfully learned to take larger immediate rewards in the RANDOM condition (Fig. 2A) and a large positive reward at s1 after small negative rewards at s2, s3 and s4 in the REGULAR condition (Fig. 2B). 3.2 Block-design analysis In REGULAR vs. RANDOM contrast, we observed a significant activation in the dorsolateral prefrontal cortex (DLPFC) (Fig. 3A) (p < 0.001 uncorrected). In RANDOM vs. REGULAR contrast, we observed a significant activation in lateral orbitofrontal cortex (lOFC) (Fig. 3B) (p < 0.001 uncorrected). The result of block-design analysis suggests differential involvement of neural pathways in reward prediction on long and short time scales. The result in RANDOM vs. REGULAR contrast was consistent with previous studies that the OFC is involved in reward prediction within a short delay and reward outcome [14-20]. 3.3 Regression analysis We observed significant correlation with reward prediction V(t) in the MFC, DLPFC (all γ ), ventromedial insula (small γ ), dorsal striatum, amygdala, hippocampus, and parahippocampal gyrus (large γ ) (p < 0.001 uncorrected) (Fig. 4A). We also found significant correlation with reward prediction error δ(t) in the IPC, PMd, cerebellum (all γ ), ventral striatum (small γ ), and lateral OFC (large γ ) (p < 0.001 uncorrected) (Fig. 4B). As we changed the time scale parameter γ of reward prediction, we found rostro-caudal maps of correlation to V(t) in MFC with increasing γ. Fig. 4. Voxels with a significant correlation (p < 0.001 uncorrected) with reward prediction V(t) and prediction error δ(t) are shown in different colors for different settings of the time scale parameter (γ = 0 in red, γ = 0.3 in orange, γ = 0.6 in yellow, γ = 0.8 in green, γ = 0.9 in cyan, and γ = 0.99 in blue). Voxels correlated with two or more regressors are shown by a mosaic of colors. (A) Significant correlation with reward prediction V(t) was observed in the MFC, DLPFC, dorsal striatum, insula, and hippocampus. Note the anterior-ventral to posterior-dorsal gradient with the increase in γ in the MFC. (B) Significant correlation with reward prediction error δ(t) on γ = 0 was observed in the ventral striatum. 4 D i s c u ss i o n In the MFC, anterior and ventral part was involved in reward prediction V(t) on shorter time scales (0 ≤ γ ≤ 0.6), whereas posterior and dorsal part was involved in reward prediction V(t) on longer time scales (0.6 ≤ γ ≤ 0.99). The ventral striatum involved in reward prediction error δ(t) on shortest time scale (γ = 0), while the dorsolateral striatum correlated with reward prediction V(t) on longer time scales (0.9 ≤ γ ≤ 0.99). These results are consistent with the topographic organization of fronto-striatal connection; the rostral part of the MFC project to the ventral striatum, whereas the dorsal and posterior part of the cingulate cortex project to the dorsolateral striatum [21]. In the MFC and the striatum, no significant difference in activity was observed in block-design analysis while we did find graded maps of activities with different values of γ. A possible reason is that different parts of the MFC and the striatum are concurrently involved with reward prediction on different time scales, regardless of the task context. Activities of the DLPFC and lOFC, which show significant differences in block-design analysis (Fig. 3), may be regulated according to the necessity for the task; From these results, we propose the following mechanism of reward prediction on different time scales. The parallel cortico-basal ganglia loops are responsible for reward prediction on various time scales. The ‘limbic loop’ via the ventral striatum specializes in immediate reward prediction, whereas the ‘cognitive and motor loop’ via the dorsal striatum specialises in future reward prediction. Each loop learns to predict rewards on its specific time scale. To perform an optimal action under a given time scale, the output of the loop with an appropriate time scale is used for actual action selection. Previous studies in brain damages and serotonergic functions suggest that the MFC and the dorsal raphe, which are reciprocally connected [22, 23], play an important role in future reward prediction. The cortico-cortico projections from the MFC, or the serotonergic projections from the dorsal raphe to the cortex and the striatum may be involved in the modulation of these parallel loops. In present study, using a novel regression analysis based on subjects’ performance data and reinforcement learning model, we revealed the maps of time scales in reward prediction, which could not be found by conventional block-design analysis. Future studies using this method under pharmacological manipulation of the serotonergic system would clarify the role of serotonin in regulating the time scale of reward prediction. Acknowledgments We thank Nicolas Schweighofer, Kazuyuki Samejima, Masahiko Haruno, Hiroshi Imamizu, Satomi Higuchi, Toshinori Yoshioka, and Mitsuo Kawato for helpful discussions and technical advice. References [1] Rogers, R.D., et al. (1999) Dissociable deficits in the decision-making cognition of chronic amphetamine abusers, opiate abusers, patients with focal damage to prefrontal cortex, and tryptophan-depleted normal volunteers: evidence for monoaminergic mechanisms. Neuropsychopharmacology 20(4):322-339. [2] Evenden, J.L. & Ryan, C.N. (1996) The pharmacology of impulsive behaviour in rats: the effects of drugs on response choice with varying delays of reinforcement. Psychopharmacology (Berl) 128(2):161-170. [3] Mobini, S., et al. (2000) Effects of central 5-hydroxytryptamine depletion on sensitivity to delayed and probabilistic reinforcement. Psychopharmacology (Berl) 152(4):390-397. [4] Bechara, A., et al. (1994) Insensitivity to future consequences following damage to human prefrontal cortex. Cognition 50(1-3):7-15. [5] Bechara, A., Tranel, D. & Damasio, H. (2000) Characterization of the decision-making deficit of patients with ventromedial prefrontal cortex lesions. Brain 123:2189-2202. [6] Mobini, S., et al. (2002) Effects of lesions of the orbitofrontal cortex on sensitivity to delayed and probabilistic reinforcement. Psychopharmacology (Berl) 160(3):290-298. [7] Doya, K. (2002) 15(4-6):495-506. Metalearning and neuromodulation. Neural Netw [8] Sutton, R.S., Barto, A. G. (1998) Reinforcement learning. Cambridge, MA: MIT press. [9] Houk, J.C., Adams, J.L. & Barto, A.G., A model of how the basal ganglia generate and use neural signals that predict reinforcement, in Models of information processing in the basal ganglia, J.C. Houk, J.L. Davis, and D.G. Beiser, Editors. 1995, MIT Press: Cambridge, Mass. p. 249-270. [10] Schultz, W., Dayan, P. & Montague, P.R. (1997) A neural substrate of prediction and reward. Science 275(5306):1593-1599. [11] Doya, K. (2000) Complementary roles of basal ganglia and cerebellum in learning and motor control. Curr Opin Neurobiol 10(6):732-739. [12] Berns, G.S., et al. (2001) Predictability modulates human brain response to reward. J Neurosci 21(8):2793-2798. [13] O'Doherty, J.P., et al. (2003) Temporal difference models and reward-related learning in the human brain. Neuron 38(2):329-337. [14] Koepp, M.J., et al. (1998) Evidence for striatal dopamine release during a video game. Nature 393(6682):266-268. [15] Rogers, R.D., et al. (1999) Choosing between small, likely rewards and large, unlikely rewards activates inferior and orbital prefrontal cortex. J Neurosci 19(20):9029-9038. [16] Elliott, R., Friston, K.J. & Dolan, R.J. (2000) Dissociable neural responses in human reward systems. J Neurosci 20(16):6159-6165. [17] Breiter, H.C., et al. (2001) Functional imaging of neural responses to expectancy and experience of monetary gains and losses. Neuron 30(2):619-639. [18] Knutson, B., et al. (2001) Anticipation of increasing monetary reward selectively recruits nucleus accumbens. J Neurosci 21(16):RC159. [19] O'Doherty, J.P., et al. (2002) Neural responses during anticipation of a primary taste reward. Neuron 33(5):815-826. [20] Pagnoni, G., et al. (2002) Activity in human ventral striatum locked to errors of reward prediction. Nat Neurosci 5(2):97-98. [21] Haber, S.N., et al. (1995) The orbital and medial prefrontal circuit through the primate basal ganglia. J Neurosci 15(7 Pt 1):4851-4867. [22] Celada, P., et al. (2001) Control of dorsal raphe serotonergic neurons by the medial prefrontal cortex: Involvement of serotonin-1A, GABA(A), and glutamate receptors. J Neurosci 21(24):9917-9929. [23] Martin-Ruiz, R., et al. (2001) Control of serotonergic function in medial prefrontal cortex by serotonin-2A receptors through a glutamate-dependent mechanism. J Neurosci 21(24):9856-9866.
Author: G.C. Littlewort, M.S. Bartlett, I.R. Fasel, J. Chenu, T. Kanda, H. Ishiguro, J.R. Movellan
Abstract: Computer animated agents and robots bring a social dimension to human computer interaction and force us to think in new ways about how computers could be used in daily life. Face to face communication is a real-time process operating at a time scale of less than a second. In this paper we present progress on a perceptual primitive to automatically detect frontal faces in the video stream and code them with respect to 7 dimensions in real time: neutral, anger, disgust, fear, joy, sadness, surprise. The face finder employs a cascade of feature detectors trained with boosting techniques [13, 2]. The expression recognizer employs a novel combination of Adaboost and SVM’s. The generalization performance to new subjects for a 7-way forced choice was 93.3% and 97% correct on two publicly available datasets. The outputs of the classifier change smoothly as a function of time, providing a potentially valuable representation to code facial expression dynamics in a fully automatic and unobtrusive manner. The system was deployed and evaluated for measuring spontaneous facial expressions in the field in an application for automatic assessment of human-robot interaction.
6 0.089530557 90 nips-2003-Increase Information Transfer Rates in BCI by CSP Extension to Multi-class
7 0.078866512 166 nips-2003-Reconstructing MEG Sources with Unknown Correlations
8 0.077579632 147 nips-2003-Online Learning via Global Feedback for Phrase Recognition
9 0.076919131 9 nips-2003-A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications
10 0.07593894 109 nips-2003-Learning a Rare Event Detection Cascade by Direct Feature Selection
11 0.0737582 132 nips-2003-Multiple Instance Learning via Disjunctive Programming Boosting
12 0.067967415 89 nips-2003-Impact of an Energy Normalization Transform on the Performance of the LF-ASD Brain Computer Interface
13 0.064014047 191 nips-2003-Unsupervised Context Sensitive Language Acquisition from a Large Corpus
14 0.061416779 53 nips-2003-Discriminating Deformable Shape Classes
15 0.060273942 160 nips-2003-Prediction on Spike Data Using Kernel Algorithms
16 0.056314182 93 nips-2003-Information Dynamics and Emergent Computation in Recurrent Circuits of Spiking Neurons
17 0.047855139 28 nips-2003-Application of SVMs for Colour Classification and Collision Detection with AIBO Robots
18 0.047747936 179 nips-2003-Sparse Representation and Its Applications in Blind Source Separation
19 0.04726623 182 nips-2003-Subject-Independent Magnetoencephalographic Source Localization by a Multilayer Perceptron
20 0.046374641 8 nips-2003-A Holistic Approach to Compositional Semantics: a connectionist model and robot experiments
topicId topicWeight
[(0, -0.16), (1, -0.011), (2, 0.063), (3, -0.142), (4, -0.093), (5, -0.057), (6, 0.02), (7, -0.048), (8, -0.034), (9, 0.092), (10, 0.103), (11, -0.026), (12, 0.069), (13, -0.051), (14, 0.076), (15, 0.162), (16, -0.099), (17, 0.088), (18, 0.088), (19, -0.118), (20, 0.241), (21, 0.099), (22, -0.025), (23, -0.156), (24, -0.051), (25, 0.1), (26, -0.017), (27, -0.047), (28, 0.023), (29, -0.027), (30, 0.043), (31, -0.009), (32, -0.02), (33, -0.002), (34, 0.039), (35, -0.063), (36, 0.016), (37, -0.04), (38, 0.099), (39, -0.032), (40, 0.095), (41, -0.042), (42, -0.03), (43, -0.046), (44, -0.056), (45, -0.059), (46, 0.016), (47, -0.008), (48, 0.034), (49, 0.02)]
simIndex simValue paperId paperTitle
same-paper 1 0.94745636 188 nips-2003-Training fMRI Classifiers to Detect Cognitive States across Multiple Human Subjects
Author: Xuerui Wang, Rebecca Hutchinson, Tom M. Mitchell
Abstract: We consider learning to classify cognitive states of human subjects, based on their brain activity observed via functional Magnetic Resonance Imaging (fMRI). This problem is important because such classifiers constitute “virtual sensors” of hidden cognitive states, which may be useful in cognitive science research and clinical applications. In recent work, Mitchell, et al. [6,7,9] have demonstrated the feasibility of training such classifiers for individual human subjects (e.g., to distinguish whether the subject is reading an ambiguous or unambiguous sentence, or whether they are reading a noun or a verb). Here we extend that line of research, exploring how to train classifiers that can be applied across multiple human subjects, including subjects who were not involved in training the classifier. We describe the design of several machine learning approaches to training multiple-subject classifiers, and report experimental results demonstrating the success of these methods in learning cross-subject classifiers for two different fMRI data sets. 1
2 0.74065906 90 nips-2003-Increase Information Transfer Rates in BCI by CSP Extension to Multi-class
Author: Guido Dornhege, Benjamin Blankertz, Gabriel Curio, Klaus-Robert Müller
Abstract: Brain-Computer Interfaces (BCI) are an interesting emerging technology that is driven by the motivation to develop an effective communication interface translating human intentions into a control signal for devices like computers or neuroprostheses. If this can be done bypassing the usual human output pathways like peripheral nerves and muscles it can ultimately become a valuable tool for paralyzed patients. Most activity in BCI research is devoted to finding suitable features and algorithms to increase information transfer rates (ITRs). The present paper studies the implications of using more classes, e.g., left vs. right hand vs. foot, for operating a BCI. We contribute by (1) a theoretical study showing under some mild assumptions that it is practically not useful to employ more than three or four classes, (2) two extensions of the common spatial pattern (CSP) algorithm, one interestingly based on simultaneous diagonalization, and (3) controlled EEG experiments that underline our theoretical findings and show excellent improved ITRs. 1
3 0.73610216 95 nips-2003-Insights from Machine Learning Applied to Human Visual Classification
Author: Felix A. Wichmann, Arnulf B. Graf
Abstract: We attempt to understand visual classification in humans using both psychophysical and machine learning techniques. Frontal views of human faces were used for a gender classification task. Human subjects classified the faces and their gender judgment, reaction time and confidence rating were recorded. Several hyperplane learning algorithms were used on the same classification task using the Principal Components of the texture and shape representation of the faces. The classification performance of the learning algorithms was estimated using the face database with the true gender of the faces as labels, and also with the gender estimated by the subjects. We then correlated the human responses to the distance of the stimuli to the separating hyperplane of the learning algorithms. Our results suggest that human classification can be modeled by some hyperplane algorithms in the feature space we used. For classification, the brain needs more processing for stimuli close to that hyperplane than for those further away. 1
4 0.62693775 161 nips-2003-Probabilistic Inference in Human Sensorimotor Processing
Author: Konrad P. Körding, Daniel M. Wolpert
Abstract: When we learn a new motor skill, we have to contend with both the variability inherent in our sensors and the task. The sensory uncertainty can be reduced by using information about the distribution of previously experienced tasks. Here we impose a distribution on a novel sensorimotor task and manipulate the variability of the sensory feedback. We show that subjects internally represent both the distribution of the task as well as their sensory uncertainty. Moreover, they combine these two sources of information in a way that is qualitatively predicted by optimal Bayesian processing. We further analyze if the subjects can represent multimodal distributions such as mixtures of Gaussians. The results show that the CNS employs probabilistic models during sensorimotor learning even when the priors are multimodal.
5 0.56688511 147 nips-2003-Online Learning via Global Feedback for Phrase Recognition
Author: Xavier Carreras, Lluís Màrquez
Abstract: This work presents an architecture based on perceptrons to recognize phrase structures, and an online learning algorithm to train the perceptrons together and dependently. The recognition strategy applies learning in two layers: a filtering layer, which reduces the search space by identifying plausible phrase candidates, and a ranking layer, which recursively builds the optimal phrase structure. We provide a recognition-based feedback rule which reflects to each local function its committed errors from a global point of view, and allows to train them together online as perceptrons. Experimentation on a syntactic parsing problem, the recognition of clause hierarchies, improves state-of-the-art results and evinces the advantages of our global training method over optimizing each function locally and independently. 1
7 0.52558464 89 nips-2003-Impact of an Energy Normalization Transform on the Performance of the LF-ASD Brain Computer Interface
8 0.44428673 178 nips-2003-Sparse Greedy Minimax Probability Machine Classification
9 0.43371406 181 nips-2003-Statistical Debugging of Sampled Programs
10 0.42834979 53 nips-2003-Discriminating Deformable Shape Classes
11 0.42445338 28 nips-2003-Application of SVMs for Colour Classification and Collision Detection with AIBO Robots
12 0.42333904 3 nips-2003-AUC Optimization vs. Error Rate Minimization
13 0.40931112 191 nips-2003-Unsupervised Context Sensitive Language Acquisition from a Large Corpus
14 0.39988032 182 nips-2003-Subject-Independent Magnetoencephalographic Source Localization by a Multilayer Perceptron
15 0.36689851 52 nips-2003-Different Cortico-Basal Ganglia Loops Specialize in Reward Prediction at Different Time Scales
16 0.36293527 109 nips-2003-Learning a Rare Event Detection Cascade by Direct Feature Selection
17 0.35637385 9 nips-2003-A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications
18 0.34461111 166 nips-2003-Reconstructing MEG Sources with Unknown Correlations
19 0.34360662 179 nips-2003-Sparse Representation and Its Applications in Blind Source Separation
20 0.32600492 23 nips-2003-An Infinity-sample Theory for Multi-category Large Margin Classification
topicId topicWeight
[(0, 0.029), (11, 0.021), (29, 0.016), (35, 0.043), (53, 0.118), (64, 0.011), (66, 0.355), (69, 0.011), (71, 0.052), (76, 0.035), (82, 0.011), (85, 0.122), (91, 0.08), (99, 0.014)]
simIndex simValue paperId paperTitle
1 0.9231506 195 nips-2003-When Does Non-Negative Matrix Factorization Give a Correct Decomposition into Parts?
Author: David Donoho, Victoria Stodden
Abstract: We interpret non-negative matrix factorization geometrically, as the problem of finding a simplicial cone which contains a cloud of data points and which is contained in the positive orthant. We show that under certain conditions, basically requiring that some of the data are spread across the faces of the positive orthant, there is a unique such simplicial cone. We give examples of synthetic image articulation databases which obey these conditions; these require separated support and factorial sampling. For such databases there is a generative model in terms of ‘parts’ and NMF correctly identifies the ‘parts’. We show that our theoretical results are predictive of the performance of published NMF code, by running the published algorithms on one of our synthetic image articulation databases. 1
same-paper 2 0.80590647 188 nips-2003-Training fMRI Classifiers to Detect Cognitive States across Multiple Human Subjects
Author: Xuerui Wang, Rebecca Hutchinson, Tom M. Mitchell
Abstract: We consider learning to classify cognitive states of human subjects, based on their brain activity observed via functional Magnetic Resonance Imaging (fMRI). This problem is important because such classifiers constitute “virtual sensors” of hidden cognitive states, which may be useful in cognitive science research and clinical applications. In recent work, Mitchell, et al. [6,7,9] have demonstrated the feasibility of training such classifiers for individual human subjects (e.g., to distinguish whether the subject is reading an ambiguous or unambiguous sentence, or whether they are reading a noun or a verb). Here we extend that line of research, exploring how to train classifiers that can be applied across multiple human subjects, including subjects who were not involved in training the classifier. We describe the design of several machine learning approaches to training multiple-subject classifiers, and report experimental results demonstrating the success of these methods in learning cross-subject classifiers for two different fMRI data sets. 1
3 0.76767117 35 nips-2003-Attractive People: Assembling Loose-Limbed Models using Non-parametric Belief Propagation
Author: Leonid Sigal, Michael Isard, Benjamin H. Sigelman, Michael J. Black
Abstract: The detection and pose estimation of people in images and video is made challenging by the variability of human appearance, the complexity of natural scenes, and the high dimensionality of articulated body models. To cope with these problems we represent the 3D human body as a graphical model in which the relationships between the body parts are represented by conditional probability distributions. We formulate the pose estimation problem as one of probabilistic inference over a graphical model where the random variables correspond to the individual limb parameters (position and orientation). Because the limbs are described by 6-dimensional vectors encoding pose in 3-space, discretization is impractical and the random variables in our model must be continuousvalued. To approximate belief propagation in such a graph we exploit a recently introduced generalization of the particle filter. This framework facilitates the automatic initialization of the body-model from low level cues and is robust to occlusion of body parts and scene clutter. 1
4 0.74365395 98 nips-2003-Kernel Dimensionality Reduction for Supervised Learning
Author: Kenji Fukumizu, Francis R. Bach, Michael I. Jordan
Abstract: We propose a novel method of dimensionality reduction for supervised learning. Given a regression or classification problem in which we wish to predict a variable Y from an explanatory vector X, we treat the problem of dimensionality reduction as that of finding a low-dimensional “effective subspace” of X which retains the statistical relationship between X and Y . We show that this problem can be formulated in terms of conditional independence. To turn this formulation into an optimization problem, we characterize the notion of conditional independence using covariance operators on reproducing kernel Hilbert spaces; this allows us to derive a contrast function for estimation of the effective subspace. Unlike many conventional methods, the proposed method requires neither assumptions on the marginal distribution of X, nor a parametric model of the conditional distribution of Y . 1
5 0.56623584 191 nips-2003-Unsupervised Context Sensitive Language Acquisition from a Large Corpus
Author: Zach Solan, David Horn, Eytan Ruppin, Shimon Edelman
Abstract: We describe a pattern acquisition algorithm that learns, in an unsupervised fashion, a streamlined representation of linguistic structures from a plain natural-language corpus. This paper addresses the issues of learning structured knowledge from a large-scale natural language data set, and of generalization to unseen text. The implemented algorithm represents sentences as paths on a graph whose vertices are words (or parts of words). Significant patterns, determined by recursive context-sensitive statistical inference, form new vertices. Linguistic constructions are represented by trees composed of significant patterns and their associated equivalence classes. An input module allows the algorithm to be subjected to a standard test of English as a Second Language (ESL) proficiency. The results are encouraging: the model attains a level of performance considered to be “intermediate” for 9th-grade students, despite having been trained on a corpus (CHILDES) containing transcribed speech of parents directed to small children. 1
6 0.51999533 115 nips-2003-Linear Dependent Dimensionality Reduction
7 0.50027591 147 nips-2003-Online Learning via Global Feedback for Phrase Recognition
8 0.49648699 107 nips-2003-Learning Spectral Clustering
9 0.49357852 47 nips-2003-Computing Gaussian Mixture Models with EM Using Equivalence Constraints
10 0.49304038 9 nips-2003-A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications
11 0.48747 3 nips-2003-AUC Optimization vs. Error Rate Minimization
12 0.48426509 124 nips-2003-Max-Margin Markov Networks
13 0.4824211 106 nips-2003-Learning Non-Rigid 3D Shape from 2D Motion
14 0.47956517 192 nips-2003-Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes
15 0.47859645 20 nips-2003-All learning is Local: Multi-agent Learning in Global Reward Games
16 0.47638264 64 nips-2003-Estimating Internal Variables and Paramters of a Learning Agent by a Particle Filter
17 0.47555768 54 nips-2003-Discriminative Fields for Modeling Spatial Dependencies in Natural Images
18 0.47348323 126 nips-2003-Measure Based Regularization
19 0.47217447 93 nips-2003-Information Dynamics and Emergent Computation in Recurrent Circuits of Spiking Neurons
20 0.47165909 78 nips-2003-Gaussian Processes in Reinforcement Learning