nips nips2004 nips2004-106 knowledge-graph by maker-knowledge-mining

106 nips-2004-Machine Learning Applied to Perception: Decision Images for Gender Classification

Source: pdf

Author: Felix A. Wichmann, Arnulf B. Graf, Heinrich H. Bülthoff, Eero P. Simoncelli, Bernhard Schölkopf

Abstract: We study gender discrimination of human faces using a combination of psychophysical classiﬁcation and discrimination experiments together with methods from machine learning. We reduce the dimensionality of a set of face images using principal component analysis, and then train a set of linear classiﬁers on this reduced representation (linear support vector machines (SVMs), relevance vector machines (RVMs), Fisher linear discriminant (FLD), and prototype (prot) classiﬁers) using human classiﬁcation data. Because we combine a linear preprocessor with linear classiﬁers, the entire system acts as a linear classiﬁer, allowing us to visualise the decision-image corresponding to the normal vector of the separating hyperplanes (SH) of each classiﬁer. We predict that the female-tomaleness transition along the normal vector for classiﬁers closely mimicking human classiﬁcation (SVM and RVM [1]) should be faster than the transition along any other direction. A psychophysical discrimination experiment using the decision images as stimuli is consistent with this prediction. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Bulthoff and Bernhard Sch¨ lkopf o Max Planck Institute for Biological Cybernetics T¨ bingen, Germany u Abstract We study gender discrimination of human faces using a combination of psychophysical classiﬁcation and discrimination experiments together with methods from machine learning. [sent-9, score-0.574]

2 Because we combine a linear preprocessor with linear classiﬁers, the entire system acts as a linear classiﬁer, allowing us to visualise the decision-image corresponding to the normal vector of the separating hyperplanes (SH) of each classiﬁer. [sent-11, score-0.186]

3 We predict that the female-tomaleness transition along the normal vector for classiﬁers closely mimicking human classiﬁcation (SVM and RVM [1]) should be faster than the transition along any other direction. [sent-12, score-0.314]

4 A psychophysical discrimination experiment using the decision images as stimuli is consistent with this prediction. [sent-13, score-0.257]

5 1 Introduction One of the central problems in vision science is to identify the features used by human subjects to classify visual stimuli. [sent-14, score-0.3]

6 We combine machine learning and psychophysical techniques to gain insight into the algorithms used by human subjects during visual classiﬁcation of faces. [sent-15, score-0.409]

7 Comparing gender classiﬁcation performance of humans to that of machines has attracted considerable attention in the past [2, 3, 4, 5]. [sent-16, score-0.184]

8 The main novel aspect of our study is to analyse the machine algorithms to make inferences about the features used by human subjects, thus providing an alternative to psychophysical feature extraction techniques such as the “bubbles” [6] or the noise classiﬁcation image [7] techniques. [sent-17, score-0.269]

9 In this “machine-learning-psychophysics research” we ﬁrst we train machine learning classiﬁers on the responses (labels) of human subjects to re-create the human decision boundaries by learning machines. [sent-18, score-0.458]

10 Then we look for correlations between machine classiﬁers and sev- eral characteristics of subjects’ responses to the stimuli—proportion correct, reaction times (RT) and conﬁdence ratings. [sent-19, score-0.083]

11 Ideally this allows us to ﬁnd preprocessor-classiﬁer pairings that are closely aligned with the algorithm employed by the human brain for the task at hand. [sent-20, score-0.147]

12 Thereafter we analyse properties of the machine closest to the human—in our case support vector machines (SVMs), and to slightly lesser degree, relevance vector machines (RVMs)—and make predictions about human behaviour based on machine properties. [sent-21, score-0.37]

13 The decision-image has the same dimensionality as the (input-) images—in our case 256 × 256—whereas the normal vector lives in the (reduced dimensionality) space after preprocessing—in our case in 200 × 1 after Principal Component Analysis (PCA). [sent-24, score-0.086]

14 Second, we use w of the classiﬁers to generate novel stimuli by adding (or subtracting) various “amounts” (λw) to a genderless face in w PCA space. [sent-25, score-0.142]

15 We predict that the female-to-maleness transition along the vectors normal to the SHs, wSVM and wRVM , should be signiﬁcantly faster than those along the normal vectors of machine classiﬁers that do not correlate as well with human subjects. [sent-27, score-0.314]

16 A psychophysical gender discrimination experiment conﬁrms our predictions: the female-to-maleness axis of the SVM and, to a smaller extent, RVM, are more closely aligned with the human female-tomaleness axis than those of the prototype (Prot) and a Fisher linear discriminant (FLD) classiﬁer. [sent-28, score-0.588]

17 2 Preprocessing and Machine Learning Methods We preprocessed the faces using PCA. [sent-29, score-0.085]

18 PCA is a good preprocessor in the current context since we have previously shown that in PCA-space strong correlations exist between man and machine [1]. [sent-30, score-0.125]

19 The face stimuli were taken from the gender-balanced Max Planck Institute (MPI) face database1 composed of 200 greyscale 256 × 256-pixel frontal 2 views of human faces, yielding a data matrix X ∈ R200×256 . [sent-32, score-0.345]

20 For the gender discrimination task we adhere to the following convention for the class labels: y = −1 for females and y = +1 for males. [sent-33, score-0.252]

21 The combination of the encoding matrix E with the true class labels y of the MPI database yields the true dataset, whereas its combination with the class labels yest by the subjects yields the subject dataset. [sent-36, score-0.379]

22 To model classiﬁcation in human subjects we use methods from supervised machine learning. [sent-37, score-0.305]

23 In particular, we consider linear classiﬁers where classiﬁcation is done using a SH deﬁned by its normal vector w and offset b. [sent-38, score-0.065]

24 Furthermore the normal vector w of our classiﬁers can then be written as a linear combination of the input patterns xi with suitable coefﬁcients αi as w = i αi xi . [sent-39, score-0.111]

25 Note that in our experiments the xi are the PCA coefﬁcients of the imw 2 ages, that is xi ∈ R200 , whereas the images themselves are in R256 . [sent-41, score-0.1]

26 For the subject dataset we chose the mean values of w, b and w± over all subjects. [sent-42, score-0.075]

27 1 The MPI face database is located at http://faces. [sent-43, score-0.078]

28 SVMs have an intuitive geometrical interpretation: they classify by maximizing the margin separating both classes while minimizing the classiﬁcation error. [sent-49, score-0.028]

29 It optimises the expansion coefﬁcients of a SV-style decision function using a hyperprior which favours sparse solutions. [sent-51, score-0.028]

30 Common classiﬁers in neuroscience, cognitive science and psychology are variants of the Prototype classiﬁer (Prot, [12]). [sent-52, score-0.041]

31 Their popularity is due to their simplicity: they classify according to the nearest mean-of-class prototype; in the simplest form all dimensions are weighted equally but variants exist that weight the dimensions inversely proportional the class variance along the dimensions. [sent-53, score-0.051]

32 As we cannot estimate class variance along all 200 dimensions from only 200 stimuli, we chose to implement the simplest Prot with equal weight along all dimensions. [sent-54, score-0.079]

33 The Fisher linear discriminant classiﬁer (FLD, [13]) ﬁnds a direction in the dataset which allows best linear separation of the two classes. [sent-55, score-0.053]

34 This direction is then used as the normal vector of the separating hyperplane. [sent-56, score-0.093]

35 In fact, FLD is arguably a more principled whitened −1 variant of the Prot classiﬁer: Its weight vector can be written as w = SW (µ+ −µ− ), where −1 SW is the within class covariance matrix of the two classes, and µ± are the class means. [sent-57, score-0.069]

36 Consequently, if we disregard the constant offset b, we can write the decision function as −1/2 −1/2 −1 w|x = SW (µ+ − µ− )|x = SW (µ+ − µ− )|SW x , which is a prototype classiﬁer −1/2 using the prototypes µ± after whitening the space with SW . [sent-58, score-0.189]

37 2 Decision-Images and Generalised Portraits ¯ We combine the linear preprocessor (PCA) X = EB and the linear classiﬁer (SVM, RVM, Prot, FLD) y(x) = w|x + b to yield a linear classiﬁcation system: y = wT E T + b where b = b1. [sent-60, score-0.062]

38 The decision-images in the ﬁrst row are those obtained if the classiﬁers are trained on the true dataset; those in the second row if trained on the subject dataset, marked on the right hand side of the ﬁgure by “true data” and “subj data”, respectively. [sent-66, score-0.109]

39 Decision-images are represented by a vector pointing to the positive class and can thus be expected to have male attributes (the negative of it looks female). [sent-67, score-0.076]

40 For the prototype learner, the eye and beard regions are most important. [sent-70, score-0.185]

41 The decision-images for the subject dataset are slightly more “face-like” and less holistic than those obtained using the true labels; the eye and mouth regions are more strongly emphasised. [sent-73, score-0.211]

42 This suggest that human subjects base their gender classiﬁcation strongly on the eye and mouth regions of the face—clearly a sub-optimal strategy as revealed by the more holistic true dataset SVM, RVM and FLD decision-images. [sent-75, score-0.597]

43 A decision-image thus represents a way to extract the visual cues and features used by human subjects during visual classiﬁcation without using a priori assumptions or knowledge about the task at hand. [sent-76, score-0.321]

44 SVM RVM Prot FLD trained on → W true data → W subj data Figure 1: Decision-images W for each classiﬁer for both the true and the subject dataset; all images are rescaled to [0, 1] and their means set to 128 for illustration purposes (different scalers for different images). [sent-77, score-0.227]

45 The generalised portraits W± can be seen as “summary” faces in each class reﬂecting the decision rule of the classiﬁer. [sent-79, score-0.476]

46 They can be viewed as an extension of the concept of a prototype: they are the prototype of the faces the classiﬁer bases its decision on. [sent-80, score-0.249]

47 We note that w can be written as: w = i αi xi = i| sign(αi )=+1 αi xi − i| sign(αi )=−1 |αi |xi . [sent-81, score-0.046]

48 This allows to deﬁne the generalized portraits as W± which are computed by inverting the PCA transformation P on the patterns w± = i| P sign(αi )=±1 αi xi αi . [sent-82, score-0.186]

49 The generalised portraits for the SVM, RVM and FLD together with the Prot, where the prototype is the same as the generalised portrait, are shown in ﬁgure 2. [sent-84, score-0.653]

50 We also note that w can be written as w = i αi xi = i| sign(αi )=+1 αi xi − i| sign(αi )=−1 |αi |xi . [sent-85, score-0.046]

51 The generalised portraits can be associated with the correct class: W+ are males whereas W− are females. [sent-86, score-0.424]

52 The SVM and the FLD use patterns close to the SH for classiﬁcation and hence their decision-images appear androgynous, whereas Prot and RVM tend to use patterns distant from the SH resulting in more female and male generalised portraits. [sent-87, score-0.29]

53 3 Human Gender Discrimination along the Decision-Image Axes The decision-images introduced in section 2. [sent-89, score-0.028]

54 2 are based purely on machine learning, albeit on labels provided by human subjects in the case of the subject dataset. [sent-90, score-0.378]

55 [Unfortunately the downsampling (low-pass ﬁltering) of the faces necessary to ﬁt them in the ﬁgure makes all the faces somewhat more androgynous than they are viewed at full resolution. [sent-93, score-0.201]

56 ] (RT) and conﬁdence ratings—correlated very well with the distance of the stimuli to their separating hyperplane (SH) for support and relevance vector machines (SVMs, RVMs) but not for simple prototype (Prot) classiﬁer. [sent-94, score-0.328]

57 In other words: the female-to-maleness axis of SVM and RVM should be closely aligned to those of our subjects whereas that is not expected to be the case for FLD and Prot. [sent-96, score-0.201]

58 1 Psychophysical Methods Four observers—one of the authors (FAW) with extensive psychophysical training and three na¨ve subjects paid for their participation—took part in a standard, spatial (left versus ı right) two-alternative forced-choice (2AFC) discrimination experiment. [sent-98, score-0.29]

59 Subjects were presented with two faces I(−λ) and I(λ) and had to indicate which face looked more male. [sent-99, score-0.163]

60 Neither male nor female faces changed the mean luminance. [sent-101, score-0.173]

61 Subjects viewed the screen binocularly with their head stabilised by a headrest. [sent-102, score-0.031]

62 The probability of the female face being presented on the left was 0. [sent-104, score-0.136]

63 5 on each trial and observers indicated whether they 0. [sent-105, score-0.087]

64 4 length of normalised decision image vector λ W / ||W|| @75% correct @90% correct c. [sent-118, score-0.196]

65 HM @75% correct FLD RVM Prot FLD RVM @90% correct 1. [sent-122, score-0.118]

66 FAW FLD proportion correct gender identification 1 0. [sent-134, score-0.244]

67 Shows raw data and ﬁtted psychometric functions for one observer (FAW). [sent-140, score-0.108]

68 For each of four observers the threshold elevation for the RVM, Prot and FLD decision-image relative to that of the SVM; results are shown for both 75 and 90% correct together with 68%-CIs. [sent-142, score-0.208]

69 thought the left or right face was female by touching the corresponding location on a Elo TouchSystems touch-screen immediately in front of the display; no feedback was provided. [sent-145, score-0.136]

70 Trials were run in blocks of 256 in which eight repetitions of eight stimulus levels (±λ1 . [sent-146, score-0.062]

71 The na¨ve subı jects required approximately 2000 trials before their performance stabilised; thereafter they did another ﬁve to six blocks of 256 trials. [sent-150, score-0.045]

72 All results presented below are based on the trials after training; all training trials were discarded. [sent-151, score-0.044]

73 2 Results and Discussion Figure 3a shows the raw data and ﬁtted psychometric functions for one of the observers. [sent-153, score-0.108]

74 Proportion correct gender identiﬁcation on the y-axis is plotted against λ on the x-axis on semi-logarithmic coordinates. [sent-154, score-0.208]

75 68%-conﬁdence intervals (CIs), indicated by horizontal lines at 75 and 90-% correct in ﬁgure 3a, were estimated by the BCa bootstrap method also implemented in psigniﬁt [16]. [sent-156, score-0.059]

76 The raw data appear noisy because each data point is based on only eight trials. [sent-157, score-0.052]

77 However, none of ﬁtted psychometric functions failed various Monte Carlo based goodness-of-ﬁt tests [15]. [sent-158, score-0.087]

78 Figure 3b– e shows the thresholds for all four observers normalised by λSVM (the “threshold elevation” re. [sent-160, score-0.114]

79 0 for RVM, Prot and FLD indicate that more of the corresponding decision-images had to be added for the human observers to be able to discriminate females from males. [sent-163, score-0.239]

80 In ﬁgure 3f we pool the data across observers as the main trend, poorer performance for Prot and FLD compared to SVM and RVM, is apparent for all four observers. [sent-164, score-0.087]

81 The difference between SVM and RVM is small; going along the direction of both Prot and FLD, however, results in a much ”slower” transition from female-tomaleness. [sent-165, score-0.051]

82 The psychophysical data are very clear: all observers require a larger λ for Prot and FLD; the length ratio ranges from 1. [sent-166, score-0.17]

83 In the pooled data all the differences are statistically signiﬁcant but even at the individual subject level all differences are signiﬁcant at the 90% performance level, and ﬁve of eight are signiﬁcant at the 75% performance level. [sent-170, score-0.106]

84 It thus appears that SVM and RVM capture more of the psychological face-space of our human observers than Prot and FLD. [sent-171, score-0.212]

85 From our results we cannot exclude the possibility that some other direction might have yielded even steeper psychometric functions, i. [sent-172, score-0.087]

86 faster female-to-maleness transitions, but we can conclude that the decision-images of SVM and RVM are closer to the decision-images used by human subjects than those of Prot and FLD. [sent-174, score-0.279]

87 This is exactly as predicted by the correlations between proportion correct, RTs and conﬁdence ratings versus distance to the hyperplane reported in [1]—high correlations for SVM and RVM, low correlations for Prot. [sent-175, score-0.168]

88 4 Summary and Conclusions We studied classiﬁcation and discrimination of human faces both psychophysically as well as using methods from machine learning. [sent-176, score-0.289]

89 The combination of linear preprocessor (PCA) and classiﬁer (SVM, RVM, Prot and FLD) allowed us to visualise the decision-images of a classiﬁer corresponding to the vector normal to the SH of the classiﬁer. [sent-177, score-0.158]

90 Decision-images can be used to determine the regions of the stimuli most useful for classiﬁcation simply by analysing the distribution of light and dark regions in the decision-image. [sent-178, score-0.11]

91 In addition we deﬁned the generalised portraits to be the prototypes of all faces used by the classiﬁer to obtain its classiﬁcation. [sent-179, score-0.45]

92 For the SVM this is the weighted average of all the support vectors (SVs), for the RVM the weighted average of all the relevance vectors (RVs), and for the Prot it is the prototype itself. [sent-180, score-0.178]

93 The generalised portraits are, like the decision-images, another useful visualisation of the categorisation algorithm of the machine classiﬁer. [sent-181, score-0.366]

94 In the machine-learning-psychophysics research we substitute a very hard to analyse complex system (the human brain) by a reasonably complex system (learning machine) that is complex enough to capture essentials of our human subjects’ behaviour but is nonetheless amenable to close analysis. [sent-183, score-0.285]

95 From the analysis of the machines we then derive predictions for human subjects which we subsequently test psychophysically. [sent-184, score-0.314]

96 Given the success in predicting the steepness of the female-to-male transition of the wSVM -axis we believe that the decision-image WSVM captures some of the essential characteristics of the human decision algorithm. [sent-185, score-0.176]

97 In addition we thank Frank J¨ kel for supplying us with the code to run the touch-screen experiment. [sent-187, score-0.031]

98 Insights from machine learning applied to human visual classiﬁcation. [sent-194, score-0.172]

99 A comparison of two computer-based face recognition systems with human perceptions of faces. [sent-215, score-0.225]

100 Face recognition algorithms as models of human face processing. [sent-226, score-0.225]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('prot', 0.5), ('fld', 0.469), ('rvm', 0.422), ('generalised', 0.177), ('portraits', 0.163), ('subjects', 0.154), ('gender', 0.149), ('prototype', 0.136), ('human', 0.125), ('svm', 0.12), ('classi', 0.115), ('observers', 0.087), ('sh', 0.087), ('psychometric', 0.087), ('faces', 0.085), ('psychophysical', 0.083), ('face', 0.078), ('pca', 0.068), ('stimuli', 0.064), ('faw', 0.062), ('preprocessor', 0.062), ('sw', 0.062), ('elevation', 0.062), ('ers', 0.06), ('correct', 0.059), ('female', 0.058), ('subj', 0.054), ('discrimination', 0.053), ('portrait', 0.047), ('rvms', 0.047), ('wichmann', 0.047), ('wsvm', 0.047), ('er', 0.044), ('subject', 0.042), ('normal', 0.042), ('relevance', 0.042), ('mpi', 0.041), ('holistic', 0.037), ('correlations', 0.037), ('proportion', 0.036), ('machines', 0.035), ('analyse', 0.035), ('cation', 0.034), ('dataset', 0.033), ('pooled', 0.033), ('sign', 0.032), ('androgynous', 0.031), ('bruce', 0.031), ('eb', 0.031), ('kel', 0.031), ('psigni', 0.031), ('scalers', 0.031), ('stabilised', 0.031), ('visualise', 0.031), ('eight', 0.031), ('labels', 0.031), ('tted', 0.03), ('male', 0.03), ('images', 0.029), ('dence', 0.028), ('decision', 0.028), ('separating', 0.028), ('along', 0.028), ('planck', 0.028), ('bubbles', 0.027), ('normalised', 0.027), ('females', 0.027), ('machine', 0.026), ('eye', 0.026), ('svms', 0.026), ('true', 0.025), ('whereas', 0.025), ('graf', 0.025), ('prototypes', 0.025), ('mouth', 0.025), ('xi', 0.023), ('vector', 0.023), ('transition', 0.023), ('thereafter', 0.023), ('regions', 0.023), ('class', 0.023), ('closely', 0.022), ('trials', 0.022), ('wt', 0.022), ('fisher', 0.022), ('frank', 0.022), ('psychophysics', 0.022), ('recognition', 0.022), ('raw', 0.021), ('row', 0.021), ('perception', 0.021), ('rescaled', 0.021), ('ratings', 0.021), ('cognitive', 0.021), ('dimensionality', 0.021), ('visual', 0.021), ('discriminant', 0.02), ('gure', 0.02), ('reaction', 0.02), ('psychology', 0.02)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 106 nips-2004-Machine Learning Applied to Perception: Decision Images for Gender Classification

Author: Felix A. Wichmann, Arnulf B. Graf, Heinrich H. Bülthoff, Eero P. Simoncelli, Bernhard Schölkopf

2 0.11661987 191 nips-2004-The Variational Ising Classifier (VIC) Algorithm for Coherently Contaminated Data

Author: Oliver Williams, Andrew Blake, Roberto Cipolla

Abstract: There has been substantial progress in the past decade in the development of object classiﬁers for images, for example of faces, humans and vehicles. Here we address the problem of contaminations (e.g. occlusion, shadows) in test images which have not explicitly been encountered in training data. The Variational Ising Classiﬁer (VIC) algorithm models contamination as a mask (a ﬁeld of binary variables) with a strong spatial coherence prior. Variational inference is used to marginalize over contamination and obtain robust classiﬁcation. In this way the VIC approach can turn a kernel classiﬁer for clean data into one that can tolerate contamination, without any speciﬁc training on contaminated positives. 1

3 0.082593665 34 nips-2004-Breaking SVM Complexity with Cross-Training

Author: Léon Bottou, Jason Weston, Gökhan H. Bakir

Abstract: We propose to selectively remove examples from the training set using probabilistic estimates related to editing algorithms (Devijver and Kittler, 1982). This heuristic procedure aims at creating a separable distribution of training examples with minimal impact on the position of the decision boundary. It breaks the linear dependency between the number of SVs and the number of training examples, and sharply reduces the complexity of SVMs during both the training and prediction stages. 1

4 0.077313967 20 nips-2004-An Auditory Paradigm for Brain-Computer Interfaces

Author: N. J. Hill, Thomas N. Lal, Karin Bierig, Niels Birbaumer, Bernhard Schölkopf

Abstract: Motivated by the particular problems involved in communicating with “locked-in” paralysed patients, we aim to develop a braincomputer interface that uses auditory stimuli. We describe a paradigm that allows a user to make a binary decision by focusing attention on one of two concurrent auditory stimulus sequences. Using Support Vector Machine classiﬁcation and Recursive Channel Elimination on the independent components of averaged eventrelated potentials, we show that an untrained user’s EEG data can be classiﬁed with an encouragingly high level of accuracy. This suggests that it is possible for users to modulate EEG signals in a single trial by the conscious direction of attention, well enough to be useful in BCI. 1

5 0.071407974 139 nips-2004-Optimal Aggregation of Classifiers and Boosting Maps in Functional Magnetic Resonance Imaging

Author: Vladimir Koltchinskii, Manel Martínez-ramón, Stefan Posse

Abstract: We study a method of optimal data-driven aggregation of classiﬁers in a convex combination and establish tight upper bounds on its excess risk with respect to a convex loss function under the assumption that the solution of optimal aggregation problem is sparse. We use a boosting type algorithm of optimal aggregation to develop aggregate classiﬁers of activation patterns in fMRI based on locally trained SVM classiﬁers. The aggregation coefﬁcients are then used to design a ”boosting map” of the brain needed to identify the regions with most signiﬁcant impact on classiﬁcation. 1

6 0.070727371 46 nips-2004-Constraining a Bayesian Model of Human Visual Speed Perception

7 0.062279157 134 nips-2004-Object Classification from a Single Example Utilizing Class Relevance Metrics

8 0.058445971 68 nips-2004-Face Detection --- Efficient and Rank Deficient

9 0.058237638 3 nips-2004-A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound

10 0.057667337 111 nips-2004-Maximal Margin Labeling for Multi-Topic Text Categorization

11 0.056907095 182 nips-2004-Synergistic Face Detection and Pose Estimation with Energy-Based Models

12 0.055124491 11 nips-2004-A Second Order Cone programming Formulation for Classifying Missing Data

13 0.053227838 178 nips-2004-Support Vector Classification with Input Data Uncertainty

14 0.052840512 197 nips-2004-Two-Dimensional Linear Discriminant Analysis

15 0.050408859 142 nips-2004-Outlier Detection with One-class Kernel Fisher Discriminants

16 0.048690289 21 nips-2004-An Information Maximization Model of Eye Movements

17 0.048602164 49 nips-2004-Density Level Detection is Classification

18 0.048497595 144 nips-2004-Parallel Support Vector Machines: The Cascade SVM

19 0.048092242 59 nips-2004-Efficient Kernel Discriminant Analysis via QR Decomposition

20 0.048013087 93 nips-2004-Kernel Projection Machine: a New Tool for Pattern Recognition

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.141), (1, 0.042), (2, -0.043), (3, 0.013), (4, 0.039), (5, 0.073), (6, 0.132), (7, -0.067), (8, 0.078), (9, 0.022), (10, -0.053), (11, 0.015), (12, -0.08), (13, -0.025), (14, 0.024), (15, -0.088), (16, -0.095), (17, 0.045), (18, 0.021), (19, 0.077), (20, 0.007), (21, 0.002), (22, 0.039), (23, 0.006), (24, 0.106), (25, 0.04), (26, 0.016), (27, 0.033), (28, -0.082), (29, 0.038), (30, 0.147), (31, -0.044), (32, 0.039), (33, -0.013), (34, -0.091), (35, -0.093), (36, -0.023), (37, 0.047), (38, 0.214), (39, 0.049), (40, 0.025), (41, -0.048), (42, -0.074), (43, 0.125), (44, -0.039), (45, 0.007), (46, -0.1), (47, 0.109), (48, 0.109), (49, -0.0)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.91715753 106 nips-2004-Machine Learning Applied to Perception: Decision Images for Gender Classification

Author: Felix A. Wichmann, Arnulf B. Graf, Heinrich H. Bülthoff, Eero P. Simoncelli, Bernhard Schölkopf

2 0.66937357 191 nips-2004-The Variational Ising Classifier (VIC) Algorithm for Coherently Contaminated Data

Author: Oliver Williams, Andrew Blake, Roberto Cipolla

3 0.51678699 182 nips-2004-Synergistic Face Detection and Pose Estimation with Energy-Based Models

Author: Margarita Osadchy, Matthew L. Miller, Yann L. Cun

Abstract: We describe a novel method for real-time, simultaneous multi-view face detection and facial pose estimation. The method employs a convolutional network to map face images to points on a manifold, parametrized by pose, and non-face images to points far from that manifold. This network is trained by optimizing a loss function of three variables: image, pose, and face/non-face label. We test the resulting system, in a single conﬁguration, on three standard data sets – one for frontal pose, one for rotated faces, and one for proﬁles – and ﬁnd that its performance on each set is comparable to previous multi-view face detectors that can only handle one form of pose variation. We also show experimentally that the system’s accuracy on both face detection and pose estimation is improved by training for the two tasks together.

4 0.4938412 46 nips-2004-Constraining a Bayesian Model of Human Visual Speed Perception

Author: Alan Stocker, Eero P. Simoncelli

Abstract: It has been demonstrated that basic aspects of human visual motion perception are qualitatively consistent with a Bayesian estimation framework, where the prior probability distribution on velocity favors slow speeds. Here, we present a reﬁned probabilistic model that can account for the typical trial-to-trial variabilities observed in psychophysical speed perception experiments. We also show that data from such experiments can be used to constrain both the likelihood and prior functions of the model. Speciﬁcally, we measured matching speeds and thresholds in a two-alternative forced choice speed discrimination task. Parametric ﬁts to the data reveal that the likelihood function is well approximated by a LogNormal distribution with a characteristic contrast-dependent variance, and that the prior distribution on velocity exhibits signiﬁcantly heavier tails than a Gaussian, and approximately follows a power-law function. Humans do not perceive visual motion veridically. Various psychophysical experiments have shown that the perceived speed of visual stimuli is affected by stimulus contrast, with low contrast stimuli being perceived to move slower than high contrast ones [1, 2]. Computational models have been suggested that can qualitatively explain these perceptual effects. Commonly, they assume the perception of visual motion to be optimal either within a deterministic framework with a regularization constraint that biases the solution toward zero motion [3, 4], or within a probabilistic framework of Bayesian estimation with a prior that favors slow velocities [5, 6]. The solutions resulting from these two frameworks are similar (and in some cases identical), but the probabilistic framework provides a more principled formulation of the problem in terms of meaningful probabilistic components. Speciﬁcally, Bayesian approaches rely on a likelihood function that expresses the relationship between the noisy measurements and the quantity to be estimated, and a prior distribution that expresses the probability of encountering any particular value of that quantity. A probabilistic model can also provide a richer description, by deﬁning a full probability density over the set of possible “percepts”, rather than just a single value. Numerous analyses of psychophysical experiments have made use of such distributions within the framework of signal detection theory in order to model perceptual behavior [7]. Previous work has shown that an ideal Bayesian observer model based on Gaussian forms µ posterior low contrast probability density probability density high contrast likelihood prior a posterior likelihood prior v ˆ v ˆ visual speed µ b visual speed Figure 1: Bayesian model of visual speed perception. a) For a high contrast stimulus, the likelihood has a narrow width (a high signal-to-noise ratio) and the prior induces only a small shift µ of the mean v of the posterior. b) For a low contrast stimuli, the measurement ˆ is noisy, leading to a wider likelihood. The shift µ is much larger and the perceived speed lower than under condition (a). for both likelihood and prior is sufﬁcient to capture the basic qualitative features of global translational motion perception [5, 6]. But the behavior of the resulting model deviates systematically from human perceptual data, most importantly with regard to trial-to-trial variability and the precise form of interaction between contrast and perceived speed. A recent article achieved better ﬁts for the model under the assumption that human contrast perception saturates [8]. In order to advance the theory of Bayesian perception and provide signiﬁcant constraints on models of neural implementation, it seems essential to constrain quantitatively both the likelihood function and the prior probability distribution. In previous work, the proposed likelihood functions were derived from the brightness constancy constraint [5, 6] or other generative principles [9]. Also, previous approaches deﬁned the prior distribution based on general assumptions and computational convenience, typically choosing a Gaussian with zero mean, although a Laplacian prior has also been suggested [4]. In this paper, we develop a more general form of Bayesian model for speed perception that can account for trial-to-trial variability. We use psychophysical speed discrimination data in order to constrain both the likelihood and the prior function. 1 1.1 Probabilistic Model of Visual Speed Perception Ideal Bayesian Observer Assume that an observer wants to obtain an estimate for a variable v based on a measurement m that she/he performs. A Bayesian observer “knows” that the measurement device is not ideal and therefore, the measurement m is affected by noise. Hence, this observer combines the information gained by the measurement m with a priori knowledge about v. Doing so (and assuming that the prior knowledge is valid), the observer will – on average – perform better in estimating v than just trusting the measurements m. According to Bayes’ rule 1 p(v|m) = p(m|v)p(v) (1) α the probability of perceiving v given m (posterior) is the product of the likelihood of v for a particular measurements m and the a priori knowledge about the estimated variable v (prior). α is a normalization constant independent of v that ensures that the posterior is a proper probability distribution. ^ ^ P(v2 > v1) 1 + Pcum=0.5 0 a b Pcum=0.875 vmatch vthres v2 Figure 2: 2AFC speed discrimination experiment. a) Two patches of drifting gratings were displayed simultaneously (motion without movement). The subject was asked to ﬁxate the center cross and decide after the presentation which of the two gratings was moving faster. b) A typical psychometric curve obtained under such paradigm. The dots represent the empirical probability that the subject perceived stimulus2 moving faster than stimulus1. The speed of stimulus1 was ﬁxed while v2 is varied. The point of subjective equality, vmatch , is the value of v2 for which Pcum = 0.5. The threshold velocity vthresh is the velocity for which Pcum = 0.875. It is important to note that the measurement m is an internal variable of the observer and is not necessarily represented in the same space as v. The likelihood embodies both the mapping from v to m and the noise in this mapping. So far, we assume that there is a monotonic function f (v) : v → vm that maps v into the same space as m (m-space). Doing so allows us to analytically treat m and vm in the same space. We will later propose a suitable form of the mapping function f (v). An ideal Bayesian observer selects the estimate that minimizes the expected loss, given the posterior and a loss function. We assume a least-squares loss function. Then, the optimal estimate v is the mean of the posterior in Equation (1). It is easy to see why this model ˆ of a Bayesian observer is consistent with the fact that perceived speed decreases with contrast. The width of the likelihood varies inversely with the accuracy of the measurements performed by the observer, which presumably decreases with decreasing contrast due to a decreasing signal-to-noise ratio. As illustrated in Figure 1, the shift in perceived speed towards slow velocities grows with the width of the likelihood, and thus a Bayesian model can qualitatively explain the psychophysical results [1]. 1.2 Two Alternative Forced Choice Experiment We would like to examine perceived speeds under a wide range of conditions in order to constrain a Bayesian model. Unfortunately, perceived speed is an internal variable, and it is not obvious how to design an experiment that would allow subjects to express it directly 1 . Perceived speed can only be accessed indirectly by asking the subject to compare the speed of two stimuli. For a given trial, an ideal Bayesian observer in such a two-alternative forced choice (2AFC) experimental paradigm simply decides on the basis of the two trial estimates v1 (stimulus1) and v2 (stimulus2) which stimulus moves faster. Each estimate v is based ˆ ˆ ˆ on a particular measurement m. For a given stimulus with speed v, an ideal Bayesian observer will produce a distribution of estimates p(ˆ|v) because m is noisy. Over trials, v the observers behavior can be described by classical signal detection theory based on the distributions of the estimates, hence e.g. the probability of perceiving stimulus2 moving 1 Although see [10] for an example of determining and even changing the prior of a Bayesian model for a sensorimotor task, where the estimates are more directly accessible. faster than stimulus1 is given as the cumulative probability Pcum (ˆ2 > v1 ) = v ˆ ∞ 0 p(ˆ2 |v2 ) v v2 ˆ 0 p(ˆ1 |v1 ) dˆ1 dˆ2 v v v (2) Pcum describes the full psychometric curve. Figure 2b illustrates the measured psychometric curve and its ﬁt from such an experimental situation. 2 Experimental Methods We measured matching speeds (Pcum = 0.5) and thresholds (Pcum = 0.875) in a 2AFC speed discrimination task. Subjects were presented simultaneously with two circular patches of horizontally drifting sine-wave gratings for the duration of one second (Figure 2a). Patches were 3deg in diameter, and were displayed at 6deg eccentricity to either side of a ﬁxation cross. The stimuli had an identical spatial frequency of 1.5 cycle/deg. One stimulus was considered to be the reference stimulus having one of two different contrast values (c1 =[0.075 0.5]) and one of ﬁve different speed values (u1 =[1 2 4 8 12] deg/sec) while the second stimulus (test) had one of ﬁve different contrast values (c2 =[0.05 0.1 0.2 0.4 0.8]) and a varying speed that was determined by an interleaved staircase procedure. For each condition there were 96 trials. Conditions were randomly interleaved, including a random choice of stimulus identity (test vs. reference) and motion direction (right vs. left). Subjects were asked to ﬁxate during stimulus presentation and select the faster moving stimulus. The threshold experiment differed only in that auditory feedback was given to indicate the correctness of their decision. This did not change the outcome of the experiment but increased signiﬁcantly the quality of the data and thus reduced the number of trials needed. 3 Analysis With the data from the speed discrimination experiments we could in principal apply a parametric ﬁt using Equation (2) to derive the prior and the likelihood, but the optimization is difﬁcult, and the ﬁt might not be well constrained given the amount of data we have obtained. The problem becomes much more tractable given the following weak assumptions: • We consider the prior to be relatively smooth. • We assume that the measurement m is corrupted by additive Gaussian noise with a variance whose dependence on stimulus speed and contrast is separable. • We assume that there is a mapping function f (v) : v → vm that maps v into the space of m (m-space). In that space, the likelihood is convolutional i.e. the noise in the measurement directly deﬁnes the width of the likelihood. These assumptions allow us to relate the psychophysical data to our probabilistic model in a simple way. The following analysis is in the m-space. The point of subjective equality (Pcum = 0.5) is deﬁned as where the expected values of the speed estimates are equal. We write E vm,1 ˆ vm,1 − E µ1 = E vm,2 ˆ = vm,2 − E µ2 (3) where E µ is the expected shift of the perceived speed compared to the veridical speed. For the discrimination threshold experiment, above assumptions imply that the variance var vm of the speed estimates vm is equal for both stimuli. Then, (2) predicts that the ˆ ˆ discrimination threshold is proportional to the standard deviation, thus vm,2 − vm,1 = γ var vm ˆ (4) likelihood a b prior vm Figure 3: Piece-wise approximation We perform a parametric ﬁt by assuming the prior to be piece-wise linear and the likelihood to be LogNormal (Gaussian in the m-space). where γ is a constant that depends on the threshold criterion Pcum and the exact shape of p(ˆm |vm ). v 3.1 Estimating the prior and likelihood In order to extract the prior and the likelihood of our model from the data, we have to ﬁnd a generic local form of the prior and the likelihood and relate them to the mean and the variance of the speed estimates. As illustrated in Figure 3, we assume that the likelihood is Gaussian with a standard deviation σ(c, vm ). Furthermore, the prior is assumed to be wellapproximated by a ﬁrst-order Taylor series expansion over the velocity ranges covered by the likelihood. We parameterize this linear expansion of the prior as p(vm ) = avm + b. We now can derive a posterior for this local approximation of likelihood and prior and then deﬁne the perceived speed shift µ(m). The posterior can be written as 2 vm 1 1 p(m|vm )p(vm ) = [exp(− )(avm + b)] α α 2σ(c, vm )2 where α is the normalization constant ∞ b p(m|vm )p(vm )dvm = π2σ(c, vm )2 α= 2 −∞ p(vm |m) = (5) (6) We can compute µ(m) as the ﬁrst order moment of the posterior for a given m. Exploiting the symmetries around the origin, we ﬁnd ∞ a(m) µ(m) = σ(c, vm )2 vp(vm |m)dvm ≡ (7) b(m) −∞ The expected value of µ(m) is equal to the value of µ at the expected value of the measurement m (which is the stimulus velocity vm ), thus a(vm ) σ(c, vm )2 E µ = µ(m)|m=vm = (8) b(vm ) Similarly, we derive var vm . Because the estimator is deterministic, the variance of the ˆ estimate only depends on the variance of the measurement m. For a given stimulus, the variance of the estimate can be well approximated by ∂ˆm (m) v var vm = var m ( ˆ |m=vm )2 (9) ∂m ∂µ(m) |m=vm )2 ≈ var m = var m (1 − ∂m Under the assumption of a locally smooth prior, the perceived velocity shift remains locally constant. The variance of the perceived speed vm becomes equal to the variance of the ˆ measurement m, which is the variance of the likelihood (in the m-space), thus var vm = σ(c, vm )2 ˆ (10) With (3) and (4), above derivations provide a simple dependency of the psychophysical data to the local parameters of the likelihood and the prior. 3.2 Choosing a Logarithmic speed representation We now want to choose the appropriate mapping function f (v) that maps v to the m-space. We deﬁne the m-space as the space in which the likelihood is Gaussian with a speedindependent width. We have shown that discrimination threshold is proportional to the width of the likelihood (4), (10). Also, we know from the psychophysics literature that visual speed discrimination approximately follows a Weber-Fechner law [11, 12], thus that the discrimination threshold increases roughly proportional with speed and so would the likelihood. A logarithmic speed representation would be compatible with the data and our choice of the likelihood. Hence, we transform the linear speed-domain v into a normalized logarithmic domain according to v + v0 vm = f (v) = ln( ) (11) v0 where v0 is a small normalization constant. The normalization is chosen to account for the expected deviation of equal variance behavior at the low end. Surprisingly, it has been found that neurons in the Medial Temporal area (Area MT) of macaque monkeys have speed-tuning curves that are very well approximated by Gaussians of constant width in above normalized logarithmic space [13]. These neurons are known to play a central role in the representation of motion. It seems natural to assume that they are strongly involved in tasks such as our performed psychophysical experiments. 4 Results Figure 4 shows the contrast dependent shift of speed perception and the speed discrimination threshold data for two subjects. Data points connected with a dashed line represent the relative matching speed (v2 /v1 ) for a particular contrast value c2 of the test stimulus as a function of the speed of the reference stimulus. Error bars are the empirical standard deviation of ﬁts to bootstrapped samples of the data. Clearly, low contrast stimuli are perceived to move slower. The effect, however, varies across the tested speed range and tends to become smaller for higher speeds. The relative discrimination thresholds for two different contrasts as a function of speed show that the Weber-Fechner law holds only approximately. The data are in good agreement with other data from the psychophysics literature [1, 11, 8]. For each subject, data from both experiments were used to compute a parametric leastsquares ﬁt according to (3), (4), (7), and (10). In order to test the assumption of a LogNormal likelihood we allowed the standard deviation to be dependent on contrast and speed, thus σ(c, vm ) = g(c)h(vm ). We split the speed range into six bins (subject2: ﬁve) and parameterized h(vm ) and the ratio a/b accordingly. Similarly, we parameterized g(c) for the seven contrast values. The resulting ﬁts are superimposed as bold lines in Figure 4. Figure 5 shows the ﬁtted parametric values for g(c) and h(v) (plotted in the linear domain), and the reconstructed prior distribution p(v) transformed back to the linear domain. The approximately constant values for h(v) provide evidence that a LogNormal distribution is an appropriate functional description of the likelihood. The resulting values for g(c) suggest for the likelihood width a roughly exponential decaying dependency on contrast with strong saturation for higher contrasts. discrimination threshold (relative) reference stimulus contrast c1: 0.075 0.5 subject 1 normalized matching speed 1.5 contrast c2 1 0.5 1 10 0.075 0.5 0.79 0.5 0.4 0.3 0.2 0.1 0 10 1 contrast: 1 10 discrimination threshold (relative) normalized matching speed subject 2 1.5 contrast c2 1 0.5 10 1 a 0.5 0.4 0.3 0.2 0.1 10 1 1 b speed of reference stimulus [deg/sec] 10 stimulus speed [deg/sec] Figure 4: Speed discrimination data for two subjects. a) The relative matching speed of a test stimulus with different contrast levels (c2 =[0.05 0.1 0.2 0.4 0.8]) to achieve subjective equality with a reference stimulus (two different contrast values c1 ). b) The relative discrimination threshold for two stimuli with equal contrast (c1,2 =[0.075 0.5]). reconstructed prior subject 1 p(v) [unnormalized] 1 Gaussian Power-Law g(c) 1 h(v) 2 0.9 1.5 0.8 0.1 n=-1.41 0.7 1 0.6 0.01 0.5 0.5 0.4 0.3 1 p(v) [unnormalized] subject 2 10 0.1 1 1 1 1 10 1 10 2 0.9 n=-1.35 0.1 1.5 0.8 0.7 1 0.6 0.01 0.5 0.5 0.4 1 speed [deg/sec] 10 0.3 0 0.1 1 contrast speed [deg/sec] Figure 5: Reconstructed prior distribution and parameters of the likelihood function. The reconstructed prior for both subjects show much heavier tails than a Gaussian (dashed ﬁt), approximately following a power-law function with exponent n ≈ −1.4 (bold line). 5 Conclusions We have proposed a probabilistic framework based on a Bayesian ideal observer and standard signal detection theory. We have derived a likelihood function and prior distribution for the estimator, with a fairly conservative set of assumptions, constrained by psychophysical measurements of speed discrimination and matching. The width of the resulting likelihood is nearly constant in the logarithmic speed domain, and decreases approximately exponentially with contrast. The prior expresses a preference for slower speeds, and approximately follows a power-law distribution, thus has much heavier tails than a Gaussian. It would be interesting to compare the here derived prior distributions with measured true distributions of local image velocities that impinge on the retina. Although a number of authors have measured the spatio-temporal structure of natural images [14, e.g. ], it is clearly difﬁcult to extract therefrom the true prior distribution because of the feedback loop formed through movements of the body, head and eyes. Acknowledgments The authors thank all subjects for their participation in the psychophysical experiments. References [1] P. Thompson. Perceived rate of movement depends on contrast. Vision Research, 22:377–380, 1982. [2] L.S. Stone and P. Thompson. Human speed perception is contrast dependent. Vision Research, 32(8):1535–1549, 1992. [3] A. Yuille and N. Grzywacz. A computational theory for the perception of coherent visual motion. Nature, 333(5):71–74, May 1988. [4] Alan Stocker. Constraint Optimization Networks for Visual Motion Perception - Analysis and Synthesis. PhD thesis, Dept. of Physics, Swiss Federal Institute of Technology, Z¨ rich, Switzeru land, March 2002. [5] Eero Simoncelli. Distributed analysis and representation of visual motion. PhD thesis, MIT, Dept. of Electrical Engineering, Cambridge, MA, 1993. [6] Y. Weiss, E. Simoncelli, and E. Adelson. Motion illusions as optimal percept. Nature Neuroscience, 5(6):598–604, June 2002. [7] D.M. Green and J.A. Swets. Signal Detection Theory and Psychophysics. Wiley, New York, 1966. [8] F. H¨ rlimann, D. Kiper, and M. Carandini. Testing the Bayesian model of perceived speed. u Vision Research, 2002. [9] Y. Weiss and D.J. Fleet. Probabilistic Models of the Brain, chapter Velocity Likelihoods in Biological and Machine Vision, pages 77–96. Bradford, 2002. [10] K. Koerding and D. Wolpert. Bayesian integration in sensorimotor learning. 427(15):244–247, January 2004. Nature, [11] Leslie Welch. The perception of moving plaids reveals two motion-processing stages. Nature, 337:734–736, 1989. [12] S. McKee, G. Silvermann, and K. Nakayama. Precise velocity discrimintation despite random variations in temporal frequency and contrast. Vision Research, 26(4):609–619, 1986. [13] C.H. Anderson, H. Nover, and G.C. DeAngelis. Modeling the velocity tuning of macaque MT neurons. Journal of Vision/VSS abstract, 2003. [14] D.W. Dong and J.J. Atick. Statistics of natural time-varying images. Network: Computation in Neural Systems, 6:345–358, 1995.

5 0.4732888 139 nips-2004-Optimal Aggregation of Classifiers and Boosting Maps in Functional Magnetic Resonance Imaging

Author: Vladimir Koltchinskii, Manel Martínez-ramón, Stefan Posse

6 0.45849422 3 nips-2004-A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound

7 0.43765157 68 nips-2004-Face Detection --- Efficient and Rank Deficient

8 0.4362618 192 nips-2004-The power of feature clustering: An application to object detection

9 0.43100196 111 nips-2004-Maximal Margin Labeling for Multi-Topic Text Categorization

10 0.43060356 127 nips-2004-Neighbourhood Components Analysis

11 0.39601484 14 nips-2004-A Topographic Support Vector Machine: Classification Using Local Label Configurations

12 0.38494155 20 nips-2004-An Auditory Paradigm for Brain-Computer Interfaces

13 0.37498942 101 nips-2004-Learning Syntactic Patterns for Automatic Hypernym Discovery

14 0.3719759 34 nips-2004-Breaking SVM Complexity with Cross-Training

15 0.36798996 11 nips-2004-A Second Order Cone programming Formulation for Classifying Missing Data

16 0.36566755 136 nips-2004-On Semi-Supervised Classification

17 0.36365885 120 nips-2004-Modeling Conversational Dynamics as a Mixed-Memory Markov Process

18 0.358915 156 nips-2004-Result Analysis of the NIPS 2003 Feature Selection Challenge

19 0.35706472 199 nips-2004-Using Machine Learning to Break Visual Human Interaction Proofs (HIPs)

20 0.35653073 53 nips-2004-Discriminant Saliency for Visual Recognition from Cluttered Scenes

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(3, 0.34), (13, 0.062), (15, 0.11), (26, 0.069), (31, 0.029), (33, 0.182), (35, 0.029), (39, 0.017), (50, 0.031), (56, 0.02)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.77575874 106 nips-2004-Machine Learning Applied to Perception: Decision Images for Gender Classification

Author: Felix A. Wichmann, Arnulf B. Graf, Heinrich H. Bülthoff, Eero P. Simoncelli, Bernhard Schölkopf

2 0.71683371 177 nips-2004-Supervised Graph Inference

Author: Jean-philippe Vert, Yoshihiro Yamanishi

Abstract: We formulate the problem of graph inference where part of the graph is known as a supervised learning problem, and propose an algorithm to solve it. The method involves the learning of a mapping of the vertices to a Euclidean space where the graph is easy to infer, and can be formulated as an optimization problem in a reproducing kernel Hilbert space. We report encouraging results on the problem of metabolic network reconstruction from genomic data. 1

3 0.71465772 55 nips-2004-Distributed Occlusion Reasoning for Tracking with Nonparametric Belief Propagation

Author: Erik B. Sudderth, Michael I. Mandel, William T. Freeman, Alan S. Willsky

Abstract: We describe a three–dimensional geometric hand model suitable for visual tracking applications. The kinematic constraints implied by the model’s joints have a probabilistic structure which is well described by a graphical model. Inference in this model is complicated by the hand’s many degrees of freedom, as well as multimodal likelihoods caused by ambiguous image measurements. We use nonparametric belief propagation (NBP) to develop a tracking algorithm which exploits the graph’s structure to control complexity, while avoiding costly discretization. While kinematic constraints naturally have a local structure, self– occlusions created by the imaging process lead to complex interpendencies in color and edge–based likelihood functions. However, we show that local structure may be recovered by introducing binary hidden variables describing the occlusion state of each pixel. We augment the NBP algorithm to infer these occlusion variables in a distributed fashion, and then analytically marginalize over them to produce hand position estimates which properly account for occlusion events. We provide simulations showing that NBP may be used to reﬁne inaccurate model initializations, as well as track hand motion through extended image sequences. 1

4 0.56295407 69 nips-2004-Fast Rates to Bayes for Kernel Machines

Author: Ingo Steinwart, Clint Scovel

Abstract: We establish learning rates to the Bayes risk for support vector machines (SVMs) with hinge loss. In particular, for SVMs with Gaussian RBF kernels we propose a geometric condition for distributions which can be used to determine approximation properties of these kernels. Finally, we compare our methods with a recent paper of G. Blanchard et al.. 1

5 0.56144929 161 nips-2004-Self-Tuning Spectral Clustering

Author: Lihi Zelnik-manor, Pietro Perona

Abstract: We study a number of open issues in spectral clustering: (i) Selecting the appropriate scale of analysis, (ii) Handling multi-scale data, (iii) Clustering with irregular background clutter, and, (iv) Finding automatically the number of groups. We ﬁrst propose that a ‘local’ scale should be used to compute the afﬁnity between each pair of points. This local scaling leads to better clustering especially when the data includes multiple scales and when the clusters are placed within a cluttered background. We further suggest exploiting the structure of the eigenvectors to infer automatically the number of groups. This leads to a new algorithm in which the ﬁnal randomly initialized k-means stage is eliminated. 1

6 0.56062049 174 nips-2004-Spike Sorting: Bayesian Clustering of Non-Stationary Data

7 0.56041896 45 nips-2004-Confidence Intervals for the Area Under the ROC Curve

8 0.55850017 167 nips-2004-Semi-supervised Learning with Penalized Probabilistic Clustering

9 0.5581907 62 nips-2004-Euclidean Embedding of Co-Occurrence Data

10 0.55749249 77 nips-2004-Hierarchical Clustering of a Mixture Model

11 0.55697364 189 nips-2004-The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees

12 0.55666757 206 nips-2004-Worst-Case Analysis of Selective Sampling for Linear-Threshold Algorithms

13 0.55663633 3 nips-2004-A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound

14 0.55659777 103 nips-2004-Limits of Spectral Clustering

15 0.55655587 31 nips-2004-Blind One-microphone Speech Separation: A Spectral Learning Approach

16 0.55631417 207 nips-2004-ℓ₀-norm Minimization for Basis Selection

17 0.55554563 44 nips-2004-Conditional Random Fields for Object Recognition

18 0.5545783 86 nips-2004-Instance-Specific Bayesian Model Averaging for Classification

19 0.55417007 179 nips-2004-Surface Reconstruction using Learned Shape Models

20 0.55386621 187 nips-2004-The Entire Regularization Path for the Support Vector Machine