nips nips2007 nips2007-109 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Mehul Parsana, Sourangshu Bhattacharya, Chiru Bhattacharya, K. Ramakrishnan
Abstract: This paper introduces kernels on attributed pointsets, which are sets of vectors embedded in an euclidean space. The embedding gives the notion of neighborhood, which is used to define positive semidefinite kernels on pointsets. Two novel kernels on neighborhoods are proposed, one evaluating the attribute similarity and the other evaluating shape similarity. Shape similarity function is motivated from spectral graph matching techniques. The kernels are tested on three real life applications: face recognition, photo album tagging, and shot annotation in video sequences, with encouraging results. 1
Reference: text
sentIndex sentText sentNum sentScore
1 in Abstract This paper introduces kernels on attributed pointsets, which are sets of vectors embedded in an euclidean space. [sent-12, score-0.584]
2 The embedding gives the notion of neighborhood, which is used to define positive semidefinite kernels on pointsets. [sent-13, score-0.329]
3 Two novel kernels on neighborhoods are proposed, one evaluating the attribute similarity and the other evaluating shape similarity. [sent-14, score-0.582]
4 Shape similarity function is motivated from spectral graph matching techniques. [sent-15, score-0.265]
5 The kernels are tested on three real life applications: face recognition, photo album tagging, and shot annotation in video sequences, with encouraging results. [sent-16, score-0.927]
6 1 Introduction In recent times, one of the major challenges in kernel methods has been design of kernels on structured data e. [sent-17, score-0.473]
7 In this paper, we propose kernels on a type of structured objects called attributed pointsets [18]. [sent-20, score-0.863]
8 Attributed pointsets are points embedded in a euclidean space with a vector of attributes attached to each point. [sent-21, score-0.506]
9 The embedding of points in the euclidean space yields a notion of neighborhood of each point which is exploited in designing new kernels. [sent-22, score-0.312]
10 Also, we describe the notion of similarity between pointsets which model many real life scenarios and incorporate it in the proposed kernels. [sent-23, score-0.512]
11 The main contribution of this paper is definition of two different kernels on neighborhoods. [sent-24, score-0.289]
12 These neighborhood kernels are then used to define kernels on the entire pointsets. [sent-25, score-0.746]
13 The first kernel treats the neighborhoods as sets of vectors for calculating the similarity. [sent-26, score-0.374]
14 Second kernel calculates similarity in shape of the two neighborhoods. [sent-27, score-0.312]
15 We demonstrate practical applications of the kernels on the well known task of face recognition [20], and two other novel tasks of tagging photo albums and annotation of shots in video sequences. [sent-29, score-1.143]
16 For the face recognition task, we test our kernels on benchmark datasets and compare their performance with state-of-the-art algorithms. [sent-30, score-0.592]
17 Our kernels outperform the existing methods in many cases. [sent-31, score-0.289]
18 The kernels also perform according to expectation on the two novel applications. [sent-32, score-0.289]
19 Section 2 defines attributed pointsets and contrasts it with related notions. [sent-33, score-0.574]
20 Section 3 proposes two kernels and section 4 describes experimental results. [sent-34, score-0.289]
21 2 Definition and related work An attributed pointset [18, 1] (a. [sent-35, score-0.38]
22 Also, for practical purposes pointsets with u = 2, 3 are of interest. [sent-47, score-0.368]
23 The construct of pointsets are richer than sets of vectors [17] because of the structure formed by embedding of the points in a euclidean space. [sent-48, score-0.558]
24 However, they are less general than attributed graphs because all attributed graphs cannot be embedded onto a euclidean space. [sent-49, score-0.517]
25 The notion of similarity between pointsets is also different from those between sets of vectors, or graphs. [sent-51, score-0.465]
26 The main aspect of similarity is that there should be correspondences (1-1 mappings) between the points of a pointset such that the relative positions of corresponding point are same. [sent-52, score-0.373]
27 In case of sets of vectors, the kernel function captures the similarity between aggregate properties of the two sets, such as the principle angles between spanned subspaces [17], or distance between the distributions generating the vectors [9]. [sent-54, score-0.327]
28 For example, consider recognizing faces using local descriptors calculated at some descriptor points (corner points in this case) on the face. [sent-56, score-0.483]
29 It is necessary that subsets of descriptor points found in two images of the same face should be approximately superimposable (slight changes may be due to change of expression) and that the descriptor values for the corresponding points should be roughly same to ensure similar local features. [sent-57, score-0.629]
30 Thus, a face can be modeled as an attributed pointset X = {(xi , di )|i = 1 . [sent-58, score-0.591]
31 n}, where xi ∈ R2 is the coordinate of ith descriptor point and di ∈ Rv is the local descriptor vector at the ith descriptor point. [sent-61, score-0.389]
32 A local descriptor based kernel was proposed for object recognition in similar setting in [12]. [sent-63, score-0.384]
33 The i i i i nB nA normalized sum kernel [12] was defined as KN S (X A , X B ) = nA1 B i=1 j=1 (K(dA , dB ))p , i j n where K(dA , dB ) is some kernel function on the descriptors. [sent-71, score-0.368]
34 It was argued in [12] that raising i j the kernel to a high power p approximately calculates similarity between matched pairs of vectors. [sent-72, score-0.312]
35 In fact, this kernel can be derived as a special case of the set kernel proposed T ˆ ˆ in [15]. [sent-74, score-0.368]
36 The kernel K(A, B) = trace r (A Gr B)Fr becomes K(A, B) = ij k(ai , bj )fij ˆ for Gr = I and F = Fr (whose entries are fij ) should be positive semidefinite [15]. [sent-75, score-0.213]
37 Thus, r choosing F = 11T (all entries 1) and multiplying the kernel by n2 1 2 and using KRBF as the A nB kernel on vectors, we get back the kernel defined in (1). [sent-76, score-0.552]
38 The normalized sum kernel is used as the basic kernel for development and validation of the new kernels proposed here. [sent-77, score-0.657]
39 1 Kernels Neighborhood kernels The key idea in this section is to use spatially co-occurring points of a point to improve the similarity values given by the kernel function. [sent-80, score-0.631]
40 In other words, we hypothesize that similar points from two pointsets should also have neighboring points which are similar. [sent-81, score-0.49]
41 Thus, for each point we define a neighborhood of the point and weight the similarity between each pair of points with the similarity between their neighborhoods. [sent-82, score-0.423]
42 The k-neighborhood Ni of a point (xi , di ) in a pointset X is defined as the set of points (including itself) that are closest to it in the embedding euclidean space. [sent-83, score-0.355]
43 Figure 1 shows values of KN S and KN K for 4 pairs of point from two pointsets modeling faces. [sent-87, score-0.368]
44 The next section proposes a kernel which uses positions of points (xi ) in a neighborhood more strongly to calculate similarity in shape. [sent-91, score-0.51]
45 2 Spectral Neighborhood Kernel The kernel defined in the previous section still uses a set of vectors kernel for finding similarity between the neighborhoods. [sent-93, score-0.511]
46 Here, we are interested in a kernel function which evaluates the similarity in relative position of the corresponding points. [sent-94, score-0.281]
47 Since the neighborhoods being compared are of fixed size, we assume that all points in a neighborhood have a corresponding point in the other. [sent-95, score-0.373]
48 We use the features given by spectral decomposition of adjacency matrix of the neighborhood to define a kernel function. [sent-98, score-0.522]
49 Given a neighborhood Ni we define its adjacency matrix Ai as Ai (s, t) = xs −xt e− α , ∀s, t|(xs , ds ), (xt , dt ) ∈ Ni , where α is a parameter. [sent-99, score-0.251]
50 Given two neighborhoods NiA and NjB , we are thus interested in a permutation π of the basis of adjacency matrix of one of the neighborhoods (say NjB ), such that AA − π(AB ) F is minimized, . [sent-100, score-0.41]
51 Call fiA = |ζi (1)| and fj = |ζj (1)|, the spectral projection vectors A B A B corresponding to neighborhoods Ni and Nj . [sent-109, score-0.378]
52 It is B equivalent to seek a permutation π ∗ which minimizes fiA − π(fj ) , for comparing neighborhoods A B Ni and Nj . [sent-112, score-0.217]
53 B To construct a positive semidefinite kernel giving similarity between the vectors fiA and fj , we use the convolution kernel technique [7] on discrete structures. [sent-115, score-0.618]
54 The convolution kernel K over X is defined as: m Ki (xi , yi ) K(x, y) = (x1 ,. [sent-140, score-0.224]
55 , f (k) are spectral projections the points of neighborhood NiA ). [sent-169, score-0.35]
56 Thus, from the above equation, the convolution kernel becomes −1 Pl A B K(NiA , Nj ) = k! [sent-174, score-0.224]
57 Dividing by the A B − fi −π(fj ) 2 β (6) π∈Π The spectral kernel (SK) KSK between two pointsets X A and X B is thus defined as: KSK (X A , X B ) = 1 nA nB nA nB KRBF (dA , dB )KSN (NiA , NjB ) i j (7) i=1 j=1 Following theorem relates KSN (NiA , NjB ) to S(NiA , NjB ) (eqn 4). [sent-179, score-0.673]
58 4 Experimental Results In order to study the effectiveness of proposed kernels for practical visual tasks, we applied them on three problems. [sent-248, score-0.289]
59 Firstly, the kernels were applied to the well known problem of face recognition [20], and results on two benchmark datasets (AR and ORL) were compared to existing state-of-theart methods. [sent-249, score-0.592]
60 Next we used the spectral kernel to tag images in personal photo albums using faces of people present in them. [sent-250, score-0.894]
61 Finally, the spectral kernel was used for annotation of video sequences using faces of people present. [sent-251, score-0.704]
62 Attribute For face recognition, faces were modeled as attributed pointsets using local gabor descriptors [10] calculated at the corner points using Harris corner point detector [6]. [sent-252, score-1.234]
63 For image tagging and video annotation, faces were modeled as attributed pointsets using SIFT local descriptors [11], having 128 descriptors per point. [sent-255, score-1.131]
64 The face detector provided in OpenCV was used for detecting faces in album images and video frames. [sent-259, score-0.683]
65 Dataset The AR dataset [13] is composed of color images of 135 people (75 men and 60 women). [sent-260, score-0.244]
66 1 Face Recognition in AR face DB The kernels proposed in this paper, were tested pointsets derived from images in AR face DB. [sent-268, score-1.124]
67 The AR face DB is a standard benchmark dataset, on which a recent comparison of state of the art methods for face recognition has been given in [14]. [sent-270, score-0.477]
68 All the results reported in table 1 have been obtained using one normal (no occlusion or change of expression) face image as the training set. [sent-272, score-0.228]
69 It can be seen that for all the images showing change of expression (Smile, Angry and Scream), the pointset kernels outperform existing methods. [sent-273, score-0.582]
70 Also, in case of occlusion of face by glasses, the 5 Table 2: Recognition accuracy on ORL dataset (section 4. [sent-274, score-0.258]
71 00% Figure 2: Representative cluster from tagging of album pointset kernels give better results than existing methods. [sent-284, score-0.759]
72 However, in case of occlusion by scarf, the kernel based method do not perform as well as the Face-ARG or AMM. [sent-285, score-0.238]
73 It was observed that about 50% of the descriptor points in the faces having scarfs were in the scarf region of the image. [sent-287, score-0.411]
74 Summing the similarities over such a large number of extra points makes the overall kernel value noisy. [sent-288, score-0.245]
75 However, performance of the kernels is comparable to the existing methods, thus demonstrating the effectiveness of modeling faces as attributed pointsets. [sent-292, score-0.661]
76 2 Recognition performance on ORL Dataset Real life problems in face recognition also show minor variations in pose, which are addressed by testing the kernels on images in the ORL dataset. [sent-294, score-0.76]
77 The problem was posed as a multiclass classification problem and SVM was used along with the kernels for classification. [sent-295, score-0.322]
78 Table 2 reports the recognition accuracies of all the three kernels for two different values of parameters, and for 1, 3 and 5 training images. [sent-296, score-0.415]
79 It can be seen that even with images showing minor variations in pose, the proposed kernels perform reasonably well. [sent-297, score-0.446]
80 Also, due to change in pose the relative position of points in the pointsets change. [sent-298, score-0.48]
81 This is reflected in the fact that improvement due to addition of position information in kernels is minor as compared to those shown in AR dataset. [sent-299, score-0.327]
82 For higher number of training images, the performance of all the kernels saturate at 98%. [sent-300, score-0.289]
83 3 Tagging images in personal albums based on faces The problem of tagging images in personal albums with names of people present in them, is a problem of high practical relevance [19]. [sent-302, score-0.961]
84 Five personal albums having 20 - 55 images each were downloaded and many images had upto 6 people. [sent-307, score-0.424]
85 Faces detected from the images were represented as attributed pointsets using SIFT local descriptors, and spectral kernel was evaluated on them. [sent-313, score-1.056]
86 Ideally, each cluster thus obtained should represent a person and images containing faces from a given cluster should be tagged with the name of that person. [sent-315, score-0.403]
87 1 We intend to make the dataset publicly available if no copyrights are violated 6 Table 3: Face based album tagging Album no. [sent-316, score-0.267]
88 70% Figure 3: Keyframes of a few shots detected with annotation Table 3 reports results from tagging experiments for five albums. [sent-325, score-0.412]
89 4 Video annotation based on faces The kernels were also used to perform video shot annotation based faces detected in video sequences. [sent-337, score-1.072]
90 The faces annotating the shots are shown in the left as thumbnails. [sent-346, score-0.296]
91 The results on detecting shots were highly encouraging, thus demonstrating the varied applicability of proposed attributed pointset kernels. [sent-348, score-0.481]
92 5 Conclusion In this article, we propose kernels on attributed pointsets. [sent-349, score-0.495]
93 We define the notion of neighborhood in an attributed pointset and propose two new kernels. [sent-350, score-0.548]
94 The first kernel evaluates attribute similarities between the neighborhoods and uses the co-occurrence information to improve the performance of kernels on sets of vectors. [sent-351, score-0.669]
95 This kernel function is motivated from spectral graph matching techniques. [sent-353, score-0.352]
96 The proposed kernels were validated on the well known task on face recognition on two popular benchmark datasets. [sent-354, score-0.592]
97 Results show that the current kernels perform competitively with the state-ofthe-art techniques for face recognition. [sent-355, score-0.463]
98 The spectral kernel was also used to perform two real life tasks of tagging images in personal photo albums and annotating shots in videos. [sent-356, score-0.978]
99 Algebraic set kernels with application to inference over local image representations. [sent-430, score-0.289]
100 Automated annotation of human faces in family albums, 2003. [sent-450, score-0.257]
wordName wordTfidf (topN-words)
[('pointsets', 0.368), ('kernels', 0.289), ('nia', 0.213), ('attributed', 0.206), ('kernel', 0.184), ('face', 0.174), ('pointset', 0.174), ('neighborhood', 0.168), ('faces', 0.166), ('kn', 0.166), ('ni', 0.161), ('db', 0.16), ('njb', 0.155), ('neighborhoods', 0.144), ('tagging', 0.129), ('spectral', 0.121), ('images', 0.119), ('albums', 0.118), ('krbf', 0.116), ('nb', 0.115), ('album', 0.108), ('da', 0.107), ('descriptor', 0.107), ('ksn', 0.101), ('shots', 0.101), ('ar', 0.099), ('similarity', 0.097), ('fia', 0.097), ('recognition', 0.093), ('annotation', 0.091), ('na', 0.091), ('descriptors', 0.088), ('video', 0.086), ('scarf', 0.077), ('xm', 0.075), ('permutation', 0.073), ('orl', 0.072), ('personal', 0.068), ('xb', 0.068), ('fj', 0.067), ('nj', 0.067), ('photo', 0.062), ('points', 0.061), ('eq', 0.059), ('cluster', 0.059), ('detected', 0.058), ('people', 0.056), ('occlusion', 0.054), ('attribute', 0.052), ('corner', 0.052), ('pose', 0.051), ('adjacency', 0.049), ('semide', 0.048), ('life', 0.047), ('matching', 0.047), ('vectors', 0.046), ('xa', 0.046), ('euclidean', 0.043), ('correspondences', 0.041), ('convolution', 0.04), ('embedding', 0.04), ('eigenvectors', 0.04), ('amnon', 0.039), ('angry', 0.039), ('glasses', 0.039), ('gnu', 0.039), ('ksk', 0.039), ('lapack', 0.039), ('men', 0.039), ('scream', 0.039), ('shot', 0.039), ('smile', 0.039), ('sourangshu', 0.039), ('tpami', 0.039), ('women', 0.039), ('minor', 0.038), ('lim', 0.037), ('di', 0.037), ('gabor', 0.037), ('benchmark', 0.036), ('videos', 0.036), ('xs', 0.034), ('attached', 0.034), ('opencv', 0.034), ('reports', 0.033), ('posed', 0.033), ('encouraging', 0.031), ('xi', 0.031), ('rv', 0.031), ('calculates', 0.031), ('graphs', 0.031), ('dataset', 0.03), ('doesn', 0.03), ('detector', 0.03), ('nk', 0.03), ('illumination', 0.029), ('fij', 0.029), ('annotating', 0.029), ('fr', 0.029), ('twentieth', 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999958 109 nips-2007-Kernels on Attributed Pointsets with Applications
Author: Mehul Parsana, Sourangshu Bhattacharya, Chiru Bhattacharya, K. Ramakrishnan
Abstract: This paper introduces kernels on attributed pointsets, which are sets of vectors embedded in an euclidean space. The embedding gives the notion of neighborhood, which is used to define positive semidefinite kernels on pointsets. Two novel kernels on neighborhoods are proposed, one evaluating the attribute similarity and the other evaluating shape similarity. Shape similarity function is motivated from spectral graph matching techniques. The kernels are tested on three real life applications: face recognition, photo album tagging, and shot annotation in video sequences, with encouraging results. 1
2 0.14351042 155 nips-2007-Predicting human gaze using low-level saliency combined with face detection
Author: Moran Cerf, Jonathan Harel, Wolfgang Einhaeuser, Christof Koch
Abstract: Under natural viewing conditions, human observers shift their gaze to allocate processing resources to subsets of the visual input. Many computational models try to predict such voluntary eye and attentional shifts. Although the important role of high level stimulus properties (e.g., semantic information) in search stands undisputed, most models are based on low-level image properties. We here demonstrate that a combined model of face detection and low-level saliency significantly outperforms a low-level model in predicting locations humans fixate on, based on eye-movement recordings of humans observing photographs of natural scenes, most of which contained at least one person. Observers, even when not instructed to look for anything particular, fixate on a face with a probability of over 80% within their first two fixations; furthermore, they exhibit more similar scanpaths when faces are present. Remarkably, our model’s predictive performance in images that do not contain faces is not impaired, and is even improved in some cases by spurious face detector responses. 1
3 0.13840529 190 nips-2007-Support Vector Machine Classification with Indefinite Kernels
Author: Ronny Luss, Alexandre D'aspremont
Abstract: In this paper, we propose a method for support vector machine classification using indefinite kernels. Instead of directly minimizing or stabilizing a nonconvex loss function, our method simultaneously finds the support vectors and a proxy kernel matrix used in computing the loss. This can be interpreted as a robust classification problem where the indefinite kernel matrix is treated as a noisy observation of the true positive semidefinite kernel. Our formulation keeps the problem convex and relatively large problems can be solved efficiently using the analytic center cutting plane method. We compare the performance of our technique with other methods on several data sets.
4 0.10804859 118 nips-2007-Learning with Transformation Invariant Kernels
Author: Christian Walder, Olivier Chapelle
Abstract: This paper considers kernels invariant to translation, rotation and dilation. We show that no non-trivial positive definite (p.d.) kernels exist which are radial and dilation invariant, only conditionally positive definite (c.p.d.) ones. Accordingly, we discuss the c.p.d. case and provide some novel analysis, including an elementary derivation of a c.p.d. representer theorem. On the practical side, we give a support vector machine (s.v.m.) algorithm for arbitrary c.p.d. kernels. For the thinplate kernel this leads to a classifier with only one parameter (the amount of regularisation), which we demonstrate to be as effective as an s.v.m. with the Gaussian kernel, even though the Gaussian involves a second parameter (the length scale). 1
5 0.10643865 18 nips-2007-A probabilistic model for generating realistic lip movements from speech
Author: Gwenn Englebienne, Tim Cootes, Magnus Rattray
Abstract: The present work aims to model the correspondence between facial motion and speech. The face and sound are modelled separately, with phonemes being the link between both. We propose a sequential model and evaluate its suitability for the generation of the facial animation from a sequence of phonemes, which we obtain from speech. We evaluate the results both by computing the error between generated sequences and real video, as well as with a rigorous double-blind test with human subjects. Experiments show that our model compares favourably to other existing methods and that the sequences generated are comparable to real video sequences. 1
6 0.091495991 160 nips-2007-Random Features for Large-Scale Kernel Machines
7 0.088023014 115 nips-2007-Learning the 2-D Topology of Images
8 0.082619213 65 nips-2007-DIFFRAC: a discriminative and flexible framework for clustering
9 0.079136871 143 nips-2007-Object Recognition by Scene Alignment
10 0.076762989 188 nips-2007-Subspace-Based Face Recognition in Analog VLSI
11 0.071567543 139 nips-2007-Nearest-Neighbor-Based Active Learning for Rare Category Detection
12 0.069356963 75 nips-2007-Efficient Bayesian Inference for Dynamically Changing Graphs
13 0.068708524 50 nips-2007-Combined discriminative and generative articulated pose and non-rigid shape estimation
14 0.067807764 94 nips-2007-Gaussian Process Models for Link Analysis and Transfer Learning
15 0.067444079 134 nips-2007-Multi-Task Learning via Conic Programming
16 0.067229778 10 nips-2007-A Randomized Algorithm for Large Scale Support Vector Learning
17 0.065711588 212 nips-2007-Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes
18 0.064592831 7 nips-2007-A Kernel Statistical Test of Independence
19 0.062094364 49 nips-2007-Colored Maximum Variance Unfolding
20 0.060945369 137 nips-2007-Multiple-Instance Pruning For Learning Efficient Cascade Detectors
topicId topicWeight
[(0, -0.18), (1, 0.073), (2, -0.096), (3, 0.018), (4, -0.013), (5, 0.079), (6, 0.024), (7, 0.151), (8, -0.068), (9, 0.131), (10, -0.014), (11, 0.0), (12, -0.025), (13, 0.058), (14, -0.229), (15, -0.027), (16, -0.022), (17, -0.052), (18, -0.029), (19, 0.12), (20, 0.07), (21, 0.061), (22, 0.041), (23, 0.086), (24, -0.064), (25, -0.012), (26, -0.017), (27, 0.013), (28, -0.098), (29, 0.03), (30, -0.001), (31, 0.029), (32, 0.108), (33, 0.028), (34, 0.097), (35, 0.054), (36, 0.042), (37, -0.02), (38, -0.155), (39, 0.082), (40, -0.007), (41, 0.022), (42, 0.025), (43, -0.064), (44, 0.08), (45, -0.019), (46, 0.084), (47, -0.076), (48, -0.08), (49, 0.058)]
simIndex simValue paperId paperTitle
same-paper 1 0.96861184 109 nips-2007-Kernels on Attributed Pointsets with Applications
Author: Mehul Parsana, Sourangshu Bhattacharya, Chiru Bhattacharya, K. Ramakrishnan
Abstract: This paper introduces kernels on attributed pointsets, which are sets of vectors embedded in an euclidean space. The embedding gives the notion of neighborhood, which is used to define positive semidefinite kernels on pointsets. Two novel kernels on neighborhoods are proposed, one evaluating the attribute similarity and the other evaluating shape similarity. Shape similarity function is motivated from spectral graph matching techniques. The kernels are tested on three real life applications: face recognition, photo album tagging, and shot annotation in video sequences, with encouraging results. 1
2 0.5833863 190 nips-2007-Support Vector Machine Classification with Indefinite Kernels
Author: Ronny Luss, Alexandre D'aspremont
Abstract: In this paper, we propose a method for support vector machine classification using indefinite kernels. Instead of directly minimizing or stabilizing a nonconvex loss function, our method simultaneously finds the support vectors and a proxy kernel matrix used in computing the loss. This can be interpreted as a robust classification problem where the indefinite kernel matrix is treated as a noisy observation of the true positive semidefinite kernel. Our formulation keeps the problem convex and relatively large problems can be solved efficiently using the analytic center cutting plane method. We compare the performance of our technique with other methods on several data sets.
3 0.56564647 118 nips-2007-Learning with Transformation Invariant Kernels
Author: Christian Walder, Olivier Chapelle
Abstract: This paper considers kernels invariant to translation, rotation and dilation. We show that no non-trivial positive definite (p.d.) kernels exist which are radial and dilation invariant, only conditionally positive definite (c.p.d.) ones. Accordingly, we discuss the c.p.d. case and provide some novel analysis, including an elementary derivation of a c.p.d. representer theorem. On the practical side, we give a support vector machine (s.v.m.) algorithm for arbitrary c.p.d. kernels. For the thinplate kernel this leads to a classifier with only one parameter (the amount of regularisation), which we demonstrate to be as effective as an s.v.m. with the Gaussian kernel, even though the Gaussian involves a second parameter (the length scale). 1
4 0.56008989 160 nips-2007-Random Features for Large-Scale Kernel Machines
Author: Ali Rahimi, Benjamin Recht
Abstract: To accelerate the training of kernel machines, we propose to map the input data to a randomized low-dimensional feature space and then apply existing fast linear methods. The features are designed so that the inner products of the transformed data are approximately equal to those in the feature space of a user specified shiftinvariant kernel. We explore two sets of random features, provide convergence bounds on their ability to approximate various radial basis kernels, and show that in large-scale classification and regression tasks linear machine learning algorithms applied to these features outperform state-of-the-art large-scale kernel machines. 1
5 0.50996888 188 nips-2007-Subspace-Based Face Recognition in Analog VLSI
Author: Gonzalo Carvajal, Waldo Valenzuela, Miguel Figueroa
Abstract: We describe an analog-VLSI neural network for face recognition based on subspace methods. The system uses a dimensionality-reduction network whose coefficients can be either programmed or learned on-chip to perform PCA, or programmed to perform LDA. A second network with userprogrammed coefficients performs classification with Manhattan distances. The system uses on-chip compensation techniques to reduce the effects of device mismatch. Using the ORL database with 12x12-pixel images, our circuit achieves up to 85% classification performance (98% of an equivalent software implementation). 1
6 0.48129979 45 nips-2007-Classification via Minimum Incremental Coding Length (MICL)
7 0.47683021 18 nips-2007-A probabilistic model for generating realistic lip movements from speech
8 0.46095338 49 nips-2007-Colored Maximum Variance Unfolding
9 0.46015498 155 nips-2007-Predicting human gaze using low-level saliency combined with face detection
10 0.45391473 139 nips-2007-Nearest-Neighbor-Based Active Learning for Rare Category Detection
11 0.44553515 152 nips-2007-Parallelizing Support Vector Machines on Distributed Computers
12 0.44393361 70 nips-2007-Discriminative K-means for Clustering
13 0.43497574 71 nips-2007-Discriminative Keyword Selection Using Support Vector Machines
14 0.40693533 67 nips-2007-Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation
15 0.4068481 7 nips-2007-A Kernel Statistical Test of Independence
16 0.40383205 115 nips-2007-Learning the 2-D Topology of Images
17 0.40099961 108 nips-2007-Kernel Measures of Conditional Dependence
18 0.39638507 134 nips-2007-Multi-Task Learning via Conic Programming
19 0.3813085 193 nips-2007-The Distribution Family of Similarity Distances
20 0.37817499 137 nips-2007-Multiple-Instance Pruning For Learning Efficient Cascade Detectors
topicId topicWeight
[(5, 0.025), (13, 0.019), (16, 0.024), (21, 0.038), (34, 0.014), (35, 0.02), (47, 0.052), (83, 0.655), (85, 0.011), (87, 0.022), (90, 0.02)]
simIndex simValue paperId paperTitle
same-paper 1 0.99668556 109 nips-2007-Kernels on Attributed Pointsets with Applications
Author: Mehul Parsana, Sourangshu Bhattacharya, Chiru Bhattacharya, K. Ramakrishnan
Abstract: This paper introduces kernels on attributed pointsets, which are sets of vectors embedded in an euclidean space. The embedding gives the notion of neighborhood, which is used to define positive semidefinite kernels on pointsets. Two novel kernels on neighborhoods are proposed, one evaluating the attribute similarity and the other evaluating shape similarity. Shape similarity function is motivated from spectral graph matching techniques. The kernels are tested on three real life applications: face recognition, photo album tagging, and shot annotation in video sequences, with encouraging results. 1
2 0.9960891 159 nips-2007-Progressive mixture rules are deviation suboptimal
Author: Jean-yves Audibert
Abstract: We consider the learning task consisting in predicting as well as the best function in a finite reference set G up to the smallest possible additive term. If R(g) denotes the generalization error of a prediction function g, under reasonable assumptions on the loss function (typically satisfied by the least square loss when the output is bounded), it is known that the progressive mixture rule g satisfies ˆ log |G| (1) ER(ˆ) ≤ ming∈G R(g) + Cst g , n where n denotes the size of the training set, and E denotes the expectation w.r.t. the training set distribution.This work shows that, surprisingly, for appropriate reference sets G, the deviation convergence rate of the progressive mixture rule is √ no better than Cst / n: it fails to achieve the expected Cst /n. We also provide an algorithm which does not suffer from this drawback, and which is optimal in both deviation and expectation convergence rates. 1
3 0.99479336 132 nips-2007-Modeling image patches with a directed hierarchy of Markov random fields
Author: Simon Osindero, Geoffrey E. Hinton
Abstract: We describe an efficient learning procedure for multilayer generative models that combine the best aspects of Markov random fields and deep, directed belief nets. The generative models can be learned one layer at a time and when learning is complete they have a very fast inference procedure for computing a good approximation to the posterior distribution in all of the hidden layers. Each hidden layer has its own MRF whose energy function is modulated by the top-down directed connections from the layer above. To generate from the model, each layer in turn must settle to equilibrium given its top-down input. We show that this type of model is good at capturing the statistics of patches of natural images. 1
4 0.99332261 110 nips-2007-Learning Bounds for Domain Adaptation
Author: John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, Jennifer Wortman
Abstract: Empirical risk minimization offers well-known learning guarantees when training and test data come from the same domain. In the real world, though, we often wish to adapt a classifier from a source domain with a large amount of training data to different target domain with very little training data. In this work we give uniform convergence bounds for algorithms that minimize a convex combination of source and target empirical risk. The bounds explicitly model the inherent trade-off between training on a large but inaccurate source data set and a small but accurate target training set. Our theory also gives results when we have multiple source domains, each of which may have a different number of instances, and we exhibit cases in which minimizing a non-uniform combination of source risks can achieve much lower target error than standard empirical risk minimization. 1
5 0.9919458 61 nips-2007-Convex Clustering with Exemplar-Based Models
Author: Danial Lashkari, Polina Golland
Abstract: Clustering is often formulated as the maximum likelihood estimation of a mixture model that explains the data. The EM algorithm widely used to solve the resulting optimization problem is inherently a gradient-descent method and is sensitive to initialization. The resulting solution is a local optimum in the neighborhood of the initial guess. This sensitivity to initialization presents a significant challenge in clustering large data sets into many clusters. In this paper, we present a different approach to approximate mixture fitting for clustering. We introduce an exemplar-based likelihood function that approximates the exact likelihood. This formulation leads to a convex minimization problem and an efficient algorithm with guaranteed convergence to the globally optimal solution. The resulting clustering can be thought of as a probabilistic mapping of the data points to the set of exemplars that minimizes the average distance and the information-theoretic cost of mapping. We present experimental results illustrating the performance of our algorithm and its comparison with the conventional approach to mixture model clustering. 1
6 0.99003148 20 nips-2007-Adaptive Embedded Subgraph Algorithms using Walk-Sum Analysis
7 0.93118215 190 nips-2007-Support Vector Machine Classification with Indefinite Kernels
8 0.92027462 38 nips-2007-Boosting Algorithms for Maximizing the Soft Margin
9 0.90319854 116 nips-2007-Learning the structure of manifolds using random projections
10 0.90087253 54 nips-2007-Computational Equivalence of Fixed Points and No Regret Algorithms, and Convergence to Equilibria
11 0.8993426 120 nips-2007-Linear programming analysis of loopy belief propagation for weighted matching
12 0.89900005 40 nips-2007-Bundle Methods for Machine Learning
13 0.89894545 21 nips-2007-Adaptive Online Gradient Descent
14 0.89433694 65 nips-2007-DIFFRAC: a discriminative and flexible framework for clustering
15 0.88547283 75 nips-2007-Efficient Bayesian Inference for Dynamically Changing Graphs
16 0.88463628 101 nips-2007-How SVMs can estimate quantiles and the median
17 0.88201874 204 nips-2007-Theoretical Analysis of Heuristic Search Methods for Online POMDPs
18 0.8814255 178 nips-2007-Simulated Annealing: Rigorous finite-time guarantees for optimization on continuous domains
19 0.88134795 10 nips-2007-A Randomized Algorithm for Large Scale Support Vector Learning
20 0.87780142 46 nips-2007-Cluster Stability for Finite Samples