nips nips2004 nips2004-68 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Wolf Kienzle, Matthias O. Franz, Bernhard Schölkopf, Gökhan H. Bakir
Abstract: This paper proposes a method for computing fast approximations to support vector decision functions in the field of object detection. In the present approach we are building on an existing algorithm where the set of support vectors is replaced by a smaller, so-called reduced set of synthesized input space points. In contrast to the existing method that finds the reduced set via unconstrained optimization, we impose a structural constraint on the synthetic points such that the resulting approximations can be evaluated via separable filters. For applications that require scanning large images, this decreases the computational complexity by a significant amount. Experimental results show that in face detection, rank deficient approximations are 4 to 6 times faster than unconstrained reduced set systems. 1
Reference: text
sentIndex sentText sentNum sentScore
1 de Abstract This paper proposes a method for computing fast approximations to support vector decision functions in the field of object detection. [sent-4, score-0.273]
2 In the present approach we are building on an existing algorithm where the set of support vectors is replaced by a smaller, so-called reduced set of synthesized input space points. [sent-5, score-0.18]
3 In contrast to the existing method that finds the reduced set via unconstrained optimization, we impose a structural constraint on the synthetic points such that the resulting approximations can be evaluated via separable filters. [sent-6, score-0.785]
4 For applications that require scanning large images, this decreases the computational complexity by a significant amount. [sent-7, score-0.171]
5 Experimental results show that in face detection, rank deficient approximations are 4 to 6 times faster than unconstrained reduced set systems. [sent-8, score-0.913]
6 1 Introduction It has been shown that support vector machines (SVMs) provide state-of-the-art accuracies in object detection. [sent-9, score-0.12]
7 When classifying image patches of size h × w using plain gray value features, the decision function requires an h · w dimensional dot product for each SV. [sent-16, score-0.459]
8 As an example, the evaluation of a single 20 × 20 patch on a 320 × 240 image at 25 frames per second already requires 660 million operations per second. [sent-18, score-0.419]
9 In the past, research towards speeding up kernel expansions has focused exclusively on the first issue, i. [sent-19, score-0.106]
10 In [2], Burges introduced a method that, for a given SVM, creates a set of so-called reduced set vectors (RSVs) that approximate the decision function. [sent-22, score-0.214]
11 This approach has been successfully applied in the image classification domain — speedups on the order of 10 to 30 have been reported [2, 3, 4] while the full accuracy was retained. [sent-23, score-0.224]
12 Additionally, for strongly unbalanced classification problems such as face detection, the average number of RSV evaluations can be further reduced using cascaded classifiers [5, 6, 7]. [sent-24, score-0.327]
13 While this could be remedied by switching to a sparser image representation (e. [sent-29, score-0.124]
14 a wavelet basis), one could argue that in connection with SVMs, not only are plain gray values straightforward to use, but they have shown to outperform Haar wavelets and gradients in the face detection domain [8]. [sent-31, score-0.357]
15 In this paper, we develop a method that combines the simplicity of gray value correlations with the speed advantage of more sophisticated image representations. [sent-33, score-0.279]
16 To this end, we borrow an idea from image processing: by constraining the RSVs to have a special structure, they can be evaluated via separable convolutions. [sent-34, score-0.4]
17 linear, polynomial, Gaussian and sigmoid) and decreases the average computational complexity of the RSV evaluations from O(h · w) to O(r · (h + w)), where r is a small number that allows the user to balance between speed and accuracy. [sent-37, score-0.181]
18 To evaluate our approach, we examine the performance of these approximations on the MIT+CMU face detection database (used in [10, 8, 5, 6]). [sent-38, score-0.407]
19 2 Burges’ method for reduced set approximations The present section briefly describes Burges’ reduced set method [2] on which our work is based. [sent-39, score-0.364]
20 For reasons that will become clear below, h × w image patches are written as h × w matrices (denoted by bold capital letters) whose entries are the respective pixel intensities. [sent-40, score-0.302]
21 The decision rule for a test pattern X reads m f (X) = sgn yi αi k(Xi , X) + b . [sent-50, score-0.139]
22 (1) i=1 In SVMs, the decision surface induced by f corresponds to a hyperplane in the reproducing kernel Hilbert space (RKHS) associated with k. [sent-51, score-0.143]
23 The corresponding normal m yi αi k(Xi , ·) Ψ= (2) i=1 can be approximated using a smaller, so-called reduced set (RS) {Z1 , . [sent-52, score-0.113]
24 3 From separable filters to rank deficient reduced sets We now describe the concept of separable filters in image processing and show how this idea extends to a broader class of linear filters and to a special class of nonlinear filters, namely those used by SVM decision functions. [sent-64, score-1.05]
25 Using the image-matrix notation, it will become clear that the separability property boils down to a matrix rank constraint. [sent-65, score-0.366]
26 1 Linear separable filters Applying a linear filter to an image amounts to a two-dimensional convolution of the image with the impulse response of the filter. [sent-67, score-0.632]
27 (6) If H has size h × w, the convolution requires O(h · w) operations for each output pixel. [sent-71, score-0.197]
28 However, in special cases where H can be decomposed into two column vectors a and b, such that H = ab (7) holds, we can rewrite (6) as J = [I ∗ a] ∗ b , (8) since the convolution is associative and in this case, ab = a ∗ b . [sent-72, score-0.11]
29 This splits the original problem (6) into two convolution operations with masks of size h×1 and 1×w, respectively. [sent-73, score-0.197]
30 As a result, if a linear filter is separable in the sense of equation (7), the computational complexity of the filtering operation can be reduced from O(h · w) to O(h + w) per pixel by computing (8) instead of (6). [sent-74, score-0.381]
31 2 Linear rank deficient filters In view of (7) being equivalent to rank(H) ≤ 1, we now generalize the above concept to linear filters with low rank impulse responses. [sent-76, score-0.743]
32 Consider the singular value decomposition (SVD) of the h × w matrix H, H = USV , (9) and recall that U and V are orthogonal matrices of size h × h and w × w, respectively, whereas S is diagonal (the diagonal entries are the singular values) and has size h × w. [sent-77, score-0.474]
33 Due to rank(S) = rank(H), we may write H as a sum of r rank one matrices r s i u i vi H= (10) i=1 where si denotes the ith singular value of H and ui , vi are the iths columns of U and V (i. [sent-79, score-0.596]
34 As a result, the corresponding linear filter can be evaluated (analogously to (8)) as the weighted sum of r separable convolutions r J= si [I ∗ ui ] ∗ vi (11) i=1 and the computational complexity drops from O(h × w) to O(r · (h + w)) per output pixel. [sent-82, score-0.398]
35 In SVMs this amounts to using a kernel of the form k(H, X) = g(c(H, X)). [sent-88, score-0.142]
36 (12) If H has rank r, we may split the kernel evaluation into r separable correlations plus a scalar nonlinearity. [sent-89, score-0.651]
37 As a result, if the RSVs in a kernel expansion such as (5) satisfy this constraint, the average computational complexity decreases from O(m · h · w) to O(m · r · (h + w)) per output pixel. [sent-90, score-0.221]
38 While linear, polynomial and sigmoid kernels are defined as functions of input space dot products and therefore immediately satisfy equation (12), the idea applies to kernels based on the Euclidian distance as well. [sent-92, score-0.135]
39 For instance, the Gaussian kernel reads k(H, X) = exp(γ(c(X, X) − 2c(H, X) + c(H, H))). [sent-93, score-0.104]
40 (13) Here, the middle term is the correlation which we are going to evaluate via separable filters. [sent-94, score-0.224]
41 The last term is merely a constant scalar independent of the image data. [sent-96, score-0.155]
42 4 (14) 2 F F , (15) is the Frobenius norm for Finding rank deficient reduced sets In our approach we consider a special class of the approximations given by (3), namely those where the RSVs can be evaluated efficiently via separable correlations. [sent-99, score-0.858]
43 In particular, we restrict the RSV search space to the manifold spanned by all image patches that — viewed as matrices — have a fixed, small rank r (which is to be chosen a priori by the user). [sent-101, score-0.633]
44 (16) The rank constraint can then be imposed by allowing only the first r diagonal elements of Si to be non-zero. [sent-103, score-0.365]
45 The minimization problem is solved via 2 In this paper we call a non-square matrix orthogonal if its columns are pairwise orthogonal and have unit length. [sent-106, score-0.2]
46 the components of the decomposed RSV image patches, i. [sent-111, score-0.166]
47 Intuitively, this amounts to rotating (rather than translating) the columns of Ui,r and Vi,r , which ensures that the resulting matrices are still orthogonal, i. [sent-117, score-0.175]
48 The first part illustrates the behavior of rank deficient approximations for a face detection SVM in terms of the convergence rate and classification accuracy for different values of r. [sent-127, score-0.827]
49 In the second part, we show how an actual face detection system, similar to that presented in [5], can be sped up using rank deficient RSVs. [sent-128, score-0.6]
50 It consisted of 19 × 19 gray level image patches containing 16081 manually collected faces (3194 of them kindly provided by Sami Romdhani) and 42972 non-faces automatically collected from a set of 206 background scenes. [sent-130, score-0.342]
51 The set was split into a training set (13331 faces and 35827 non-faces) and a validation set (2687 faces and 7145 non-faces). [sent-132, score-0.166]
52 The resulting decision function (1) achieved a hit rate of 97. [sent-135, score-0.108]
53 0% false positives on the validation set using m = 6910 SVs. [sent-137, score-0.233]
54 1 Rank deficient faces In order to see how m and r affect the accuracy of our approximations, we compute rank deficient reduced sets for m = 1 . [sent-140, score-0.549]
55 3 (the left array in Figure 1 illustrates the actual appearance of rank deficient RSVs for the m = 6 case). [sent-146, score-0.411]
56 Accuracy of the resulting decision functions is measured in ROC score (the area under the ROC curve) on the validation set. [sent-147, score-0.196]
57 The results for our approximations are depicted in Figure 2. [sent-150, score-0.138]
58 As expected, we need a larger number of rank deficient RSVs than unconstrained RSVs to obtain similar classification accuracies, especially for small r. [sent-151, score-0.521]
59 First, a rank as m' = 1 r=ful + + r=3 = + = + . [sent-153, score-0.331]
60 The left array shows the RSVs (Zi ) of the unconstrained (top row) and constrained (r decreases from 3 to 1 down the remaining rows) approximations for m = 6. [sent-157, score-0.413]
61 This supports the fact that the classification accuracy for r = 3 is similar to that of the unconstrained approximations (cf. [sent-159, score-0.378]
62 The right array shows the m = 1 RSVs (r = full, 3, 2, 1, top to bottom row) and their decomposition into rank one matrices according to (10). [sent-161, score-0.444]
63 For the unconstrained RSV (first row) it shows an approximate (truncated) expansion based on the three leading singular vectors. [sent-162, score-0.305]
64 This illustrates that the approach is clearly different from simply finding unconstrained RSVs and then imposing the rank constraint via SVD (in fact, the norm (4) is smaller for the r = 1 RSV than for the leading singular vector of the r = full RSV). [sent-164, score-0.729]
65 low as three seems already sufficient for our face detection SVM in the sense that for equal sizes m there is no significant loss in accuracy compared to the unconstrained approximation (at least for m > 2). [sent-165, score-0.565]
66 The associated speed benefit over unconstrained RSVs is shown in the right plot of Figure 2: the rank three approximations achieve accuracies similar to the unconstrained functions, while the number of operations reduces to less than a third. [sent-166, score-1.101]
67 2 A cascade-based face detection system In this experiment we built a cascade-based face detection system similar to [5, 6], i. [sent-170, score-0.612]
68 a cascade of RSV approximations of increasing size m . [sent-172, score-0.219]
69 As the benefit of a cascaded classifier heavily depends on the speed of the first classifier which has to be evaluated on the whole image [5, 6], our system uses a rank deficient approximation as the first stage. [sent-173, score-0.681]
70 9 using 114 multiply-adds, whereas the simplest possible unconstrained approximation m = 1, r = full needs 361 multiply-adds to achieve a ROC score of only 0. [sent-176, score-0.364]
71 In particular, if the threshold of the first stage is set to yield a hit rate of 95% on the validation set, scanning the MIT+CMU set (130 images, 507 faces) with m = 3, r = 1 discards 91. [sent-179, score-0.178]
72 5% of the false positives, whereas the m = 1, r = full can only reject 70. [sent-180, score-0.155]
73 At the same time, when scanning a 320 × 240 image 3 , the three separable convolutions plus nonlinearity require 55ms, whereas the single, full kernel evaluation takes 208ms on a Pentium 4 with 2. [sent-182, score-0.62]
74 Moreover, for the unconstrained 3 For multi-scale processing the detectors are evaluated on an image pyramid with 12 different scales using a scale decay of 0. [sent-184, score-0.366]
75 This amounts to scanning 140158 patches for a 320 × 240 image. [sent-186, score-0.267]
76 6 2 10 3 10 4 10 #operations (m'⋅r⋅(h+w)) Figure 2: Effect of the rank parameter r on classification accuracies. [sent-195, score-0.331]
77 The left plots shows the ROC score of the rank deficient RSV approximations (cf. [sent-196, score-0.535]
78 Additionally, the solid line shows the accuracy of the RSVs without rank constraint (cf. [sent-204, score-0.381]
79 The right plot shows the same four curves, but plotted against the number of operations needed for the evaluation of the corresponding decision function when scanning large images (i. [sent-206, score-0.318]
80 Figure 3: A sample output from our demonstration system (running at 14 frames per second). [sent-209, score-0.112]
81 In this implementation, we reduced the number of false positives by adjusting the threshold of the final classifier. [sent-210, score-0.29]
82 cascade to catch up in terms of accuracy, the (at least) m = 2, r = full classifier (also with an ROC score of roughly 0. [sent-213, score-0.164]
83 The subsequent stages of our system consist of unconstrained RSV approximations of size m = 4, 8, 16, 32, respectively. [sent-216, score-0.398]
84 These sizes were chosen such that the number of false positives roughly halves after each stage, while the number of correct detections remains close to 95% on the validation set (with the decision thresholds adjusted accordingly). [sent-217, score-0.419]
85 To eliminate redundant detections, we combine overlapping detections via averaging of position and size if they are closer than 0. [sent-218, score-0.151]
86 The application now classifies a 320x240 image within 54ms (vs. [sent-225, score-0.124]
87 Note that this will not significantly affect the speed of our system (currently 14 frames per second) since 0. [sent-229, score-0.179]
88 034% false positives amounts to merely 47 patches to be processed by subsequent classifiers. [sent-230, score-0.387]
89 6 Discussion We have presented a new reduced set method for SVMs in image processing, which creates sparse kernel expansions that can be evaluated via separable filters. [sent-231, score-0.646]
90 To this end, the user-defined rank (the number of separable filters into which the RSVs are decomposed) provides a mechanism to control the tradeoff between accuracy and speed of the resulting approximation. [sent-232, score-0.638]
91 Our experiments show that for face detection, the use of rank deficient RSVs leads to a significant speedup without losing accuracy. [sent-233, score-0.503]
92 Especially when rough approximations are required, our method gives superior results compared to the existing reduced set methods since it allows for a finer granularity which is vital in cascade-based detection systems. [sent-234, score-0.413]
93 At run-time, rank deficient RSVs can be used together with unconstrained RSVs or SVs using the same canonical image representation. [sent-236, score-0.645]
94 In addition, our approach allows the use of off-the-shelf image processing libraries for separable convolutions. [sent-238, score-0.314]
95 Since such operations are essential in image processing, there exist many (often highly optimized) implementations. [sent-239, score-0.22]
96 to go directly from the training data to a sparse, separable function as opposed to taking the SVM ’detour’. [sent-242, score-0.19]
97 Improving the accuracy and speed of support vector machines. [sent-267, score-0.15]
98 Training support vector machines: an application to face detection. [sent-279, score-0.174]
99 Rapid object detection using a boosted cascade of simple features. [sent-291, score-0.204]
100 [12] RDRSLIB – a matlab library for rank deficient reduced sets in object detection, http://www. [sent-324, score-0.472]
wordName wordTfidf (topN-words)
[('rsvs', 0.554), ('rank', 0.331), ('rsv', 0.264), ('unconstrained', 0.19), ('separable', 0.19), ('face', 0.141), ('approximations', 0.138), ('detection', 0.128), ('image', 0.124), ('reduced', 0.113), ('patches', 0.106), ('burges', 0.105), ('lters', 0.102), ('positives', 0.102), ('roc', 0.101), ('svs', 0.098), ('operations', 0.096), ('scanning', 0.088), ('singular', 0.085), ('detections', 0.084), ('false', 0.075), ('decision', 0.074), ('amounts', 0.073), ('matrices', 0.072), ('svm', 0.072), ('cmu', 0.07), ('kernel', 0.069), ('convolution', 0.068), ('orthogonal', 0.068), ('speed', 0.067), ('score', 0.066), ('zi', 0.063), ('lter', 0.059), ('accuracies', 0.059), ('gray', 0.057), ('validation', 0.056), ('rkhs', 0.056), ('faces', 0.055), ('euclidian', 0.055), ('patch', 0.055), ('svms', 0.054), ('kienzle', 0.053), ('stiefel', 0.053), ('impulse', 0.053), ('evaluated', 0.052), ('accuracy', 0.05), ('full', 0.05), ('cascade', 0.048), ('classi', 0.046), ('decreases', 0.044), ('decomposed', 0.042), ('svd', 0.042), ('cascaded', 0.042), ('memo', 0.042), ('osuna', 0.042), ('array', 0.041), ('complexity', 0.039), ('per', 0.039), ('convolutions', 0.039), ('romdhani', 0.039), ('illustrates', 0.039), ('vi', 0.039), ('cient', 0.039), ('bene', 0.038), ('expansions', 0.037), ('system', 0.037), ('frames', 0.036), ('boils', 0.035), ('reads', 0.035), ('mask', 0.035), ('kernels', 0.035), ('via', 0.034), ('existing', 0.034), ('diagonal', 0.034), ('dot', 0.034), ('hit', 0.034), ('size', 0.033), ('support', 0.033), ('gradient', 0.031), ('correlations', 0.031), ('rs', 0.031), ('evaluations', 0.031), ('plain', 0.031), ('merely', 0.031), ('truncated', 0.031), ('speedup', 0.031), ('analogously', 0.031), ('sigmoid', 0.031), ('whereas', 0.03), ('plot', 0.03), ('columns', 0.03), ('sgn', 0.03), ('evaluation', 0.03), ('expansion', 0.03), ('row', 0.029), ('concept', 0.028), ('object', 0.028), ('sizes', 0.028), ('approximation', 0.028), ('creates', 0.027)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999988 68 nips-2004-Face Detection --- Efficient and Rank Deficient
Author: Wolf Kienzle, Matthias O. Franz, Bernhard Schölkopf, Gökhan H. Bakir
Abstract: This paper proposes a method for computing fast approximations to support vector decision functions in the field of object detection. In the present approach we are building on an existing algorithm where the set of support vectors is replaced by a smaller, so-called reduced set of synthesized input space points. In contrast to the existing method that finds the reduced set via unconstrained optimization, we impose a structural constraint on the synthetic points such that the resulting approximations can be evaluated via separable filters. For applications that require scanning large images, this decreases the computational complexity by a significant amount. Experimental results show that in face detection, rank deficient approximations are 4 to 6 times faster than unconstrained reduced set systems. 1
2 0.15749723 34 nips-2004-Breaking SVM Complexity with Cross-Training
Author: Léon Bottou, Jason Weston, Gökhan H. Bakir
Abstract: We propose to selectively remove examples from the training set using probabilistic estimates related to editing algorithms (Devijver and Kittler, 1982). This heuristic procedure aims at creating a separable distribution of training examples with minimal impact on the position of the decision boundary. It breaks the linear dependency between the number of SVs and the number of training examples, and sharply reduces the complexity of SVMs during both the training and prediction stages. 1
3 0.13944407 192 nips-2004-The power of feature clustering: An application to object detection
Author: Shai Avidan, Moshe Butman
Abstract: We give a fast rejection scheme that is based on image segments and demonstrate it on the canonical example of face detection. However, instead of focusing on the detection step we focus on the rejection step and show that our method is simple and fast to be learned, thus making it an excellent pre-processing step to accelerate standard machine learning classifiers, such as neural-networks, Bayes classifiers or SVM. We decompose a collection of face images into regions of pixels with similar behavior over the image set. The relationships between the mean and variance of image segments are used to form a cascade of rejectors that can reject over 99.8% of image patches, thus only a small fraction of the image patches must be passed to a full-scale classifier. Moreover, the training time for our method is much less than an hour, on a standard PC. The shape of the features (i.e. image segments) we use is data-driven, they are very cheap to compute and they form a very low dimensional feature space in which exhaustive search for the best features is tractable. 1
4 0.12937868 182 nips-2004-Synergistic Face Detection and Pose Estimation with Energy-Based Models
Author: Margarita Osadchy, Matthew L. Miller, Yann L. Cun
Abstract: We describe a novel method for real-time, simultaneous multi-view face detection and facial pose estimation. The method employs a convolutional network to map face images to points on a manifold, parametrized by pose, and non-face images to points far from that manifold. This network is trained by optimizing a loss function of three variables: image, pose, and face/non-face label. We test the resulting system, in a single configuration, on three standard data sets – one for frontal pose, one for rotated faces, and one for profiles – and find that its performance on each set is comparable to previous multi-view face detectors that can only handle one form of pose variation. We also show experimentally that the system’s accuracy on both face detection and pose estimation is improved by training for the two tasks together.
5 0.11024058 99 nips-2004-Learning Hyper-Features for Visual Identification
Author: Andras D. Ferencz, Erik G. Learned-miller, Jitendra Malik
Abstract: We address the problem of identifying specific instances of a class (cars) from a set of images all belonging to that class. Although we cannot build a model for any particular instance (as we may be provided with only one “training” example of it), we can use information extracted from observing other members of the class. We pose this task as a learning problem, in which the learner is given image pairs, labeled as matching or not, and must discover which image features are most consistent for matching instances and discriminative for mismatches. We explore a patch based representation, where we model the distributions of similarity measurements defined on the patches. Finally, we describe an algorithm that selects the most salient patches based on a mutual information criterion. This algorithm performs identification well for our challenging dataset of car images, after matching only a few, well chosen patches. 1
6 0.10610229 144 nips-2004-Parallel Support Vector Machines: The Cascade SVM
7 0.099443324 92 nips-2004-Kernel Methods for Implicit Surface Modeling
8 0.096715041 187 nips-2004-The Entire Regularization Path for the Support Vector Machine
9 0.091735423 133 nips-2004-Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning
10 0.086742915 168 nips-2004-Semigroup Kernels on Finite Sets
11 0.081233904 59 nips-2004-Efficient Kernel Discriminant Analysis via QR Decomposition
12 0.078073606 160 nips-2004-Seeing through water
13 0.077344164 113 nips-2004-Maximum-Margin Matrix Factorization
14 0.076893084 191 nips-2004-The Variational Ising Classifier (VIC) Algorithm for Coherently Contaminated Data
15 0.07407102 62 nips-2004-Euclidean Embedding of Co-Occurrence Data
16 0.07341186 60 nips-2004-Efficient Kernel Machines Using the Improved Fast Gauss Transform
17 0.07292974 18 nips-2004-Algebraic Set Kernels with Application to Inference Over Local Image Representations
18 0.071208499 79 nips-2004-Hierarchical Eigensolver for Transition Matrices in Spectral Methods
19 0.071079835 127 nips-2004-Neighbourhood Components Analysis
20 0.069586203 142 nips-2004-Outlier Detection with One-class Kernel Fisher Discriminants
topicId topicWeight
[(0, -0.241), (1, 0.089), (2, -0.091), (3, -0.035), (4, 0.044), (5, 0.037), (6, 0.056), (7, -0.216), (8, 0.056), (9, -0.035), (10, -0.071), (11, -0.059), (12, -0.022), (13, 0.021), (14, -0.065), (15, -0.061), (16, -0.053), (17, 0.113), (18, 0.061), (19, -0.014), (20, 0.037), (21, -0.037), (22, -0.121), (23, -0.059), (24, 0.072), (25, 0.051), (26, -0.012), (27, 0.015), (28, 0.025), (29, -0.039), (30, 0.16), (31, -0.047), (32, 0.081), (33, -0.075), (34, 0.098), (35, -0.059), (36, -0.109), (37, 0.119), (38, -0.023), (39, 0.009), (40, 0.018), (41, 0.052), (42, -0.066), (43, 0.038), (44, -0.016), (45, 0.052), (46, 0.023), (47, -0.025), (48, -0.113), (49, 0.081)]
simIndex simValue paperId paperTitle
same-paper 1 0.95526397 68 nips-2004-Face Detection --- Efficient and Rank Deficient
Author: Wolf Kienzle, Matthias O. Franz, Bernhard Schölkopf, Gökhan H. Bakir
Abstract: This paper proposes a method for computing fast approximations to support vector decision functions in the field of object detection. In the present approach we are building on an existing algorithm where the set of support vectors is replaced by a smaller, so-called reduced set of synthesized input space points. In contrast to the existing method that finds the reduced set via unconstrained optimization, we impose a structural constraint on the synthetic points such that the resulting approximations can be evaluated via separable filters. For applications that require scanning large images, this decreases the computational complexity by a significant amount. Experimental results show that in face detection, rank deficient approximations are 4 to 6 times faster than unconstrained reduced set systems. 1
2 0.69384879 192 nips-2004-The power of feature clustering: An application to object detection
Author: Shai Avidan, Moshe Butman
Abstract: We give a fast rejection scheme that is based on image segments and demonstrate it on the canonical example of face detection. However, instead of focusing on the detection step we focus on the rejection step and show that our method is simple and fast to be learned, thus making it an excellent pre-processing step to accelerate standard machine learning classifiers, such as neural-networks, Bayes classifiers or SVM. We decompose a collection of face images into regions of pixels with similar behavior over the image set. The relationships between the mean and variance of image segments are used to form a cascade of rejectors that can reject over 99.8% of image patches, thus only a small fraction of the image patches must be passed to a full-scale classifier. Moreover, the training time for our method is much less than an hour, on a standard PC. The shape of the features (i.e. image segments) we use is data-driven, they are very cheap to compute and they form a very low dimensional feature space in which exhaustive search for the best features is tractable. 1
3 0.67676032 182 nips-2004-Synergistic Face Detection and Pose Estimation with Energy-Based Models
Author: Margarita Osadchy, Matthew L. Miller, Yann L. Cun
Abstract: We describe a novel method for real-time, simultaneous multi-view face detection and facial pose estimation. The method employs a convolutional network to map face images to points on a manifold, parametrized by pose, and non-face images to points far from that manifold. This network is trained by optimizing a loss function of three variables: image, pose, and face/non-face label. We test the resulting system, in a single configuration, on three standard data sets – one for frontal pose, one for rotated faces, and one for profiles – and find that its performance on each set is comparable to previous multi-view face detectors that can only handle one form of pose variation. We also show experimentally that the system’s accuracy on both face detection and pose estimation is improved by training for the two tasks together.
4 0.565615 34 nips-2004-Breaking SVM Complexity with Cross-Training
Author: Léon Bottou, Jason Weston, Gökhan H. Bakir
Abstract: We propose to selectively remove examples from the training set using probabilistic estimates related to editing algorithms (Devijver and Kittler, 1982). This heuristic procedure aims at creating a separable distribution of training examples with minimal impact on the position of the decision boundary. It breaks the linear dependency between the number of SVs and the number of training examples, and sharply reduces the complexity of SVMs during both the training and prediction stages. 1
5 0.55695415 144 nips-2004-Parallel Support Vector Machines: The Cascade SVM
Author: Hans P. Graf, Eric Cosatto, Léon Bottou, Igor Dourdanovic, Vladimir Vapnik
Abstract: We describe an algorithm for support vector machines (SVM) that can be parallelized efficiently and scales to very large problems with hundreds of thousands of training vectors. Instead of analyzing the whole training set in one optimization step, the data are split into subsets and optimized separately with multiple SVMs. The partial results are combined and filtered again in a ‘Cascade’ of SVMs, until the global optimum is reached. The Cascade SVM can be spread over multiple processors with minimal communication overhead and requires far less memory, since the kernel matrices are much smaller than for a regular SVM. Convergence to the global optimum is guaranteed with multiple passes through the Cascade, but already a single pass provides good generalization. A single pass is 5x – 10x faster than a regular SVM for problems of 100,000 vectors when implemented on a single processor. Parallel implementations on a cluster of 16 processors were tested with over 1 million vectors (2-class problems), converging in a day or two, while a regular SVM never converged in over a week. 1
6 0.52370077 191 nips-2004-The Variational Ising Classifier (VIC) Algorithm for Coherently Contaminated Data
7 0.51847583 99 nips-2004-Learning Hyper-Features for Visual Identification
8 0.4909234 14 nips-2004-A Topographic Support Vector Machine: Classification Using Local Label Configurations
9 0.48845565 25 nips-2004-Assignment of Multiplicative Mixtures in Natural Images
10 0.48374602 160 nips-2004-Seeing through water
11 0.4682695 106 nips-2004-Machine Learning Applied to Perception: Decision Images for Gender Classification
12 0.45971942 199 nips-2004-Using Machine Learning to Break Visual Human Interaction Proofs (HIPs)
13 0.45745885 60 nips-2004-Efficient Kernel Machines Using the Improved Fast Gauss Transform
14 0.45466486 18 nips-2004-Algebraic Set Kernels with Application to Inference Over Local Image Representations
15 0.43987259 168 nips-2004-Semigroup Kernels on Finite Sets
16 0.43564737 59 nips-2004-Efficient Kernel Discriminant Analysis via QR Decomposition
17 0.41944379 81 nips-2004-Implicit Wiener Series for Higher-Order Image Analysis
18 0.4164874 94 nips-2004-Kernels for Multi--task Learning
19 0.38637048 92 nips-2004-Kernel Methods for Implicit Surface Modeling
20 0.37720993 49 nips-2004-Density Level Detection is Classification
topicId topicWeight
[(13, 0.077), (15, 0.175), (17, 0.221), (26, 0.127), (31, 0.025), (33, 0.17), (35, 0.013), (39, 0.069), (50, 0.025), (87, 0.01)]
simIndex simValue paperId paperTitle
1 0.89502561 130 nips-2004-Newscast EM
Author: Wojtek Kowalczyk, Nikos A. Vlassis
Abstract: We propose a gossip-based distributed algorithm for Gaussian mixture learning, Newscast EM. The algorithm operates on network topologies where each node observes a local quantity and can communicate with other nodes in an arbitrary point-to-point fashion. The main difference between Newscast EM and the standard EM algorithm is that the M-step in our case is implemented in a decentralized manner: (random) pairs of nodes repeatedly exchange their local parameter estimates and combine them by (weighted) averaging. We provide theoretical evidence and demonstrate experimentally that, under this protocol, nodes converge exponentially fast to the correct estimates in each M-step of the EM algorithm. 1
2 0.89423537 43 nips-2004-Conditional Models of Identity Uncertainty with Application to Noun Coreference
Author: Andrew McCallum, Ben Wellner
Abstract: Coreference analysis, also known as record linkage or identity uncertainty, is a difficult and important problem in natural language processing, databases, citation matching and many other tasks. This paper introduces several discriminative, conditional-probability models for coreference analysis, all examples of undirected graphical models. Unlike many historical approaches to coreference, the models presented here are relational—they do not assume that pairwise coreference decisions should be made independently from each other. Unlike other relational models of coreference that are generative, the conditional model here can incorporate a great variety of features of the input without having to be concerned about their dependencies—paralleling the advantages of conditional random fields over hidden Markov models. We present positive results on noun phrase coreference in two standard text data sets. 1
3 0.88890469 191 nips-2004-The Variational Ising Classifier (VIC) Algorithm for Coherently Contaminated Data
Author: Oliver Williams, Andrew Blake, Roberto Cipolla
Abstract: There has been substantial progress in the past decade in the development of object classifiers for images, for example of faces, humans and vehicles. Here we address the problem of contaminations (e.g. occlusion, shadows) in test images which have not explicitly been encountered in training data. The Variational Ising Classifier (VIC) algorithm models contamination as a mask (a field of binary variables) with a strong spatial coherence prior. Variational inference is used to marginalize over contamination and obtain robust classification. In this way the VIC approach can turn a kernel classifier for clean data into one that can tolerate contamination, without any specific training on contaminated positives. 1
same-paper 4 0.87524372 68 nips-2004-Face Detection --- Efficient and Rank Deficient
Author: Wolf Kienzle, Matthias O. Franz, Bernhard Schölkopf, Gökhan H. Bakir
Abstract: This paper proposes a method for computing fast approximations to support vector decision functions in the field of object detection. In the present approach we are building on an existing algorithm where the set of support vectors is replaced by a smaller, so-called reduced set of synthesized input space points. In contrast to the existing method that finds the reduced set via unconstrained optimization, we impose a structural constraint on the synthetic points such that the resulting approximations can be evaluated via separable filters. For applications that require scanning large images, this decreases the computational complexity by a significant amount. Experimental results show that in face detection, rank deficient approximations are 4 to 6 times faster than unconstrained reduced set systems. 1
5 0.75637418 69 nips-2004-Fast Rates to Bayes for Kernel Machines
Author: Ingo Steinwart, Clint Scovel
Abstract: We establish learning rates to the Bayes risk for support vector machines (SVMs) with hinge loss. In particular, for SVMs with Gaussian RBF kernels we propose a geometric condition for distributions which can be used to determine approximation properties of these kernels. Finally, we compare our methods with a recent paper of G. Blanchard et al.. 1
6 0.75596642 192 nips-2004-The power of feature clustering: An application to object detection
7 0.75539625 172 nips-2004-Sparse Coding of Natural Images Using an Overcomplete Set of Limited Capacity Units
8 0.75335205 4 nips-2004-A Generalized Bradley-Terry Model: From Group Competition to Individual Skill
9 0.7524184 201 nips-2004-Using the Equivalent Kernel to Understand Gaussian Process Regression
10 0.75238681 110 nips-2004-Matrix Exponential Gradient Updates for On-line Learning and Bregman Projection
11 0.75205559 187 nips-2004-The Entire Regularization Path for the Support Vector Machine
12 0.7513923 78 nips-2004-Hierarchical Distributed Representations for Statistical Language Modeling
13 0.7498371 79 nips-2004-Hierarchical Eigensolver for Transition Matrices in Spectral Methods
14 0.74911994 178 nips-2004-Support Vector Classification with Input Data Uncertainty
15 0.74762166 161 nips-2004-Self-Tuning Spectral Clustering
16 0.74670237 182 nips-2004-Synergistic Face Detection and Pose Estimation with Energy-Based Models
17 0.74659908 206 nips-2004-Worst-Case Analysis of Selective Sampling for Linear-Threshold Algorithms
18 0.74626118 14 nips-2004-A Topographic Support Vector Machine: Classification Using Local Label Configurations
19 0.74590778 81 nips-2004-Implicit Wiener Series for Higher-Order Image Analysis
20 0.74546444 189 nips-2004-The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees