nips nips2004 nips2004-92 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Joachim Giesen, Simon Spalinger, Bernhard Schölkopf
Abstract: We describe methods for computing an implicit model of a hypersurface that is given only by a finite sampling. The methods work by mapping the sample points into a reproducing kernel Hilbert space and then determining regions in terms of hyperplanes. 1
Reference: text
sentIndex sentText sentNum sentScore
1 ch Abstract We describe methods for computing an implicit model of a hypersurface that is given only by a finite sampling. [sent-8, score-0.32]
2 The methods work by mapping the sample points into a reproducing kernel Hilbert space and then determining regions in terms of hyperplanes. [sent-9, score-0.4]
3 , xm ∈ X , where the domain X is some hypersurface in Euclidean space Rd . [sent-13, score-0.219]
4 , laser range scanners, that allow the acquisition of point data from the boundary surfaces of solids. [sent-16, score-0.183]
5 Today the most popular approach is to add connectivity information to the data by transforming them into a triangle mesh (see [4] for an example of such a transformation algorithm). [sent-18, score-0.182]
6 But recently also implicit models, where the surface is modeled as the zero set of some sufficiently smooth function, gained some popularity [1]. [sent-19, score-0.491]
7 One advantage of implicit models is that they easily allow the derivation of higher order differential quantities such as curvatures. [sent-21, score-0.206]
8 , testing whether a query point lies on the bounded or unbounded side of the surface, boils down to determining the sign of a function-evaluation at the query point. [sent-24, score-0.086]
9 The goal of this paper is, loosely speaking, to find a function which takes the value zero on a surface which (1) contains the training data and (2) is a “reasonable” implicit model of X . [sent-26, score-0.526]
10 In line with a sizeable amount of recent work on kernel methods [11], we assume that this structure is given by a (positive definite) kernel, i. [sent-28, score-0.179]
11 o Figure 1: In the 2-D toy example depicted, the hyperplane w, Φ(x) = ρ separates all but one of the points from the origin. [sent-31, score-0.293]
12 The outlier Φ(x) is associated with a slack variable ξ, which is penalized in the objective function (4). [sent-32, score-0.361]
13 The distance from the outlier to the hyperplane is ξ/ w ; the distance between hyperplane and origin is ρ/ w . [sent-33, score-0.44]
14 The space H is the reproducing kernel Hilbert space (RKHS) associated with k, and Φ is called its feature map. [sent-37, score-0.245]
15 (2) The advantage of using a positive definite kernel as a similarity measure is that it allows us to construct geometric algorithms in Hilbert spaces. [sent-39, score-0.212]
16 2 Single-Class SVMs Single-class SVMs were introduced [8, 10] to estimate quantiles C ≈ {x ∈ X |f (x) ∈ [ρ, ∞[} of an unknown distribution P on X using kernel expansions. [sent-40, score-0.144]
17 The single-class SVM approximately computes the smallest set C ∈ C containing a specified fraction of all training examples, where smallness is measured in terms of the norm in the RKHS H associated with k, and C is the family of sets corresponding to half-spaces in H. [sent-48, score-0.154]
18 (5) Since non-zero slack variables ξi are penalized in the objective function, we can expect that if w and ρ solve this problem, then the decision function, f (x) = sgn ( w, Φ(x) − ρ) will 1 Here and below, bold face greek character denote vectors, e. [sent-52, score-0.373]
19 Figure 2: Models computed with a single class SVM using a Gaussian kernel (2). [sent-61, score-0.181]
20 The three examples differ in the value chosen for σ in the kernel - a large value (0. [sent-62, score-0.144]
21 062 times the diameter of the hemisphere) in the middle and right figure. [sent-64, score-0.096]
22 In the right figure also non-zero slack variables (outliers) were allowed. [sent-65, score-0.139]
23 Note that that the outliers in the right figure correspond to a sharp feature (non-smoothness) in the original surface. [sent-66, score-0.238]
24 equal 1 for most examples xi contained in the training set,2 while the regularization term w will still be small. [sent-67, score-0.167]
25 One can show that the solution takes the form αi k(xi , x) − ρ , f (x) = sgn (6) i where the αi are computed by solving the dual problem, minimize α∈Rm 1 2 subject to 0 ≤ αi ≤ αi αj k(xi , xj ) (7) ij 1 and νm αi = 1. [sent-70, score-0.354]
26 (8) i Note that according to (8), the training examples contribute with nonnegative weights α i ≥ 0 to the solution (6). [sent-71, score-0.08]
27 One can show that asymptotically, a fraction ν of all training examples will have strictly positive weights, and the rest will be zero (the “ν-property”). [sent-72, score-0.116]
28 That is, we are interested in f −1 (0), where f is the kernel expansion (3) and the points x1 , . [sent-74, score-0.373]
29 , xm ∈ X are sampled from some unknown hypersurface X ⊂ Rd . [sent-77, score-0.219]
30 If we assume that the x i are sampled without noise from X – which for example is a reasonable assumption for data obtained with a state of the art 3d laser scanning device – we should set the slack variables in (4) and (5) to zero. [sent-80, score-0.203]
31 In the dual problem this results in removing the upper constraints on the αi in (8). [sent-81, score-0.154]
32 Note that sample points with non-zero slack variable cannot be contained in f −1 (0). [sent-82, score-0.396]
33 But also sample points whose image in feature space lies above the optimal hyperplane are not contained in f −1 (0) (see Figure 1) — we will address this in the next section. [sent-83, score-0.47]
34 It turns out that it is useful in practice to allow non-zero slack variables, because they prevent f −1 (0) from decomposing into many connected components (see Figure 2 for an illustration). [sent-84, score-0.139]
35 In our experience, one can ensure that the images of all sample points in feature space lie close to (or on) the optimal hyperplane can be achieved by choosing σ in the Gaussian 2 We use the convention that sgn (z) equals 1 for z ≥ 0 and −1 otherwise. [sent-85, score-0.559]
36 Figure 3: Two parallel hyperplanes w, Φ(x) = ρ + δ (∗) enclosing all but two of the points. [sent-86, score-0.08]
37 The outlier Φ(x(∗) ) is associated with a slack variable ξ (∗) , which is penalized in the objective function (9). [sent-87, score-0.361]
38 ξ*/ ||w|| o o o (ρ+δ* )/||w|| (ρ+δ)/ ||w|| w Φ (x) o o ξ/ ||w|| o o kernel (2) such that the Gaussians in the kernel expansion (3) are highly localized. [sent-89, score-0.379]
39 However, highly localized Gaussians are not well suited for interpolation — the implicit surface decomposes into several components. [sent-90, score-0.485]
40 Allowing outliers mitigates the situation to a certain extent. [sent-91, score-0.134]
41 Another way to deal with the problem is to further restrict the optimal region in feature space. [sent-92, score-0.099]
42 3 Slab SVMs A richer class of solutions, where some of the weights can be negative, is obtained if we change the geometric setup. [sent-94, score-0.068]
43 In this case, we estimate a region which is a slab in the RKHS, i. [sent-95, score-0.283]
44 , the area enclosed between two parallel hyperplanes (see Figure 3). [sent-97, score-0.08]
45 Below we summarize some relationships of this convex quadratic optimization problem to known SV methods: 1. [sent-102, score-0.075]
46 If we drop ρ from the objective function and set δ = −ε, δ ∗ = ε (for some fixed ε ≥ 0), we obtain the ε-insensitive support vector regression algorithm [11], for a data set where all output values y1 , . [sent-107, score-0.189]
47 This shows that the ρ in our objective function plays an important role. [sent-112, score-0.073]
48 For δ = δ ∗ = 0, the term i (ξi + ξi ) measures the distance of the point Φ(xi ) from the hyperplane w, Φ(xi ) − ρ = 0 (up to a scaling of w ). [sent-114, score-0.155]
49 If ν tends to zero, this term will dominate the objective function. [sent-115, score-0.073]
50 Hence, in this case, the solution will be a hyperplane that approximates the data well in the sense that the points lie close to it in the RKHS norm. [sent-116, score-0.437]
51 The dual problem can be solved using standard quadratic programming packages. [sent-122, score-0.192]
52 The offset ρ can be computed from the value of the corresponding variable in the double dual, or using the Karush-Kuhn-Tucker (KKT) conditions, just as in other support vector methods. [sent-123, score-0.142]
53 In other words, we have an implicit description of the region in input space that corresponds to the region in between the two hyperplanes in the RKHS. [sent-125, score-0.321]
54 For δ = δ ∗ , this is a single hyperplane, corresponding to a hypersurface in input space. [sent-126, score-0.161]
55 5 To compute this surface we use the kernel expansion ∗ (αi − αi )k(xi , x). [sent-127, score-0.524]
56 The difference of the SV and OL sets are those points that lie precisely on the boundaries of the constraints. [sent-131, score-0.239]
57 4 Note that due to (17), the dual solution is invariant with respect to the transformation δ (∗) → δ (∗) + const. [sent-133, score-0.251]
58 — such a transformation only adds a constant to the objective function, leaving the solution unaffected. [sent-134, score-0.17]
59 In our definition, SVs are those points where the constraints are active. [sent-136, score-0.138]
60 How(∗) ever, the difference is marginal: (i) It follows from the KKT conditions that α i > 0 implies that the corresponding constraint is active. [sent-137, score-0.076]
61 The above statements are not symmetric with respect to exchanging the quantities with asterisks and their counterparts without asterisk. [sent-143, score-0.128]
62 This is due to the sign of ρ in the primal objective function. [sent-144, score-0.117]
63 In this case, the role of the quantities with and without asterisks would be reversed in Proposition 1. [sent-146, score-0.128]
64 [8, 9]), it can be shown that asymptotically, the two inequalities in the proposition become equalities with probability 1. [sent-154, score-0.088]
65 Implementation On larger problems, solving the dual with standard QP solvers becomes too expensive (scaling with m3 ). [sent-155, score-0.154]
66 The adaptation of known decomposition methods to the present case is straightforward, noticing that the dual of the standard ε-SV regression algorithm [11] becomes almost identical to the present dual if we set ε = (δ ∗ − δ)/2 and yi = −(δ ∗ + δ)/2 for all i. [sent-157, score-0.398]
67 Experimental Results In all our experiments we used a Gaussian kernel (2). [sent-161, score-0.144]
68 , the zero-set f −1 (0), we generated a triangle mesh that approximates it. [sent-164, score-0.165]
69 To compute the mesh we used an adaptation of the marching cubes algorithm [5] which is a standard technique to transform an implicitly given surfaces into a mesh. [sent-165, score-0.537]
70 The most costly operations in the marching cubes algorithm are evaluations of the kernel expansion (18). [sent-166, score-0.617]
71 To reduce the number of these evaluations we implemented a surface following technique that exploits the fact that we know quite some sample points on the surface, namely the support vectors. [sent-167, score-0.616]
72 ∗ Our experiments indicate a nice geometric interpretation of negative coefficients α i − αi . [sent-169, score-0.068]
73 The coefficients seem well suited to extract shape features from the sample point set, e. [sent-171, score-0.112]
74 , the detection of singularities like sharp edges or feature lines — which is an important topic in computer graphics [7]. [sent-173, score-0.369]
75 In this approach at first a rough model is computed from ten percent of the sample points using a slab SVM. [sent-175, score-0.492]
76 For the remaining 90% of the sample points we compute the residual values, i. [sent-176, score-0.302]
77 , we evaluate the kernel expansion (18) at the sample points. [sent-178, score-0.31]
78 Finally we use support vector regression (SVR) and the residual values to derive a new kernel expansion (using a smaller kernel width) whose zero set we use as our surface model. [sent-179, score-0.916]
79 7 In the experiments, both the SVM optimization and the marching cubes rendering took up to about 2 hours. [sent-181, score-0.372]
80 Figure 4: First row: Computing a model of the Stanford bunny (35947 points) and of a golf club (16864 points) with the slab SVM. [sent-182, score-0.514]
81 The close up of the ears and nose of the bunny ∗ shows the sample points colored according to the coefficients αi − αi . [sent-183, score-0.334]
82 Dark gray points have negative coefficients and light gray points positive ones. [sent-184, score-0.276]
83 In the right figure we show the bottom of the golf club model. [sent-185, score-0.151]
84 Such details get leveled out by the limited resolution of the marching cubes method. [sent-188, score-0.335]
85 However the information about these details is preserved and detected in the SVM solution, as can be seen from the color coding. [sent-189, score-0.059]
86 Second row: In the left and in the middle figure we show the results of the slab SVM method on the screwdriver model (27152 points) and the dinosaur model (13990 points), respectively. [sent-190, score-0.447]
87 In the right figure a color coding of the coefficients for the rockerarm data set (40177 points) is shown. [sent-191, score-0.14]
88 Note that we can extract sharp features from this data set by filtering the coefficients according to some threshold. [sent-192, score-0.081]
89 The blobby support surface (left figure) was computed from 1000 randomly chosen sample points with the slab SVM. [sent-194, score-0.848]
90 In the middle we show a color coding of the residual values of all sample points (cf. [sent-195, score-0.404]
91 In the right figure we show the surface that we get after applying support vector regression using the residual values. [sent-199, score-0.494]
92 We therefore consider it worthwhile to explore the algorithmic aspects of implicit surface estimation in more depth, including the study of regression based approaches. [sent-202, score-0.497]
93 Some acquisition devices do not only provide us with points from a surface embedded in R3 , but also with the normals at these points. [sent-203, score-0.518]
94 We expect that it will improve the quality of the computed models in the sense that even more geometric details are preserved. [sent-205, score-0.105]
95 A feature of our approach is that its complexity depends only marginally on the dimension of the input space (in our examples this was three). [sent-206, score-0.058]
96 Thus the approach should work also well for hypersurfaces in higher dimensional input spaces. [sent-207, score-0.081]
97 From an applications point of view hypersurfaces might not be as interesting as manifolds of higher co-dimension. [sent-208, score-0.081]
98 The bunny data were taken from the Stanford 3d model repository. [sent-211, score-0.121]
99 The screwdriver, dinosaur and rockerarm data were taken from the homepage of Cyberware Inc. [sent-212, score-0.162]
100 Efficient implementation of marching cubes cases with topological guarantee. [sent-259, score-0.335]
wordName wordTfidf (topN-words)
[('surface', 0.289), ('slab', 0.242), ('ol', 0.21), ('sv', 0.206), ('marching', 0.175), ('hypersurface', 0.161), ('cubes', 0.16), ('implicit', 0.159), ('hyperplane', 0.155), ('dual', 0.154), ('graphics', 0.149), ('kernel', 0.144), ('slack', 0.139), ('points', 0.138), ('bunny', 0.121), ('giesen', 0.121), ('gure', 0.103), ('svms', 0.1), ('outliers', 0.099), ('outlier', 0.092), ('expansion', 0.091), ('svs', 0.089), ('residual', 0.089), ('xi', 0.088), ('rkhs', 0.086), ('mesh', 0.084), ('sharp', 0.081), ('asterisks', 0.081), ('dinosaur', 0.081), ('golf', 0.081), ('hypersurfaces', 0.081), ('rockerarm', 0.081), ('screwdriver', 0.081), ('singularities', 0.081), ('smallness', 0.081), ('hyperplanes', 0.08), ('svm', 0.077), ('surfaces', 0.077), ('hilbert', 0.076), ('sample', 0.075), ('cients', 0.075), ('objective', 0.073), ('svr', 0.07), ('club', 0.07), ('sgn', 0.069), ('geometric', 0.068), ('support', 0.067), ('coef', 0.066), ('forum', 0.064), ('hemisphere', 0.064), ('laser', 0.064), ('lie', 0.064), ('asymptotically', 0.059), ('color', 0.059), ('feature', 0.058), ('xm', 0.058), ('penalized', 0.057), ('diameter', 0.053), ('proposition', 0.053), ('transformation', 0.052), ('subject', 0.049), ('regression', 0.049), ('libsvm', 0.049), ('devices', 0.049), ('kkt', 0.047), ('evaluations', 0.047), ('quantities', 0.047), ('triangle', 0.046), ('solution', 0.045), ('primal', 0.044), ('contained', 0.044), ('sch', 0.044), ('middle', 0.043), ('query', 0.043), ('reproducing', 0.043), ('zero', 0.043), ('acquisition', 0.042), ('rm', 0.041), ('adaptation', 0.041), ('region', 0.041), ('implies', 0.039), ('illustration', 0.039), ('lkopf', 0.038), ('fraction', 0.038), ('quadratic', 0.038), ('offset', 0.038), ('origin', 0.038), ('constraint', 0.037), ('computed', 0.037), ('suited', 0.037), ('boundaries', 0.037), ('reconstruction', 0.037), ('optimization', 0.037), ('training', 0.035), ('approximates', 0.035), ('equalities', 0.035), ('skip', 0.035), ('sizeable', 0.035), ('greek', 0.035), ('mitigates', 0.035)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999881 92 nips-2004-Kernel Methods for Implicit Surface Modeling
Author: Joachim Giesen, Simon Spalinger, Bernhard Schölkopf
Abstract: We describe methods for computing an implicit model of a hypersurface that is given only by a finite sampling. The methods work by mapping the sample points into a reproducing kernel Hilbert space and then determining regions in terms of hyperplanes. 1
2 0.29504594 179 nips-2004-Surface Reconstruction using Learned Shape Models
Author: Jan E. Solem, Fredrik Kahl
Abstract: We consider the problem of geometrical surface reconstruction from one or several images using learned shape models. While humans can effortlessly retrieve 3D shape information, this inverse problem has turned out to be difficult to perform automatically. We introduce a framework based on level set surface reconstruction and shape models for achieving this goal. Through this merging, we obtain an efficient and robust method for reconstructing surfaces of an object category of interest. The shape model includes surface cues such as point, curve and silhouette features. Based on ideas from Active Shape Models, we show how both the geometry and the appearance of these features can be modelled consistently in a multi-view context. The complete surface is obtained by evolving a level set driven by a PDE, which tries to fit the surface to the inferred 3D features. In addition, an a priori 3D surface model is used to regularize the solution, in particular, where surface features are sparse. Experiments are demonstrated on a database of real face images.
3 0.18409534 34 nips-2004-Breaking SVM Complexity with Cross-Training
Author: Léon Bottou, Jason Weston, Gökhan H. Bakir
Abstract: We propose to selectively remove examples from the training set using probabilistic estimates related to editing algorithms (Devijver and Kittler, 1982). This heuristic procedure aims at creating a separable distribution of training examples with minimal impact on the position of the decision boundary. It breaks the linear dependency between the number of SVs and the number of training examples, and sharply reduces the complexity of SVMs during both the training and prediction stages. 1
4 0.16776021 187 nips-2004-The Entire Regularization Path for the Support Vector Machine
Author: Saharon Rosset, Robert Tibshirani, Ji Zhu, Trevor J. Hastie
Abstract: In this paper we argue that the choice of the SVM cost parameter can be critical. We then derive an algorithm that can fit the entire path of SVM solutions for every value of the cost parameter, with essentially the same computational cost as fitting one SVM model. 1
5 0.15272905 142 nips-2004-Outlier Detection with One-class Kernel Fisher Discriminants
Author: Volker Roth
Abstract: The problem of detecting “atypical objects” or “outliers” is one of the classical topics in (robust) statistics. Recently, it has been proposed to address this problem by means of one-class SVM classifiers. The main conceptual shortcoming of most one-class approaches, however, is that in a strict sense they are unable to detect outliers, since the expected fraction of outliers has to be specified in advance. The method presented in this paper overcomes this problem by relating kernelized one-class classification to Gaussian density estimation in the induced feature space. Having established this relation, it is possible to identify “atypical objects” by quantifying their deviations from the Gaussian model. For RBF kernels it is shown that the Gaussian model is “rich enough” in the sense that it asymptotically provides an unbiased estimator for the true density. In order to overcome the inherent model selection problem, a cross-validated likelihood criterion for selecting all free model parameters is applied. 1
6 0.14427313 186 nips-2004-The Correlated Correspondence Algorithm for Unsupervised Registration of Nonrigid Surfaces
7 0.13847497 144 nips-2004-Parallel Support Vector Machines: The Cascade SVM
8 0.13445888 168 nips-2004-Semigroup Kernels on Finite Sets
9 0.12959345 178 nips-2004-Support Vector Classification with Input Data Uncertainty
10 0.12499857 67 nips-2004-Exponentiated Gradient Algorithms for Large-margin Structured Classification
11 0.1146201 11 nips-2004-A Second Order Cone programming Formulation for Classifying Missing Data
12 0.11013333 42 nips-2004-Computing regularization paths for learning multiple kernels
13 0.10965475 115 nips-2004-Maximum Margin Clustering
14 0.10648818 133 nips-2004-Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning
15 0.099443324 68 nips-2004-Face Detection --- Efficient and Rank Deficient
16 0.099239953 3 nips-2004-A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound
17 0.094713233 69 nips-2004-Fast Rates to Bayes for Kernel Machines
18 0.089816451 12 nips-2004-A Temporal Kernel-Based Model for Tracking Hand Movements from Neural Activities
19 0.084257744 134 nips-2004-Object Classification from a Single Example Utilizing Class Relevance Metrics
20 0.083542578 19 nips-2004-An Application of Boosting to Graph Classification
topicId topicWeight
[(0, -0.285), (1, 0.123), (2, -0.094), (3, 0.11), (4, -0.011), (5, 0.02), (6, 0.055), (7, -0.257), (8, -0.002), (9, -0.055), (10, 0.111), (11, -0.006), (12, -0.031), (13, -0.062), (14, -0.056), (15, 0.08), (16, -0.008), (17, 0.034), (18, -0.141), (19, 0.084), (20, -0.081), (21, 0.111), (22, -0.155), (23, 0.132), (24, 0.063), (25, -0.352), (26, -0.075), (27, -0.011), (28, -0.073), (29, 0.129), (30, 0.127), (31, 0.071), (32, 0.049), (33, 0.113), (34, -0.067), (35, -0.046), (36, 0.054), (37, -0.077), (38, -0.062), (39, -0.005), (40, -0.074), (41, 0.012), (42, -0.073), (43, -0.028), (44, -0.055), (45, -0.076), (46, 0.019), (47, 0.018), (48, -0.085), (49, 0.06)]
simIndex simValue paperId paperTitle
same-paper 1 0.94538373 92 nips-2004-Kernel Methods for Implicit Surface Modeling
Author: Joachim Giesen, Simon Spalinger, Bernhard Schölkopf
Abstract: We describe methods for computing an implicit model of a hypersurface that is given only by a finite sampling. The methods work by mapping the sample points into a reproducing kernel Hilbert space and then determining regions in terms of hyperplanes. 1
2 0.81771487 179 nips-2004-Surface Reconstruction using Learned Shape Models
Author: Jan E. Solem, Fredrik Kahl
Abstract: We consider the problem of geometrical surface reconstruction from one or several images using learned shape models. While humans can effortlessly retrieve 3D shape information, this inverse problem has turned out to be difficult to perform automatically. We introduce a framework based on level set surface reconstruction and shape models for achieving this goal. Through this merging, we obtain an efficient and robust method for reconstructing surfaces of an object category of interest. The shape model includes surface cues such as point, curve and silhouette features. Based on ideas from Active Shape Models, we show how both the geometry and the appearance of these features can be modelled consistently in a multi-view context. The complete surface is obtained by evolving a level set driven by a PDE, which tries to fit the surface to the inferred 3D features. In addition, an a priori 3D surface model is used to regularize the solution, in particular, where surface features are sparse. Experiments are demonstrated on a database of real face images.
3 0.69026697 186 nips-2004-The Correlated Correspondence Algorithm for Unsupervised Registration of Nonrigid Surfaces
Author: Dragomir Anguelov, Praveen Srinivasan, Hoi-cheung Pang, Daphne Koller, Sebastian Thrun, James Davis
Abstract: We present an unsupervised algorithm for registering 3D surface scans of an object undergoing significant deformations. Our algorithm does not need markers, nor does it assume prior knowledge about object shape, the dynamics of its deformation, or scan alignment. The algorithm registers two meshes by optimizing a joint probabilistic model over all point-topoint correspondences between them. This model enforces preservation of local mesh geometry, as well as more global constraints that capture the preservation of geodesic distance between corresponding point pairs. The algorithm applies even when one of the meshes is an incomplete range scan; thus, it can be used to automatically fill in the remaining surfaces for this partial scan, even if those surfaces were previously only seen in a different configuration. We evaluate the algorithm on several real-world datasets, where we demonstrate good results in the presence of significant movement of articulated parts and non-rigid surface deformation. Finally, we show that the output of the algorithm can be used for compelling computer graphics tasks such as interpolation between two scans of a non-rigid object and automatic recovery of articulated object models. 1
4 0.53051084 34 nips-2004-Breaking SVM Complexity with Cross-Training
Author: Léon Bottou, Jason Weston, Gökhan H. Bakir
Abstract: We propose to selectively remove examples from the training set using probabilistic estimates related to editing algorithms (Devijver and Kittler, 1982). This heuristic procedure aims at creating a separable distribution of training examples with minimal impact on the position of the decision boundary. It breaks the linear dependency between the number of SVs and the number of training examples, and sharply reduces the complexity of SVMs during both the training and prediction stages. 1
5 0.51692247 144 nips-2004-Parallel Support Vector Machines: The Cascade SVM
Author: Hans P. Graf, Eric Cosatto, Léon Bottou, Igor Dourdanovic, Vladimir Vapnik
Abstract: We describe an algorithm for support vector machines (SVM) that can be parallelized efficiently and scales to very large problems with hundreds of thousands of training vectors. Instead of analyzing the whole training set in one optimization step, the data are split into subsets and optimized separately with multiple SVMs. The partial results are combined and filtered again in a ‘Cascade’ of SVMs, until the global optimum is reached. The Cascade SVM can be spread over multiple processors with minimal communication overhead and requires far less memory, since the kernel matrices are much smaller than for a regular SVM. Convergence to the global optimum is guaranteed with multiple passes through the Cascade, but already a single pass provides good generalization. A single pass is 5x – 10x faster than a regular SVM for problems of 100,000 vectors when implemented on a single processor. Parallel implementations on a cluster of 16 processors were tested with over 1 million vectors (2-class problems), converging in a day or two, while a regular SVM never converged in over a week. 1
6 0.47143787 187 nips-2004-The Entire Regularization Path for the Support Vector Machine
7 0.4407295 142 nips-2004-Outlier Detection with One-class Kernel Fisher Discriminants
8 0.41884074 168 nips-2004-Semigroup Kernels on Finite Sets
9 0.41821009 93 nips-2004-Kernel Projection Machine: a New Tool for Pattern Recognition
10 0.39279565 60 nips-2004-Efficient Kernel Machines Using the Improved Fast Gauss Transform
11 0.3887271 94 nips-2004-Kernels for Multi--task Learning
12 0.38648894 14 nips-2004-A Topographic Support Vector Machine: Classification Using Local Label Configurations
13 0.38173383 68 nips-2004-Face Detection --- Efficient and Rank Deficient
14 0.38130593 15 nips-2004-Active Learning for Anomaly and Rare-Category Detection
15 0.37572014 67 nips-2004-Exponentiated Gradient Algorithms for Large-margin Structured Classification
16 0.36410758 177 nips-2004-Supervised Graph Inference
17 0.36063638 11 nips-2004-A Second Order Cone programming Formulation for Classifying Missing Data
18 0.35209754 3 nips-2004-A Feature Selection Algorithm Based on the Global Minimization of a Generalization Error Bound
19 0.35127771 81 nips-2004-Implicit Wiener Series for Higher-Order Image Analysis
20 0.34697086 49 nips-2004-Density Level Detection is Classification
topicId topicWeight
[(13, 0.071), (15, 0.58), (26, 0.078), (31, 0.011), (33, 0.128), (35, 0.013), (39, 0.017), (50, 0.027), (76, 0.01)]
simIndex simValue paperId paperTitle
1 0.98979175 50 nips-2004-Dependent Gaussian Processes
Author: Phillip Boyle, Marcus Frean
Abstract: Gaussian processes are usually parameterised in terms of their covariance functions. However, this makes it difficult to deal with multiple outputs, because ensuring that the covariance matrix is positive definite is problematic. An alternative formulation is to treat Gaussian processes as white noise sources convolved with smoothing kernels, and to parameterise the kernel instead. Using this, we extend Gaussian processes to handle multiple, coupled outputs. 1
2 0.98805261 59 nips-2004-Efficient Kernel Discriminant Analysis via QR Decomposition
Author: Tao Xiong, Jieping Ye, Qi Li, Ravi Janardan, Vladimir Cherkassky
Abstract: Linear Discriminant Analysis (LDA) is a well-known method for feature extraction and dimension reduction. It has been used widely in many applications such as face recognition. Recently, a novel LDA algorithm based on QR Decomposition, namely LDA/QR, has been proposed, which is competitive in terms of classification accuracy with other LDA algorithms, but it has much lower costs in time and space. However, LDA/QR is based on linear projection, which may not be suitable for data with nonlinear structure. This paper first proposes an algorithm called KDA/QR, which extends the LDA/QR algorithm to deal with nonlinear data by using the kernel operator. Then an efficient approximation of KDA/QR called AKDA/QR is proposed. Experiments on face image data show that the classification accuracy of both KDA/QR and AKDA/QR are competitive with Generalized Discriminant Analysis (GDA), a general kernel discriminant analysis algorithm, while AKDA/QR has much lower time and space costs. 1
3 0.98256987 13 nips-2004-A Three Tiered Approach for Articulated Object Action Modeling and Recognition
Author: Le Lu, Gregory D. Hager, Laurent Younes
Abstract: Visual action recognition is an important problem in computer vision. In this paper, we propose a new method to probabilistically model and recognize actions of articulated objects, such as hand or body gestures, in image sequences. Our method consists of three levels of representation. At the low level, we first extract a feature vector invariant to scale and in-plane rotation by using the Fourier transform of a circular spatial histogram. Then, spectral partitioning [20] is utilized to obtain an initial clustering; this clustering is then refined using a temporal smoothness constraint. Gaussian mixture model (GMM) based clustering and density estimation in the subspace of linear discriminant analysis (LDA) are then applied to thousands of image feature vectors to obtain an intermediate level representation. Finally, at the high level we build a temporal multiresolution histogram model for each action by aggregating the clustering weights of sampled images belonging to that action. We discuss how this high level representation can be extended to achieve temporal scaling invariance and to include Bi-gram or Multi-gram transition information. Both image clustering and action recognition/segmentation results are given to show the validity of our three tiered representation.
4 0.97127378 203 nips-2004-Validity Estimates for Loopy Belief Propagation on Binary Real-world Networks
Author: Joris M. Mooij, Hilbert J. Kappen
Abstract: We introduce a computationally efficient method to estimate the validity of the BP method as a function of graph topology, the connectivity strength, frustration and network size. We present numerical results that demonstrate the correctness of our estimates for the uniform random model and for a real-world network (“C. Elegans”). Although the method is restricted to pair-wise interactions, no local evidence (zero “biases”) and binary variables, we believe that its predictions correctly capture the limitations of BP for inference and MAP estimation on arbitrary graphical models. Using this approach, we find that BP always performs better than MF. Especially for large networks with broad degree distributions (such as scale-free networks) BP turns out to significantly outperform MF. 1
5 0.95405746 197 nips-2004-Two-Dimensional Linear Discriminant Analysis
Author: Jieping Ye, Ravi Janardan, Qi Li
Abstract: Linear Discriminant Analysis (LDA) is a well-known scheme for feature extraction and dimension reduction. It has been used widely in many applications involving high-dimensional data, such as face recognition and image retrieval. An intrinsic limitation of classical LDA is the so-called singularity problem, that is, it fails when all scatter matrices are singular. A well-known approach to deal with the singularity problem is to apply an intermediate dimension reduction stage using Principal Component Analysis (PCA) before LDA. The algorithm, called PCA+LDA, is used widely in face recognition. However, PCA+LDA has high costs in time and space, due to the need for an eigen-decomposition involving the scatter matrices. In this paper, we propose a novel LDA algorithm, namely 2DLDA, which stands for 2-Dimensional Linear Discriminant Analysis. 2DLDA overcomes the singularity problem implicitly, while achieving efficiency. The key difference between 2DLDA and classical LDA lies in the model for data representation. Classical LDA works with vectorized representations of data, while the 2DLDA algorithm works with data in matrix representation. To further reduce the dimension by 2DLDA, the combination of 2DLDA and classical LDA, namely 2DLDA+LDA, is studied, where LDA is preceded by 2DLDA. The proposed algorithms are applied on face recognition and compared with PCA+LDA. Experiments show that 2DLDA and 2DLDA+LDA achieve competitive recognition accuracy, while being much more efficient. 1
same-paper 6 0.95392382 92 nips-2004-Kernel Methods for Implicit Surface Modeling
7 0.94526535 183 nips-2004-Temporal-Difference Networks
8 0.88712835 9 nips-2004-A Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning
9 0.83825201 201 nips-2004-Using the Equivalent Kernel to Understand Gaussian Process Regression
10 0.83089763 148 nips-2004-Probabilistic Computation in Spiking Populations
11 0.82180548 79 nips-2004-Hierarchical Eigensolver for Transition Matrices in Spectral Methods
12 0.81752026 168 nips-2004-Semigroup Kernels on Finite Sets
13 0.80544358 188 nips-2004-The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space
14 0.78901494 178 nips-2004-Support Vector Classification with Input Data Uncertainty
15 0.78877074 110 nips-2004-Matrix Exponential Gradient Updates for On-line Learning and Bregman Projection
16 0.78672564 94 nips-2004-Kernels for Multi--task Learning
17 0.78612638 30 nips-2004-Binet-Cauchy Kernels
18 0.77855808 98 nips-2004-Learning Gaussian Process Kernels via Hierarchical Bayes
19 0.7764408 18 nips-2004-Algebraic Set Kernels with Application to Inference Over Local Image Representations
20 0.77150595 16 nips-2004-Adaptive Discriminative Generative Model and Its Applications