nips nips2004 nips2004-179 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jan E. Solem, Fredrik Kahl
Abstract: We consider the problem of geometrical surface reconstruction from one or several images using learned shape models. While humans can effortlessly retrieve 3D shape information, this inverse problem has turned out to be difficult to perform automatically. We introduce a framework based on level set surface reconstruction and shape models for achieving this goal. Through this merging, we obtain an efficient and robust method for reconstructing surfaces of an object category of interest. The shape model includes surface cues such as point, curve and silhouette features. Based on ideas from Active Shape Models, we show how both the geometry and the appearance of these features can be modelled consistently in a multi-view context. The complete surface is obtained by evolving a level set driven by a PDE, which tries to fit the surface to the inferred 3D features. In addition, an a priori 3D surface model is used to regularize the solution, in particular, where surface features are sparse. Experiments are demonstrated on a database of real face images.
Reference: text
sentIndex sentText sentNum sentScore
1 se Abstract We consider the problem of geometrical surface reconstruction from one or several images using learned shape models. [sent-5, score-1.186]
2 While humans can effortlessly retrieve 3D shape information, this inverse problem has turned out to be difficult to perform automatically. [sent-6, score-0.152]
3 We introduce a framework based on level set surface reconstruction and shape models for achieving this goal. [sent-7, score-1.113]
4 Through this merging, we obtain an efficient and robust method for reconstructing surfaces of an object category of interest. [sent-8, score-0.254]
5 The shape model includes surface cues such as point, curve and silhouette features. [sent-9, score-1.069]
6 The complete surface is obtained by evolving a level set driven by a PDE, which tries to fit the surface to the inferred 3D features. [sent-11, score-1.655]
7 In addition, an a priori 3D surface model is used to regularize the solution, in particular, where surface features are sparse. [sent-12, score-1.654]
8 1 Introduction Humans have a remarkable ability of perceiving 3D shape information - even from a single photograph. [sent-14, score-0.152]
9 Exactly how the human visual system works or how shape is represented is to a large extent unknown. [sent-15, score-0.178]
10 It is clear that the capabilities rely on a strong prior model and the efficient use of different surface cues. [sent-16, score-0.787]
11 The corresponding task of automatically recovering a 3D surface model for a computer has turned out to be a challenging problem, even with the addition of multiple images. [sent-17, score-0.847]
12 At the heart of our approach lies the combination of sophisticated surface reconstruction techniques and a strong statistical model of surface features. [sent-19, score-1.666]
13 The first part concerns the statistical model of surface features for inferring 3D shape. [sent-20, score-0.845]
14 The features are primarily geometric ones, such as point, curve and silhouette features. [sent-21, score-0.196]
15 Both the geometric relations and their appearances (in terms of image intensities) are modelled. [sent-22, score-0.115]
16 Also, a 3D model of the complete surface is used as a weak regularizer where surface features are sparse. [sent-24, score-1.623]
17 We are interested in automatically recovering a surface model given new image data of the object of interest. [sent-26, score-0.928]
18 It is a hard problem to robustly extract curves and apparent contours (i. [sent-27, score-0.327]
19 Our a priori model will work as a domain-specific regularizer. [sent-33, score-0.11]
20 The second part of this work deals with fitting surfaces to points and curves in 3D space, and at the same time, fitting the projections of surface contours to apparent contours in the images. [sent-34, score-1.467]
21 This variational problem leads to a surface evolution, driven by a PDE. [sent-36, score-0.786]
22 The surface is represented implicitly as the level set of a real-valued function [1]. [sent-37, score-0.851]
23 1 Related Work In the area of statistical shape models, our work is related to and inspired by Active Shape Models (ASM) [2]. [sent-39, score-0.152]
24 In the seminal work [5], a complete 3D model is built from a database of 200 laser scans of faces. [sent-44, score-0.21]
25 The so-called morphable model can be fitted to an image with very impressive results. [sent-45, score-0.198]
26 A generic face model has also been used in [7] for computing regularized structure and motion as well as in [8] based on silhouettes obtained from background subtraction. [sent-48, score-0.164]
27 In the area of using level set surface representations for fitting surfaces to 3D data, this paper is related to the work in [9] where surfaces are fitted to points using a distance potential. [sent-49, score-1.307]
28 In [10] apparent contours were incorporated for level set based surface fitting. [sent-50, score-1.112]
29 2 Contribution of the Paper The main contribution of this paper is the approach itself - the combination of a state-of-theart surface reconstruction technique and the learned statistical model of surface features. [sent-56, score-1.74]
30 This merging results in a robust and efficient method for surface reconstruction without any need for manual intervention. [sent-57, score-0.941]
31 Another key contribution is the introduction of a multi-view feature model which is capable of representing both 2D and 3D data in a consistent manner. [sent-59, score-0.111]
32 By only incorporating distinct surface features, compared to a full morphable model, we not only gain computational efficiency, but also robustness to specularities and other illumination effects. [sent-61, score-0.901]
33 This point is a also valid when compared to other surface reconstruction methods based on image-correlation [11]. [sent-62, score-0.905]
34 The main contribution within the field of level sets is in the incorporation of an a priori 3D model used for surface regularization. [sent-64, score-0.966]
35 2 Part I: A Learned Shape Model In this section, we develop a statistical model which is needed in order to be able to automatically extract and compute 3D surface features from the input images. [sent-65, score-0.876]
36 The output from the model will serve as input to the level set reconstruction algorithm in order to get a complete surface model. [sent-66, score-1.04]
37 In addition, we will use an a priori 3D surface model as a (weak) regularizer. [sent-67, score-0.853]
38 However, our measurements take place in the images, which is a non-linear function of the 3D features according to the perspective camera model. [sent-71, score-0.173]
39 Denote the projection function with f : Rd → Re , projecting all 3D features to 2D image features, for one or several images 1 . [sent-72, score-0.256]
40 We assume that the image measurements are independent and normally distributed, likewise, the latent variables are assumed to be Gaussian with unit variance u ∼ N (0, I). [sent-77, score-0.127]
41 A 3D point which is visible in m > 1 images will be represented in the vector t with its 3D coordinates (X, Y, Z). [sent-87, score-0.157]
42 For points visible in only one image, m = 1, no depth information is available, and such points are represented similarly to apparent contour points. [sent-88, score-0.398]
43 A curve will be represented in the model by a number of points along the curve. [sent-89, score-0.192]
44 In the training of the model, it is important to parameterize each 3D curve such that each point on the curve approximately corresponds to the same point on the corresponding curve in the other examples. [sent-90, score-0.247]
45 As for curves, we sample the apparent contours (in the images) using arc-length parametrization. [sent-91, score-0.287]
46 However, there is no 3D information available for the apparent contours as they are view-dependent. [sent-92, score-0.287]
47 A simple way is to treat contours points as 3D points with a constant, approximate (but crude) depth estimate. [sent-93, score-0.304]
48 1 In the experiments, f (·) will model the projection of three calibrated perspective cameras. [sent-94, score-0.117]
49 2 The Grey-Level Model The missing component in the model is the relationship between 2D image features and the underlying grey-level (or color) values at these pixels. [sent-96, score-0.183]
50 Using the same notation as in (1), but now with the subscript gl for grey-level, the model can be written tgl = Wgl ugl + µgl + gl , where tgl is a vector containing the grey-level values of all the 2D image features and gl is Gaussian noise in the measurements. [sent-98, score-0.651]
51 The complete statistical two-layer model with one feature model and one grey-level model is very similar to the concept of ASM [2]. [sent-101, score-0.203]
52 3 The 3D Model The two-layer feature model produces only a sparse set of features in 3D space. [sent-106, score-0.138]
53 Even if these cues are characteristic for a particular sample (or individual), it is often not enough in order to infer a complete surface model, in particular, in regions where the features are sparse. [sent-107, score-0.862]
54 Therefore, we introduce a 3D surface model consisting of the complete mean surface serving as domain-specific regularizer. [sent-108, score-1.604]
55 The mean surface is obtained from laser scans with the technique described in [13]. [sent-109, score-0.878]
56 The time dependent surface Γ(t) is represented implicitly as the zero level set of a function φ(x, t) : Ω × R+ → R as Γ(t) = {x ; φ(x, t) = 0} , (4) where φ is defined such that φ(x, t) < 0 inside Γ and φ(x, t) > 0 outside. [sent-112, score-0.851]
57 This PDE is solved in order to move the surface according to some derived velocity v. [sent-118, score-0.774]
58 2 Surface Fitting to Points In [9] surfaces are fitted to a point set S using the level set method. [sent-121, score-0.297]
59 An initial surface is deformed in the gradient direction of an energy functional which involves elastic energy and potential energy. [sent-122, score-0.852]
60 The energy is expressed using a distance potential as the surface integral EP (Γ) = d(x) dσ , (7) Γ where Γ is the surface, dσ surface area and d(x) = dist(x, S) is the distance from the point x to S. [sent-123, score-1.677]
61 [9], φt = ( d(x) · n + d(x)κ) | φ| , (8) where n is the surface normal and κ the mean curvature. [sent-125, score-0.782]
62 This motion is known to interfere with surface regularization, since all surface points are attracted to the 3D features. [sent-126, score-1.583]
63 3 Surface Fitting to Apparent Contours Let γ be an apparent contour in an image parameterized as γ(s) : I ⊂ R → R 2 . [sent-129, score-0.272]
64 The back-projected cone, written in homogeneous coordinates, C(s) = c + λP + γ(s) , (9) + (where c is the camera center and P the pseudo-inverse of the 3 × 4 camera matrix of a perspective projection) should graze the surface at the location of the contour generator. [sent-130, score-0.995]
65 It is undesirable to attract the surface to the entire back-projected cone of the apparent contour. [sent-131, score-0.92]
66 The cone should only touch the surface along a curve - the so called contour generator. [sent-132, score-0.904]
67 For each point on the curve m = γ(s), let x∗ denote the point closest to Γ (If there are several, then choose the one with smallest depth). [sent-134, score-0.117]
68 This set is then added to the distance potential as d(x) = min(d(x), d (x)) , (10) where d (x) = dist(x, S ) is updated at appropriate intervals as the surface evolves. [sent-140, score-0.823]
69 Instead of the common choice of minimal surface type models, we propose to use a learned shape model, as described in Section 2. [sent-145, score-0.938]
70 By first aligning the mean shape to the data, the deviation can be expressed similar to (7) as EP rior (Γ) = dP rior (x) dσ , (11) Γ where dP rior (x) is the distance potential of the aligned mean shape. [sent-147, score-0.697]
71 5 The Combined Motion PDE Adding all the components above gives a functional ET ot = EP +αEP rior , where α ∈ R+ determines the weight of the prior shape. [sent-149, score-0.159]
72 Combining the components (8), (10) and (11) above leads to a PDE for the motion of the surface as φt = [( d(x) + α dP rior (x)) · n + (d(x) + αdP rior (x))κ]| φ| . [sent-150, score-1.041]
73 Figure 1: Extracted image features for one image triple and the reconstructed 3D surface. [sent-152, score-0.278]
74 1 The Shape Model All images were taken by a stereo setup with three (synchronized) cameras. [sent-154, score-0.147]
75 The setup was pre-calibrated for both extrinsic and intrinsic camera parameters. [sent-155, score-0.113]
76 In total a database of 28 image triplets were collected of faces from different persons (23 males and 5 females of ages ranging from 7 to 65). [sent-157, score-0.289]
77 The 3D mean surface was computed from a database of 24 persons (different from the 28 persons above) using a laser scanner as described in [13]. [sent-159, score-1.06]
78 The 28 triplets in the training and test set were manually labelled - 36 points, 8 curves and 5 apparent contours were extracted for each person, see Figure 1. [sent-160, score-0.366]
79 The two-layer model has q = 12 elements in the latent variable u for the geometrical part and q gl = 15 elements in ugl for the grey-level model. [sent-161, score-0.281]
80 In fact, even for one (frontal) input image of the test set, the model predicts the two profile views remarkably well. [sent-164, score-0.125]
81 2 Surface Reconstruction Once 3D data has been obtained by fitting the two-layer shape model to the input images, surfaces are fitted to this data by solving the PDE (12). [sent-167, score-0.385]
82 In the standard level set formulation, the surface must be closed in the computational domain in order to make φ continuous. [sent-168, score-0.825]
83 Surfaces were reconstructed for a number of persons, a selection is shown in Figures 1 and 2, where the zero set is visualized and triangulated using the marching cubes algorithm . [sent-170, score-0.144]
84 3 and dmax was set from the maximum distance of the feature points to the initial surface (4-5 voxels). [sent-172, score-0.92]
85 For the reconstructions in Figure 2, the mean and median distances (measured in voxel width) of the feature points to the reconstructed surface have been computed. [sent-173, score-1.05]
86 the deviation is of the same order as the surface resolution. [sent-177, score-0.743]
87 This shows that we are not fitting a surface to the mean shape but that it really fits the feature data. [sent-178, score-0.97]
88 5 Conclusions and Future Work In this paper, a framework for reconstructing surface geometry based on learned shape models and level sets has been presented. [sent-180, score-1.131]
89 The approach relies solely on images as input and the output of the system is a geometric 3D surface consistent with the input images. [sent-181, score-0.854]
90 A new regularization approach is introduced in the level set framework, where the distance to an a priori 3D model is used instead of the common mean curvature regularization. [sent-182, score-0.311]
91 Figure 2: Input images and the reconstructed surface for three persons, two in the training data and one (bottom) in the test data. [sent-183, score-0.878]
92 For each person: One of the input images, triangulated surfaces and surfaces with texture. [sent-184, score-0.421]
93 Histogram for distance to the surface for all 3D points 70 60 50 Person 1 Person 2 Person 3 40 30 20 10 0 0 mean 0. [sent-186, score-0.886]
94 843 1 2 3 Distance in voxel width Figure 3: Histogram of the deviations of the feature points from the surface for the second person in Figure 2. [sent-192, score-0.969]
95 Currently, only images taken with the tri-stereo setup have been used with heads facing the middle camera (cf. [sent-196, score-0.227]
96 Active shape model search using local grey-level models: A quantatitative evaluation. [sent-208, score-0.196]
97 Regularized bundle-adjustment to model heads from image sequences without calibration data. [sent-246, score-0.162]
98 Implicit and non-parametric shape reconstruction from unorganized points using a variational level set method. [sent-264, score-0.52]
99 Surface reconstruction from the projection of points, curves and contours. [sent-270, score-0.216]
100 Variational principles, surface evolution, PDEs, level set methods, and the stereo problem. [sent-276, score-0.864]
wordName wordTfidf (topN-words)
[('surface', 0.743), ('surfaces', 0.189), ('shape', 0.152), ('contours', 0.151), ('apparent', 0.136), ('reconstruction', 0.136), ('rior', 0.129), ('pde', 0.129), ('asm', 0.124), ('gl', 0.106), ('persons', 0.092), ('camera', 0.082), ('level', 0.082), ('image', 0.081), ('images', 0.077), ('solem', 0.074), ('morphable', 0.073), ('ep', 0.069), ('person', 0.068), ('priori', 0.066), ('curve', 0.065), ('reconstructing', 0.065), ('laser', 0.059), ('reconstructed', 0.058), ('features', 0.058), ('points', 0.057), ('ppca', 0.055), ('contour', 0.055), ('evolving', 0.052), ('specularities', 0.05), ('tgl', 0.05), ('ugl', 0.05), ('unorganized', 0.05), ('wgl', 0.05), ('dp', 0.048), ('tted', 0.048), ('distance', 0.047), ('vision', 0.047), ('geometry', 0.046), ('latent', 0.046), ('tting', 0.044), ('model', 0.044), ('cootes', 0.043), ('marching', 0.043), ('signed', 0.043), ('triangulated', 0.043), ('fredrik', 0.043), ('dist', 0.043), ('learned', 0.043), ('variational', 0.043), ('faces', 0.042), ('face', 0.041), ('cone', 0.041), ('median', 0.041), ('motion', 0.04), ('projection', 0.04), ('curves', 0.04), ('mean', 0.039), ('silhouette', 0.039), ('silhouettes', 0.039), ('reconstructions', 0.039), ('triplets', 0.039), ('stereo', 0.039), ('depth', 0.039), ('energy', 0.038), ('dmax', 0.037), ('voxel', 0.037), ('heads', 0.037), ('scans', 0.037), ('feature', 0.036), ('modelling', 0.036), ('database', 0.035), ('complete', 0.035), ('histogram', 0.035), ('illumination', 0.035), ('geometrical', 0.035), ('geometric', 0.034), ('perspective', 0.033), ('potential', 0.033), ('merging', 0.033), ('curvature', 0.033), ('fitting', 0.033), ('vn', 0.033), ('velocity', 0.031), ('setup', 0.031), ('contribution', 0.031), ('automatically', 0.031), ('utilized', 0.03), ('ot', 0.03), ('active', 0.03), ('open', 0.029), ('manual', 0.029), ('recovering', 0.029), ('deviations', 0.028), ('visible', 0.028), ('priors', 0.028), ('cues', 0.026), ('point', 0.026), ('represented', 0.026), ('figures', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000011 179 nips-2004-Surface Reconstruction using Learned Shape Models
Author: Jan E. Solem, Fredrik Kahl
Abstract: We consider the problem of geometrical surface reconstruction from one or several images using learned shape models. While humans can effortlessly retrieve 3D shape information, this inverse problem has turned out to be difficult to perform automatically. We introduce a framework based on level set surface reconstruction and shape models for achieving this goal. Through this merging, we obtain an efficient and robust method for reconstructing surfaces of an object category of interest. The shape model includes surface cues such as point, curve and silhouette features. Based on ideas from Active Shape Models, we show how both the geometry and the appearance of these features can be modelled consistently in a multi-view context. The complete surface is obtained by evolving a level set driven by a PDE, which tries to fit the surface to the inferred 3D features. In addition, an a priori 3D surface model is used to regularize the solution, in particular, where surface features are sparse. Experiments are demonstrated on a database of real face images.
2 0.29504594 92 nips-2004-Kernel Methods for Implicit Surface Modeling
Author: Joachim Giesen, Simon Spalinger, Bernhard Schölkopf
Abstract: We describe methods for computing an implicit model of a hypersurface that is given only by a finite sampling. The methods work by mapping the sample points into a reproducing kernel Hilbert space and then determining regions in terms of hyperplanes. 1
3 0.20533384 186 nips-2004-The Correlated Correspondence Algorithm for Unsupervised Registration of Nonrigid Surfaces
Author: Dragomir Anguelov, Praveen Srinivasan, Hoi-cheung Pang, Daphne Koller, Sebastian Thrun, James Davis
Abstract: We present an unsupervised algorithm for registering 3D surface scans of an object undergoing significant deformations. Our algorithm does not need markers, nor does it assume prior knowledge about object shape, the dynamics of its deformation, or scan alignment. The algorithm registers two meshes by optimizing a joint probabilistic model over all point-topoint correspondences between them. This model enforces preservation of local mesh geometry, as well as more global constraints that capture the preservation of geodesic distance between corresponding point pairs. The algorithm applies even when one of the meshes is an incomplete range scan; thus, it can be used to automatically fill in the remaining surfaces for this partial scan, even if those surfaces were previously only seen in a different configuration. We evaluate the algorithm on several real-world datasets, where we demonstrate good results in the presence of significant movement of articulated parts and non-rigid surface deformation. Finally, we show that the output of the algorithm can be used for compelling computer graphics tasks such as interpolation between two scans of a non-rigid object and automatic recovery of articulated object models. 1
4 0.13158621 160 nips-2004-Seeing through water
Author: Alexei Efros, Volkan Isler, Jianbo Shi, Mirkó Visontai
Abstract: We consider the problem of recovering an underwater image distorted by surface waves. A large amount of video data of the distorted image is acquired. The problem is posed in terms of finding an undistorted image patch at each spatial location. This challenging reconstruction task can be formulated as a manifold learning problem, such that the center of the manifold is the image of the undistorted patch. To compute the center, we present a new technique to estimate global distances on the manifold. Our technique achieves robustness through convex flow computations and solves the “leakage” problem inherent in recent manifold embedding techniques. 1
5 0.088075802 13 nips-2004-A Three Tiered Approach for Articulated Object Action Modeling and Recognition
Author: Le Lu, Gregory D. Hager, Laurent Younes
Abstract: Visual action recognition is an important problem in computer vision. In this paper, we propose a new method to probabilistically model and recognize actions of articulated objects, such as hand or body gestures, in image sequences. Our method consists of three levels of representation. At the low level, we first extract a feature vector invariant to scale and in-plane rotation by using the Fourier transform of a circular spatial histogram. Then, spectral partitioning [20] is utilized to obtain an initial clustering; this clustering is then refined using a temporal smoothness constraint. Gaussian mixture model (GMM) based clustering and density estimation in the subspace of linear discriminant analysis (LDA) are then applied to thousands of image feature vectors to obtain an intermediate level representation. Finally, at the high level we build a temporal multiresolution histogram model for each action by aggregating the clustering weights of sampled images belonging to that action. We discuss how this high level representation can be extended to achieve temporal scaling invariance and to include Bi-gram or Multi-gram transition information. Both image clustering and action recognition/segmentation results are given to show the validity of our three tiered representation.
6 0.082203113 121 nips-2004-Modeling Nonlinear Dependencies in Natural Images using Mixture of Laplacian Distribution
7 0.071333855 83 nips-2004-Incremental Learning for Visual Tracking
8 0.06928046 172 nips-2004-Sparse Coding of Natural Images Using an Overcomplete Set of Limited Capacity Units
9 0.065532498 40 nips-2004-Common-Frame Model for Object Recognition
10 0.065308519 163 nips-2004-Semi-parametric Exponential Family PCA
11 0.064470358 134 nips-2004-Object Classification from a Single Example Utilizing Class Relevance Metrics
12 0.063425809 99 nips-2004-Learning Hyper-Features for Visual Identification
13 0.06275326 125 nips-2004-Multiple Relational Embedding
14 0.062144291 182 nips-2004-Synergistic Face Detection and Pose Estimation with Energy-Based Models
15 0.061773162 16 nips-2004-Adaptive Discriminative Generative Model and Its Applications
16 0.061145972 89 nips-2004-Joint MRI Bias Removal Using Entropy Minimization Across Images
17 0.058036607 105 nips-2004-Log-concavity Results on Gaussian Process Methods for Supervised and Unsupervised Learning
18 0.057225145 192 nips-2004-The power of feature clustering: An application to object detection
19 0.056435209 114 nips-2004-Maximum Likelihood Estimation of Intrinsic Dimension
20 0.056077477 11 nips-2004-A Second Order Cone programming Formulation for Classifying Missing Data
topicId topicWeight
[(0, -0.178), (1, 0.038), (2, -0.087), (3, -0.137), (4, 0.09), (5, 0.039), (6, 0.039), (7, -0.173), (8, 0.008), (9, 0.016), (10, 0.091), (11, -0.055), (12, 0.078), (13, -0.091), (14, -0.062), (15, 0.116), (16, 0.016), (17, -0.005), (18, -0.119), (19, 0.013), (20, -0.077), (21, 0.117), (22, -0.17), (23, 0.136), (24, 0.116), (25, -0.326), (26, -0.059), (27, -0.128), (28, -0.09), (29, 0.193), (30, 0.12), (31, 0.112), (32, -0.001), (33, 0.217), (34, -0.07), (35, -0.009), (36, 0.054), (37, -0.131), (38, -0.005), (39, -0.025), (40, -0.052), (41, 0.012), (42, -0.017), (43, 0.01), (44, -0.063), (45, -0.125), (46, -0.09), (47, -0.044), (48, -0.072), (49, 0.082)]
simIndex simValue paperId paperTitle
same-paper 1 0.96989894 179 nips-2004-Surface Reconstruction using Learned Shape Models
Author: Jan E. Solem, Fredrik Kahl
Abstract: We consider the problem of geometrical surface reconstruction from one or several images using learned shape models. While humans can effortlessly retrieve 3D shape information, this inverse problem has turned out to be difficult to perform automatically. We introduce a framework based on level set surface reconstruction and shape models for achieving this goal. Through this merging, we obtain an efficient and robust method for reconstructing surfaces of an object category of interest. The shape model includes surface cues such as point, curve and silhouette features. Based on ideas from Active Shape Models, we show how both the geometry and the appearance of these features can be modelled consistently in a multi-view context. The complete surface is obtained by evolving a level set driven by a PDE, which tries to fit the surface to the inferred 3D features. In addition, an a priori 3D surface model is used to regularize the solution, in particular, where surface features are sparse. Experiments are demonstrated on a database of real face images.
2 0.81347775 186 nips-2004-The Correlated Correspondence Algorithm for Unsupervised Registration of Nonrigid Surfaces
Author: Dragomir Anguelov, Praveen Srinivasan, Hoi-cheung Pang, Daphne Koller, Sebastian Thrun, James Davis
Abstract: We present an unsupervised algorithm for registering 3D surface scans of an object undergoing significant deformations. Our algorithm does not need markers, nor does it assume prior knowledge about object shape, the dynamics of its deformation, or scan alignment. The algorithm registers two meshes by optimizing a joint probabilistic model over all point-topoint correspondences between them. This model enforces preservation of local mesh geometry, as well as more global constraints that capture the preservation of geodesic distance between corresponding point pairs. The algorithm applies even when one of the meshes is an incomplete range scan; thus, it can be used to automatically fill in the remaining surfaces for this partial scan, even if those surfaces were previously only seen in a different configuration. We evaluate the algorithm on several real-world datasets, where we demonstrate good results in the presence of significant movement of articulated parts and non-rigid surface deformation. Finally, we show that the output of the algorithm can be used for compelling computer graphics tasks such as interpolation between two scans of a non-rigid object and automatic recovery of articulated object models. 1
3 0.69520551 92 nips-2004-Kernel Methods for Implicit Surface Modeling
Author: Joachim Giesen, Simon Spalinger, Bernhard Schölkopf
Abstract: We describe methods for computing an implicit model of a hypersurface that is given only by a finite sampling. The methods work by mapping the sample points into a reproducing kernel Hilbert space and then determining regions in terms of hyperplanes. 1
4 0.38634971 160 nips-2004-Seeing through water
Author: Alexei Efros, Volkan Isler, Jianbo Shi, Mirkó Visontai
Abstract: We consider the problem of recovering an underwater image distorted by surface waves. A large amount of video data of the distorted image is acquired. The problem is posed in terms of finding an undistorted image patch at each spatial location. This challenging reconstruction task can be formulated as a manifold learning problem, such that the center of the manifold is the image of the undistorted patch. To compute the center, we present a new technique to estimate global distances on the manifold. Our technique achieves robustness through convex flow computations and solves the “leakage” problem inherent in recent manifold embedding techniques. 1
5 0.35478061 81 nips-2004-Implicit Wiener Series for Higher-Order Image Analysis
Author: Matthias O. Franz, Bernhard Schölkopf
Abstract: The computation of classical higher-order statistics such as higher-order moments or spectra is difficult for images due to the huge number of terms to be estimated and interpreted. We propose an alternative approach in which multiplicative pixel interactions are described by a series of Wiener functionals. Since the functionals are estimated implicitly via polynomial kernels, the combinatorial explosion associated with the classical higher-order statistics is avoided. First results show that image structures such as lines or corners can be predicted correctly, and that pixel interactions up to the order of five play an important role in natural images. Most of the interesting structure in a natural image is characterized by its higher-order statistics. Arbitrarily oriented lines and edges, for instance, cannot be described by the usual pairwise statistics such as the power spectrum or the autocorrelation function: From knowing the intensity of one point on a line alone, we cannot predict its neighbouring intensities. This would require knowledge of a second point on the line, i.e., we have to consider some third-order statistics which describe the interactions between triplets of points. Analogously, the prediction of a corner neighbourhood needs at least fourth-order statistics, and so on. In terms of Fourier analysis, higher-order image structures such as edges or corners are described by phase alignments, i.e. phase correlations between several Fourier components of the image. Classically, harmonic phase interactions are measured by higher-order spectra [4]. Unfortunately, the estimation of these spectra for high-dimensional signals such as images involves the estimation and interpretation of a huge number of terms. For instance, a sixth-order spectrum of a 16×16 sized image contains roughly 1012 coefficients, about 1010 of which would have to be estimated independently if all symmetries in the spectrum are considered. First attempts at estimating the higher-order structure of natural images were therefore restricted to global measures such as skewness or kurtosis [8], or to submanifolds of fourth-order spectra [9]. Here, we propose an alternative approach that models the interactions of image points in a series of Wiener functionals. A Wiener functional of order n captures those image components that can be predicted from the multiplicative interaction of n image points. In contrast to higher-order spectra or moments, the estimation of a Wiener model does not require the estimation of an excessive number of terms since it can be computed implicitly via polynomial kernels. This allows us to decompose an image into components that are characterized by interactions of a given order. In the next section, we introduce the Wiener expansion and discuss its capability of modeling higher-order pixel interactions. The implicit estimation method is described in Sect. 2, followed by some examples of use in Sect. 3. We conclude in Sect. 4 by briefly discussing the results and possible improvements. 1 Modeling pixel interactions with Wiener functionals For our analysis, we adopt a prediction framework: Given a d × d neighbourhood of an image pixel, we want to predict its gray value from the gray values of the neighbours. We are particularly interested to which extent interactions of different orders contribute to the overall prediction. Our basic assumption is that the dependency of the central pixel value y on its neighbours xi , i = 1, . . . , m = d2 − 1 can be modeled as a series y = H0 [x] + H1 [x] + H2 [x] + · · · + Hn [x] + · · · (1) of discrete Volterra functionals H0 [x] = h0 = const. and Hn [x] = m i1 =1 ··· m in =1 (n) hi1 ...in xi1 . . . xin . (2) Here, we have stacked the grayvalues of the neighbourhood into the vector x = (x1 , . . . , xm ) ∈ Rm . The discrete nth-order Volterra functional is, accordingly, a linear combination of all ordered nth-order monomials of the elements of x with mn coefficients (n) hi1 ...in . Volterra functionals provide a controlled way of introducing multiplicative interactions of image points since a functional of order n contains all products of the input of order n. In terms of higher-order statistics, this means that we can control the order of the statistics used since an nth-order Volterra series leads to dependencies between maximally n + 1 pixels. Unfortunately, Volterra functionals are not orthogonal to each other, i.e., depending on the input distribution, a functional of order n generally leads to additional lower-order interactions. As a result, the output of the functional will contain components that are proportional to that of some lower-order monomials. For instance, the output of a second-order Volterra functional for Gaussian input generally has a mean different from zero [5]. If one wants to estimate the zeroeth-order component of an image (i.e., the constant component created without pixel interactions) the constant component created by the second-order interactions needs to be subtracted. For general Volterra series, this correction can be achieved by decomposing it into a new series y = G0 [x] + G1 [x] + · · · + Gn [x] + · · · of functionals Gn [x] that are uncorrelated, i.e., orthogonal with respect to the input. The resulting Wiener functionals1 Gn [x] are linear combinations of Volterra functionals up to order n. They are computed from the original Volterra series by a procedure akin to Gram-Schmidt orthogonalization [5]. It can be shown that any Wiener expansion of finite degree minimizes the mean squared error between the true system output and its Volterra series model [5]. The orthogonality condition ensures that a Wiener functional of order n captures only the component of the image created by the multiplicative interaction of n pixels. In contrast to general Volterra functionals, a Wiener functional is orthogonal to all monomials of lower order [5]. So far, we have not gained anything compared to classical estimation of higher-order moments or spectra: an nth-order Volterra functional contains the same number of terms as 1 Strictly speaking, the term Wiener functional is reserved for orthogonal Volterra functionals with respect to Gaussian input. Here, the term will be used for orthogonalized Volterra functionals with arbitrary input distributions. the corresponding n + 1-order spectrum, and a Wiener functional of the same order has an even higher number of coefficients as it consists also of lower-order Volterra functionals. In the next section, we will introduce an implicit representation of the Wiener series using polynomial kernels which allows for an efficient computation of the Wiener functionals. 2 Estimating Wiener series by regression in RKHS Volterra series as linear functionals in RKHS. The nth-order Volterra functional is a weighted sum of all nth-order monomials of the input vector x. We can interpret the evaluation of this functional for a given input x as a map φn defined for n = 0, 1, 2, . . . as φ0 (x) = 1 and φn (x) = (xn , xn−1 x2 , . . . , x1 xn−1 , xn , . . . , xn ) 1 2 m 1 2 (3) n such that φn maps the input x ∈ Rm into a vector φn (x) ∈ Fn = Rm containing all mn ordered monomials of degree n. Using φn , we can write the nth-order Volterra functional in Eq. (2) as a scalar product in Fn , Hn [x] = ηn φn (x), (n) (4) (n) (n) with the coefficients stacked into the vector ηn = (h1,1,..1 , h1,2,..1 , h1,3,..1 , . . . ) ∈ Fn . The same idea can be applied to the entire pth-order Volterra series. By stacking the maps φn into a single map φ(p) (x) = (φ0 (x), φ1 (x), . . . , φp (x)) , one obtains a mapping from p+1 2 p Rm into F(p) = R × Rm × Rm × . . . Rm = RM with dimensionality M = 1−m . The 1−m entire pth-order Volterra series can be written as a scalar product in F(p) p n=0 Hn [x] = (η (p) ) φ(p) (x) (5) with η (p) ∈ F(p) . Below, we will show how we can express η (p) as an expansion in terms of the training points. This will dramatically reduce the number of parameters we have to estimate. This procedure works because the space Fn of nth-order monomials has a very special property: it has the structure of a reproducing kernel Hilbert space (RKHS). As a consequence, the dot product in Fn can be computed by evaluating a positive definite kernel function kn (x1 , x2 ). For monomials, one can easily show that (e.g., [6]) φn (x1 ) φn (x2 ) = (x1 x2 )n =: kn (x1 , x2 ). (6) Since F(p) is generated as a direct sum of the single spaces Fn , the associated scalar product is simply the sum of the scalar products in the Fn : φ(p) (x1 ) φ(p) (x2 ) = p n=0 (x1 x2 )n = k (p) (x1 , x2 ). (7) Thus, we have shown that the discretized Volterra series can be expressed as a linear functional in a RKHS2 . Linear regression in RKHS. For our prediction problem (1), the RKHS property of the Volterra series leads to an efficient solution which is in part due to the so called representer theorem (e.g., [6]). It states the following: suppose we are given N observations 2 A similar approach has been taken by [1] using the inhomogeneous polynomial kernel = (1 + x1 x2 )p . This kernel implies a map φinh into the same space of monomials, but it weights the degrees of the monomials differently as can be seen by applying the binomial theorem. (p) kinh (x1 , x2 ) (x1 , y1 ), . . . , (xN , yN ) of the function (1) and an arbitrary cost function c, Ω is a nondecreasing function on R>0 and . F is the norm of the RKHS associated with the kernel k. If we minimize an objective function c((x1 , y1 , f (x1 )), . . . , (xN , yN , f (xN ))) + Ω( f F ), (8) 3 over all functions in the RKHS, then an optimal solution can be expressed as N f (x) = j=1 aj k(x, xj ), aj ∈ R. (9) In other words, although we optimized over the entire RKHS including functions which are defined for arbitrary input points, it turns out that we can always express the solution in terms of the observations xj only. Hence the optimization problem over the extremely large number of coefficients η (p) in Eq. (5) is transformed into one over N variables aj . Let us consider the special case where the cost function is the mean squared error, N 1 c((x1 , y1 , f (x1 )), . . . , (xN , yN , f (xN ))) = N j=1 (f (xj ) − yj )2 , and the regularizer Ω is zero4 . The solution for a = (a1 , . . . , aN ) is readily computed by setting the derivative of (8) with respect to the vector a equal to zero; it takes the form a = K −1 y with the Gram matrix defined as Kij = k(xi , xj ), hence5 y = f (x) = a z(x) = y K −1 z(x), (10) N where z(x) = (k(x, x1 ), k(x, x2 ), . . . k(x, xN )) ∈ R . Implicit Wiener series estimation. As we stated above, the pth-degree Wiener expansion is the pth-order Volterra series that minimizes the squared error. This can be put into the regression framework: since any finite Volterra series can be represented as a linear functional in the corresponding RKHS, we can find the pth-order Volterra series that minimizes the squared error by linear regression. This, by definition, must be the pth-degree Wiener series since no other Volterra series has this property6 . From Eqn. (10), we obtain the following expressions for the implicit Wiener series p p 1 −1 G0 [x] = y 1, Hn [x] = y Kp z(p) (x) (11) Gn [x] = n=0 n=0 N (p) where the Gram matrix Kp and the coefficient vector z (x) are computed using the kernel from Eq. (7) and 1 = (1, 1, . . . ) ∈ RN . Note that the Wiener series is represented only implicitly since we are using the RKHS representation as a sum of scalar products with the training points. Thus, we can avoid the “curse of dimensionality”, i.e., there is no need to compute the possibly large number of coefficients explicitly. The explicit Volterra and Wiener expansions can be recovered from Eq. (11) by collecting all terms containing monomials of the desired order and summing them up. The individual nth-order Volterra functionals in a Wiener series of degree p > 0 are given implicitly by −1 Hn [x] = y Kp zn (x) n n (12) n with zn (x) = ((x1 x) , (x2 x) , . . . , (xN x) ) . For p = 0 the only term is the constant zero-order Volterra functional H0 [x] = G0 [x]. The coefficient vector ηn = (n) (n) (n) (h1,1,...1 , h1,2,...1 , h1,3,...1 , . . . ) of the explicit Volterra functional is obtained as −1 η n = Φ n Kp y 3 (13) for conditions on uniqueness of the solution, see [6] Note that this is different from the regularized approach used by [1]. If Ω is not zero, the resulting Volterra series are different from the Wiener series since they are not orthogonal with respect to the input. 5 If K is not invertible, K −1 denotes the pseudo-inverse of K. 6 assuming symmetrized Volterra kernels which can be obtained from any Volterra expanson. 4 using the design matrix Φn = (φn (x1 ) , φn (x1 ) , . . . , φn (x1 ) ) . The individual Wiener functionals can only be recovered by applying the regression procedure twice. If we are interested in the nth-degree Wiener functional, we have to compute the solution for the kernels k (n) (x1 , x2 ) and k (n−1) (x1 , x2 ). The Wiener functional for n > 0 is then obtained from the difference of the two results as Gn [x] = n i=0 Gi [x] − n−1 i=0 Gi [x] = y −1 −1 Kn z(n) (x) − Kn−1 z(n−1) (x) . (14) The corresponding ith-order Volterra functionals of the nth-degree Wiener functional are computed analogously to Eqns. (12) and (13) [3]. Orthogonality. The resulting Wiener functionals must fulfill the orthogonality condition which in its strictest form states that a pth-degree Wiener functional must be orthogonal to all monomials in the input of lower order. Formally, we will prove the following Theorem 1 The functionals obtained from Eq. (14) fulfill the orthogonality condition E [m(x)Gp [x]] = 0 (15) where E denotes the expectation over the input distribution and m(x) an arbitrary ithorder monomial with i < p. We will show that this a consequence of the least squares fit of any linear expansion in a set M of basis functions of the form y = j=1 γj ϕj (x). In the case of the Wiener and Volterra expansions, the basis functions ϕj (x) are monomials of the components of x. M We denote the error of the expansion as e(x) = y − j=1 γj ϕj (xi ). The minimum of the expected quadratic loss L with respect to the expansion coefficient γk is given by ∂ ∂L = E e(x) ∂γk ∂γk 2 = −2E [ϕk (x)e(x)] = 0. (16) This means that, for an expansion in a set of basis functions minimizing the squared error, the error is orthogonal to all basis functions used in the expansion. Now let us assume we know the Wiener series expansion (which minimizes the mean squared error) of a system up to degree p − 1. The approximation error is given by the ∞ sum of the higher-order Wiener functionals e(x) = n=p Gn [x], so Gp [x] is part of the error. As a consequence of the linearity of the expectation, Eq. (16) implies ∞ n=p ∞ E [ϕk (x)Gn [x]] = 0 and n=p+1 E [ϕk (x)Gn [x]] = 0 (17) for any φk of order less than p. The difference of both equations yields E [ϕk (x)Gp [x]] = 0, so that Gp [x] must be orthogonal to any of the lower order basis functions, namely to all monomials with order smaller than p. 3 Experiments Toy examples. In our first experiment, we check whether our intuitions about higher-order statistics described in the introduction are captured by the proposed method. In particular, we expect that arbitrarily oriented lines can only be predicted using third-order statistics. As a consequence, we should need at least a second-order Wiener functional to predict lines correctly. Our first test image (size 80 × 110, upper row in Fig. 1) contains only lines of varying orientations. Choosing a 5 × 5 neighbourhood, we predicted the central pixel using (11). original image 0th-order component/ reconstruction 1st-order reconstruction 1st-order component 2nd-order reconstruction 2nd-order component 3rd-order reconstruction mse = 583.7 mse = 0.006 mse = 0 mse = 1317 mse = 37.4 mse = 0.001 mse = 1845 mse = 334.9 3rd-order component mse = 19.0 Figure 1: Higher-order components of toy images. The image components of different orders are created by the corresponding Wiener functionals. They are added up to obtain the different orders of reconstruction. Note that the constant 0-order component and reconstruction are identical. The reconstruction error (mse) is given as the mean squared error between the true grey values of the image and the reconstruction. Although the linear first-order model seems to reconstruct the lines, this is actually not true since the linear model just smoothes over the image (note its large reconstruction error). A correct prediction is only obtained by adding a second-order component to the model. The third-order component is only significant at crossings, corners and line endings. Models of orders 0 . . . 3 were learned from the image by extracting the maximal training set of 76 × 106 patches of size 5 × 57 . The corresponding image components of order 0 to 3 were computed according to (14). Note the different components generated by the Wiener functionals can also be negative. In Fig. 1, they are scaled to the gray values [0..255]. The behaviour of the models conforms to our intuition: the linear model cannot capture the line structure of the image thus leading to a large reconstruction error which drops to nearly zero when a second-order model is used. The additional small correction achieved by the third-order model is mainly due to discretization effects. Similar to lines, we expect that we need at least a third-order model to predict crossings or corners correctly. This is confirmed by the second and third test image shown in the corresponding row in Fig. 1. Note that the third-order component is only significant at crossings, corners and line endings. The fourth- and fifth-order terms (not shown) have only negligible contributions. The fact that the reconstruction error does not drop to zero for the third image is caused by the line endings which cannot be predicted to a higher accuracy than one pixel. Application to natural images. Are there further predictable structures in natural images that are not due to lines, crossings or corners? This can be investigated by applying our method to a set of natural images (an example of size 80 × 110 is depicted in Fig. 2). Our 7 In contrast to the usual setting in machine learning, training and test set are identical in our case since we are not interested in generalization to other images, but in analyzing the higher-order components of the image at hand. original image 0th-order component/ reconstruction 1st-order reconstruction mse = 1070 1st-order component 2nd-order reconstruction mse = 957.4 2nd-order component 3rd-order reconstruction mse = 414.6 3rd-order component 4th-order reconstruction mse = 98.5 4th-order component 5th-order reconstruction mse = 18.5 5th-order component 6th-order reconstruction mse = 4.98 6th-order component 7th-order reconstruction mse = 1.32 7th-order component 8th-order reconstruction mse = 0.41 8th-order component Figure 2: Higher-order components and reconstructions of a photograph. Interactions up to the fifth order play an important role. Note that significant components become sparser with increasing model order. results on a set of 10 natural images of size 50 × 70 show an an approximately exponential decay of the reconstruction error when more and more higher-order terms are added to the reconstruction (Fig. 3). Interestingly, terms up to order 5 still play a significant role, although the image regions with a significant component become sparser with increasing model order (see Fig. 2). Note that the nonlinear terms reduce the reconstruction error to almost 0. This suggests a high degree of higher-order redundancy in natural images that cannot be exploited by the usual linear prediction models. 4 Conclusion The implicit estimation of Wiener functionals via polynomial kernels opens up new possibilities for the estimation of higher-order image statistics. Compared to the classical methods such as higher-order spectra, moments or cumulants, our approach avoids the combinatorial explosion caused by the exponential increase of the number of terms to be estimated and interpreted. When put into a predictive framework, multiplicative pixel interactions of different orders are easily visualized and conform to the intuitive notions about image structures such as edges, lines, crossings or corners. There is no one-to-one mapping between the classical higher-order statistics and multiplicative pixel interactions. Any nonlinear Wiener functional, for instance, creates infinitely many correlations or cumulants of higher order, and often also of lower order. On the other 700 Figure 3: Mean square reconstruction error of 600 models of different order for a set of 10 natural images. mse 500 400 300 200 100 0 0 1 2 3 4 5 6 7 model order hand, a Wiener functional of order n produces only harmonic phase interactions up to order n + 1, but sometimes also of lower orders. Thus, when one analyzes a classical statistic of a given order, one often cannot determine by which order of pixel interaction it was created. In contrast, our method is able to isolate image components that are created by a single order of interaction. Although of preliminary nature, our results on natural images suggest an important role of statistics up to the fifth order. Most of the currently used low-level feature detectors such as edge or corner detectors maximally use third-order interactions. The investigation of fourth- or higher-order features is a field that might lead to new insights into the nature and role of higher-order image structures. As often observed in the literature (e.g. [2][7]), our results seem to confirm that a large proportion of the redundancy in natural images is contained in the higher-order pixel interactions. Before any further conclusions can be drawn, however, our study needs to be extended in several directions: 1. A representative image database has to be analyzed. The images must be carefully calibrated since nonlinear statistics can be highly calibrationsensitive. In addition, the contribution of image noise has to be investigated. 2. Currently, only images up to 9000 pixels can be analyzed due to the matrix inversion required by Eq. 11. To accomodate for larger images, our method has to be reformulated in an iterative algorithm. 3. So far, we only considered 5 × 5-patches. To systematically investigate patch size effects, the analysis has to be conducted in a multi-scale framework. References [1] T. J. Dodd and R. F. Harrison. A new solution to Volterra series estimation. In CD-Rom Proc. 2002 IFAC World Congress, 2002. [2] D. J. Field. What is the goal of sensory coding? Neural Computation, 6:559 – 601, 1994. [3] M. O. Franz and B. Sch¨lkopf. Implicit Wiener series. Technical Report 114, Max-Plancko Institut f¨r biologische Kybernetik, T¨bingen, June 2003. u u [4] C. L. Nikias and A. P. Petropulu. Higher-order spectra analysis. Prentice Hall, Englewood Cliffs, NJ, 1993. [5] M. Schetzen. The Volterra and Wiener theories of nonlinear systems. Krieger, Malabar, 1989. [6] B. Sch¨lkopf and A. J. Smola. Learning with kernels. MIT Press, Cambridge, MA, 2002. o [7] O. Schwartz and E. P. Simoncelli. Natural signal statistics and sensory gain control. Nature Neurosc., 4(8):819 – 825, 2001. [8] M. G. A. Thomson. Higher-order structure in natural scenes. J. Opt.Soc. Am. A, 16(7):1549 – 1553, 1999. [9] M. G. A. Thomson. Beats, kurtosis and visual coding. Network: Compt. Neural Syst., 12:271 – 287, 2001.
6 0.31824479 109 nips-2004-Mass Meta-analysis in Talairach Space
7 0.3084273 40 nips-2004-Common-Frame Model for Object Recognition
8 0.29512385 99 nips-2004-Learning Hyper-Features for Visual Identification
9 0.28387275 15 nips-2004-Active Learning for Anomaly and Rare-Category Detection
10 0.27738649 89 nips-2004-Joint MRI Bias Removal Using Entropy Minimization Across Images
11 0.27101639 142 nips-2004-Outlier Detection with One-class Kernel Fisher Discriminants
12 0.26632121 25 nips-2004-Assignment of Multiplicative Mixtures in Natural Images
14 0.26599491 125 nips-2004-Multiple Relational Embedding
15 0.26263005 158 nips-2004-Sampling Methods for Unsupervised Learning
16 0.25892743 182 nips-2004-Synergistic Face Detection and Pose Estimation with Energy-Based Models
17 0.2566199 51 nips-2004-Detecting Significant Multidimensional Spatial Clusters
18 0.24270184 205 nips-2004-Who's In the Picture
19 0.23863575 13 nips-2004-A Three Tiered Approach for Articulated Object Action Modeling and Recognition
20 0.23391762 172 nips-2004-Sparse Coding of Natural Images Using an Overcomplete Set of Limited Capacity Units
topicId topicWeight
[(13, 0.069), (15, 0.17), (25, 0.01), (26, 0.065), (31, 0.013), (33, 0.205), (34, 0.229), (35, 0.039), (39, 0.026), (50, 0.041), (71, 0.021), (76, 0.012)]
simIndex simValue paperId paperTitle
same-paper 1 0.85267603 179 nips-2004-Surface Reconstruction using Learned Shape Models
Author: Jan E. Solem, Fredrik Kahl
Abstract: We consider the problem of geometrical surface reconstruction from one or several images using learned shape models. While humans can effortlessly retrieve 3D shape information, this inverse problem has turned out to be difficult to perform automatically. We introduce a framework based on level set surface reconstruction and shape models for achieving this goal. Through this merging, we obtain an efficient and robust method for reconstructing surfaces of an object category of interest. The shape model includes surface cues such as point, curve and silhouette features. Based on ideas from Active Shape Models, we show how both the geometry and the appearance of these features can be modelled consistently in a multi-view context. The complete surface is obtained by evolving a level set driven by a PDE, which tries to fit the surface to the inferred 3D features. In addition, an a priori 3D surface model is used to regularize the solution, in particular, where surface features are sparse. Experiments are demonstrated on a database of real face images.
2 0.79994875 10 nips-2004-A Probabilistic Model for Online Document Clustering with Application to Novelty Detection
Author: Jian Zhang, Zoubin Ghahramani, Yiming Yang
Abstract: In this paper we propose a probabilistic model for online document clustering. We use non-parametric Dirichlet process prior to model the growing number of clusters, and use a prior of general English language model as the base distribution to handle the generation of novel clusters. Furthermore, cluster uncertainty is modeled with a Bayesian Dirichletmultinomial distribution. We use empirical Bayes method to estimate hyperparameters based on a historical dataset. Our probabilistic model is applied to the novelty detection task in Topic Detection and Tracking (TDT) and compared with existing approaches in the literature. 1
3 0.77400184 16 nips-2004-Adaptive Discriminative Generative Model and Its Applications
Author: Ruei-sung Lin, David A. Ross, Jongwoo Lim, Ming-Hsuan Yang
Abstract: This paper presents an adaptive discriminative generative model that generalizes the conventional Fisher Linear Discriminant algorithm and renders a proper probabilistic interpretation. Within the context of object tracking, we aim to find a discriminative generative model that best separates the target from the background. We present a computationally efficient algorithm to constantly update this discriminative model as time progresses. While most tracking algorithms operate on the premise that the object appearance or ambient lighting condition does not significantly change as time progresses, our method adapts a discriminative generative model to reflect appearance variation of the target and background, thereby facilitating the tracking task in ever-changing environments. Numerous experiments show that our method is able to learn a discriminative generative model for tracking target objects undergoing large pose and lighting changes.
4 0.77059162 133 nips-2004-Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning
Author: Xiaojin Zhu, Jaz Kandola, Zoubin Ghahramani, John D. Lafferty
Abstract: We present an algorithm based on convex optimization for constructing kernels for semi-supervised learning. The kernel matrices are derived from the spectral decomposition of graph Laplacians, and combine labeled and unlabeled data in a systematic fashion. Unlike previous work using diffusion kernels and Gaussian random field kernels, a nonparametric kernel approach is presented that incorporates order constraints during optimization. This results in flexible kernels and avoids the need to choose among different parametric forms. Our approach relies on a quadratically constrained quadratic program (QCQP), and is computationally feasible for large datasets. We evaluate the kernels on real datasets using support vector machines, with encouraging results. 1
5 0.76728892 167 nips-2004-Semi-supervised Learning with Penalized Probabilistic Clustering
Author: Zhengdong Lu, Todd K. Leen
Abstract: While clustering is usually an unsupervised operation, there are circumstances in which we believe (with varying degrees of certainty) that items A and B should be assigned to the same cluster, while items A and C should not. We would like such pairwise relations to influence cluster assignments of out-of-sample data in a manner consistent with the prior knowledge expressed in the training set. Our starting point is probabilistic clustering based on Gaussian mixture models (GMM) of the data distribution. We express clustering preferences in the prior distribution over assignments of data points to clusters. This prior penalizes cluster assignments according to the degree with which they violate the preferences. We fit the model parameters with EM. Experiments on a variety of data sets show that PPC can consistently improve clustering results.
6 0.7665146 161 nips-2004-Self-Tuning Spectral Clustering
7 0.76587659 31 nips-2004-Blind One-microphone Speech Separation: A Spectral Learning Approach
8 0.76479584 178 nips-2004-Support Vector Classification with Input Data Uncertainty
9 0.76267928 110 nips-2004-Matrix Exponential Gradient Updates for On-line Learning and Bregman Projection
10 0.76266086 60 nips-2004-Efficient Kernel Machines Using the Improved Fast Gauss Transform
11 0.76233584 189 nips-2004-The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees
12 0.76031655 68 nips-2004-Face Detection --- Efficient and Rank Deficient
13 0.75952774 174 nips-2004-Spike Sorting: Bayesian Clustering of Non-Stationary Data
14 0.75857347 4 nips-2004-A Generalized Bradley-Terry Model: From Group Competition to Individual Skill
15 0.75813687 53 nips-2004-Discriminant Saliency for Visual Recognition from Cluttered Scenes
16 0.75804299 73 nips-2004-Generative Affine Localisation and Tracking
17 0.75802058 2 nips-2004-A Direct Formulation for Sparse PCA Using Semidefinite Programming
18 0.75743937 61 nips-2004-Efficient Out-of-Sample Extension of Dominant-Set Clusters
19 0.75737238 187 nips-2004-The Entire Regularization Path for the Support Vector Machine
20 0.7571426 127 nips-2004-Neighbourhood Components Analysis