nips nips2009 nips2009-1 knowledge-graph by maker-knowledge-mining

1 nips-2009-$L 1$-Penalized Robust Estimation for a Class of Inverse Problems Arising in Multiview Geometry

Source: pdf

Author: Arnak Dalalyan, Renaud Keriven

Abstract: We propose a new approach to the problem of robust estimation in multiview geometry. Inspired by recent advances in the sparse recovery problem of statistics, we deﬁne our estimator as a Bayesian maximum a posteriori with multivariate Laplace prior on the vector describing the outliers. This leads to an estimator in which the ﬁdelity to the data is measured by the L∞ -norm while the regularization is done by the L1 -norm. The proposed procedure is fairly fast since the outlier removal is done by solving one linear program (LP). An important difference compared to existing algorithms is that for our estimator it is not necessary to specify neither the number nor the proportion of the outliers. We present strong theoretical results assessing the accuracy of our procedure, as well as a numerical example illustrating its efﬁciency on real data. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 fr Abstract We propose a new approach to the problem of robust estimation in multiview geometry. [sent-4, score-0.348]

2 Inspired by recent advances in the sparse recovery problem of statistics, we deﬁne our estimator as a Bayesian maximum a posteriori with multivariate Laplace prior on the vector describing the outliers. [sent-5, score-0.308]

3 This leads to an estimator in which the ﬁdelity to the data is measured by the L∞ -norm while the regularization is done by the L1 -norm. [sent-6, score-0.15]

4 The proposed procedure is fairly fast since the outlier removal is done by solving one linear program (LP). [sent-7, score-0.141]

5 An important difference compared to existing algorithms is that for our estimator it is not necessary to specify neither the number nor the proportion of the outliers. [sent-8, score-0.15]

6 1 Introduction In the present paper, we are concerned with a class of non-linear inverse problems appearing in the structure and motion problem of multiview geometry. [sent-10, score-0.268]

7 This problem, that have received a great deal of attention by the computer vision community in last decade, consists in recovering a set of 3D points (structure) and a set of camera matrices (motion), when only 2D images of the aforementioned 3D points by some cameras are available. [sent-11, score-0.598]

8 Throughout this work we assume that the internal parameters of cameras as well as their orientations are known. [sent-12, score-0.226]

9 Thus, only the locations of camera centers and 3D points are to be estimated. [sent-13, score-0.405]

10 In solving the structure and motion problem by state-ofthe-art methods, it is customary to start by establishing correspondences between pairs of 2D data points. [sent-14, score-0.126]

11 One can think of the structure and motion problem as the inverse problem of inverting the operator O that takes as input the set of 3D points and the set of cameras, and produces as output the 2D images of the 3D points by the cameras. [sent-16, score-0.205]

12 In such cases, the solutions to the structure and motion problem can be found using algebraic arguments. [sent-19, score-0.12]

13 The main ﬂaw of algebraic solutions is their sensitivity to the noise in the data: very often, thanks to the noise in the measurements, there is no input that could have generated the observed output. [sent-20, score-0.131]

14 A standard approach [16] consists in measuring the distance between two elements 1 (a) (b) (c) (d) (e) Figure 1: (a) One image from the dinosaur sequence. [sent-23, score-0.197]

15 Camera locations and scene points estimated by the blind L∞ -cost minimization (b,c) and by the proposed “outlier aware” procedure (d,e). [sent-24, score-0.347]

16 In the structure and motion problem with more than two cameras, this leads to a hard non-convex optimization problem. [sent-26, score-0.119]

17 It has been shown that, for a number of problems, L∞ -norm based estimators can be computed very efﬁciently using, for example, the iterative bisection method [18, Algorithm 1, p. [sent-28, score-0.121]

18 There is however an issue with the L∞ -techniques that dampens the enthusiasm of practitioners: it is highly sensitive to outliers (c. [sent-30, score-0.17]

19 In fact, among all Lq -metrics with q ≥ 1, the L∞ -metric is the most seriously affected by the outliers in the data. [sent-34, score-0.17]

20 The purpose of the present work is to introduce and to theoretically investigate a new procedure of estimation in presence of noise and outliers. [sent-37, score-0.256]

21 It can be seen as a maximum a posteriori (MAP) estimator under uniformly distributed random noise and a sparsity favoring prior on the vector of outliers. [sent-39, score-0.283]

22 Interestingly, this study bridges the work on the robust estimation in multiview geometry [12, 27, 19, 21] and the theory of sparse recovery in statistics and signal processing [10, 2, 5, 6]. [sent-40, score-0.492]

23 The next section gives the precise formulation of the translation estimation and triangulation problem to which the presented methodology can be applied. [sent-42, score-0.412]

24 2 Translation estimation and triangulation Let us start by presenting a problem of multiview geometry to which our approach can be successfully applied, namely the problem of translation estimation and triangulation in the case of known rotations. [sent-47, score-0.908]

25 For rotation estimation algorithms, we refer the interested reader to [22, 14] and the references therein. [sent-48, score-0.15]

26 , m, be a sequence of m cameras that are known up to a translation. [sent-52, score-0.226]

27 Recall that a i camera is characterized by a 3 × 4 matrix P with real entries that can be written as P = K[R|t], where K is an invertible 3 × 3 matrix called the camera calibration matrix, R is a 3 × 3 rotation matrix and t ∈ R3 . [sent-53, score-0.54]

28 We will refer to t as the translation of the camera P. [sent-54, score-0.317]

29 , U∗ is an element of the projective space P3 ), we assume that noisy images of each j U∗ by some cameras P∗ are observed. [sent-64, score-0.226]

30 Thus, we have at our disposal the measurements j i xij = 1 eT P∗ U∗ 3 i j eT P∗ U∗ j = 1, . [sent-65, score-0.297]

31 , n, 1 i j + ξ ij , i ∈ Ij , eT P∗ U∗ 2 i j (1) where e , = 1, 2, 3, stands for the unit vector of R3 having one as the th coordinate and Ij is the set of indices of cameras for which the point U∗ is visible. [sent-68, score-0.541]

32 j j j 2 We are now in a position to state the problem of translation estimation and triangulation in the context of multiview geometry. [sent-73, score-0.56]

33 It consists in recovering the 3-vectors {t∗ } (translation estimation) i and the 3D points {X∗ } (triangulation) from the noisy measurements {xij ; j = 1, . [sent-74, score-0.204]

34 It should be noted right away that if the point U∗ is in front of the camera j P∗ , then eT P∗ U∗ ≥ 0. [sent-87, score-0.216]

35 Furthermore, we will assume that none 3 i i j of the true 3D points U∗ lies on the principal plane of a camera P∗ . [sent-89, score-0.275]

36 The parameter θ we have just deﬁned is, in general, not identiﬁable from the measurements {xij }. [sent-92, score-0.107]

37 This deﬁnes the estimator θ as a minimizer of the cost function C2,2 (θ) = i,j xij − xij (θ) 2 , where xij (θ) := eT Pi Uj ; eT Pi Uj T /eT Pi Uj is the 2-vector that we would obtain if θ 2 1 2 3 were the true parameter. [sent-107, score-0.72]

38 (2) eT Ki (Ri Xj + ti ) eT Ki (Ri Xj + ti ) 3 3 The minimization of C2,2 is a hard nonconvex problem. [sent-109, score-0.175]

39 (3) C∞,s (θ) = max max xij − xij (θ) s , xij (θ) = j=1,. [sent-112, score-0.57]

40 The main idea behind the bisection algorithm can be summarized as follows. [sent-124, score-0.121]

41 We aim to designate an algorithm computing θ s ∈ arg minθ C∞,s (θ), for any prespeciﬁed s ≥ 1, over the set of all vectors θ satisfying the cheirality condition. [sent-125, score-0.12]

42 Let us introduce the residuals rij (θ) = xij − xij (θ) that can be represented as rij (θ) = aT θ aT θ ij2 ij1 ; cT θ cT θ ij ij T , (4) for some vectors aij , cij ∈ R2 . [sent-126, score-1.021]

43 Furthermore, as presented in Remark 2, the cheirality conditions imply the set of linear constraints cT θ ≥ 1. [sent-127, score-0.12]

44 Thus, the problem of computing θ s can be rewritten as ij minimize γ subject to rij (θ) s ≤ γ, cT θ ≥ 1. [sent-128, score-0.244]

45 ij (5) Note that the inequality rij (θ) s ≤ γ can be replaced by AT θ s ≤ γcT θ with Aij = [aij1 ; aij2 ]. [sent-129, score-0.244]

46 ij ij Although (5) is not a convex problem, its solution can be well approximated by solving a sequence of convex feasibility problems. [sent-130, score-0.376]

47 1 The statistical model Let us ﬁrst observe that, in view of (1) and (4), the model we are considering can be rewritten as aT θ ∗ aT θ ∗ ij1 ij2 ; cT θ ∗ cT θ ∗ ij ij T = ξ ij , j = 1, . [sent-135, score-0.462]

48 (6) n Let N = 2 j=1 Ij be the total number of measurements and let M = 3(n + m − 1) be the size of the vector θ ∗ . [sent-139, score-0.139]

49 Similarly, let us denote by ξ the N -vector formed by concatenating the vectors ξ ij . [sent-143, score-0.154]

50 To extend this model to cover the situation where some outliers are present in the measurements, we introduce the vector ∗ ∗ ω ∗ ∈ RN deﬁned by ωp = aT θ ∗ − (cT θ ∗ )ξ p so that ωp = 0 if the pth measurement is an inlier and p p ∗ |ωp | > 0 otherwise. [sent-150, score-0.286]

51 (7) holds with some small noise vector ξ, C2 : minp cT θ ∗ = 1, p C3 : ω ∗ is sparse, i. [sent-153, score-0.16]

52 2 Sparsity prior and MAP estimator To derive an estimator of the parameter β ∗ , we place ourselves in the Bayesian framework. [sent-157, score-0.352]

53 To this end, we impose a probabilistic structure on the noise vector ξ and introduce a prior distribution on the unknown vector β. [sent-158, score-0.24]

54 Since the noise ξ represents the difference (in pixels) between the measurements and the true image points, it is naturally bounded and, generally, does not exceeds the level of a few pixels. [sent-159, score-0.195]

55 In practice, this assumption can be enforced by decorrelating the measurements using the empirical covariance matrix [20]. [sent-163, score-0.148]

56 Assume that the noise ξ has independent entries which are uniformly distributed in [−σ, σ] for some σ > 0, then the MAP estimator β = [θ T ; ω T ]T based on the prior π deﬁned by Eq. [sent-178, score-0.303]

57 The proposed algorithm implicitly assumes that all the measurements xij for which ξ ij ∞ > σ are outliers, while all the others are treated as inliers. [sent-186, score-0.451]

58 If σ is unknown, a reasonable way of acting is to impose a prior distribution on the possible values of σ and to deﬁne the estimator β as a MAP estimator based on the prior incorporating the uncertainty on σ. [sent-187, score-0.404]

59 When there are no outliers and the prior on σ is decreasing, this approach leads to the estimator minimizing the L∞ cost function. [sent-188, score-0.372]

60 Step 2: Apply the bisection algorithm to the reduced data set {xp ; p ∈ J}. [sent-198, score-0.121]

61 First, when applying the bisection algorithm at Step 2, we can use C∞,s (θ) as the initial value of γu . [sent-200, score-0.121]

62 Since θ ∗ is unknown, a reasonable strategy consists in adding a step in between Step p 1 and Step 2, which performs the weighted minimization with weights {(cT θ)−1 ; p = 1, . [sent-202, score-0.107]

63 p 5 Accuracy of estimation Let us introduce some additional notation. [sent-206, score-0.125]

64 The estimator β of β ∗ is then deﬁned as a solution to the optimization problem min ω over β = 1 AT θ = ω . [sent-226, score-0.182]

65 ω (11) From now on, for every index set T and for every vector h, hT stands for the vector equal to h on an index set T and zero elsewhere. [sent-229, score-0.247]

66 T1 ) denote the index set corresponding c to the locations of S largest entries4 of ω ∗ (resp. [sent-233, score-0.115]

67 If δT0 (θ ∗ ) + δT0 ∪T1 (θ ∗ ) < 1 then, for some constant C0 , it holds: β − β∗ ω∗ S 2 ≤ C0 ω ∗ − ω ∗ S 1, (12) ∗ where stands for the vector ω with all but the S-largest entries set to zero. [sent-235, score-0.181]

68 If we denote by T the indices of S largest entries of the vector |v|, then vT c 2 ≤ S −1/2 v 1 . [sent-243, score-0.151]

69 The assumption δT0 (θ ∗ ) + δT0 ∪T1 (θ ∗ ) < 1 is close in spirit to the restricted isometry assumption (cf. [sent-253, score-0.125]

70 Indeed, if δT0 (θ ∗ ) = 1, then it is possible to ﬁnd θ ∈ ∂P such that AT0 θ ∗ = AT0 θ , which makes the problem of robust Tc Tc estimation ill-posed, since both θ ∗ and θ satisfy (7) with the same number of outliers. [sent-279, score-0.167]

71 A simple upper bound on δJ , obtained by replacing the sup over ∂P by the sup over RM , is δJ (θ) ≤ OT , ∀θ ∈ ∂P, where O = O(A) stands for the Rank(A)×N matrix with orthonormal rows spanning J the image of AT . [sent-284, score-0.204]

72 Indeed, if the last row of the matrix A is equal to zero as well as all the rows of C except the last row which that has all the entries equal to one, then the model described by (7) is nothing else but a linear model with unknown noise variance. [sent-288, score-0.145]

73 , for instance, [9, 7, 1]) recently introduced in sparse learning and estimation may potentially be useful for the problem of robust estimation. [sent-290, score-0.199]

74 We applied our algorithm of robust estimation to the well-known dinosaur sequence 5 . [sent-292, score-0.287]

75 html 7 Figure 2: (a)-(c) Overhead view of the scene points estimated by the KK-procedure (a), by the SHprocedure (b) and by our procedure. [sent-298, score-0.16]

76 (d) Boxplots of the errors when estimating the camera centers by our procedure (left) and by the KK-procedure. [sent-299, score-0.358]

77 (e) Boxplots of the errors when estimating the camera centers by our procedure (left) and by the SH-procedure. [sent-300, score-0.358]

78 of 36 images of a dinosaur on a turntable, see Fig. [sent-301, score-0.12]

79 The 2D image points which are tracked across the image sequence and the projection matrices of 36 cameras are provided as well. [sent-303, score-0.363]

80 There are 16,432 image points corresponding to 4,983 scene points. [sent-304, score-0.199]

81 This data is severely affected by outliers which results in a very poor accuracy of the “blind” L∞ -cost minimization procedure. [sent-305, score-0.239]

82 1, the estimated camera centers are not on the same plane and the scatter plot of scene points is inaccurate. [sent-307, score-0.469]

83 If for pth measurement |ωp /cT θ| was larger than σ/4, p then the it has been considered is an outlier and removed from the dataset. [sent-310, score-0.157]

84 The corresponding 3D scene point was also removed if, after the step of outlier removal, it was seen by only one camera. [sent-311, score-0.205]

85 This resulted in removing 1, 306 image points and 297 scene points. [sent-312, score-0.233]

86 1 show the estimated camera centers and estimated scene points. [sent-314, score-0.41]

87 We see, in particular, that the camera centers are almost coplanar. [sent-315, score-0.309]

88 3 does not improve on the estimator computed at the ﬁrst step. [sent-317, score-0.15]

89 We compared our procedure with the procedures proposed by Sim and Hartley [27], hereafter referred to as SH-procedure, and by Kanade and Ke [19], hereafter KK-procedure. [sent-319, score-0.139]

90 For the SHprocedure, we iteratively computed the L∞ -cost minimizer by removing, at each step j, the measurements that had a RE larger than Emax,j − 0. [sent-320, score-0.107]

91 We have stopped the SH-procedure when the number of removed measurements exceeded 1,500. [sent-322, score-0.151]

92 The estimator obtained by SH-procedure has a maximal RE equal to 1. [sent-325, score-0.15]

93 33 pixel, whereas the maximal RE for our estimator is of 0. [sent-326, score-0.15]

94 7 Conclusion In this paper, we presented a rigorous Bayesian framework for the problem of translation estimation and triangulation that have leaded to a new robust estimation procedure. [sent-332, score-0.546]

95 This parameter-vector encapsulates the information on the scene points and the camera locations, as well as the information on the location of outliers in the data. [sent-334, score-0.546]

96 The proposed estimator exploits the sparse nature of the vector of outliers through L1 -norm minimization. [sent-335, score-0.384]

97 We have given the mathematical proof of the result demonstrating the efﬁciency of the proposed estimator under mild assumptions. [sent-336, score-0.15]

98 Real data analysis conducted on the dinosaur sequence supports our theoretical results. [sent-337, score-0.12]

99 Stable recovery of sparse overcomplete representations in the presence of noise. [sent-416, score-0.107]

100 Sparse structures in L-inﬁnity norm minimization for structure and motion reconstruction. [sent-516, score-0.156]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('ct', 0.366), ('tc', 0.339), ('cameras', 0.226), ('camera', 0.216), ('xij', 0.19), ('triangulation', 0.184), ('multiview', 0.181), ('outliers', 0.17), ('ij', 0.154), ('hartley', 0.15), ('estimator', 0.15), ('bisection', 0.121), ('cheirality', 0.12), ('dinosaur', 0.12), ('measurements', 0.107), ('remark', 0.101), ('translation', 0.101), ('scene', 0.101), ('stands', 0.097), ('estimation', 0.094), ('centers', 0.093), ('rij', 0.09), ('dalalyan', 0.09), ('motion', 0.087), ('minp', 0.079), ('diag', 0.078), ('ki', 0.075), ('robust', 0.073), ('ful', 0.073), ('geometry', 0.07), ('minimization', 0.069), ('cand', 0.065), ('ri', 0.063), ('checks', 0.061), ('outlier', 0.06), ('quasiconvex', 0.06), ('seo', 0.06), ('shprocedure', 0.06), ('points', 0.059), ('uj', 0.057), ('iccv', 0.056), ('rotation', 0.056), ('ti', 0.053), ('maxj', 0.053), ('boxplots', 0.053), ('kanade', 0.053), ('pth', 0.053), ('sedumi', 0.053), ('entries', 0.052), ('prior', 0.052), ('aij', 0.05), ('lled', 0.05), ('procedure', 0.049), ('cvpr', 0.049), ('noise', 0.049), ('et', 0.048), ('inliers', 0.048), ('xj', 0.047), ('hereafter', 0.045), ('ot', 0.045), ('sim', 0.045), ('removed', 0.044), ('unknown', 0.044), ('index', 0.043), ('isometry', 0.043), ('recovery', 0.042), ('assumption', 0.041), ('cope', 0.041), ('ke', 0.041), ('correspondences', 0.039), ('dantzig', 0.039), ('coordinates', 0.039), ('image', 0.039), ('consists', 0.038), ('geometric', 0.037), ('locations', 0.037), ('cij', 0.036), ('residuals', 0.036), ('delity', 0.035), ('largest', 0.035), ('pi', 0.035), ('convex', 0.034), ('sup', 0.034), ('removing', 0.034), ('map', 0.033), ('algebraic', 0.033), ('assessing', 0.033), ('paris', 0.033), ('methodology', 0.033), ('presence', 0.033), ('optimization', 0.032), ('blind', 0.032), ('removal', 0.032), ('indices', 0.032), ('vector', 0.032), ('sparse', 0.032), ('emphasize', 0.032), ('laplace', 0.032), ('introduce', 0.031), ('get', 0.031)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 1 nips-2009-$L 1$-Penalized Robust Estimation for a Class of Inverse Problems Arising in Multiview Geometry

Author: Arnak Dalalyan, Renaud Keriven

2 0.21700503 61 nips-2009-Convex Relaxation of Mixture Regression with Efficient Algorithms

Author: Novi Quadrianto, John Lim, Dale Schuurmans, Tibério S. Caetano

Abstract: We develop a convex relaxation of maximum a posteriori estimation of a mixture of regression models. Although our relaxation involves a semideﬁnite matrix variable, we reformulate the problem to eliminate the need for general semideﬁnite programming. In particular, we provide two reformulations that admit fast algorithms. The ﬁrst is a max-min spectral reformulation exploiting quasi-Newton descent. The second is a min-min reformulation consisting of fast alternating steps of closed-form updates. We evaluate the methods against Expectation-Maximization in a real problem of motion segmentation from video data. 1

3 0.1437812 133 nips-2009-Learning models of object structure

Author: Joseph Schlecht, Kobus Barnard

Abstract: We present an approach for learning stochastic geometric models of object categories from single view images. We focus here on models expressible as a spatially contiguous assemblage of blocks. Model topologies are learned across groups of images, and one or more such topologies is linked to an object category (e.g. chairs). Fitting learned topologies to an image can be used to identify the object class, as well as detail its geometry. The latter goes beyond labeling objects, as it provides the geometric structure of particular instances. We learn the models using joint statistical inference over category parameters, camera parameters, and instance parameters. These produce an image likelihood through a statistical imaging model. We use trans-dimensional sampling to explore topology hypotheses, and alternate between Metropolis-Hastings and stochastic dynamics to explore instance parameters. Experiments on images of furniture objects such as tables and chairs suggest that this is an effective approach for learning models that encode simple representations of category geometry and the statistics thereof, and support inferring both category and geometry on held out single view images. 1

4 0.12260006 245 nips-2009-Thresholding Procedures for High Dimensional Variable Selection and Statistical Estimation

Author: Shuheng Zhou

Abstract: Given n noisy samples with p dimensions, where n ≪ p, we show that the multistep thresholding procedure can accurately estimate a sparse vector β ∈ Rp in a linear model, under the restricted eigenvalue conditions (Bickel-Ritov-Tsybakov 09). Thus our conditions for model selection consistency are considerably weaker than what has been achieved in previous works. More importantly, this method allows very signiﬁcant values of s, which is the number of non-zero elements in the true parameter. For example, it works for cases where the ordinary Lasso would have failed. Finally, we show that if X obeys a uniform uncertainty principle and if the true parameter is sufﬁciently sparse, the Gauss-Dantzig selector (Cand` se Tao 07) achieves the ℓ2 loss within a logarithmic factor of the ideal mean square error one would achieve with an oracle which would supply perfect information about which coordinates are non-zero and which are above the noise level, while selecting a sufﬁciently sparse model. 1

5 0.10264356 208 nips-2009-Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization

Author: John Wright, Arvind Ganesh, Shankar Rao, Yigang Peng, Yi Ma

Abstract: Principal component analysis is a fundamental operation in computational data analysis, with myriad applications ranging from web search to bioinformatics to computer vision and image analysis. However, its performance and applicability in real scenarios are limited by a lack of robustness to outlying or corrupted observations. This paper considers the idealized “robust principal component analysis” problem of recovering a low rank matrix A from corrupted observations D = A + E. Here, the corrupted entries E are unknown and the errors can be arbitrarily large (modeling grossly corrupted observations common in visual and bioinformatic data), but are assumed to be sparse. We prove that most matrices A can be efﬁciently and exactly recovered from most error sign-and-support patterns by solving a simple convex program, for which we give a fast and provably convergent algorithm. Our result holds even when the rank of A grows nearly proportionally (up to a logarithmic factor) to the dimensionality of the observation space and the number of errors E grows in proportion to the total number of entries in the matrix. A by-product of our analysis is the ﬁrst proportional growth results for the related problem of completing a low-rank matrix from a small fraction of its entries. Simulations and real-data examples corroborate the theoretical results, and suggest potential applications in computer vision. 1

6 0.092299692 14 nips-2009-A Parameter-free Hedging Algorithm

7 0.086930014 223 nips-2009-Sparse Metric Learning via Smooth Optimization

8 0.086415306 147 nips-2009-Matrix Completion from Noisy Entries

9 0.085942566 207 nips-2009-Robust Nonparametric Regression with Metric-Space Valued Output

10 0.085203215 225 nips-2009-Sparsistent Learning of Varying-coefficient Models with Structural Changes

11 0.083715104 243 nips-2009-The Ordered Residual Kernel for Robust Motion Subspace Clustering

12 0.082520887 20 nips-2009-A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers

13 0.075793967 224 nips-2009-Sparse and Locally Constant Gaussian Graphical Models

14 0.073424064 211 nips-2009-Segmenting Scenes by Matching Image Composites

15 0.071319714 137 nips-2009-Learning transport operators for image manifolds

16 0.070782855 141 nips-2009-Local Rules for Global MAP: When Do They Work ?

17 0.070599109 87 nips-2009-Exponential Family Graph Matching and Ranking

18 0.069498047 31 nips-2009-An LP View of the M-best MAP problem

19 0.069414318 222 nips-2009-Sparse Estimation Using General Likelihoods and Non-Factorial Priors

20 0.068815134 58 nips-2009-Constructing Topological Maps using Markov Random Fields and Loop-Closure Detection

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.233), (1, 0.058), (2, -0.018), (3, 0.057), (4, -0.025), (5, 0.031), (6, 0.082), (7, -0.094), (8, 0.198), (9, -0.066), (10, -0.02), (11, -0.014), (12, 0.06), (13, 0.029), (14, -0.026), (15, 0.01), (16, -0.016), (17, -0.139), (18, -0.104), (19, 0.019), (20, 0.027), (21, 0.037), (22, -0.018), (23, -0.081), (24, 0.039), (25, -0.053), (26, 0.13), (27, -0.091), (28, 0.126), (29, 0.038), (30, -0.134), (31, 0.023), (32, 0.073), (33, 0.022), (34, -0.04), (35, 0.182), (36, 0.051), (37, 0.038), (38, -0.045), (39, -0.027), (40, 0.036), (41, 0.031), (42, -0.005), (43, 0.068), (44, 0.007), (45, 0.009), (46, -0.003), (47, -0.035), (48, -0.04), (49, -0.031)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93465459 1 nips-2009-$L 1$-Penalized Robust Estimation for a Class of Inverse Problems Arising in Multiview Geometry

Author: Arnak Dalalyan, Renaud Keriven

2 0.78556395 61 nips-2009-Convex Relaxation of Mixture Regression with Efficient Algorithms

Author: Novi Quadrianto, John Lim, Dale Schuurmans, Tibério S. Caetano

3 0.58979195 243 nips-2009-The Ordered Residual Kernel for Robust Motion Subspace Clustering

Author: Tat-jun Chin, Hanzi Wang, David Suter

Abstract: We present a novel and highly effective approach for multi-body motion segmentation. Drawing inspiration from robust statistical model ﬁtting, we estimate putative subspace hypotheses from the data. However, instead of ranking them we encapsulate the hypotheses in a novel Mercer kernel which elicits the potential of two point trajectories to have emerged from the same subspace. The kernel permits the application of well-established statistical learning methods for effective outlier rejection, automatic recovery of the number of motions and accurate segmentation of the point trajectories. The method operates well under severe outliers arising from spurious trajectories or mistracks. Detailed experiments on a recent benchmark dataset (Hopkins 155) show that our method is superior to other stateof-the-art approaches in terms of recovering the number of motions, segmentation accuracy, robustness against gross outliers and computational efﬁciency. 1 Introduction1 Multi-body motion segmentation concerns the separation of motions arising from multiple moving objects in a video sequence. The input data is usually a set of points on the surface of the objects which are tracked throughout the video sequence. Motion segmentation can serve as a useful preprocessing step for many computer vision applications. In recent years the case of rigid (i.e. nonarticulated) objects for which the motions could be semi-dependent on each other has received much attention [18, 14, 19, 21, 22, 17]. Under this domain the afﬁne projection model is usually adopted. Such a model implies that the point trajectories from a particular motion lie on a linear subspace of at most four, and trajectories from different motions lie on distinct subspaces. Thus multi-body motion segmentation is reduced to the problem of subspace segmentation or clustering. To realize practical algorithms, motion segmentation approaches should possess four desirable attributes: (1) Accuracy in classifying the point trajectories to the motions they respectively belong to. This is crucial for success in the subsequent vision applications, e.g. object recognition, 3D reconstruction. (2) Robustness against inlier noise (e.g. slight localization error) and gross outliers (e.g. mistracks, spurious trajectories), since getting imperfect data is almost always unavoidable in practical circumstances. (3) Ability to automatically deduce the number of motions in the data. This is pivotal to accomplish fully automated vision applications. (4) Computational efﬁciency. This is integral for the processing of video sequences which are usually large amounts of data. Recent work on multi-body motion segmentation can roughly be divided into algebraic or factorization methods [3, 19, 20], statistical methods [17, 7, 14, 6, 10] and clustering methods [22, 21, 5]. Notable approaches include Generalized PCA (GPCA) [19, 20], an algebraic method based on the idea that one can ﬁt a union of m subspaces with a set of polynomials of degree m. Statistical methods often employ concepts such random hypothesis generation [4, 17], Expectation-Maximization [14, 6] 1 This work was supported by the Australian Research Council (ARC) under the project DP0878801. 1 and geometric model selection [7, 8]. Clustering based methods [22, 21, 5] are also gaining attention due to their effectiveness. They usually include a dimensionality reduction step (e.g. manifold learning [5]) followed by a clustering of the point trajectories (e.g. via spectral clustering in [21]). A recent benchmark [18] indicated that Local Subspace Afﬁnity (LSA) [21] gave the best performance in terms of classiﬁcation accuracy, although their result was subsequently surpassed by [5, 10]. However, we argue that most of the previous approaches do not simultaneously fulﬁl the qualities desirable of motion segmentation algorithms. Most notably, although some of the approaches have the means to estimate the number of motions, they are generally unreliable in this respect and require manual input of this parameter. In fact this prior knowledge was given to all the methods compared in [18]2 . Secondly, most of the methods (e.g. [19, 5]) do not explicitly deal with outliers. They will almost always breakdown when given corrupted data. These deﬁciencies reduce the usefulness of available motion segmentation algorithms in practical circumstances. In this paper we attempt to bridge the gap between experimental performance and practical usability. Our previous work [2] indicates that robust multi-structure model ﬁtting can be achieved effectively with statistical learning. Here we extend this concept to motion subspace clustering. Drawing inspiration from robust statistical model ﬁtting [4], we estimate random hypotheses of motion subspaces in the data. However, instead of ranking these hypotheses we encapsulate them in a novel Mercer kernel. The kernel can function reliably despite overwhelming sampling imbalance, and it permits the application of non-linear dimensionality reduction techniques to effectively identify and reject outlying trajectories. This is then followed by Kernel PCA [11] to maximize the separation between groups and spectral clustering [13] to recover the number of motions and clustering. Experiments on the Hopkins 155 benchmark dataset [18] show that our method is superior to other approaches in terms of the qualities described above, including computational efﬁciency. 1.1 Brief review of afﬁne model multi-body motion segmentation Let {tf p ∈ R2 }f =1,...,F be the set of 2D coordinates of P trajectories tracked across F frames. In p=1,...,P multi-body motion segmentation the tf p ’s correspond to points on the surface of rigid objects which are moving. The goal is to separate the trajectories into groups corresponding to the motion they belong to. In other words, if we arrange the coordinates in the following data matrix   t11 · · · t1P  . .  ∈ R2F ×P , .. .  T= . (1) . . . tF 1 . . . tF P the goal is to ﬁnd the permutation Γ ∈ RP ×P such that the columns of T · Γ are arranged according to the respective motions they belong to. It turns out that under afﬁne projection [1, 16] trajectories from the same motion lie on a distinct subspace in R2F , and each of these motion subspaces is of dimensions 2, 3 or 4. Thus motion segmentation can be accomplished via clustering subspaces in R2F . See [1, 16] for more details. Realistically actual motion sequences might contain trajectories which do not correspond to valid objects or motions. These trajectories behave as outliers in the data and, if not taken into account, can be seriously detrimental to subspace clustering algorithms. 2 The Ordered Residual Kernel (ORK) First, we take a statistical model ﬁtting point of view to motion segmentation. Let {xi }i=1,...,N be the set of N samples on which we want to perform model ﬁtting. We randomly draw p-subsets from the data and use it to ﬁt a hypothesis of the model, where p is the number of parameters that deﬁne the model. In motion segmentation, the xi ’s are the columns of matrix T, and p = 4 since the model is a four-dimensional subspace3 . Assume that M of such random hypotheses are drawn. i i For each data point xi compute its absolute residual set ri = {r1 , . . . , rM } as measured to the M hypotheses. For motion segmentation, the residual is the orthogonal distance to a hypothesis 2 As conﬁrmed through private contact with the authors of [18]. Ideally we should also consider degenerate motions with subspace dimensions 2 or 3, but previous work [18] using RANSAC [4] and our results suggest this is not a pressing issue for the Hopkins 155 dataset. 3 2 i i subspace. We sort the elements in ri to obtain the sorted residual set ˜i = {rλi , . . . , rλi }, where r 1 M i i the permutation {λi , . . . , λi } is obtained such that rλi ≤ · · · ≤ rλi . Deﬁne the following 1 M 1 M ˜ θi := {λi , . . . , λi } 1 M (2) ˜ as the sorted hypothesis set of point xi , i.e. θi depicts the order in which xi becomes the inlier of the M hypotheses as a ﬁctitious inlier threshold is increased from 0 to ∞. We deﬁne the Ordered Residual Kernel (ORK) between two data points as 1 kr (xi1 , xi2 ) := ˜ Z M/h t ˜ ˜ zt · k∩ (θi1 , θi2 ), (3) t=1 M/h where zt = 1 are the harmonic series and Z = t=1 zt is the (M/h)-th harmonic number. t Without lost of generality assume that M is wholly divisible by h. Step size h is used to obtain the Difference of Intersection Kernel (DOIK) 1 ˜1:α t ˜ ˜ ˜1:α ˜1:α ˜1:α k∩ (θi1 , θi2 ) := (|θi1 t ∩ θi2 t | − |θi1 t−1 ∩ θi2 t−1 |) (4) h ˜a:b where αt = t · h and αt−1 = (t − 1) · h. Symbol θi indicates the set formed by the a-th to ˜i . Since the contents of the sorted hypotheses set are merely permutations of the b-th elements of θ {1 . . . M }, i.e. there are no repeating elements, 0 ≤ kr (xi1 , xi2 ) ≤ 1. ˜ (5) Note that kr is independent of the type of model to be ﬁtted, thus it is applicable to generic statistical ˜ model ﬁtting problems. However, we concentrate on motion subspaces in this paper. Let τ be a ﬁctitious inlier threshold. The kernel kr captures the intuition that, if τ is low, two ˜ points arising from the same subspace will have high normalized intersection since they share many common hypotheses which correspond to that subspace. If τ is high, implausible hypotheses ﬁtted on outliers start to dominate and decrease the normalized intersection. Step size h allows us to quantify the rate of change of intersection if τ is increased from 0 to ∞, and since zt is decreasing, kr will evaluate to a high value for two points from the same subspace. In contrast, kr is always low ˜ ˜ for points not from the same subspace or that are outliers. Proof of satisfying Mercer’s condition. Let D be a ﬁxed domain, and P(D) be the power set of D, i.e. the set of all subsets of D. Let S ⊆ P(D), and p, q ∈ S. If µ is a measure on D, then k∩ (p, q) = µ(p ∩ q), (6) called the intersection kernel, is provably a valid Mercer kernel [12]. The DOIK can be rewritten as t ˜ ˜ k∩ (θi1 , θi2 ) = 1 ˜(αt−1 +1):αt ˜(αt−1 +1):αt (|θ ∩ θi2 | h i1 ˜1:(α ) ˜(α +1):αt | + |θ (αt−1 +1):αt ∩ θ 1:(αt−1 ) |). ˜ ˜ +|θi1 t−1 ∩ θi2 t−1 i1 i2 (7) If we let D = {1 . . . M } be the set of all possible hypothesis indices and µ be uniform on D, each term in Eq. (7) is simply an intersection kernel multiplied by |D|/h. Since multiplying a kernel with a positive constant and adding two kernels respectively produce valid Mercer kernels [12], the DOIK and ORK are also valid Mercer kernels.• Parameter h in kr depends on the number of random hypotheses M , i.e. step size h can be set as a ˜ ratio of M . The value of M can be determined based on the size of the p-subset and the size of the data N (e.g. [23, 15]), and thus h is not contingent on knowledge of the true inlier noise scale or threshold. Moreover, our experiments in Sec. 4 show that segmentation performance is relatively insensitive to the settings of h and M . 2.1 Performance under sampling imbalance Methods based on random sampling (e.g. RANSAC [4]) are usually affected by unbalanced datasets. The probability of simultaneously retrieving p inliers from a particular structure is tiny if points 3 from that structure represent only a small minority in the data. In an unbalanced dataset the “pure” p-subsets in the M randomly drawn samples will be dominated by points from the majority structure in the data. This is a pronounced problem in motion sequences, since there is usually a background “object” whose point trajectories form a large majority in the data. In fact, for motion sequences from the Hopkins 155 dataset [18] with typically about 300 points per sequence, M has to be raised to about 20,000 before a pure p-subset from the non-background objects is sampled. However, ORK can function reliably despite serious sampling imbalance. This is because points from the same subspace are roughly equi-distance to the sampled hypotheses in their vicinity, even though these hypotheses might not pass through that subspace. Moreover, since zt in Eq. (3) is decreasing only residuals/hypotheses in the vicinity of a point are heavily weighted in the intersection. Fig. 1(a) illustrates this condition. Results in Sec. 4 show that ORK excelled even with M = 1, 000. (a) Data in R2F . (b) Data in RKHS Fkr . ˜ Figure 1: (a) ORK under sampling imbalance. (b) Data in RKHS induced by ORK. 3 Multi-Body Motion Segmentation using ORK In this section, we describe how ORK is used for multi-body motion segmentation. 3.1 Outlier rejection via non-linear dimensionality reduction Denote by Fkr the Reproducing Kernel Hilbert Space (RKHS) induced by kr . Let matrix A = ˜ ˜ [φ(x1 ) . . . φ(xN )] contain the input data after it is mapped to Fkr . The kernel matrix K = AT A is ˜ computed using the kernel function kr as ˜ Kp,q = φ(xp ), φ(xq ) = kr (xp , xq ), p, q ∈ {1 . . . N }. ˜ (8) Since kr is a valid Mercer kernel, K is guaranteed to be positive semi-deﬁnite [12]. Let K = ˜ Q∆QT be the eigenvalue decomposition (EVD) of K. Then the rank-n Kernel Singular Value Decomposition (Kernel SVD) [12] of A is 1 1 An = [AQn (∆n )− 2 ][(∆n ) 2 ][(Qn )T ] ≡ Un Σn (Vn )T . n n (9) n Via the Matlab notation, Q = Q:,1:n and ∆ = ∆1:n,1:n . The left singular vectors U is an orthonormal basis for the n-dimensional principal subspace of the whole dataset in Fkr . Projecting ˜ the data onto the principal subspace yields 1 1 B = [AQn (∆n )− 2 ]T A = (∆n ) 2 (Qn )T , (10) n×N where B = [b1 . . . bN ] ∈ R is the reduced dimension version of A. Directions of the principal subspace are dominated by inlier points, since kr evaluates to a high value generally for them, but ˜ always to a low value for gross outliers. Moreover the kernel ensures that points from the same subspace are mapped to the same cluster and vice versa. Fig. 1(b) illustrates this condition. Fig. 2(a)(left) shows the ﬁrst frame of sequence “Cars10” from the Hopkins 155 dataset [18] with 100 false trajectories of Brownian motion added to the original data (297 points). The corresponing RKHS norm histogram for n = 3 is displayed in Fig. 2(b). The existence of two distinct modes, 4 15 Outlier mode Bin count Inlier mode 10 5 0 (a) (left) Before and (right) after outlier removal. Blue dots are inliers while red dots are added outliers. 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 Vector norm in principal subspace 0.18 0.2 (b) Actual norm histogram of “cars10”. Figure 2: Demonstration of outlier rejection on sequence “cars10” from Hopkins 155. corresponding respectively to inliers and outliers, is evident. We exploit this observation for outlier rejection by discarding data with low norms in the principal subspace. The cut-off threshold ψ can be determined by analyzing the shape of the distribution. For instance we can ﬁt a 1D Gaussian Mixture Model (GMM) with two components and set ψ as the point of equal Mahalanobis distance between the two components. However, our experimentation shows that an effective threshold can be obtained by simply setting ψ as the average value of all the norms, i.e. ψ= 1 N N bi . (11) i=1 This method was applied uniformly on all the sequences in our experiments in Sec. 4. Fig. 2(a)(right) shows an actual result of the method on Fig. 2(a)(left). 3.2 Recovering the number of motions and subspace clustering After outlier rejection, we further take advantage of the mapping induced by ORK for recovering the number of motions and subspace clustering. On the remaining data, we perform Kernel PCA [11] to seek the principal components which maximize the variance of the data in the RKHS, as Fig. 1(b) illustrates. Let {yi }i=1,...,N ′ be the N ′ -point subset of the input data that remains after outlier removal, where N ′ < N . Denote by C = [φ(y1 ) . . . φ(yN ′ )] the data matrix after mapping the data ˜ to Fkr , and by symbol C the result of adjusting C with the empirical mean of {φ(y1 ), . . . , φ(yN ′ )}. ˜ ˜ ˜ ˜ The centered kernel matrix K′ = CT C [11] can be obtained as 1 ˜ K′ = ν T K′ ν, ν = [IN ′ − ′ 1N ′ ,N ′ ], (12) N where K′ = CT C is the uncentered kernel matrix, Is and 1s,s are respectively the s × s identity ˜ ˜ matrix and a matrix of ones. If K′ = RΩRT is the EVD of K′ , then we obtain ﬁrst-m kernel m ˜ principal components P of C as the ﬁrst-m left singular vectors of C , i.e. 1 ˜ Pm = CRm (Ωm )− 2 , (13) where Rm = R:,1:m and Ω1:m,1:m ; see Eq. (9). Projecting the data on the principal components yields 1 D = [d1 . . . dN ′ ] = (Ωm ) 2 (Rm )T , (14) ′ where D ∈ Rm×N . The afﬁne subspace span(Pm ) maximizes the spread of the centered data in the RKHS, and the projection D offers an effective representation for clustering. Fig. 3(a) shows the Kernel PCA projection results for m = 3 on the sequence in Fig. 2(a). The number of clusters in D is recovered via spectral clustering. More speciﬁcally we apply the Normalized Cut (Ncut) [13] algorithm. A fully connected graph is ﬁrst derived from the data, where ′ ′ its weighted adjacency matrix W ∈ RN ×N is obtained as Wp,q = exp(− dp − dq 2 /2δ 2 ), (15) and δ is taken as the average nearest neighbour distance in the Euclidean sense among the vectors in D. The Laplacian matrix [13] is then derived from W and eigendecomposed. Under Ncut, 5 0.1 0.05 0 −0.05 −0.1 0.1 −0.15 0.15 0.08 0.1 0.05 0 −0.05 −0.1 0.06 (a) Kernel PCA and Ncut results. (b) W matrix. (c) Final result for “cars10”. Figure 3: Actual results on the motion sequence in Fig. 2(a)(left). the number of clusters is revealed as the number of eigenvalues of the Laplacian that are zero or numerically insigniﬁcant. With this knowledge, a subsequent k-means step is then performed to cluster the points. Fig. 3(b) shows W for the input data in Fig. 2(a)(left) after outlier removal. It can be seen that strong afﬁnity exists between points from the same cluster, thus allowing accurate clustering. Figs. 3(a) and 3(c) illustrate the ﬁnal clustering result for the data in Fig. 2(a)(left). There are several reasons why spectral clustering under our framework is more successful than previous methods. Firstly, we perform an effective outlier rejection step that removes bad trajectories that can potentially mislead the clustering. Secondly, the mapping induced by ORK deliberately separates the trajectories based on their cluster membership. Finally, we perform Kernel PCA to maximize the variance of the data. Effectively this also improves the separation of clusters, thus facilitating an accurate recovery of the number of clusters and also the subsequent segmentation. This distinguishes our work from previous clustering based methods [21, 5] which tend to operate without maximizing the between-class scatter. Results in Sec. 4 validate our claims. 4 Results Henceforth we indicate the proposed method as “ORK”. We leverage on a recently published benchmark on afﬁne model motion segmentation [18] as a basis of comparison. The benchmark was evaluated on the Hopkins 155 dataset4 which contains 155 sequences with tracked point trajectories. A total of 120 sequences have two motions while 35 have three motions. The sequences contain degenerate and non-degenerate motions, independent and partially dependent motions, articulated motions, nonrigid motions etc. In terms of video content three categories exist: Checkerboard sequences, trafﬁc sequences (moving cars, trucks) and articulated motions (moving faces, people). 4.1 Details on benchmarking Four major algorithms were compared in [18]: Generalized PCA (GPCA) [19], Local Subspace Afﬁnity (LSA) [21], Multi-Stage Learning (MSL) [14] and RANSAC [17]. Here we extend the benchmark with newly reported results from Locally Linear Manifold Clustering (LLMC) [5] and Agglomerative Lossy Compression (ALC) [10, 9]. We also compare our method against Kanatani and Matsunaga’s [8] algorithm (henceforth, the “KM” method) in estimating the number of independent motions in the video sequences. Note that KM per se does not perform motion segmentation. For the sake of objective comparisons we use only implementations available publicly5. Following [18], motion segmentation performance is evaluated in terms of the labelling error of the point trajectories, where each point in a sequence has a ground truth label, i.e. number of mislabeled points . (16) classiﬁcation error = total number of points Unlike [18], we also emphasize on the ability of the methods in recovering the number of motions. However, although the methods compared in [18] (except RANSAC) theoretically have the means to 4 Available at http://www.vision.jhu.edu/data/hopkins155/. For MSL and KM, see http://www.suri.cs.okayama-u.ac.jp/e-program-separate.html/. For GPCA, LSA and RANSAC, refer to the url for the Hopkins 155 dataset. 5 6 do so, their estimation of the number of motions is generally unrealiable and the benchmark results in [18] were obtained by revealing the actual number of motions to the algorithms. A similar initialization exists in [5, 10] where the results were obtained by giving LLMC and ALC this knowledge a priori (for LLMC, this was given at least to the variant LLMC 4m during dimensionality reduction [5], where m is the true number of motions). In the following subsections, where variants exist for the compared algorithms we use results from the best performing variant. In the following the number of random hypotheses M and step size h for ORK are ﬁxed at 1000 and 300 respectively, and unlike the others, ORK is not given knowledge of the number of motions. 4.2 Data without gross outliers We apply ORK on the Hopkins 155 dataset. Since ORK uses random sampling we repeat it 100 times for each sequence and average the results. Table 1 depicts the obtained classiﬁcation error among those from previously proposed methods. ORK (column 9) gives comparable results to the other methods for sequences with 2 motions (mean = 7.83%, median = 0.41%). For sequences with 3 motions, ORK (mean = 12.62%, median = 4.75%) outperforms GPCA and RANSAC, but is slightly less accurate than the others. However, bear in mind that unlike the other methods ORK is not given prior knowledge of the true number of motions and has to estimate this independently. Column Method 1 REF 2 GPCA Mean Median 2.03 0.00 4.59 0.38 Mean Median 5.08 2.40 28.66 28.26 3 4 5 6 LSA MSL RANSAC LLMC Sequences with 2 motions 3.45 4.14 5.56 3.62 0.59 0.00 1.18 0.00 Sequences with 3 motions 9.73 8.23 22.94 8.85 2.33 1.76 22.03 3.19 8 ALC 9 ORK 10 ORK∗ 3.03 0.00 7.83 0.41 1.27 0.00 6.26 1.02 12.62 4.75 2.09 0.05 Table 1: Classiﬁcation error (%) on Hopkins 155 sequences. REF represents the reference/control method which operates based on knowledge of ground truth segmentation. Refer to [18] for details. We also separately investigate the accuracy of ORK in estimating the number of motions, and compare it against KM [8] which was proposed for this purpose. Note that such an experiment was not attempted in [18] since approaches compared therein generally do not perform reliably in estimating the number of motions. The results in Table 2 (columns 1–2) show that for sequences with two motions, KM (80.83%) outperforms ORK (67.37%) by ≈ 15 percentage points. However, for sequences with three motions, ORK (49.66%) vastly outperforms KM (14.29%) by more than doubling the percentage points of accuracy. The overall accuracy of KM (65.81%) is slightly better than ORK (63.37%), but this is mostly because sequences with two motions form the majority in the dataset (120 out of 155). This leads us to conclude that ORK is actually the superior method here. Dataset Column Method 2 motions 3 motions Overall Hopkins 155 1 2 KM ORK 80.83% 67.37% 14.29% 49.66% 65.81% 63.37% Hopkins 155 + Outliers 3 4 KM ORK 00.00% 47.58% 100.00% 50.00% 22.58% 48.13% Table 2: Accuracy in determining the number of motions in a sequence. Note that in the experiment with outliers (columns 3–4), KM returns a constant number of 3 motions for all sequences. We re-evaluate the performance of ORK by considering only results on sequences where the number of motions is estimated correctly by ORK (there are about 98 ≡ 63.37% of such cases). The results are tabulated under ORK∗ (column 10) in Table 1. It can be seen that when ORK estimates the number of motions correctly, it is signiﬁcantly more accurate than the other methods. Finally, we compare the speed of the methods in Table 3. ORK was implemented and run in Matlab on a Dual Core Pentium 3.00GHz machine with 4GB of main memory (this is much less powerful 7 than the 8 Core Xeon 3.66GHz with 32GB memory used in [18] for the other methods in Table 3). The results show that ORK is comparable to LSA, much faster than MSL and ALC, but slower than GPCA and RANSAC. Timing results of LLMC are not available in the literature. Method 2 motions 3 motions GPCA 324ms 738ms LSA 7.584s 15.956s MSL 11h 4m 1d 23h RANSAC 175ms 258ms ALC 10m 32s 10m 32s ORK 4.249s 8.479s Table 3: Average computation time on Hopkins 155 sequences. 4.3 Data with gross outliers We next examine the ability of the proposed method in dealing with gross outliers in motion data. For each sequence in Hopkins 155, we add 100 gross outliers by creating trajectories corresponding to mistracks or spuriously occuring points. These are created by randomly initializing 100 locations in the ﬁrst frame and allowing them to drift throughout the sequence according to Brownian motion. The corrupted sequences are then subjected to the algorithms for motion segmentation. Since only ORK is capable of rejecting outliers, the classiﬁcation error of Eq. (16) is evaluated on the inlier points only. The results in Table 4 illustrate that ORK (column 4) is the most accurate method by a large margin. Despite being given the true number of motions a priori, GPCA, LSA and RANSAC are unable to provide satisfactory segmentation results. Column Method Mean Median Mean Median 1 2 3 4 GPCA LSA RANSAC ORK Sequences with 2 motions 28.66 24.25 30.64 16.50 30.96 26.51 32.36 10.54 Sequences with 3 motions 40.61 30.94 42.24 19.99 41.30 27.68 43.43 8.49 5 ORK∗ 1.62 0.00 2.68 0.09 Table 4: Classiﬁcation error (%) on Hopkins 155 sequences with 100 gross outliers per sequence. In terms of estimating the number of motions, as shown in column 4 in Table 2 the overall accuracy of ORK is reduced to 48.13%. This is contributed mainly by the deterioration in accuracy on sequences with two motions (47.58%), although the accuracy on sequences with three motions are maintained (50.00%). This is not a surprising result, since sequences with three motions generally have more (inlying) point trajectories than sequences with two motions, thus the outlier rates for sequences with three motions are lower (recall that a ﬁxed number of 100 false trajectories are added). On the other hand, the KM method (column 3) is completely overwhelmed by the outliers— for all the sequences with outliers it returned a constant “3” as the number of motions. We again re-evaluate ORK by considering results from sequences (now with gross outliers) where the number of motions is correctly estimated (there are about 75 ≡ 48.13% of such cases). The results tabulated under ORK∗ (column 5) in Table 4 show that the proposed method can accurately segment the point trajectories without being inﬂuenced by the gross outliers. 5 Conclusions In this paper we propose a novel and highly effective approach for multi-body motion segmentation. Our idea is based on encapsulating random hypotheses in a novel Mercel kernel and statistical learning. We evaluated our method on the Hopkins 155 dataset with results showing that the idea is superior other state-of-the-art approaches. It is by far the most accurate in terms of estimating the number of motions, and it excels in segmentation accuracy despite lacking prior knowledge of the number of motions. The proposed idea is also highly robust towards outliers in the input data. Acknowledgements. We are grateful to the authors of [18] especially Ren´ Vidal for discussions e and insights which have been immensely helpful. 8 References [1] T. Boult and L. Brown. Factorization-based segmentation of motions. In IEEE Workshop on Motion Understanding, 1991. [2] T.-J. Chin, H. Wang, and D. Suter. Robust ﬁtting of multiple structures: The statistical learning approach. In ICCV, 2009. [3] J. Costeira and T. Kanade. A multibody factorization method for independently moving objects. IJCV, 29(3):159–179, 1998. [4] M. A. Fischler and R. C. Bolles. Random sample concensus: A paradigm for model ﬁtting with applications to image analysis and automated cartography. Comm. of the ACM, 24:381–395, 1981. [5] A. Goh and R. Vidal. Segmenting motions of different types by unsupervised manifold clustering. In CVPR, 2007. [6] A. Gruber and Y. Weiss. Multibody factorization with uncertainty and missing data using the EM algorithm. In CVPR, 2004. [7] K. Kanatani. Motion segmentation by subspace separation and model selection. In ICCV, 2001. [8] K. Kanatani and C. Matsunaga. Estimating the number of independent motions for multibody segmentation. In ACCV, 2002. [9] Y. Ma, H. Derksen, W. Hong, and J. Wright. Segmentation of multivariate mixed data via lossy coding and compression. TPAMI, 29(9):1546–1562, 2007. [10] S. Rao, R. Tron, Y. Ma, and R. Vidal. Motion segmentation via robust subspace separation in the presence of outlying, incomplete, or corrupted trajectories. In CVPR, 2008. [11] B. Sch¨ lkopf, A. Smola, and K. R. M¨ ller. Nonlinear component analysis as a kernel eigeno u value problem. Neural Computation, 10:1299–1319, 1998. [12] J. Shawe-Taylor and N. Cristianini. Kernel methods for pattern analysis. Cambridge University Press, 2004. [13] J. Shi and J. Malik. Normalized cuts and image segmentation. TPAMI, 22(8):888–905, 2000. [14] Y. Sugaya and K. Kanatani. Geometric structure of degeneracy for multi-body motion segmentation. In Workshop on Statistical Methods in Video Processing, 2004. [15] R. Toldo and A. Fusiello. Robust multiple structures estimation with J-Linkage. In ECCV, 2008. [16] C. Tomasi and T. Kanade. Shape and motion from image streams under orthography. IJCV, 9(2):137–154, 1992. [17] P. Torr. Geometric motion segmentation and model selection. Phil. Trans. Royal Society of London, 356(1740):1321–1340, 1998. [18] R. Tron and R. Vidal. A benchmark for the comparison of 3-D motion segmentation algorithms. In CVPR, 2007. [19] R. Vidal and R. Hartley. Motion segmentation with missing data by PowerFactorization and Generalized PCA. In CVPR, 2004. [20] R. Vidal, Y. Ma, and S. Sastry. Generalized Principal Component Analysis (GPCA). TPAMI, 27(12):1–15, 2005. [21] J. Yan and M. Pollefeys. A general framework for motion segmentation: independent, articulated, rigid, non-rigid, degenerate and non-degenerate. In ECCV, 2006. [22] L. Zelnik-Manor and M. Irani. Degeneracies, dependencies and their implications on multibody and multi-sequence factorization. In CVPR, 2003. [23] W. Zhang and J. Koseck´ . Nonparametric estimation of multiple structures with outliers. In a Dynamical Vision, ICCV 2005 and ECCV 2006 Workshops, 2006. 9

4 0.54718268 79 nips-2009-Efficient Recovery of Jointly Sparse Vectors

Author: Liang Sun, Jun Liu, Jianhui Chen, Jieping Ye

Abstract: We consider the reconstruction of sparse signals in the multiple measurement vector (MMV) model, in which the signal, represented as a matrix, consists of a set of jointly sparse vectors. MMV is an extension of the single measurement vector (SMV) model employed in standard compressive sensing (CS). Recent theoretical studies focus on the convex relaxation of the MMV problem based on the (2, 1)-norm minimization, which is an extension of the well-known 1-norm minimization employed in SMV. However, the resulting convex optimization problem in MMV is signiﬁcantly much more difﬁcult to solve than the one in SMV. Existing algorithms reformulate it as a second-order cone programming (SOCP) or semideﬁnite programming (SDP) problem, which is computationally expensive to solve for problems of moderate size. In this paper, we propose a new (dual) reformulation of the convex optimization problem in MMV and develop an efﬁcient algorithm based on the prox-method. Interestingly, our theoretical analysis reveals the close connection between the proposed reformulation and multiple kernel learning. Our simulation studies demonstrate the scalability of the proposed algorithm.

5 0.53753257 36 nips-2009-Asymptotic Analysis of MAP Estimation via the Replica Method and Compressed Sensing

Author: Sundeep Rangan, Vivek Goyal, Alyson K. Fletcher

Abstract: The replica method is a non-rigorous but widely-accepted technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. This paper applies the replica method to non-Gaussian maximum a posteriori (MAP) estimation. It is shown that with random linear measurements and Gaussian noise, the asymptotic behavior of the MAP estimate of an n-dimensional vector “decouples” as n scalar MAP estimators. The result is a counterpart to Guo and Verd´ ’s replica analysis of minimum mean-squared error estimation. u The replica MAP analysis can be readily applied to many estimators used in compressed sensing, including basis pursuit, lasso, linear estimation with thresholding, and zero norm-regularized estimation. In the case of lasso estimation the scalar estimator reduces to a soft-thresholding operator, and for zero normregularized estimation it reduces to a hard-threshold. Among other beneﬁts, the replica method provides a computationally-tractable method for exactly computing various performance metrics including mean-squared error and sparsity pattern recovery probability.

6 0.52069342 208 nips-2009-Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization

7 0.50415778 147 nips-2009-Matrix Completion from Noisy Entries

8 0.49613556 245 nips-2009-Thresholding Procedures for High Dimensional Variable Selection and Statistical Estimation

9 0.49601981 88 nips-2009-Extending Phase Mechanism to Differential Motion Opponency for Motion Pop-out

10 0.48288682 185 nips-2009-Orthogonal Matching Pursuit From Noisy Random Measurements: A New Analysis

11 0.47219026 224 nips-2009-Sparse and Locally Constant Gaussian Graphical Models

12 0.46436161 20 nips-2009-A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers

13 0.45590681 180 nips-2009-On the Convergence of the Concave-Convex Procedure

14 0.45302027 222 nips-2009-Sparse Estimation Using General Likelihoods and Non-Factorial Priors

15 0.44666678 225 nips-2009-Sparsistent Learning of Varying-coefficient Models with Structural Changes

16 0.43666521 14 nips-2009-A Parameter-free Hedging Algorithm

17 0.43378854 103 nips-2009-Graph Zeta Function in the Bethe Free Energy and Loopy Belief Propagation

18 0.42349896 223 nips-2009-Sparse Metric Learning via Smooth Optimization

19 0.417689 141 nips-2009-Local Rules for Global MAP: When Do They Work ?

20 0.41678709 137 nips-2009-Learning transport operators for image manifolds

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(7, 0.015), (24, 0.039), (25, 0.135), (26, 0.201), (35, 0.067), (36, 0.101), (39, 0.026), (57, 0.012), (58, 0.14), (61, 0.027), (71, 0.042), (81, 0.034), (86, 0.079), (91, 0.017)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.87063748 1 nips-2009-$L 1$-Penalized Robust Estimation for a Class of Inverse Problems Arising in Multiview Geometry

Author: Arnak Dalalyan, Renaud Keriven

2 0.86401552 173 nips-2009-Nonparametric Greedy Algorithms for the Sparse Learning Problem

Author: Han Liu, Xi Chen

Abstract: This paper studies the forward greedy strategy in sparse nonparametric regression. For additive models, we propose an algorithm called additive forward regression; for general multivariate models, we propose an algorithm called generalized forward regression. Both algorithms simultaneously conduct estimation and variable selection in nonparametric settings for the high dimensional sparse learning problem. Our main emphasis is empirical: on both simulated and real data, these two simple greedy methods can clearly outperform several state-ofthe-art competitors, including LASSO, a nonparametric version of LASSO called the sparse additive model (SpAM) and a recently proposed adaptive parametric forward-backward algorithm called Foba. We also provide some theoretical justiﬁcations of speciﬁc versions of the additive forward regression. 1

3 0.75370795 254 nips-2009-Variational Gaussian-process factor analysis for modeling spatio-temporal data

Author: Jaakko Luttinen, Alexander T. Ihler

Abstract: We present a probabilistic factor analysis model which can be used for studying spatio-temporal datasets. The spatial and temporal structure is modeled by using Gaussian process priors both for the loading matrix and the factors. The posterior distributions are approximated using the variational Bayesian framework. High computational cost of Gaussian process modeling is reduced by using sparse approximations. The model is used to compute the reconstructions of the global sea surface temperatures from a historical dataset. The results suggest that the proposed model can outperform the state-of-the-art reconstruction systems.

4 0.74061251 214 nips-2009-Semi-supervised Regression using Hessian energy with an application to semi-supervised dimensionality reduction

Author: Kwang I. Kim, Florian Steinke, Matthias Hein

Abstract: Semi-supervised regression based on the graph Laplacian suffers from the fact that the solution is biased towards a constant and the lack of extrapolating power. Based on these observations, we propose to use the second-order Hessian energy for semi-supervised regression which overcomes both these problems. If the data lies on or close to a low-dimensional submanifold in feature space, the Hessian energy prefers functions whose values vary linearly with respect to geodesic distance. We ﬁrst derive the Hessian energy for smooth manifolds and continue to give a stable estimation procedure for the common case where only samples of the underlying manifold are given. The preference of ‘’linear” functions on manifolds renders the Hessian energy particularly suited for the task of semi-supervised dimensionality reduction, where the goal is to ﬁnd a user-deﬁned embedding function given some labeled points which varies smoothly (and ideally linearly) along the manifold. The experimental results suggest superior performance of our method compared with semi-supervised regression using Laplacian regularization or standard supervised regression techniques applied to this task. 1

5 0.73708689 37 nips-2009-Asymptotically Optimal Regularization in Smooth Parametric Models

Author: Percy Liang, Guillaume Bouchard, Francis R. Bach, Michael I. Jordan

Abstract: Many types of regularization schemes have been employed in statistical learning, each motivated by some assumption about the problem domain. In this paper, we present a uniﬁed asymptotic analysis of smooth regularizers, which allows us to see how the validity of these assumptions impacts the success of a particular regularizer. In addition, our analysis motivates an algorithm for optimizing regularization parameters, which in turn can be analyzed within our framework. We apply our analysis to several examples, including hybrid generative-discriminative learning and multi-task learning. 1

6 0.73683846 243 nips-2009-The Ordered Residual Kernel for Robust Motion Subspace Clustering

7 0.73555219 225 nips-2009-Sparsistent Learning of Varying-coefficient Models with Structural Changes

8 0.73550171 158 nips-2009-Multi-Label Prediction via Sparse Infinite CCA

9 0.73342502 228 nips-2009-Speeding up Magnetic Resonance Image Acquisition by Bayesian Multi-Slice Adaptive Compressed Sensing

10 0.73179781 19 nips-2009-A joint maximum-entropy model for binary neural population patterns and continuous signals

11 0.73161119 162 nips-2009-Neural Implementation of Hierarchical Bayesian Inference by Importance Sampling

12 0.73086625 100 nips-2009-Gaussian process regression with Student-t likelihood

13 0.73065692 229 nips-2009-Statistical Analysis of Semi-Supervised Learning: The Limit of Infinite Unlabelled Data

14 0.73057789 174 nips-2009-Nonparametric Latent Feature Models for Link Prediction

15 0.73005027 97 nips-2009-Free energy score space

16 0.7288124 113 nips-2009-Improving Existing Fault Recovery Policies

17 0.72856176 195 nips-2009-Probabilistic Relational PCA

18 0.72847468 61 nips-2009-Convex Relaxation of Mixture Regression with Efficient Algorithms

19 0.72614491 169 nips-2009-Nonlinear Learning using Local Coordinate Coding

20 0.72528946 35 nips-2009-Approximating MAP by Compensating for Structural Relaxations