nips nips2009 nips2009-208 knowledge-graph by maker-knowledge-mining

208 nips-2009-Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization


Source: pdf

Author: John Wright, Arvind Ganesh, Shankar Rao, Yigang Peng, Yi Ma

Abstract: Principal component analysis is a fundamental operation in computational data analysis, with myriad applications ranging from web search to bioinformatics to computer vision and image analysis. However, its performance and applicability in real scenarios are limited by a lack of robustness to outlying or corrupted observations. This paper considers the idealized “robust principal component analysis” problem of recovering a low rank matrix A from corrupted observations D = A + E. Here, the corrupted entries E are unknown and the errors can be arbitrarily large (modeling grossly corrupted observations common in visual and bioinformatic data), but are assumed to be sparse. We prove that most matrices A can be efficiently and exactly recovered from most error sign-and-support patterns by solving a simple convex program, for which we give a fast and provably convergent algorithm. Our result holds even when the rank of A grows nearly proportionally (up to a logarithmic factor) to the dimensionality of the observation space and the number of errors E grows in proportion to the total number of entries in the matrix. A by-product of our analysis is the first proportional growth results for the related problem of completing a low-rank matrix from a small fraction of its entries. Simulations and real-data examples corroborate the theoretical results, and suggest potential applications in computer vision. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 However, its performance and applicability in real scenarios are limited by a lack of robustness to outlying or corrupted observations. [sent-4, score-0.265]

2 This paper considers the idealized “robust principal component analysis” problem of recovering a low rank matrix A from corrupted observations D = A + E. [sent-5, score-0.745]

3 Here, the corrupted entries E are unknown and the errors can be arbitrarily large (modeling grossly corrupted observations common in visual and bioinformatic data), but are assumed to be sparse. [sent-6, score-0.716]

4 We prove that most matrices A can be efficiently and exactly recovered from most error sign-and-support patterns by solving a simple convex program, for which we give a fast and provably convergent algorithm. [sent-7, score-0.465]

5 Our result holds even when the rank of A grows nearly proportionally (up to a logarithmic factor) to the dimensionality of the observation space and the number of errors E grows in proportion to the total number of entries in the matrix. [sent-8, score-0.637]

6 A by-product of our analysis is the first proportional growth results for the related problem of completing a low-rank matrix from a small fraction of its entries. [sent-9, score-0.291]

7 In other words, if we stack all the observations as column vectors of a matrix M ∈ Rm×n , the matrix should be (approximately) low rank. [sent-16, score-0.267]

8 It enjoys a number of optimality properties when the data are only mildly corrupted by small noise, and can be stably and efficiently computed via the singular value decomposition. [sent-18, score-0.383]

9 1 One major shortcoming of classical PCA is its brittleness with respect to grossly corrupted or outlying observations [5]. [sent-25, score-0.354]

10 Gross errors are ubiquitous in modern applications in imaging and bioinformatics, where some measurements may be arbitrarily corrupted (e. [sent-26, score-0.338]

11 1 In this paper, we consider an idealization of the robust PCA problem, in which the goal is to recover a low-rank matrix A from highly corrupted measurements D = A + E. [sent-32, score-0.549]

12 The errors E can be arbitrary in magnitude, but are assumed to be sparsely supported, affecting only a fraction of the entries of D. [sent-33, score-0.303]

13 In that setting, classical PCA, computed via the singular value decomposition, remains optimal if the noise is Gaussian. [sent-35, score-0.122]

14 Here, on the other hand, even a small fraction of large errors can cause arbitrary corruption in PCA’s estimate of the low rank structure, A. [sent-36, score-0.518]

15 Our approach to robust PCA is motivated by two recent, and tightly related, lines of research. [sent-37, score-0.123]

16 The first set of results concerns the robust solution of over-determined linear systems of equations in the presence of arbitrary, but sparse errors. [sent-38, score-0.349]

17 These results imply that for generic systems of equations, it is possible to correct a constant fraction of arbitrary errors in polynomial time [11]. [sent-39, score-0.159]

18 A parallel (and still emerging) line of work concerns the problem of computing low-rank matrix solutions to underdetermined linear equations [12, 13]. [sent-41, score-0.32]

19 One of the most striking results concerns the exact completion of low-rank matrices from only a small fraction of their entries [13, 14, 15, 16]. [sent-42, score-0.55]

20 2 There, a similar convex relaxation is employed, replacing the highly non-convex matrix rank with the nuclear norm (or sum of singular values). [sent-43, score-0.838]

21 The robust PCA problem outlined above combines aspects of both of these lines of work: we wish to recover a low-rank matrix from large but sparse errors. [sent-44, score-0.374]

22 This conclusion holds with high probability as the dimensionality m increases, implying that in high-dimensional observation spaces, sparse and low-rank structures can be efficiently and exactly separated. [sent-46, score-0.134]

23 However, this result would remain a theoretical curiosity without scalable algorithms for solving the associated convex program. [sent-48, score-0.201]

24 To this end, we discuss how a near-solution to this convex program can be obtained relatively efficiently via proximal gradient [18, 19] and iterative thresholding techniques, similar to those proposed for matrix completion in [20, 21]. [sent-49, score-0.694]

25 For large matrices, these algorithms are significantly faster and more scalable than general-purpose convex program solvers. [sent-50, score-0.246]

26 1 Random sampling approaches guarantee near-optimal estimates, but have complexity exponential in the rank of the matrix A0 . [sent-52, score-0.359]

27 2 A major difference between robust PCA and low-rank matrix completion is that here we do not know which entries are corrupted, whereas in matrix completion the support of the missing entries is given. [sent-54, score-1.019]

28 Section 2 formulates the robust principal component analysis problem more precisely and states the main results of this paper, placing these results in the context of existing work. [sent-57, score-0.19]

29 Section 3 extends existing proximal gradient techniques to give a simple, scalable algorithm for solving the robust PCA problem. [sent-59, score-0.308]

30 2 Problem Setting and Main Results We assume that the observed data matrix D ∈ Rm×n was generated by corrupting some of the entries of a low-rank matrix A ∈ Rm×n . [sent-62, score-0.335]

31 The corruption can be represented as an additive error E ∈ Rm×n , so that D = A + E. [sent-63, score-0.156]

32 Because the error affects only a portion of the entries of D, E is a sparse matrix. [sent-64, score-0.23]

33 The idealized (or noise-free) robust PCA problem can then be formulated as follows: Problem 2. [sent-65, score-0.184]

34 Given D = A + E, where A and E are unknown, but A is known to be low rank and E is known to be sparse, recover A. [sent-67, score-0.301]

35 This problem formulation immediately suggests a conceptual solution: seek the lowest rank A that could have generated the data, subject to the constraint that the errors are sparse: E 0 ≤ k. [sent-68, score-0.327]

36 The Lagrangian reformulation of this optimization problem is min rank(A) + γ E A,E 0 subj A + E = D. [sent-69, score-0.142]

37 3 We can obtain a tractable optimization problem by relaxing (1), replacing the 0 -norm with the 1 -norm, and the rank with the nuclear norm A ∗ = i σi (A), yielding the following convex surrogate: min A A,E ∗ +λ E subj A + E = D. [sent-72, score-0.706]

38 1 (2) This relaxation can be motivated by observing that A ∗ + λ E 1 is the convex envelope of rank(A) + λ E 0 over the set of (A, E) such that max( A 2,2 , E 1,∞ ) ≤ 1. [sent-73, score-0.154]

39 A sketch of the result is as follows: For “almost all” pairs (A0 , E0 ) consisting of a low-rank matrix A0 and a sparse matrix E0 , (A0 , E0 ) = arg min A A,E ∗ +λ E 1 subj A + E = A0 + E0 , and the minimizer is uniquely defined. [sent-76, score-0.496]

40 That is, under natural probabilistic models for low-rank and sparse matrices, almost all observations D = A0 + E0 generated as the sum of a low-rank matrix A0 and a sparse matrix E0 can be efficiently and exactly decomposed into their generating parts by solving a convex program. [sent-77, score-0.589]

41 From the optimality conditions for the convex program (2), it is not difficult to show that for matrices D ∈ Rm×m , the correct scaling is λ = O m−1/2 . [sent-79, score-0.325]

42 In a sense, this problem subsumes both the low rank matrix completion problem and the 0 -minimization problem, both of which are NP-hard and hard to approximate [23]. [sent-82, score-0.586]

43 4 Notice that this is not an “equivalence” result for (1) and (2) – rather than asserting that the solutions of these two problems are equal with high probability, we directly prove that the convex program correctly decomposes D = A0 + E0 into (A0 , E0 ). [sent-83, score-0.296]

44 3 3 It should be clear that not all matrices A0 can be successfully recovered by solving the convex program (2). [sent-85, score-0.469]

45 Without additional prior knowledge, the low-rank matrix A = U SV ∗ cannot be recovered from even a single gross error. [sent-89, score-0.317]

46 We consider a matrix A0 to be distributed according to the random orthogonal model of rank r if its left and right singular vectors are independent uniformly distributed m×r matrices with orthonormal columns. [sent-93, score-0.716]

47 5 In this model, the nonzero singular values of A0 can be arbitrary. [sent-94, score-0.122]

48 Our model for errors is similarly natural: each entry of the matrix is independently corrupted with some probability ρs , and the signs of the corruptions are independent Rademacher random variables. [sent-95, score-0.492]

49 We consider an error matrix E0 to be drawn from the Bernoulli sign and support model with parameter ρs if the entries of sign(E0 ) are independently distributed, each taking on value 0 with probability 1 − ρs , and ±1 with probability ρs /2 each. [sent-98, score-0.3]

50 In other words, matrices A0 whose singular spaces are distributed according to the random orthogonal model can, with probability approaching one, be efficiently recovered from almost all corruption sign and support patterns without prior knowledge of the pattern of corruption. [sent-103, score-0.599]

51 Our line of analysis also implies strong results for the matrix completion problem studied in [13, 15, 14, 16]. [sent-104, score-0.341]

52 Contemporaneous results due to [25] show that for A0 distributed according to the random orthogonal model, and E0 with Bernoulli support, correct recovery occurs with high probability provided E0 0 ≤ C m1. [sent-109, score-0.216]

53 However, even for constant rank r it guarantees correction of only a vanishing fraction o(m1. [sent-112, score-0.369]

54 4, states that even if r grows proportional to m/ log(m), non-vanishing fractions of errors are corrected with high probability. [sent-118, score-0.241]

55 Both analyses start from the optimality condition for the convex program (2). [sent-119, score-0.247]

56 This approach extends techniques used in [11, 26], with additional care required to handle an operator norm constraint arising from the presence of the nuclear norm in (2). [sent-121, score-0.302]

57 A similar result for spectral methods [14] gives exact completion from O(m log(m)) measurements when r = O(1). [sent-127, score-0.264]

58 3 Scalable Optimization for Robust PCA There are a number of possible approaches to solving the robust PCA semidefinite program (2). [sent-132, score-0.256]

59 However, off-the-shelf interior point solvers become impractical for data matrices larger than about 70 × 70, due to the O(m6 ) complexity of solving for the step direction. [sent-134, score-0.122]

60 We encourage the interested reader to ˜ ˜ consult [18] for a more detailed explanation of the choice of the proximal points (Ak , Ek ), as well as a convergence proof ([18] Theorem 4. [sent-144, score-0.137]

61 Since the dominant cost of each iteration is computing the singular value decomposition, this means that it is often possible to obtain a provably robust PCA with only a constant factor more computational resources than required for conventional PCA. [sent-147, score-0.288]

62 6 That work is similar in spirit to the work of [19], and has also applied to matrix completion in [21]. [sent-148, score-0.341]

63 We then sketch two computer vision applications involving the recovery of intrinsically low-dimensional data from gross corruption: background estimation from video and face subspace estimation under varying illumination. [sent-163, score-0.462]

64 We first demonstrate the exactness of the convex programming heuristic, as well as the efficacy of Algorithm 1, on random matrix examples of increasing dimension. [sent-165, score-0.23]

65 We generate E0 as a sparse matrix whose support is chosen uniformly at random, and whose non-zero entries are independent and uniformly distributed in the range . [sent-170, score-0.353]

66 We apply the proposed algorithm to the matrix D = A0 + E0 to recover A and E. [sent-172, score-0.17]

67 Here the rank of the matrix grows in proportion (5%) to the dimensionality m; and the number of corrupted measurements grows in proportion to the number of entries m2 , top 5% and bottom 10%, respectively. [sent-193, score-0.967]

68 We next examine how the rank of A and the proportion of errors in E affect the performance our algorithm. [sent-200, score-0.369]

69 We deem (A0 , E0 ) successfully recovered 8 Here, we use these intuitive examples and data illustrate how our algorithm can be used as a simple, general tool to effectively separate low-dimensional and sparse structures occurring in real visual data. [sent-204, score-0.223]

70 A0 White denotes perfect recovery in all experiments, and black denotes failure for all experiments. [sent-211, score-0.147]

71 We observe that there is a relatively sharp phase transition between success and failure of the algorithm roughly above the line ρr + ρs = 0. [sent-212, score-0.122]

72 3 r Figure 1: Phase transition wrt rank and error sparsity. [sent-231, score-0.33]

73 Background modeling or subtraction from video sequences is a popular approach to detecting activity in the scene, and finds application in video surveillance from static cameras. [sent-237, score-0.306]

74 Background estimation is complicated by the presence of foreground objects such as people, as well as variability in the background itself, for example due to varying illumination. [sent-238, score-0.206]

75 In many cases, however, it is reasonable to assume that these background variations are low-rank, while the foreground activity is spatially localized, and therefore sparse. [sent-239, score-0.226]

76 If the individual frames are stacked as columns of a matrix D, this matrix can be expressed as the sum of a low-rank background matrix and a sparse error matrix representing the activity in the scene. [sent-240, score-0.819]

77 In Figure 2(a)-(c), the video sequence consists of 200 frames of a scene in an airport. [sent-242, score-0.231]

78 There is no significant change in illumination in the video, but a lot of activity in the foreground. [sent-243, score-0.201]

79 In Figure 2(d)-(f), we have 550 frames from a scene in a lobby. [sent-245, score-0.128]

80 There is little activity in the video, but the illumination changes drastically towards the end of the sequence. [sent-246, score-0.201]

81 We see that our algorithm is once again able to recover the background, irrespective of the illumination change. [sent-247, score-0.199]

82 The key observation is that under certain idealized circumstances, images of the same face under varying illumination lie near an approximately ninedimensional linear subspace known as the harmonic plane. [sent-250, score-0.36]

83 However, since faces are neither perfectly convex nor Lambertian, face images taken under directional illumination often suffer from self-shadowing, specularities, or saturations in brightness. [sent-251, score-0.488]

84 Given a matrix D whose columns represent well-aligned training images of a person’s face under various illumination conditions, our Robust PCA algorithm offers a principled way of removing such spatially localized artifacts. [sent-252, score-0.376]

85 The proposed algorithm algorithm removes the specularities in the eyes and the shadows around the nose region. [sent-254, score-0.277]

86 This technique is potentially useful for pre-processing training images in face recognition systems to remove such deviations from the low-dimensional linear model. [sent-255, score-0.119]

87 5 Discussion and Future Work Our results give strong theoretical and empirical evidences for the efficacy of using convex programming to recover low-rank matrices from corrupted observations. [sent-256, score-0.469]

88 (c) Sparse error recovered by our algorithm represents activity in the frame. [sent-262, score-0.242]

89 The background is correctly recovered even when the illumination in the room changes drastically in the frame on the last row. [sent-267, score-0.434]

90 (a) (b) (c) (a) (b) (c) Figure 3: Removing shadows and specularities from face images. [sent-268, score-0.319]

91 (a) Cropped and aligned images of a person’s face under different illuminations from the Extended Yale B database. [sent-269, score-0.169]

92 (c) The sparse errors returned by our algorithm correspond to specularities in the eyes, shadows around the nose region, or brightness saturations on the face. [sent-272, score-0.497]

93 The phase transition experiment in Section 4 suggests that convex programming actually succeeds even for rank(A0 ) < ρr m and E0 0 < ρs m2 , where ρr and ρs are sufficiently small positive constants. [sent-274, score-0.238]

94 Another interesting and important question is whether the recovery is stable in the presence of small dense noise. [sent-275, score-0.148]

95 For matrix completion, [16] showed that a similar relaxation gives stable recovery – the error in the solution is proportional to the noise level. [sent-280, score-0.359]

96 Robust l1 norm factorization in the presence of outliers and missing data by alternative convex programming. [sent-325, score-0.215]

97 Guaranteed minimum rank solution of matrix equations via nuclear norm minimization. [sent-344, score-0.608]

98 An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems. [sent-392, score-0.303]

99 Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. [sent-403, score-0.59]

100 Fast convex optimization algorithms for exact recovery of a corrupted low-rank matrix. [sent-433, score-0.445]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('ek', 0.344), ('ak', 0.249), ('rank', 0.245), ('completion', 0.227), ('corrupted', 0.219), ('pca', 0.18), ('illumination', 0.143), ('subj', 0.142), ('recovered', 0.142), ('nuclear', 0.142), ('specularities', 0.124), ('robust', 0.123), ('singular', 0.122), ('rm', 0.121), ('convex', 0.116), ('corruption', 0.114), ('shadows', 0.114), ('matrix', 0.114), ('recovery', 0.11), ('entries', 0.107), ('background', 0.107), ('video', 0.103), ('proximal', 0.1), ('program', 0.089), ('errors', 0.082), ('face', 0.081), ('sparse', 0.081), ('matrices', 0.078), ('signs', 0.077), ('fraction', 0.077), ('tk', 0.077), ('frames', 0.075), ('wright', 0.071), ('ganesh', 0.068), ('principal', 0.067), ('candes', 0.065), ('norm', 0.061), ('idealized', 0.061), ('concerns', 0.061), ('foreground', 0.061), ('gross', 0.061), ('submitted', 0.06), ('activity', 0.058), ('underdetermined', 0.058), ('corroborating', 0.057), ('saturations', 0.057), ('stiefel', 0.057), ('recover', 0.056), ('orthogonal', 0.055), ('proportional', 0.055), ('bernoulli', 0.054), ('grows', 0.054), ('yk', 0.053), ('dimensionality', 0.053), ('scene', 0.053), ('perfectly', 0.053), ('preprint', 0.052), ('distributed', 0.051), ('illuminations', 0.05), ('lambertian', 0.05), ('fractions', 0.05), ('asserting', 0.05), ('grossly', 0.05), ('bioinformatics', 0.05), ('thresholding', 0.048), ('correction', 0.047), ('equations', 0.046), ('outlying', 0.046), ('continuation', 0.046), ('trimming', 0.046), ('cm', 0.045), ('growth', 0.045), ('uniquely', 0.045), ('solving', 0.044), ('transition', 0.043), ('recovers', 0.043), ('provably', 0.043), ('proportion', 0.042), ('phase', 0.042), ('frame', 0.042), ('static', 0.042), ('optimality', 0.042), ('error', 0.042), ('solutions', 0.041), ('scalable', 0.041), ('observations', 0.039), ('yale', 0.039), ('wm', 0.039), ('nose', 0.039), ('presence', 0.038), ('relaxation', 0.038), ('images', 0.038), ('sign', 0.037), ('lie', 0.037), ('reader', 0.037), ('failure', 0.037), ('succeeds', 0.037), ('circumstances', 0.037), ('affecting', 0.037), ('measurements', 0.037)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 208 nips-2009-Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization

Author: John Wright, Arvind Ganesh, Shankar Rao, Yigang Peng, Yi Ma

Abstract: Principal component analysis is a fundamental operation in computational data analysis, with myriad applications ranging from web search to bioinformatics to computer vision and image analysis. However, its performance and applicability in real scenarios are limited by a lack of robustness to outlying or corrupted observations. This paper considers the idealized “robust principal component analysis” problem of recovering a low rank matrix A from corrupted observations D = A + E. Here, the corrupted entries E are unknown and the errors can be arbitrarily large (modeling grossly corrupted observations common in visual and bioinformatic data), but are assumed to be sparse. We prove that most matrices A can be efficiently and exactly recovered from most error sign-and-support patterns by solving a simple convex program, for which we give a fast and provably convergent algorithm. Our result holds even when the rank of A grows nearly proportionally (up to a logarithmic factor) to the dimensionality of the observation space and the number of errors E grows in proportion to the total number of entries in the matrix. A by-product of our analysis is the first proportional growth results for the related problem of completing a low-rank matrix from a small fraction of its entries. Simulations and real-data examples corroborate the theoretical results, and suggest potential applications in computer vision. 1

2 0.25205579 147 nips-2009-Matrix Completion from Noisy Entries

Author: Raghunandan Keshavan, Andrea Montanari, Sewoong Oh

Abstract: Given a matrix M of low-rank, we consider the problem of reconstructing it from noisy observations of a small, random subset of its entries. The problem arises in a variety of applications, from collaborative filtering (the ‘Netflix problem’) to structure-from-motion and positioning. We study a low complexity algorithm introduced in [1], based on a combination of spectral techniques and manifold optimization, that we call here O PT S PACE. We prove performance guarantees that are order-optimal in a number of circumstances. 1

3 0.13742086 20 nips-2009-A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers

Author: Sahand Negahban, Bin Yu, Martin J. Wainwright, Pradeep K. Ravikumar

Abstract: High-dimensional statistical inference deals with models in which the the number of parameters p is comparable to or larger than the sample size n. Since it is usually impossible to obtain consistent procedures unless p/n → 0, a line of recent work has studied models with various types of structure (e.g., sparse vectors; block-structured matrices; low-rank matrices; Markov assumptions). In such settings, a general approach to estimation is to solve a regularized convex program (known as a regularized M -estimator) which combines a loss function (measuring how well the model fits the data) with some regularization function that encourages the assumed structure. The goal of this paper is to provide a unified framework for establishing consistency and convergence rates for such regularized M estimators under high-dimensional scaling. We state one main theorem and show how it can be used to re-derive several existing results, and also to obtain several new results on consistency and convergence rates. Our analysis also identifies two key properties of loss and regularization functions, referred to as restricted strong convexity and decomposability, that ensure the corresponding regularized M -estimators have fast convergence rates. 1

4 0.12656125 157 nips-2009-Multi-Label Prediction via Compressed Sensing

Author: John Langford, Tong Zhang, Daniel J. Hsu, Sham M. Kakade

Abstract: We consider multi-label prediction problems with large output spaces under the assumption of output sparsity – that the target (label) vectors have small support. We develop a general theory for a variant of the popular error correcting output code scheme, using ideas from compressed sensing for exploiting this sparsity. The method can be regarded as a simple reduction from multi-label regression problems to binary regression problems. We show that the number of subproblems need only be logarithmic in the total number of possible labels, making this approach radically more efficient than others. We also state and prove robustness guarantees for this method in the form of regret transform bounds (in general), and also provide a more detailed analysis for the linear prediction setting. 1

5 0.11829069 148 nips-2009-Matrix Completion from Power-Law Distributed Samples

Author: Raghu Meka, Prateek Jain, Inderjit S. Dhillon

Abstract: The low-rank matrix completion problem is a fundamental problem with many important applications. Recently, [4],[13] and [5] obtained the first non-trivial theoretical results for the problem assuming that the observed entries are sampled uniformly at random. Unfortunately, most real-world datasets do not satisfy this assumption, but instead exhibit power-law distributed samples. In this paper, we propose a graph theoretic approach to matrix completion that solves the problem for more realistic sampling models. Our method is simpler to analyze than previous methods with the analysis reducing to computing the threshold for complete cascades in random graphs, a problem of independent interest. By analyzing the graph theoretic problem, we show that our method achieves exact recovery when the observed entries are sampled from the Chung-Lu-Vu model, which can generate power-law distributed graphs. We also hypothesize that our algorithm solves the matrix completion problem from an optimal number of entries for the popular preferential attachment model and provide strong empirical evidence for the claim. Furthermore, our method is easy to implement and is substantially faster than existing methods. We demonstrate the effectiveness of our method on random instances where the low-rank matrix is sampled according to the prevalent random graph models for complex networks and present promising preliminary results on the Netflix challenge dataset. 1

6 0.11211944 137 nips-2009-Learning transport operators for image manifolds

7 0.10400365 224 nips-2009-Sparse and Locally Constant Gaussian Graphical Models

8 0.10264356 1 nips-2009-$L 1$-Penalized Robust Estimation for a Class of Inverse Problems Arising in Multiview Geometry

9 0.097094685 61 nips-2009-Convex Relaxation of Mixture Regression with Efficient Algorithms

10 0.092064753 252 nips-2009-Unsupervised Feature Selection for the $k$-means Clustering Problem

11 0.090937361 245 nips-2009-Thresholding Procedures for High Dimensional Variable Selection and Statistical Estimation

12 0.088363215 211 nips-2009-Segmenting Scenes by Matching Image Composites

13 0.082748324 222 nips-2009-Sparse Estimation Using General Likelihoods and Non-Factorial Priors

14 0.078598194 46 nips-2009-Bilinear classifiers for visual recognition

15 0.078587636 40 nips-2009-Bayesian Nonparametric Models on Decomposable Graphs

16 0.076366618 79 nips-2009-Efficient Recovery of Jointly Sparse Vectors

17 0.073144279 201 nips-2009-Region-based Segmentation and Object Detection

18 0.070030704 116 nips-2009-Information-theoretic lower bounds on the oracle complexity of convex optimization

19 0.066945553 228 nips-2009-Speeding up Magnetic Resonance Image Acquisition by Bayesian Multi-Slice Adaptive Compressed Sensing

20 0.065894447 185 nips-2009-Orthogonal Matching Pursuit From Noisy Random Measurements: A New Analysis


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.249), (1, 0.065), (2, -0.028), (3, 0.038), (4, -0.018), (5, 0.005), (6, 0.074), (7, -0.142), (8, 0.21), (9, 0.001), (10, -0.009), (11, -0.008), (12, 0.012), (13, 0.065), (14, -0.064), (15, -0.037), (16, 0.023), (17, -0.055), (18, -0.16), (19, 0.054), (20, -0.039), (21, 0.018), (22, 0.091), (23, 0.006), (24, 0.004), (25, -0.12), (26, -0.004), (27, -0.214), (28, -0.023), (29, -0.053), (30, -0.029), (31, -0.087), (32, 0.005), (33, -0.002), (34, 0.111), (35, -0.034), (36, 0.071), (37, -0.089), (38, 0.077), (39, 0.074), (40, -0.179), (41, 0.099), (42, 0.06), (43, -0.088), (44, 0.035), (45, 0.12), (46, -0.004), (47, -0.135), (48, 0.045), (49, 0.094)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96743888 208 nips-2009-Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization

Author: John Wright, Arvind Ganesh, Shankar Rao, Yigang Peng, Yi Ma

Abstract: Principal component analysis is a fundamental operation in computational data analysis, with myriad applications ranging from web search to bioinformatics to computer vision and image analysis. However, its performance and applicability in real scenarios are limited by a lack of robustness to outlying or corrupted observations. This paper considers the idealized “robust principal component analysis” problem of recovering a low rank matrix A from corrupted observations D = A + E. Here, the corrupted entries E are unknown and the errors can be arbitrarily large (modeling grossly corrupted observations common in visual and bioinformatic data), but are assumed to be sparse. We prove that most matrices A can be efficiently and exactly recovered from most error sign-and-support patterns by solving a simple convex program, for which we give a fast and provably convergent algorithm. Our result holds even when the rank of A grows nearly proportionally (up to a logarithmic factor) to the dimensionality of the observation space and the number of errors E grows in proportion to the total number of entries in the matrix. A by-product of our analysis is the first proportional growth results for the related problem of completing a low-rank matrix from a small fraction of its entries. Simulations and real-data examples corroborate the theoretical results, and suggest potential applications in computer vision. 1

2 0.88418251 147 nips-2009-Matrix Completion from Noisy Entries

Author: Raghunandan Keshavan, Andrea Montanari, Sewoong Oh

Abstract: Given a matrix M of low-rank, we consider the problem of reconstructing it from noisy observations of a small, random subset of its entries. The problem arises in a variety of applications, from collaborative filtering (the ‘Netflix problem’) to structure-from-motion and positioning. We study a low complexity algorithm introduced in [1], based on a combination of spectral techniques and manifold optimization, that we call here O PT S PACE. We prove performance guarantees that are order-optimal in a number of circumstances. 1

3 0.67721379 148 nips-2009-Matrix Completion from Power-Law Distributed Samples

Author: Raghu Meka, Prateek Jain, Inderjit S. Dhillon

Abstract: The low-rank matrix completion problem is a fundamental problem with many important applications. Recently, [4],[13] and [5] obtained the first non-trivial theoretical results for the problem assuming that the observed entries are sampled uniformly at random. Unfortunately, most real-world datasets do not satisfy this assumption, but instead exhibit power-law distributed samples. In this paper, we propose a graph theoretic approach to matrix completion that solves the problem for more realistic sampling models. Our method is simpler to analyze than previous methods with the analysis reducing to computing the threshold for complete cascades in random graphs, a problem of independent interest. By analyzing the graph theoretic problem, we show that our method achieves exact recovery when the observed entries are sampled from the Chung-Lu-Vu model, which can generate power-law distributed graphs. We also hypothesize that our algorithm solves the matrix completion problem from an optimal number of entries for the popular preferential attachment model and provide strong empirical evidence for the claim. Furthermore, our method is easy to implement and is substantially faster than existing methods. We demonstrate the effectiveness of our method on random instances where the low-rank matrix is sampled according to the prevalent random graph models for complex networks and present promising preliminary results on the Netflix challenge dataset. 1

4 0.57971698 252 nips-2009-Unsupervised Feature Selection for the $k$-means Clustering Problem

Author: Christos Boutsidis, Petros Drineas, Michael W. Mahoney

Abstract: We present a novel feature selection algorithm for the k-means clustering problem. Our algorithm is randomized and, assuming an accuracy parameter ϵ ∈ (0, 1), selects and appropriately rescales in an unsupervised manner Θ(k log(k/ϵ)/ϵ2 ) features from a dataset of arbitrary dimensions. We prove that, if we run any γ-approximate k-means algorithm (γ ≥ 1) on the features selected using our method, we can find a (1 + (1 + ϵ)γ)-approximate partition with high probability. 1

5 0.56639361 20 nips-2009-A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers

Author: Sahand Negahban, Bin Yu, Martin J. Wainwright, Pradeep K. Ravikumar

Abstract: High-dimensional statistical inference deals with models in which the the number of parameters p is comparable to or larger than the sample size n. Since it is usually impossible to obtain consistent procedures unless p/n → 0, a line of recent work has studied models with various types of structure (e.g., sparse vectors; block-structured matrices; low-rank matrices; Markov assumptions). In such settings, a general approach to estimation is to solve a regularized convex program (known as a regularized M -estimator) which combines a loss function (measuring how well the model fits the data) with some regularization function that encourages the assumed structure. The goal of this paper is to provide a unified framework for establishing consistency and convergence rates for such regularized M estimators under high-dimensional scaling. We state one main theorem and show how it can be used to re-derive several existing results, and also to obtain several new results on consistency and convergence rates. Our analysis also identifies two key properties of loss and regularization functions, referred to as restricted strong convexity and decomposability, that ensure the corresponding regularized M -estimators have fast convergence rates. 1

6 0.56067955 79 nips-2009-Efficient Recovery of Jointly Sparse Vectors

7 0.56044197 1 nips-2009-$L 1$-Penalized Robust Estimation for a Class of Inverse Problems Arising in Multiview Geometry

8 0.549133 185 nips-2009-Orthogonal Matching Pursuit From Noisy Random Measurements: A New Analysis

9 0.54047275 224 nips-2009-Sparse and Locally Constant Gaussian Graphical Models

10 0.50419408 245 nips-2009-Thresholding Procedures for High Dimensional Variable Selection and Statistical Estimation

11 0.4954128 180 nips-2009-On the Convergence of the Concave-Convex Procedure

12 0.49009746 33 nips-2009-Analysis of SVM with Indefinite Kernels

13 0.47404793 61 nips-2009-Convex Relaxation of Mixture Regression with Efficient Algorithms

14 0.47212762 138 nips-2009-Learning with Compressible Priors

15 0.46471128 81 nips-2009-Ensemble Nystrom Method

16 0.45895875 46 nips-2009-Bilinear classifiers for visual recognition

17 0.43851432 157 nips-2009-Multi-Label Prediction via Compressed Sensing

18 0.43072224 137 nips-2009-Learning transport operators for image manifolds

19 0.42448515 36 nips-2009-Asymptotic Analysis of MAP Estimation via the Replica Method and Compressed Sensing

20 0.4146013 222 nips-2009-Sparse Estimation Using General Likelihoods and Non-Factorial Priors


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(24, 0.022), (25, 0.074), (35, 0.034), (36, 0.081), (39, 0.026), (57, 0.012), (58, 0.571), (71, 0.036), (81, 0.017), (86, 0.045), (91, 0.01)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.98883015 43 nips-2009-Bayesian estimation of orientation preference maps

Author: Sebastian Gerwinn, Leonard White, Matthias Kaschube, Matthias Bethge, Jakob H. Macke

Abstract: Imaging techniques such as optical imaging of intrinsic signals, 2-photon calcium imaging and voltage sensitive dye imaging can be used to measure the functional organization of visual cortex across different spatial and temporal scales. Here, we present Bayesian methods based on Gaussian processes for extracting topographic maps from functional imaging data. In particular, we focus on the estimation of orientation preference maps (OPMs) from intrinsic signal imaging data. We model the underlying map as a bivariate Gaussian process, with a prior covariance function that reflects known properties of OPMs, and a noise covariance adjusted to the data. The posterior mean can be interpreted as an optimally smoothed estimate of the map, and can be used for model based interpolations of the map from sparse measurements. By sampling from the posterior distribution, we can get error bars on statistical properties such as preferred orientations, pinwheel locations or pinwheel counts. Finally, the use of an explicit probabilistic model facilitates interpretation of parameters and quantitative model comparisons. We demonstrate our model both on simulated data and on intrinsic signaling data from ferret visual cortex. 1

same-paper 2 0.97903502 208 nips-2009-Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization

Author: John Wright, Arvind Ganesh, Shankar Rao, Yigang Peng, Yi Ma

Abstract: Principal component analysis is a fundamental operation in computational data analysis, with myriad applications ranging from web search to bioinformatics to computer vision and image analysis. However, its performance and applicability in real scenarios are limited by a lack of robustness to outlying or corrupted observations. This paper considers the idealized “robust principal component analysis” problem of recovering a low rank matrix A from corrupted observations D = A + E. Here, the corrupted entries E are unknown and the errors can be arbitrarily large (modeling grossly corrupted observations common in visual and bioinformatic data), but are assumed to be sparse. We prove that most matrices A can be efficiently and exactly recovered from most error sign-and-support patterns by solving a simple convex program, for which we give a fast and provably convergent algorithm. Our result holds even when the rank of A grows nearly proportionally (up to a logarithmic factor) to the dimensionality of the observation space and the number of errors E grows in proportion to the total number of entries in the matrix. A by-product of our analysis is the first proportional growth results for the related problem of completing a low-rank matrix from a small fraction of its entries. Simulations and real-data examples corroborate the theoretical results, and suggest potential applications in computer vision. 1

3 0.96055448 257 nips-2009-White Functionals for Anomaly Detection in Dynamical Systems

Author: Marco Cuturi, Jean-philippe Vert, Alexandre D'aspremont

Abstract: We propose new methodologies to detect anomalies in discrete-time processes taking values in a probability space. These methods are based on the inference of functionals whose evaluations on successive states visited by the process are stationary and have low autocorrelations. Deviations from this behavior are used to flag anomalies. The candidate functionals are estimated in a subspace of a reproducing kernel Hilbert space associated with the original probability space considered. We provide experimental results on simulated datasets which show that these techniques compare favorably with other algorithms.

4 0.95837426 67 nips-2009-Directed Regression

Author: Yi-hao Kao, Benjamin V. Roy, Xiang Yan

Abstract: When used to guide decisions, linear regression analysis typically involves estimation of regression coefficients via ordinary least squares and their subsequent use to make decisions. When there are multiple response variables and features do not perfectly capture their relationships, it is beneficial to account for the decision objective when computing regression coefficients. Empirical optimization does so but sacrifices performance when features are well-chosen or training data are insufficient. We propose directed regression, an efficient algorithm that combines merits of ordinary least squares and empirical optimization. We demonstrate through a computational study that directed regression can generate significant performance gains over either alternative. We also develop a theory that motivates the algorithm. 1

5 0.94622654 223 nips-2009-Sparse Metric Learning via Smooth Optimization

Author: Yiming Ying, Kaizhu Huang, Colin Campbell

Abstract: In this paper we study the problem of learning a low-rank (sparse) distance matrix. We propose a novel metric learning model which can simultaneously conduct dimension reduction and learn a distance matrix. The sparse representation involves a mixed-norm regularization which is non-convex. We then show that it can be equivalently formulated as a convex saddle (min-max) problem. From this saddle representation, we develop an efficient smooth optimization approach [17] for sparse metric learning, although the learning model is based on a nondifferentiable loss function. Finally, we run experiments to validate the effectiveness and efficiency of our sparse metric learning model on various datasets.

6 0.93730104 81 nips-2009-Ensemble Nystrom Method

7 0.77230209 36 nips-2009-Asymptotic Analysis of MAP Estimation via the Replica Method and Compressed Sensing

8 0.77222949 147 nips-2009-Matrix Completion from Noisy Entries

9 0.74375463 177 nips-2009-On Learning Rotations

10 0.73295212 243 nips-2009-The Ordered Residual Kernel for Robust Motion Subspace Clustering

11 0.7258209 254 nips-2009-Variational Gaussian-process factor analysis for modeling spatio-temporal data

12 0.69984066 62 nips-2009-Correlation Coefficients are Insufficient for Analyzing Spike Count Dependencies

13 0.69782841 228 nips-2009-Speeding up Magnetic Resonance Image Acquisition by Bayesian Multi-Slice Adaptive Compressed Sensing

14 0.6977399 20 nips-2009-A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers

15 0.69401091 224 nips-2009-Sparse and Locally Constant Gaussian Graphical Models

16 0.69158655 61 nips-2009-Convex Relaxation of Mixture Regression with Efficient Algorithms

17 0.69000125 22 nips-2009-Accelerated Gradient Methods for Stochastic Optimization and Online Learning

18 0.68344015 108 nips-2009-Heterogeneous multitask learning with joint sparsity constraints

19 0.68272972 202 nips-2009-Regularized Distance Metric Learning:Theory and Algorithm

20 0.68232614 163 nips-2009-Neurometric function analysis of population codes