nips nips2011 nips2011-297 knowledge-graph by maker-knowledge-mining

297 nips-2011-Universal low-rank matrix recovery from Pauli measurements

Source: pdf

Author: Yi-kai Liu

Abstract: We study the problem of reconstructing an unknown matrix M of rank r and dimension d using O(rd poly log d) Pauli measurements. This has applications in quantum state tomography, and is a non-commutative analogue of a well-known problem in compressed sensing: recovering a sparse vector from a few of its Fourier coefﬁcients. We show that almost all sets of O(rd log6 d) Pauli measurements satisfy the rankr restricted isometry property (RIP). This implies that M can be recovered from a ﬁxed (“universal”) set of Pauli measurements, using nuclear-norm minimization (e.g., the matrix Lasso), with nearly-optimal bounds on the error. A similar result holds for any class of measurements that use an orthonormal operator basis whose elements have small operator norm. Our proof uses Dudley’s inequality for Gaussian processes, together with bounds on covering numbers obtained via entropy duality. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Universal low-rank matrix recovery from Pauli measurements Yi-Kai Liu Applied and Computational Mathematics Division National Institute of Standards and Technology Gaithersburg, MD, USA yi-kai. [sent-1, score-0.345]

2 gov Abstract We study the problem of reconstructing an unknown matrix M of rank r and dimension d using O(rd poly log d) Pauli measurements. [sent-3, score-0.231]

3 This has applications in quantum state tomography, and is a non-commutative analogue of a well-known problem in compressed sensing: recovering a sparse vector from a few of its Fourier coefﬁcients. [sent-4, score-0.362]

4 We show that almost all sets of O(rd log6 d) Pauli measurements satisfy the rankr restricted isometry property (RIP). [sent-5, score-0.295]

5 A similar result holds for any class of measurements that use an orthonormal operator basis whose elements have small operator norm. [sent-9, score-0.48]

6 Our proof uses Dudley’s inequality for Gaussian processes, together with bounds on covering numbers obtained via entropy duality. [sent-10, score-0.251]

7 1 Introduction Low-rank matrix recovery is the following problem: let M be some unknown matrix of dimension d and rank r d, and let A1 , A2 , . [sent-11, score-0.328]

8 , Am be a set of measurement matrices; then can one reconstruct M from its inner products tr(M ∗ A1 ), tr(M ∗ A2 ), . [sent-14, score-0.111]

9 Remarkably, it turns out that for many useful choices of measurement matrices, low-rank matrix recovery is possible, and can even be done efﬁciently. [sent-21, score-0.213]

10 For example, when the Ai are Gaussian random matrices, then it is known that m = O(rd) measurements are sufﬁcient to uniquely determine M , and furthermore, M can be reconstructed by solving a convex program (minimizing the nuclear norm) [3, 4, 5]. [sent-22, score-0.341]

11 Another example is the “matrix completion” problem, where the measurements return a random subset of matrix elements of M ; in this case, m = O(rd poly log d) measurements sufﬁce, provided that M satisﬁes some “incoherence” conditions [6, 7, 8, 9, 10]. [sent-23, score-0.547]

12 The Pauli basis is a non-commutative analogue of the Fourier basis in Cd ; thus, low-rank matrix recovery using Pauli measurements can be viewed as a generalization of the idea of compressed sensing of sparse vectors using their Fourier coefﬁcients [11, 12]. [sent-26, score-0.536]

13 In addition, this problem has applications in quantum state tomography, the task of learning an unknown quantum state by performing measurements [13]. [sent-27, score-0.781]

14 This is because most quantum states of physical interest are accurately described by density matrices that have low rank; and Pauli measurements are especially easy to carry out in an experiment (due to the tensor product structure of the Pauli basis). [sent-28, score-0.546]

15 1 In this paper we show stronger results on low-rank matrix recovery from Pauli measurements. [sent-29, score-0.181]

16 Previously [13, 8], it was known that, for every rank-r matrix M ∈ Cd×d , almost all choices of m = O(rd poly log d) random Pauli measurements will lead to successful recovery of M . [sent-30, score-0.45]

17 Here we show a stronger statement: there is a ﬁxed (“universal”) set of m = O(rd poly log d) Pauli measurements, such that for all rank-r matrices M ∈ Cd×d , we have successful recovery. [sent-31, score-0.209]

18 1 We do this by showing that the random Pauli sampling operator obeys the “restricted isometry property” (RIP). [sent-32, score-0.202]

19 Intuitively, RIP says that the sampling operator is an approximate isometry, acting on the set of all low-rank matrices. [sent-33, score-0.124]

20 In geometric terms, it says that the sampling operator embeds the manifold of low-rank matrices into O(rd poly log d) dimensions, with low distortion in the 2-norm. [sent-34, score-0.31]

21 RIP for low-rank matrices is a very strong property, and prior to this work, it was only known to hold for very unstructured types of random measurements, such as Gaussian measurements [3], which are unsuitable for most applications. [sent-35, score-0.268]

22 RIP was known to fail in the matrix completion case, and whether it held for Pauli measurements was an open question. [sent-36, score-0.317]

23 Once we have established RIP for Pauli measurements, we can use known results [3, 4, 5] to show low-rank matrix recovery from a universal set of Pauli measurements. [sent-37, score-0.256]

24 In particular, using [5], we can get nearly-optimal universal bounds on the error of the reconstructed density matrix, when the data are noisy; and we can even get bounds on the recovery of arbitrary (not necessarily low-rank) matrices. [sent-38, score-0.3]

25 In the context of quantum state tomography, this implies that, given a quantum state that consists of a low-rank component Mr plus a residual full-rank component Mc , we can reconstruct Mr up to an error that is not much larger than Mc . [sent-40, score-0.621]

26 In particular, let · ∗ denote the nuclear norm, and let · F denote the Frobenius norm. [sent-41, score-0.174]

27 Then the error can be bounded in the nuclear norm by O( Mc ∗ ) (assuming noiseless data), and it can be bounded in the Frobenius norm by O( Mc F poly log d) (which holds even with noisy data2 ). [sent-42, score-0.411]

28 In addition, a completely √ arbitrary quantum state can be reconstructed up to an error of O(1/ r) in Frobenius norm. [sent-44, score-0.321]

29 Essentially, one can replace the Pauli basis with any orthonormal basis of Cd×d that is incoherent, i. [sent-47, score-0.149]

30 , whose elements √ have small operator norm (of order O(1/ d), say); a similar generalization was noted in the earlier results of [8]. [sent-49, score-0.184]

31 Also, our proof shows that the RIP actually holds in a slightly stronger sense: it holds √ not just for all rank-r matrices, but for all matrices X that satisfy X ∗ ≤ r X F . [sent-50, score-0.126]

32 RIP results were previously known for Gaussian measurements and some of their close relatives [3]. [sent-52, score-0.187]

33 Also, restricted strong convexity (RSC), a similar but somewhat weaker property, was recently shown in the context of the matrix completion problem (with additional “non-spikiness” conditions) [10]. [sent-53, score-0.186]

34 , using a concentration inequality to upper-bound the failure probability on each individual low-rank matrix X, and then taking the union bound over all such X). [sent-56, score-0.136]

35 Showing RIP for Pauli measurements seems to be more delicate, however. [sent-57, score-0.187]

36 Pauli measurements have more structure and less randomness, so the concentration of measure phenomena are weaker, and the union bound no longer gives the desired result. [sent-58, score-0.234]

37 Instead, one must take into account the favorable correlations between the behavior of the sampling operator on different matrices — intuitively, if two low-rank matrices M and M have overlapping supports, then good behavior on M is positively correlated with good behavior on M . [sent-59, score-0.286]

38 This is the same approach used in classical compressed sensing, to show RIP for Fourier measurements [12, 11]. [sent-61, score-0.231]

39 To bound the correlations in this process, one then needs to bound the covering numbers of the nuclear norm ball (of matrices), rather than the 1 ball (of vectors). [sent-63, score-0.542]

40 This 1 2 Note that in the universal result, m is slightly larger, by a factor of poly log d. [sent-64, score-0.203]

41 ) As a side note, we remark that matrix recovery can sometimes fail because there exist large sets of up to d Pauli matrices that all commute, i. [sent-68, score-0.239]

42 (These φi are of interest in quantum information — they are called stabilizer states [18]. [sent-74, score-0.254]

43 In section 2, we state our results precisely, and discuss some speciﬁc applications to quantum state tomography. [sent-82, score-0.34]

44 In particular, · ∗ = · 1 is the trace or nuclear norm, · F = · 2 is the Frobenius norm, and · = · ∞ is the operator norm. [sent-87, score-0.226]

45 Also, A A is the superoperator that maps every matrix X ∈ Cd×d to the matrix A tr(A∗ X). [sent-90, score-0.136]

46 Let M ∈ Cd×d be an unknown matrix of rank at most r. [sent-92, score-0.126]

47 , Wd2 be an orthonormal basis for Cd×d , with respect to the inner product (A, B) = tr(A∗ B). [sent-96, score-0.13]

48 For this to be possible, the measurement matrices Wi must be “incoherent” with respect to M . [sent-106, score-0.136]

49 , Wd2 is incoherent if the Wi all have small operator norm, √ Wi ≤ K/ d, (1) where K is a constant. [sent-111, score-0.165]

50 ) Before proceeding further, let us sketch the connection between this problem and quantum state tomography. [sent-113, score-0.347]

51 We want to learn the state of the system, which is described by a density matrix ρ ∈ Cd×d ; ρ is positive semideﬁnite, has trace 1, and has rank r d when the state is nearly pure. [sent-115, score-0.236]

52 These are matrices of the form P1 ⊗ · · · ⊗ Pn , where ⊗ denotes the tensor product (Kronecker product), and each Pi is a 2 × 2 matrix chosen from the following four possibilities: I= 1 0 0 , 1 σx = 0 1 1 , 0 σy = 0 i −i , 0 σz = 1 0 0 . [sent-117, score-0.149]

53 This is a special case of the above measurement model, where the measurement matrices Wi√ are √ the (scaled) Pauli observables (P1 ⊗ · · · ⊗ Pn )/ d, and they are incoherent with Wi ≤ K/ d, K = 1. [sent-119, score-0.298]

54 , Wd2 }, and we deﬁne the sampling operator A : Cd×d → Cm as d √ m (A(X))i = ∗ tr(Si X), i = 1, . [sent-128, score-0.124]

55 We will construct an estimator M by minimizing the nuclear norm, subject to the constraints speciﬁed by y. [sent-135, score-0.13]

56 (Note that one can view the nuclear norm as a convex relaxation of the rank function — thus these estimators can be computed efﬁciently. [sent-136, score-0.276]

57 To do this, we will show that the sampling operator A satisﬁes the restricted isometry property (RIP). [sent-141, score-0.232]

58 (6) (Here, A(X) 2 denotes the 2 norm of a vector, while X F denotes the Frobenius norm of a matrix. [sent-146, score-0.176]

59 We will show that Pauli measurements satisfy the RIP over a slightly larger set (the set of √ all X ∈ Cd×d such that X ∗ ≤ r X F ), provided the number of measurements m is at least Ω(rd poly log d). [sent-148, score-0.479]

60 This result generalizes to measurements in any basis with small operator norm. [sent-149, score-0.331]

61 , Wd2 } be an orthonormal basis for Cd×d that is incoherent in the sense of (1). [sent-155, score-0.17]

62 In the remainder of this section, we discuss its applications to low-rank matrix recovery, and quantum state tomography in particular. [sent-164, score-0.46]

63 1 with previous results [3, 4, 5], we immediately obtain bounds on the accuracy of the matrix Dantzig selector (4) and the matrix Lasso (5). [sent-167, score-0.224]

64 In particular, for the ﬁrst time we can show universal recovery of low-rank matrices via Pauli measurements, and near-optimal bounds on the accuracy of the reconstruction when the data is noisy [5]. [sent-168, score-0.301]

65 (Similar results hold for measurements in any incoherent operator basis. [sent-169, score-0.352]

66 Here, we will sketch a couple of these results that are of particular interest for quantum state tomography. [sent-172, score-0.325]

67 Here, M is the density matrix describing the state of a quantum mechanical object, and A(M ) is a vector of Pauli expectation values for the state M . [sent-173, score-0.432]

68 In many situations, the ideal state has low rank (for instance, a pure state has rank 1); however, for the actual state observed in an experiment, the density matrix M is full-rank with decaying eigenvalues. [sent-177, score-0.337]

69 4 Secondly, the measurements of A(M ) are inherently noisy. [sent-179, score-0.187]

70 ) ˆ We would like to reconstruct M up to a small error in the nuclear or Frobenius norm. [sent-187, score-0.157]

71 Bounding the error in nuclear norm implies that, for any measurement allowed by ˆ quantum mechanics, the probability of distinguishing the state M from M is small. [sent-189, score-0.57]

72 First, we can bound the error in nuclear norm (assuming the data has no noise): Proposition 2. [sent-194, score-0.265]

73 ˆ This bounds the error of M in terms of the noise strength σ and the size of the tail Mc . [sent-205, score-0.11]

74 It is universal: one sampling operator A works for all matrices M . [sent-206, score-0.205]

75 First, suppose we do not assume anything about M , besides the r fact that it is a density matrix for a quantum state. [sent-214, score-0.346]

76 Thus, even for arbitrary (not necessarily low-rank) r √ ˆ quantum states, the estimator M gives nontrivial results. [sent-216, score-0.254]

77 If we know that Mc is small in nuclear norm, then we can use equation (8). [sent-219, score-0.13]

78 Choose a random Pauli sampling operator A : Cd×d → Cm , with m = Crd log6 d, for some ˆ absolute constant C. [sent-223, score-0.147]

79 The ﬁrst term expresses the bias-variance tradeoff for estimating Mr , while the second term depends on the Frobenius norm of Mc . [sent-228, score-0.116]

80 Note that this bound is not universal: it shows that for all matrices M , a random choice of the sampling operator A is likely to work. [sent-239, score-0.252]

81 The general approach involving Dudley’s entropy bound is similar to [12], while the technical part of the proof (bounding certain covering numbers) uses ideas from [16]. [sent-242, score-0.214]

82 Here the round ket notation Sj means 2 we view the matrix Sj as an element of the vector space Cd with Hilbert-Schmidt inner product; the round bra Sj denotes the adjoint element in the (dual) vector space. [sent-253, score-0.119]

83 This lets us upper-bound the covering numbers in dG with covering numbers in · X : 1 ε √ (18) N (U2 , dG , ε) ≤ N (U2 , · X , 2R ) = N ( √r U2 , · X , 2Rε r ). [sent-291, score-0.272]

84 ) This gives a simple bound on the covering numbers: 1 N ( √r U2 , · X , ε) ≤ N (B1 , · X , ε) ≤ N (K · BX , · X , ε). [sent-300, score-0.152]

85 (20) ε When ε is large, we will use a more sophisticated bound based on Maurey’s empirical method and entropy duality, which is due to [16] (see also [17]): C2K2 N (B1 , · X , ε) ≤ exp( 1 2 log3 d log m), ε We defer the proof of (21) to the next section. [sent-305, score-0.131]

86 3 Proof of Equation (21) (covering numbers of the nuclear-norm ball) Our result will follow easily from a bound on covering numbers introduced in [16] (where it appears as Lemma 1): Lemma 3. [sent-311, score-0.214]

87 The proof uses entropy duality to reduce the problem to bounding the “dual” covering number. [sent-324, score-0.236]

88 Also deﬁne the dual map S ∗ : E → m , and the associated dual ∞ covering number N (S ∗ ). [sent-330, score-0.185]

89 4 Outlook We have showed that random Pauli measurements obey the restricted isometry property (RIP), which implies strong error bounds for low-rank matrix recovery. [sent-342, score-0.395]

90 The key technical tool was a bound on covering numbers of the nuclear norm ball, due to Gu´ don et al [16]. [sent-343, score-0.401]

91 For matrix completion, one can compare with the work of Negahban and Wainwright [10], where the sampling operator satisﬁes restricted strong convexity (RSC) over a certain set of “non-spiky” low-rank matrices. [sent-345, score-0.248]

92 There are also many questions pertaining to low-rank quantum state tomography. [sent-347, score-0.297]

93 Guaranteed minimum rank solutions to linear matrix equations via nuclear norm minimization. [sent-370, score-0.344]

94 Compressed sensing and robust recovery of low rank matrices. [sent-377, score-0.178]

95 Tight oracle bounds for low-rank matrix recovery from a minimal number of random measurements. [sent-383, score-0.19]

96 Restricted strong convexity and weighted matrix completion: Optimal bounds with noise. [sent-419, score-0.126]

97 Near-optimal signal recovery from random projections: universal encoding strategies. [sent-426, score-0.188]

98 Nuclear norm penalization and optimal rates for noisy low rank matrix completion. [sent-499, score-0.214]

99 New and improved Johnson-Lindenstrauss embeddings via the restricted isometry property. [sent-520, score-0.132]

100 Stability and instance optimality for gaussian measurements in compressed sensing. [sent-545, score-0.231]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('pauli', 0.673), ('rip', 0.256), ('quantum', 0.254), ('cd', 0.248), ('mc', 0.214), ('measurements', 0.187), ('nuclear', 0.13), ('covering', 0.105), ('universal', 0.098), ('dg', 0.096), ('operator', 0.096), ('tomography', 0.095), ('recovery', 0.09), ('mr', 0.089), ('norm', 0.088), ('poly', 0.083), ('dudley', 0.082), ('matrices', 0.081), ('isometry', 0.078), ('incoherent', 0.069), ('matrix', 0.068), ('frobenius', 0.066), ('vi', 0.065), ('completion', 0.062), ('candes', 0.06), ('banach', 0.06), ('rank', 0.058), ('selector', 0.056), ('measurement', 0.055), ('dantzig', 0.054), ('orthonormal', 0.053), ('tail', 0.051), ('rd', 0.05), ('sj', 0.049), ('basis', 0.048), ('bound', 0.047), ('ball', 0.047), ('cm', 0.046), ('bx', 0.046), ('compressed', 0.044), ('state', 0.043), ('iid', 0.042), ('schatten', 0.041), ('dual', 0.04), ('wi', 0.04), ('fourier', 0.04), ('entropy', 0.04), ('observables', 0.038), ('tr', 0.038), ('gross', 0.036), ('bp', 0.035), ('lasso', 0.035), ('duality', 0.035), ('vm', 0.035), ('bounding', 0.034), ('flammia', 0.034), ('maurey', 0.034), ('standards', 0.034), ('bounds', 0.032), ('gu', 0.031), ('numbers', 0.031), ('arxiv', 0.031), ('restricted', 0.03), ('sensing', 0.03), ('vj', 0.03), ('crd', 0.03), ('cates', 0.03), ('inner', 0.029), ('fix', 0.029), ('sampling', 0.028), ('tradeoff', 0.028), ('lemma', 0.028), ('sketch', 0.028), ('reconstruct', 0.027), ('noise', 0.027), ('rsc', 0.027), ('convexity', 0.026), ('rademacher', 0.026), ('eg', 0.024), ('certi', 0.024), ('reconstructed', 0.024), ('sm', 0.024), ('embeddings', 0.024), ('density', 0.024), ('absolute', 0.023), ('stronger', 0.023), ('pn', 0.023), ('fazel', 0.023), ('adjoint', 0.022), ('log', 0.022), ('singular', 0.022), ('proof', 0.022), ('let', 0.022), ('balls', 0.021), ('zi', 0.021), ('proposition', 0.021), ('analogue', 0.021), ('recht', 0.021), ('inequality', 0.021), ('negahban', 0.02)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0 297 nips-2011-Universal low-rank matrix recovery from Pauli measurements

Author: Yi-kai Liu

2 0.11319225 264 nips-2011-Sparse Recovery with Brownian Sensing

Author: Alexandra Carpentier, Odalric-ambrym Maillard, Rémi Munos

Abstract: We consider the problem of recovering the parameter α ∈ RK of a sparse function f (i.e. the number of non-zero entries of α is small compared to the number K of features) given noisy evaluations of f at a set of well-chosen sampling points. We introduce an additional randomization process, called Brownian sensing, based on the computation of stochastic integrals, which produces a Gaussian sensing matrix, for which good recovery properties are proven, independently on the number of sampling points N , even when the features are arbitrarily non-orthogonal. Under the assumption that f is H¨ lder continuous with exponent at least √ we proo 1/2, vide an estimate α of the parameter such that �α − α�2 = O(�η�2 / N ), where � � η is the observation noise. The method uses a set of sampling points uniformly distributed along a one-dimensional curve selected according to the features. We report numerical experiments illustrating our method. 1

3 0.11011837 209 nips-2011-Orthogonal Matching Pursuit with Replacement

Author: Prateek Jain, Ambuj Tewari, Inderjit S. Dhillon

Abstract: In this paper, we consider the problem of compressed sensing where the goal is to recover all sparse vectors using a small number offixed linear measurements. For this problem, we propose a novel partial hard-thresholding operator that leads to a general family of iterative algorithms. While one extreme of the family yields well known hard thresholding algorithms like ITI and HTP[17, 10], the other end of the spectrum leads to a novel algorithm that we call Orthogonal Matching Pursnit with Replacement (OMPR). OMPR, like the classic greedy algorithm OMP, adds exactly one coordinate to the support at each iteration, based on the correlation with the current residnal. However, unlike OMP, OMPR also removes one coordinate from the support. This simple change allows us to prove that OMPR has the best known guarantees for sparse recovery in terms of the Restricted Isometry Property (a condition on the measurement matrix). In contrast, OMP is known to have very weak performance guarantees under RIP. Given its simple structore, we are able to extend OMPR using locality sensitive hashing to get OMPR-Hasb, the first provably sub-linear (in dimensionality) algorithm for sparse recovery. Our proof techniques are novel and flexible enough to also permit the tightest known analysis of popular iterative algorithms such as CoSaMP and Subspace Pursnit. We provide experimental results on large problems providing recovery for vectors of size up to million dimensions. We demonstrste that for large-scale problems our proposed methods are more robust and faster than existing methods.

4 0.10541181 257 nips-2011-SpaRCS: Recovering low-rank and sparse matrices from compressive measurements

Author: Andrew E. Waters, Aswin C. Sankaranarayanan, Richard Baraniuk

Abstract: We consider the problem of recovering a matrix M that is the sum of a low-rank matrix L and a sparse matrix S from a small set of linear measurements of the form y = A(M) = A(L + S). This model subsumes three important classes of signal recovery problems: compressive sensing, afﬁne rank minimization, and robust principal component analysis. We propose a natural optimization problem for signal recovery under this model and develop a new greedy algorithm called SpaRCS to solve it. Empirically, SpaRCS inherits a number of desirable properties from the state-of-the-art CoSaMP and ADMiRA algorithms, including exponential convergence and efﬁcient implementation. Simulation results with video compressive sensing, hyperspectral imaging, and robust matrix completion data sets demonstrate both the accuracy and efﬁcacy of the algorithm. 1

5 0.09765885 145 nips-2011-Learning Eigenvectors for Free

Author: Wouter M. Koolen, Wojciech Kotlowski, Manfred K. Warmuth

Abstract: We extend the classical problem of predicting a sequence of outcomes from a ﬁnite alphabet to the matrix domain. In this extension, the alphabet of n outcomes is replaced by the set of all dyads, i.e. outer products uu where u is a vector in Rn of unit length. Whereas in the classical case the goal is to learn (i.e. sequentially predict as well as) the best multinomial distribution, in the matrix case we desire to learn the density matrix that best explains the observed sequence of dyads. We show how popular online algorithms for learning a multinomial distribution can be extended to learn density matrices. Intuitively, learning the n2 parameters of a density matrix is much harder than learning the n parameters of a multinomial distribution. Completely surprisingly, we prove that the worst-case regrets of certain classical algorithms and their matrix generalizations are identical. The reason is that the worst-case sequence of dyads share a common eigensystem, i.e. the worst case regret is achieved in the classical case. So these matrix algorithms learn the eigenvectors without any regret. 1

6 0.097351253 203 nips-2011-On the accuracy of l1-filtering of signals with block-sparse structure

7 0.092659757 270 nips-2011-Statistical Performance of Convex Tensor Decomposition

8 0.09044797 211 nips-2011-Penalty Decomposition Methods for Rank Minimization

9 0.087346688 239 nips-2011-Robust Lasso with missing and grossly corrupted observations

10 0.081966221 289 nips-2011-Trace Lasso: a trace norm regularization for correlated designs

11 0.077184319 259 nips-2011-Sparse Estimation with Structured Dictionaries

12 0.075220361 73 nips-2011-Divide-and-Conquer Matrix Factorization

13 0.07426206 118 nips-2011-High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity

14 0.070468813 165 nips-2011-Matrix Completion for Multi-label Image Classification

15 0.067413248 202 nips-2011-On the Universality of Online Mirror Descent

16 0.065899678 228 nips-2011-Quasi-Newton Methods for Markov Chain Monte Carlo

17 0.064284727 70 nips-2011-Dimensionality Reduction Using the Sparse Linear Model

18 0.061146867 108 nips-2011-Greedy Algorithms for Structurally Constrained High Dimensional Problems

19 0.055526532 180 nips-2011-Multiple Instance Filtering

20 0.05424739 207 nips-2011-Optimal learning rates for least squares SVMs using Gaussian kernels

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.152), (1, -0.034), (2, -0.04), (3, -0.148), (4, -0.073), (5, 0.068), (6, 0.018), (7, 0.046), (8, 0.067), (9, 0.041), (10, 0.07), (11, 0.013), (12, -0.058), (13, 0.024), (14, 0.083), (15, -0.097), (16, 0.053), (17, -0.035), (18, -0.089), (19, 0.077), (20, -0.014), (21, 0.005), (22, -0.038), (23, -0.116), (24, -0.04), (25, 0.051), (26, -0.016), (27, -0.032), (28, 0.064), (29, -0.039), (30, -0.01), (31, 0.131), (32, 0.027), (33, -0.083), (34, 0.025), (35, -0.038), (36, -0.022), (37, -0.016), (38, 0.001), (39, -0.085), (40, 0.01), (41, -0.072), (42, 0.145), (43, -0.055), (44, -0.079), (45, 0.068), (46, -0.09), (47, 0.057), (48, 0.003), (49, -0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93198401 297 nips-2011-Universal low-rank matrix recovery from Pauli measurements

Author: Yi-kai Liu

2 0.80202788 257 nips-2011-SpaRCS: Recovering low-rank and sparse matrices from compressive measurements

Author: Andrew E. Waters, Aswin C. Sankaranarayanan, Richard Baraniuk

3 0.78338021 203 nips-2011-On the accuracy of l1-filtering of signals with block-sparse structure

Author: Fatma K. Karzan, Arkadi S. Nemirovski, Boris T. Polyak, Anatoli Juditsky

Abstract: We discuss new methods for the recovery of signals with block-sparse structure, based on 1 -minimization. Our emphasis is on the efﬁciently computable error bounds for the recovery routines. We optimize these bounds with respect to the method parameters to construct the estimators with improved statistical properties. We justify the proposed approach with an oracle inequality which links the properties of the recovery algorithms and the best estimation performance. 1

4 0.76048881 209 nips-2011-Orthogonal Matching Pursuit with Replacement

Author: Prateek Jain, Ambuj Tewari, Inderjit S. Dhillon

5 0.75663018 264 nips-2011-Sparse Recovery with Brownian Sensing

Author: Alexandra Carpentier, Odalric-ambrym Maillard, Rémi Munos

6 0.72323763 73 nips-2011-Divide-and-Conquer Matrix Factorization

7 0.5801093 211 nips-2011-Penalty Decomposition Methods for Rank Minimization

8 0.5380463 230 nips-2011-RTRMC: A Riemannian trust-region method for low-rank matrix completion

9 0.52213651 239 nips-2011-Robust Lasso with missing and grossly corrupted observations

10 0.52171493 172 nips-2011-Minimax Localization of Structural Information in Large Noisy Matrices

11 0.50646597 259 nips-2011-Sparse Estimation with Structured Dictionaries

12 0.49486837 265 nips-2011-Sparse recovery by thresholded non-negative least squares

13 0.46570107 108 nips-2011-Greedy Algorithms for Structurally Constrained High Dimensional Problems

14 0.45726129 270 nips-2011-Statistical Performance of Convex Tensor Decomposition

15 0.4161472 118 nips-2011-High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity

16 0.40702659 65 nips-2011-Convergent Fitted Value Iteration with Linear Function Approximation

17 0.40257263 202 nips-2011-On the Universality of Online Mirror Descent

18 0.40200415 161 nips-2011-Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation

19 0.38303941 260 nips-2011-Sparse Features for PCA-Like Linear Regression

20 0.38173658 5 nips-2011-A Denoising View of Matrix Completion

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.057), (4, 0.036), (20, 0.037), (26, 0.048), (31, 0.066), (39, 0.271), (43, 0.088), (45, 0.084), (57, 0.06), (74, 0.055), (83, 0.035), (99, 0.063)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.77952552 297 nips-2011-Universal low-rank matrix recovery from Pauli measurements

Author: Yi-kai Liu

2 0.76681858 203 nips-2011-On the accuracy of l1-filtering of signals with block-sparse structure

Author: Fatma K. Karzan, Arkadi S. Nemirovski, Boris T. Polyak, Anatoli Juditsky

3 0.71784902 151 nips-2011-Learning a Tree of Metrics with Disjoint Visual Features

Author: Kristen Grauman, Fei Sha, Sung J. Hwang

Abstract: We introduce an approach to learn discriminative visual representations while exploiting external semantic knowledge about object category relationships. Given a hierarchical taxonomy that captures semantic similarity between the objects, we learn a corresponding tree of metrics (ToM). In this tree, we have one metric for each non-leaf node of the object hierarchy, and each metric is responsible for discriminating among its immediate subcategory children. Speciﬁcally, a Mahalanobis metric learned for a given node must satisfy the appropriate (dis)similarity constraints generated only among its subtree members’ training instances. To further exploit the semantics, we introduce a novel regularizer coupling the metrics that prefers a sparse disjoint set of features to be selected for each metric relative to its ancestor (supercategory) nodes’ metrics. Intuitively, this reﬂects that visual cues most useful to distinguish the generic classes (e.g., feline vs. canine) should be different than those cues most useful to distinguish their component ﬁne-grained classes (e.g., Persian cat vs. Siamese cat). We validate our approach with multiple image datasets using the WordNet taxonomy, show its advantages over alternative metric learning approaches, and analyze the meaning of attribute features selected by our algorithm. 1

4 0.71205509 82 nips-2011-Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons

Author: Yan Karklin, Eero P. Simoncelli

Abstract: Efﬁcient coding provides a powerful principle for explaining early sensory coding. Most attempts to test this principle have been limited to linear, noiseless models, and when applied to natural images, have yielded oriented ﬁlters consistent with responses in primary visual cortex. Here we show that an efﬁcient coding model that incorporates biologically realistic ingredients – input and output noise, nonlinear response functions, and a metabolic cost on the ﬁring rate – predicts receptive ﬁelds and response nonlinearities similar to those observed in the retina. Speciﬁcally, we develop numerical methods for simultaneously learning the linear ﬁlters and response nonlinearities of a population of model neurons, so as to maximize information transmission subject to metabolic costs. When applied to an ensemble of natural images, the method yields ﬁlters that are center-surround and nonlinearities that are rectifying. The ﬁlters are organized into two populations, with On- and Off-centers, which independently tile the visual space. As observed in the primate retina, the Off-center neurons are more numerous and have ﬁlters with smaller spatial extent. In the absence of noise, our method reduces to a generalized version of independent components analysis, with an adapted nonlinear “contrast” function; in this case, the optimal ﬁlters are localized and oriented.

5 0.60539448 182 nips-2011-Nearest Neighbor based Greedy Coordinate Descent

Author: Inderjit S. Dhillon, Pradeep K. Ravikumar, Ambuj Tewari

Abstract: Increasingly, optimization problems in machine learning, especially those arising from bigh-dimensional statistical estimation, bave a large number of variables. Modem statistical estimators developed over the past decade have statistical or sample complexity that depends only weakly on the number of parameters when there is some structore to the problem, such as sparsity. A central question is whether similar advances can be made in their computational complexity as well. In this paper, we propose strategies that indicate that such advances can indeed be made. In particular, we investigate the greedy coordinate descent algorithm, and note that performing the greedy step efficiently weakens the costly dependence on the problem size provided the solution is sparse. We then propose a snite of methods that perform these greedy steps efficiently by a reduction to nearest neighbor search. We also devise a more amenable form of greedy descent for composite non-smooth objectives; as well as several approximate variants of such greedy descent. We develop a practical implementation of our algorithm that combines greedy coordinate descent with locality sensitive hashing. Without tuning the latter data structore, we are not only able to significantly speed up the vanilla greedy method, hot also outperform cyclic descent when the problem size becomes large. Our resnlts indicate the effectiveness of our nearest neighbor strategies, and also point to many open questions regarding the development of computational geometric techniques tailored towards first-order optimization methods.

6 0.57316756 209 nips-2011-Orthogonal Matching Pursuit with Replacement

7 0.56837249 19 nips-2011-Active Classification based on Value of Classifier

8 0.54939181 29 nips-2011-Algorithms and hardness results for parallel large margin learning

9 0.54934454 186 nips-2011-Noise Thresholds for Spectral Clustering

10 0.54716146 183 nips-2011-Neural Reconstruction with Approximate Message Passing (NeuRAMP)

11 0.54511702 159 nips-2011-Learning with the weighted trace-norm under arbitrary sampling distributions

12 0.54459339 204 nips-2011-Online Learning: Stochastic, Constrained, and Smoothed Adversaries

13 0.54312265 258 nips-2011-Sparse Bayesian Multi-Task Learning

14 0.54138577 84 nips-2011-EigenNet: A Bayesian hybrid of generative and conditional models for sparse learning

15 0.54098135 265 nips-2011-Sparse recovery by thresholded non-negative least squares

16 0.54022717 71 nips-2011-Directed Graph Embedding: an Algorithm based on Continuous Limits of Laplacian-type Operators

17 0.53931952 22 nips-2011-Active Ranking using Pairwise Comparisons

18 0.53880239 102 nips-2011-Generalised Coupled Tensor Factorisation

19 0.53811824 92 nips-2011-Expressive Power and Approximation Errors of Restricted Boltzmann Machines

20 0.53682435 267 nips-2011-Spectral Methods for Learning Multivariate Latent Tree Structure